Inputs

Amazon SageMaker channel configurations for S3 data sources and file system data sources

class sagemaker.inputs.TrainingInput(s3_data, distribution=None, compression=None, content_type=None, record_wrapping=None, s3_data_type='S3Prefix', input_mode=None, attribute_names=None, target_attribute_name=None, shuffle_config=None)

Bases: object

Amazon SageMaker channel configurations for S3 data sources.

config

A SageMaker DataSource referencing a SageMaker S3DataSource.

Type

dict[str, dict]

Create a definition for input data used by an SageMaker training job.

See AWS documentation on the CreateTrainingJob API for more details on the parameters.

Parameters
  • s3_data (str) – Defines the location of s3 data to train on.

  • distribution (str) – Valid values: ‘FullyReplicated’, ‘ShardedByS3Key’ (default: ‘FullyReplicated’).

  • compression (str) – Valid values: ‘Gzip’, None (default: None). This is used only in Pipe input mode.

  • content_type (str) – MIME type of the input data (default: None).

  • record_wrapping (str) – Valid values: ‘RecordIO’ (default: None).

  • s3_data_type (str) – Valid values: ‘S3Prefix’, ‘ManifestFile’, ‘AugmentedManifestFile’. If ‘S3Prefix’, s3_data defines a prefix of s3 objects to train on. All objects with s3 keys beginning with s3_data will be used to train. If ‘ManifestFile’ or ‘AugmentedManifestFile’, then s3_data defines a single S3 manifest file or augmented manifest file (respectively), listing the S3 data to train on. Both the ManifestFile and AugmentedManifestFile formats are described in the SageMaker API documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/API_S3DataSource.html

  • input_mode (str) –

    Optional override for this channel’s input mode (default: None). By default, channels will use the input mode defined on sagemaker.estimator.EstimatorBase.input_mode, but they will ignore that setting if this parameter is set.

    • None - Amazon SageMaker will use the input mode specified in the Estimator

    • ’File’ - Amazon SageMaker copies the training dataset from the S3 location to

      a local directory.

    • ’Pipe’ - Amazon SageMaker streams data directly from S3 to the container via

      a Unix-named pipe.

    • ’FastFile’ - Amazon SageMaker streams data from S3 on demand instead of

      downloading the entire dataset before training begins.

  • attribute_names (list[str]) – A list of one or more attribute names to use that are found in a specified AugmentedManifestFile.

  • target_attribute_name (str) – The name of the attribute will be predicted (classified) in a SageMaker AutoML job. It is required if the input is for SageMaker AutoML job.

  • shuffle_config (sagemaker.inputs.ShuffleConfig) – If specified this configuration enables shuffling on this channel. See the SageMaker API documentation for more info: https://docs.aws.amazon.com/sagemaker/latest/dg/API_ShuffleConfig.html

class sagemaker.inputs.ShuffleConfig(seed)

Bases: object

For configuring channel shuffling using a seed.

For more detail, see the AWS documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/API_ShuffleConfig.html

Create a ShuffleConfig.

Parameters

seed (long) – the long value used to seed the shuffled sequence.

class sagemaker.inputs.CreateModelInput(instance_type: str = None, accelerator_type: str = None)

Bases: object

A class containing parameters which can be used to create a SageMaker Model

instance_type

type or EC2 instance will be used for model deployment.

Type

str

accelerator_type

elastic inference accelerator type.

Type

str

Method generated by attrs for class CreateModelInput.

instance_type: str
accelerator_type: str
class sagemaker.inputs.TransformInput(data: str, data_type: str = 'S3Prefix', content_type: str = None, compression_type: str = None, split_type: str = None, input_filter: str = None, output_filter: str = None, join_source: str = None, model_client_config: dict = None)

Bases: object

Create a class containing all the parameters.

It can be used when calling sagemaker.transformer.Transformer.transform()

Method generated by attrs for class TransformInput.

data: str
data_type: str
content_type: str
compression_type: str
split_type: str
input_filter: str
output_filter: str
join_source: str
model_client_config: dict
class sagemaker.inputs.FileSystemInput(file_system_id, file_system_type, directory_path, file_system_access_mode='ro', content_type=None)

Bases: object

Amazon SageMaker channel configurations for file system data sources.

config

A Sagemaker File System DataSource.

Type

dict[str, dict]

Create a new file system input used by an SageMaker training job.

Parameters