TensorFlow¶
TensorFlow Estimator¶
- class sagemaker.tensorflow.estimator.TensorFlow(py_version=None, framework_version=None, model_dir=None, image_uri=None, distribution=None, compiler_config=None, **kwargs)¶
Bases:
Framework
Handle end-to-end training and deployment of user-provided TensorFlow code.
Initialize a
TensorFlow
estimator.- Parameters
py_version (str) – Python version you want to use for executing your model training code. Defaults to
None
. Required unlessimage_uri
is provided.framework_version (str) – TensorFlow version you want to use for executing your model training code. Defaults to
None
. Required unlessimage_uri
is provided. List of supported versions: https://aws.amazon.com/releasenotes/available-deep-learning-containers-images/.model_dir (str or PipelineVariable) –
S3 location where the checkpoint data and models can be exported to during training (default: None). It will be passed in the training script as one of the command line arguments. If not specified, one is provided based on your training configuration:
distributed training with SMDistributed or MPI with Horovod -
/opt/ml/model
single-machine training or distributed training without MPI -
s3://{output_path}/model
Local Mode with local sources (file:// instead of s3://) -
/opt/ml/shared/model
To disable having
model_dir
passed to your training script, setmodel_dir=False
.image_uri (str or PipelineVariable) –
If specified, the estimator will use this image for training and hosting, instead of selecting the appropriate SageMaker official image based on framework_version and py_version. It can be an ECR url or dockerhub image and tag.
Examples
123.dkr.ecr.us-west-2.amazonaws.com/my-custom-image:1.0 custom-image:latest.
If
framework_version
orpy_version
areNone
, thenimage_uri
is required. If alsoNone
, then aValueError
will be raised.distribution (dict) –
A dictionary with information on how to run distributed training (default: None).
To enable Multi Worker Mirrored Strategy:
{ "multi_worker_mirrored_strategy": { "enabled": True } }
This distribution strategy option is available for TensorFlow 2.9 and later in the SageMaker Python SDK v2.xx.yy and later. To learn more about the mirrored strategy for TensorFlow, see TensorFlow Distributed Training in the TensorFlow documentation.
To enable MPI:
{ "mpi": { "enabled": True } }
To learn more, see Training with Horovod.
To enable parameter server:
{ "parameter_server": { "enabled": True } }
To learn more, see Training with parameter servers.
Note
The SageMaker distributed data parallelism (SMDDP) library discontinued support for TensorFlow. The documentation for the SMDDP library v1.x is still available at Use the SMDDP library in your TensorFlow training script (deprecated) in the Amazon SageMaker User Guide, and the SMDDP v1 API reference in the SageMaker Python SDK v2.199.0 documentation.
Note
The SageMaker model parallelism (SMP) library v2 discontinued support for TensorFlow. The documentation for the SMP library v1.x is archived and available at Run distributed training with the SageMaker model parallelism library in the Amazon SageMaker User Guide, and the SMP v1 API reference in the SageMaker Python SDK v2.199.0 documentation.
compiler_config (
TrainingCompilerConfig
) – Configures SageMaker Training Compiler to accelerate training.**kwargs – Additional kwargs passed to the Framework constructor.
Tip
You can find additional parameters for initializing this class at
Framework
andEstimatorBase
.- create_model(role=None, vpc_config_override='VPC_CONFIG_DEFAULT', entry_point=None, source_dir=None, dependencies=None, **kwargs)¶
Creates
TensorFlowModel
object to be used for creating SageMaker model entities.This can be done by deploying it to a SageMaker endpoint, or starting SageMaker Batch Transform jobs.
- Parameters
role (str) – The
TensorFlowModel
, which is also used during transform jobs. If not specified, the role from the Estimator is used.vpc_config_override (dict[str, list[str]]) –
Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator.
’Subnets’ (list[str]): List of subnet ids.
’SecurityGroupIds’ (list[str]): List of security group ids.
entry_point (str) – Path (absolute or relative) to the local Python source file which should be executed as the entry point to training. If
source_dir
is specified, thenentry_point
must point to a file located at the root ofsource_dir
. If not specified andendpoint_type
is ‘tensorflow-serving’, no entry point is used. Ifendpoint_type
is alsoNone
, then the training entry point is used.source_dir (str) – Path (absolute or relative or an S3 URI) to a directory with any other serving source code dependencies aside from the entry point file (default: None).
dependencies (list[str]) – A list of paths to directories (absolute or relative) with any additional libraries that will be exported to the container (default: None).
**kwargs – Additional kwargs passed to
TensorFlowModel
.
- Returns
- A
TensorFlowModel
object. See
TensorFlowModel
for full details.
- A
- Return type
- hyperparameters()¶
Return hyperparameters used by your custom TensorFlow code during model training.
- transformer(instance_count, instance_type, strategy=None, assemble_with=None, output_path=None, output_kms_key=None, accept=None, env=None, max_concurrent_transforms=None, max_payload=None, tags=None, role=None, volume_kms_key=None, entry_point=None, vpc_config_override='VPC_CONFIG_DEFAULT', enable_network_isolation=None, model_name=None)¶
Return a
Transformer
that uses a SageMaker Model based on the training job.It reuses the SageMaker Session and base job name used by the Estimator.
- Parameters
instance_count (int) – Number of EC2 instances to use.
instance_type (str) – Type of EC2 instance to use, for example, ‘ml.c4.xlarge’.
strategy (str) – The strategy used to decide how to batch records in a single request (default: None). Valid values: ‘MultiRecord’ and ‘SingleRecord’.
assemble_with (str) – How the output is assembled (default: None). Valid values: ‘Line’ or ‘None’.
output_path (str) – S3 location for saving the transform result. If not specified, results are stored to a default bucket.
output_kms_key (str) – Optional. KMS key ID for encrypting the transform output (default: None).
accept (str) – The accept header passed by the client to the inference endpoint. If it is supported by the endpoint, it will be the format of the batch transform output.
env (dict) – Environment variables to be set for use during the transform job (default: None).
max_concurrent_transforms (int) – The maximum number of HTTP requests to be made to each individual transform container at one time.
max_payload (int) – Maximum size of the payload in a single HTTP request to the container in MB.
tags (Optional[Tags]) – Tags for labeling a transform job. If none specified, then the tags used for the training job are used for the transform job.
role (str) – The IAM Role ARN for the
TensorFlowModel
, which is also used during transform jobs. If not specified, the role from the Estimator is used.volume_kms_key (str) – Optional. KMS key ID for encrypting the volume attached to the ML compute instance (default: None).
entry_point (str) – Path (absolute or relative) to the local Python source file which should be executed as the entry point to training. If
source_dir
is specified, thenentry_point
must point to a file located at the root ofsource_dir
. If not specified andendpoint_type
is ‘tensorflow-serving’, no entry point is used. Ifendpoint_type
is alsoNone
, then the training entry point is used.vpc_config_override (dict[str, list[str]]) –
Optional override for the VpcConfig set on the model. Default: use subnets and security groups from this Estimator.
’Subnets’ (list[str]): List of subnet ids.
’SecurityGroupIds’ (list[str]): List of security group ids.
enable_network_isolation (bool) – Specifies whether container will run in network isolation mode. Network isolation mode restricts the container access to outside networks (such as the internet). The container does not make any inbound or outbound network calls. If True, a channel named “code” will be created for any user entry script for inference. Also known as Internet-free mode. If not specified, this setting is taken from the estimator’s current configuration.
model_name (str) – Name to use for creating an Amazon SageMaker model. If not specified, the estimator generates a default job name based on the training image name and current timestamp.
- uploaded_code: Optional[UploadedCode]¶
TensorFlow Training Compiler Configuration¶
- class sagemaker.tensorflow.TrainingCompilerConfig(enabled=True, debug=False)¶
Bases:
TrainingCompilerConfig
The SageMaker Training Compiler configuration class.
This class initializes a
TrainingCompilerConfig
instance.Amazon SageMaker Training Compiler is a feature of SageMaker Training and speeds up training jobs by optimizing model execution graphs.
You can compile TensorFlow models by passing the object of this configuration class to the
compiler_config
parameter of theTensorFlow
estimator.- Parameters
Example: The following code shows the basic usage of the
sagemaker.tensorflow.TrainingCompilerConfig()
class to run a TensorFlow training job with the compiler.from sagemaker.tensorflow import TensorFlow, TrainingCompilerConfig tensorflow_estimator=TensorFlow( ... compiler_config=TrainingCompilerConfig() )
See also
For more information about how to enable SageMaker Training Compiler for various training settings such as using TensorFlow-based models, PyTorch-based models, and distributed training, see Enable SageMaker Training Compiler in the Amazon SageMaker Training Compiler developer guide.
- SUPPORTED_INSTANCE_CLASS_PREFIXES = ['p3', 'p3dn', 'g4dn', 'p4d', 'g5']¶
- MIN_SUPPORTED_VERSION = '2.9'¶
- MAX_SUPPORTED_VERSION = '2.11'¶
- classmethod validate(estimator)¶
Checks if SageMaker Training Compiler is configured correctly.
- Parameters
estimator (
sagemaker.tensorflow.estimator.TensorFlow
) – A estimator object If SageMaker Training Compiler is enabled, it will validate whether the estimator is configured to be compatible with Training Compiler.- Raises
ValueError – Raised if the requested configuration is not compatible with SageMaker Training Compiler.
TensorFlow Serving Model¶
- class sagemaker.tensorflow.model.TensorFlowModel(model_data, role=None, entry_point=None, image_uri=None, framework_version=None, container_log_level=None, predictor_cls=<class 'sagemaker.tensorflow.model.TensorFlowPredictor'>, **kwargs)¶
Bases:
FrameworkModel
A
FrameworkModel
implementation for inference with TensorFlow Serving.Initialize a Model.
- Parameters
model_data (str or PipelineVariable) – The S3 location of a SageMaker model data
.tar.gz
file.role (str) – An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
entry_point (str) – Path (absolute or relative) to the Python source file which should be executed as the entry point to model hosting. If
source_dir
is specified, thenentry_point
must point to a file located at the root ofsource_dir
.image_uri (str or PipelineVariable) – A Docker image URI (default: None). If not specified, a default image for TensorFlow Serving will be used. If
framework_version
isNone
, thenimage_uri
is required. Ifimage_uri
is alsoNone
, then aValueError
will be raised.framework_version (str) – Optional. TensorFlow Serving version you want to use. Defaults to
None
. Required unlessimage_uri
is provided.container_log_level (int) – Log level to use within the container (default: logging.ERROR). Valid values are defined in the Python logging module.
predictor_cls (callable[str, sagemaker.session.Session]) – A function to call to create a predictor with an endpoint name and SageMaker
Session
. If specified,deploy()
returns the result of invoking this function on the created endpoint name.**kwargs – Keyword arguments passed to the superclass
FrameworkModel
and, subsequently, its superclassModel
.
Tip
You can find additional parameters for initializing this class at
FrameworkModel
andModel
.- LOG_LEVEL_PARAM_NAME = 'SAGEMAKER_TFS_NGINX_LOGLEVEL'¶
- LOG_LEVEL_MAP = {10: 'debug', 20: 'info', 30: 'warn', 40: 'error', 50: 'crit'}¶
- LATEST_EIA_VERSION = [2, 3]¶
- register(content_types=None, response_types=None, inference_instances=None, transform_instances=None, model_package_name=None, model_package_group_name=None, image_uri=None, model_metrics=None, metadata_properties=None, marketplace_cert=False, approval_status=None, description=None, drift_check_baselines=None, customer_metadata_properties=None, domain=None, sample_payload_url=None, task=None, framework=None, framework_version=None, nearest_model_name=None, data_input_configuration=None, skip_model_validation=None, source_uri=None, model_card=None, model_life_cycle=None)¶
Creates a model package for creating SageMaker models or listing on Marketplace.
- Parameters
content_types (list[str] or list[PipelineVariable]) – The supported MIME types for the input data.
response_types (list[str] or list[PipelineVariable]) – The supported MIME types for the output data.
inference_instances (list[str] or list[PipelineVariable]) – A list of the instance types that are used to generate inferences in real-time (default: None).
transform_instances (list[str] or list[PipelineVariable]) – A list of the instance types on which a transformation job can be run or on which an endpoint can be deployed (default: None).
model_package_name (str or PipelineVariable) – Model Package name, exclusive to model_package_group_name, using model_package_name makes the Model Package un-versioned (default: None).
model_package_group_name (str or PipelineVariable) – Model Package Group name, exclusive to model_package_name, using model_package_group_name makes the Model Package versioned (default: None).
image_uri (str or PipelineVariable) – Inference image uri for the container. Model class’ self.image will be used if it is None (default: None).
model_metrics (ModelMetrics) – ModelMetrics object (default: None).
metadata_properties (MetadataProperties) – MetadataProperties object (default: None).
marketplace_cert (bool) – A boolean value indicating if the Model Package is certified for AWS Marketplace (default: False).
approval_status (str or PipelineVariable) – Model Approval Status, values can be “Approved”, “Rejected”, or “PendingManualApproval” (default: “PendingManualApproval”).
description (str) – Model Package description (default: None).
drift_check_baselines (DriftCheckBaselines) – DriftCheckBaselines object (default: None).
customer_metadata_properties (dict[str, str] or dict[str, PipelineVariable]) – A dictionary of key-value paired metadata properties (default: None).
domain (str or PipelineVariable) – Domain values can be “COMPUTER_VISION”, “NATURAL_LANGUAGE_PROCESSING”, “MACHINE_LEARNING” (default: None).
sample_payload_url (str or PipelineVariable) – The S3 path where the sample payload is stored (default: None).
task (str or PipelineVariable) – Task values which are supported by Inference Recommender are “FILL_MASK”, “IMAGE_CLASSIFICATION”, “OBJECT_DETECTION”, “TEXT_GENERATION”, “IMAGE_SEGMENTATION”, “CLASSIFICATION”, “REGRESSION”, “OTHER” (default: None).
framework (str or PipelineVariable) – Machine learning framework of the model package container image (default: None).
framework_version (str or PipelineVariable) – Framework version of the Model Package Container Image (default: None).
nearest_model_name (str or PipelineVariable) – Name of a pre-trained machine learning benchmarked by Amazon SageMaker Inference Recommender (default: None).
data_input_configuration (str or PipelineVariable) – Input object for the model (default: None).
skip_model_validation (str or PipelineVariable) – Indicates if you want to skip model validation. Values can be “All” or “None” (default: None).
source_uri (str or PipelineVariable) – The URI of the source for the model package (default: None).
model_card (ModeCard or ModelPackageModelCard) – document contains qualitative and quantitative information about a model (default: None).
model_life_cycle (ModelLifeCycle) – ModelLifeCycle object (default: None).
- Returns
A sagemaker.model.ModelPackage instance.
- deploy(initial_instance_count=None, instance_type=None, serializer=None, deserializer=None, accelerator_type=None, endpoint_name=None, tags=None, kms_key=None, wait=True, data_capture_config=None, async_inference_config=None, serverless_inference_config=None, volume_size=None, model_data_download_timeout=None, container_startup_health_check_timeout=None, inference_recommendation_id=None, explainer_config=None, **kwargs)¶
Deploy a Tensorflow
Model
to a SageMakerEndpoint
.
- prepare_container_def(instance_type=None, accelerator_type=None, serverless_inference_config=None, accept_eula=None, model_reference_arn=None)¶
Prepare the container definition.
- Parameters
instance_type – Instance type of the container.
accelerator_type – Accelerator type, if applicable.
serverless_inference_config (sagemaker.serverless.ServerlessInferenceConfig) – Specifies configuration related to serverless endpoint. Instance type is not provided in serverless inference. So this is used to find image URIs.
accept_eula (bool) – For models that require a Model Access Config, specify True or False to indicate whether model terms of use have been accepted. The accept_eula value must be explicitly defined as True in order to accept the end-user license agreement (EULA) that some models require. (Default: None).
- Returns
A container definition for deploying a
Model
to anEndpoint
.
- serving_image_uri(region_name, instance_type, accelerator_type=None, serverless_inference_config=None)¶
Create a URI for the serving image.
- Parameters
region_name (str) – AWS region where the image is uploaded.
instance_type (str) – SageMaker instance type. Used to determine device type (cpu/gpu/family-specific optimized).
accelerator_type (str) – The Elastic Inference accelerator type to deploy to the instance for loading and making inferences to the model (default: None). For example, ‘ml.eia1.medium’.
serverless_inference_config (sagemaker.serverless.ServerlessInferenceConfig) – Specifies configuration related to serverless endpoint. Instance type is not provided in serverless inference. So this is used to determine device type.
- Returns
The appropriate image URI based on the given parameters.
- Return type
TensorFlow Serving Predictor¶
- class sagemaker.tensorflow.model.TensorFlowPredictor(endpoint_name, sagemaker_session=None, serializer=<sagemaker.base_serializers.JSONSerializer object>, deserializer=<sagemaker.base_deserializers.JSONDeserializer object>, model_name=None, model_version=None, component_name=None, **kwargs)¶
Bases:
Predictor
A
Predictor
implementation for inference against TensorFlow Serving endpoints.Initialize a
TensorFlowPredictor
.See
Predictor
for more info about parameters.- Parameters
endpoint_name (str) – The name of the endpoint to perform inference on.
sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
serializer (callable) – Optional. Default serializes input data to json. Handles dicts, lists, and numpy arrays.
deserializer (callable) – Optional. Default parses the response using
json.load(...)
.model_name (str) – Optional. The name of the SavedModel model that should handle the request. If not specified, the endpoint’s default model will handle the request.
model_version (str) – Optional. The version of the SavedModel model that should handle the request. If not specified, the latest version of the model will be used.
component_name (str) – Optional. Name of the Amazon SageMaker inference component corresponding to the predictor.
- classify(data)¶
Placeholder docstring.
- regress(data)¶
Placeholder docstring.
- predict(data, initial_args=None)¶
Placeholder docstring.
TensorFlow Processor¶
- class sagemaker.tensorflow.processing.TensorFlowProcessor(framework_version, role=None, instance_count=None, instance_type=None, py_version='py3', image_uri=None, command=None, volume_size_in_gb=30, volume_kms_key=None, output_kms_key=None, code_location=None, max_runtime_in_seconds=None, base_job_name=None, sagemaker_session=None, env=None, tags=None, network_config=None)¶
Bases:
FrameworkProcessor
Handles Amazon SageMaker processing tasks for jobs using TensorFlow containers.
This processor executes a Python script in a TensorFlow execution environment.
Unless
image_uri
is specified, the TensorFlow environment is an Amazon-built Docker container that executes functions defined in the suppliedcode
Python script.The arguments have the exact same meaning as in
FrameworkProcessor
.Tip
You can find additional parameters for initializing this class at
FrameworkProcessor
.- Parameters
framework_version (str) –
role (Optional[Union[str, PipelineVariable]]) –
instance_count (Union[int, PipelineVariable]) –
instance_type (Union[str, PipelineVariable]) –
py_version (str) –
image_uri (Optional[Union[str, PipelineVariable]]) –
volume_size_in_gb (Union[int, PipelineVariable]) –
volume_kms_key (Optional[Union[str, PipelineVariable]]) –
output_kms_key (Optional[Union[str, PipelineVariable]]) –
max_runtime_in_seconds (Optional[Union[int, PipelineVariable]]) –
tags (Optional[Union[List[Dict[str, Union[str, PipelineVariable]]], Dict[str, Union[str, PipelineVariable]]]]) –
network_config (Optional[NetworkConfig]) –
- estimator_cls¶
alias of
TensorFlow