Use Version 2.x of the SageMaker Python SDK

Installation

To install the latest version:

pip install --upgrade sagemaker

If you are executing this pip install command in a notebook, make sure to restart your kernel.

Breaking Changes

This section is for major changes that may require updates to your SageMaker Python SDK code. For the full list of changes, see the CHANGELOG.

Removals

Python 2 Support

This library is no longer compatible with Python 2. Python 2 has been EOL since January 1, 2020. Please upgrade to Python 3 if you haven’t already.

Remove Legacy TensorFlow

TensorFlow versions 1.4-1.10 and some variations of versions 1.11-1.12 (see What Constitutes “Legacy TensorFlow Support”) are no longer natively supported by the SageMaker Python SDK.

To use those versions of TensorFlow, you must specify the Docker image URI explicitly, and configure settings via hyperparameters or environment variables rather than using SDK parameters. For more information, see Upgrade from Legacy TensorFlow Support.

SageMaker Python SDK CLI

The SageMaker Python SDK CLI has been removed. (This is different from the AWS CLI.)

delete_endpoint() for Estimators and HyperparameterTuner

The delete_endpoint() method for estimators and HyperparameterTuner is now a no-op. Please use sagemaker.predictor.Predictor.delete_endpoint() instead.

update_endpoint in deploy()

The update_endpoint argument in deploy() methods for estimators and models is now a no-op. Please use sagemaker.predictor.Predictor.update_endpoint() instead.

serializer and deserializer in create_model()

The serializer and deserializer arguments in sagemaker.estimator.Estimator.create_model() are now no-ops. Please specify serializers and deserializers in deploy() methods instead.

content_type and accept in the Predictor Constructor

The content_type and accept parameters are now no-ops in the following classes and methods:

  • sagemaker.predictor.Predictor

  • sagemaker.estimator.Estimator.create_model

  • sagemaker.algorithms.AlgorithmEstimator.create_model

  • sagemaker.tensorflow.model.TensorFlowPredictor

Please specify content types in a serializer or deserializer class instead.

Changes in Default Behavior

Require framework_version and py_version for Frameworks

Framework estimator and model classes now require framework_version and py_version instead of supplying defaults, unless an image URI is explicitly supplied.

For example:

from sagemaker.tensorflow import TensorFlow

TensorFlow(
    entry_point="script.py",
    framework_version="2.2.0",  # now required
    py_version="py37",  # now required
    role="my-role",
    instance_type="ml.m5.xlarge",
    instance_count=1,
)

from sagemaker.mxnet import MXNetModel

MXNetModel(
    model_data="s3://bucket/model.tar.gz",
    role="my-role",
    entry_point="inference.py",
    framework_version="1.6.0",  # now required
    py_version="py3",  # now required
)

Log Display Behavior with attach()

Logs are no longer printed when using attach() with an estimator. To view logs after attaching a training job to an estimator, use sagemaker.estimator.EstimatorBase.logs().

HyperparameterTuner.fit() and Transformer.transform()

sagemaker.tuner.HyperparameterTuner.fit() and sagemaker.transformer.Transformer.transform() now wait until the completion of the Hyperparameter Tuning Job or Batch Transform Job, respectively. To make the function non-blocking, use wait=False.

XGBoost Predictor

The default serializer of sagemaker.xgboost.model.XGBoostPredictor has been changed from NumpySerializer to LibSVMSerializer.

Parameter Order Changes

sagemaker.model.Model Parameter Order

The parameter order for sagemaker.model.Model changed: instead of model_data being first, image_uri (formerly image) is first. As a result, model_data has been made into an optional parameter.

If you are using the sagemaker.model.Model class, your code should be changed as follows:

# v1.x
Model("s3://bucket/path/model.tar.gz", "my-image:latest")

# v2.0 and later
Model("my-image:latest", model_data="s3://bucket/path/model.tar.gz")

Airflow Parameter Order

For sagemaker.workflow.airflow.model_config() and sagemaker.workflow.airflow.model_config_from_estimator(), instance_type is no longer the first positional argument and is now an optional keyword argument.

Dependency Changes

SciPy

SciPy is no longer a required dependency of the SageMaker Python SDK.

If you use sagemaker.amazon.common.write_spmatrix_to_sparse_tensor() and don’t already install SciPy in your environment, you can use our scipy installation target:

pip install sagemaker[scipy]

TensorFlow

The tensorflow installation target has been removed, as it is no longer needed for any SageMaker Python SDK functionality.

If you want to install TensorFlow, see the TensorFlow documentation.

Non-Breaking Changes

Deprecations

Pre-instantiated Serializer and Deserializer Objects

The csv_serializer, json_serializer, npy_serializer, csv_deserializer, json_deserializer, and numpy_deserializer objects have been deprecated.

Please instantiate the objects instead.

v1.x

v2.0 and later

sagemaker.predictor.csv_serializer

sagemaker.serializers.CSVSerializer()

sagemaker.predictor.json_serializer

sagemaker.serializers.JSONSerializer()

sagemaker.predictor.npy_serializer

sagemaker.serializers.NumpySerializer()

sagemaker.predictor.csv_deserializer

sagemaker.deserializers.CSVDeserializer()

sagemaker.predictor.json_deserializer

sagemaker.deserializers.JSONDeserializer()

sagemaker.predictor.numpy_deserializer

sagemaker.deserializers.NumpyDeserializer()

sagemaker.content_types

The sagemaker.content_types module is deprecated in v2.0 and later of the SageMaker Python SDK.

Instead of importing constants from sagemaker.content_types, explicitly write MIME types as a string.

v1.x

v2.0 and later

CONTENT_TYPE_JSON

"application/json"

CONTENT_TYPE_CSV

"text/csv"

CONTENT_TYPE_OCTET_STREAM

"application/octet-stream"

CONTENT_TYPE_NPY

"application/x-npy"

Image URI Functions (e.g. get_image_uri)

The following functions have been deprecated in favor of sagemaker.image_uris.retrieve():

  • sagemaker.amazon_estimator.get_image_uri()

  • sagemaker.fw_utils.create_image_uri()

  • sagemaker.fw_registry.registry()

  • sagemaker.utils.get_ecr_image_uri_prefix()

For more information about usage, see sagemaker.image_uris.retrieve().

enable_cloudwatch_metrics for Estimators and Models

The parameter enable_cloudwatch_metrics has been deprecated. CloudWatch metrics are already emitted for all Training Jobs, etc.

sagemaker.fw_utils.parse_s3_url

The sagemaker.fw_utils.parse_s3_url function has been deprecated. Please use sagemaker.s3.parse_s3_url() instead.

sagemaker.session.ModelContainer

The class sagemaker.session.ModelContainer has been deprecated, as it is not needed for creating inference pipelines.

Parameter and Class Name Changes

Estimators

Renamed Estimator Parameters

The following estimator parameters have been renamed:

v1.x

v2.0 and later

train_instance_count

instance_count

train_instance_type

instance_type

train_max_run

max_run

train_use_spot_instances

use_spot_instances

train_max_wait

max_wait

train_volume_size

volume_size

train_volume_kms_key

volume_kms_key

Serializer and Deserializer Classes

The follow serializer/deserializer classes have been renamed and/or moved:

v1.x

v2.0 and later

sagemaker.predictor._CsvDeserializer

sagemaker.deserializers.CSVDeserializer

sagemaker.predictor._CsvSerializer

sagemaker.serializers.CSVSerializer

sagemaker.predictor.BytesDeserializer

sagemaker.deserializers.BytesDeserializers

sagemaker.predictor.StringDeserializer

sagemaker.deserializers.StringDeserializer

sagemaker.predictor.StreamDeserializer

sagemaker.deserializers.StreamDeserializer

sagemaker.predictor._JsonSerializer

sagemaker.serializers.JSONSerializer

sagemaker.predictor._NumpyDeserializer

sagemaker.deserializers.NumpyDeserializer

sagemaker.predictor._NPYSerializer

sagemaker.serializers.NumpySerializer

sagemaker.amazon.common.numpy_to_record_serializer

sagemaker.amazon.common.RecordSerializer

sagemaker.amazon.common.record_deserializer

sagemaker.amazon.common.RecordDeserializer

sagemaker.predictor._JsonDeserializer

sagemaker.deserializers.JSONDeserializer

sagemaker.serializers.LibSVMSerializer has been added in v2.0.

distributions

For TensorFlow and MXNet estimators, distributions has been renamed to distribution.

Specify Custom Training Images

The image_name parameter has been renamed to image_uri for specifying a custom Docker image URI to use with training.

Models

Specify Custom Serving Image

The image parameter has been renamed to image_uri for specifying a custom Docker image URI to use with inference.

TensorFlow Serving Model

sagemaker.tensorflow.serving.Model has been renamed to sagemaker.tensorflow.model.TensorFlowModel. (For the previous implementation of that class, see Remove Legacy TensorFlow).

Predictors

Generic Predictor Class Name

sagemaker.predictor.RealTimePredictor has been renamed to sagemaker.predictor.Predictor.

Endpoint Argument Name

For sagemaker.predictor.Predictor, sagemaker.sparkml.model.SparkMLPredictor, and predictors for Amazon algorithm (e.g. Factorization Machines, Linear Learner, etc.), the endpoint attribute has been renamed to endpoint_name.

TensorFlow Serving Predictor

sagemaker.tensorflow.serving.Predictor has been renamed to sagemaker.tensorflow.model.TensorFlowPredictor. (For the previous implementation of that class, see Remove Legacy TensorFlow).

Inputs

s3_input

sagemaker.session.s3_input has been renamed to sagemaker.inputs.TrainingInput.

ShuffleConfig

sagemaker.session.ShuffleConfig has been renamed to sagemaker.inputs.ShuffleConfig.

Automatically Upgrade Your Code

To help make your transition as seamless as possible, v2 of the SageMaker Python SDK comes with a command-line tool to automate updating your code. It automates as much as possible, but there are still syntactical and stylistic changes that cannot be performed by the script.

Warning

While the tool is intended to be easy to use, we recommend using it as part of a process that includes testing before and after you run the tool.

Usage

Currently, the tool supports only converting one file at a time:

$ sagemaker-upgrade-v2 --in-file input.py --out-file output.py
$ sagemaker-upgrade-v2 --in-file input.ipynb --out-file output.ipynb

You can apply it to a set of files using a loop:

$ for file in $(find input-dir); do sagemaker-upgrade-v2 --in-file $file --out-file output-dir/$file; done

Limitations

Jupyter Notebook Cells with Shell Commands

If your Jupyter notebook has a code cell with lines that start with either %% or !, the tool ignores that cell. The other cells in the notebook are still updated.

Aliased Imports

The tool checks for a limited number of patterns when looking for constructors. For example, if you are using a TensorFlow estimator, only the following invocation styles are handled:

TensorFlow()
sagemaker.tensorflow.TensorFlow()
sagemaker.tensorflow.estimator.TensorFlow()

If you have aliased an import, e.g. from sagemaker.tensorflow import TensorFlow as TF, the tool does not take care of updating its parameters.

TensorFlow Serving

If you are using the sagemaker.tensorflow.serving.Model class, the tool does not take care of adding a framework version or changing it to sagemaker.tensorflow.TensorFlowModel.

sagemaker.model.Model

If you are using the sagemaker.model.Model class, the tool does not take care of switching the order between model_data and image_uri (formerly image).

update_endpoint and delete_endpoint

The tool does not take care of removing the update_endpoint argument from a deploy call. If you are using that argument, please modify your code to use sagemaker.predictor.Predictor.update_endpoint() instead.

The tool also does not handle delete_endpoint calls on estimators or HyperparameterTuner. If you are using that method, please modify your code to use sagemaker.predictor.Predictor.delete_endpoint() instead.