Use Version 2.0 of the SageMaker Python SDK (experimental)

Development on v2.0.0 of the SageMaker Python SDK is underway. For more info on our plans, see https://github.com/aws/sagemaker-python-sdk/issues/1459.

Installation

Warning

Version 2.0.0 is currently experimental, so proceed with caution. If you do run into issues or have any other feedback, please let us know by opening an issue or commenting on our planning issue.

To install the latest release candidate:

pip install git+git@github.com:aws/sagemaker-python-sdk.git@v2.0.0.rc1

To install the latest version of v2:

pip install git+git@github.com:aws/sagemaker-python-sdk.git@zwei

If you are executing this pip install command in a notebook, make sure to restart your kernel.

Changes

This section is for major changes that may require updates to your SageMaker Python SDK code. You can also see what changes have been made in the CHANGELOG.

Deprecations

Python 2 Support

This library is no longer supported for Python 2. Please upgrade to Python 3 if you haven’t already.

Deprecate Legacy TensorFlow

TensorFlow versions 1.4-1.10 and some variations of versions 1.11-1.12 (see What Constitutes “Legacy TensorFlow Support”) are no longer natively supported by the SageMaker Python SDK.

To use those versions of TensorFlow, you must specify the Docker image URI explicitly, and configure settings via hyperparameters or environment variables rather than using SDK parameters. For more information, see Upgrade from Legacy TensorFlow Support.

delete_endpoint() for Estimators and HyperparameterTuner

The delete_endpoint() method for estimators and HyperparameterTuner has been deprecated. Please use sagemaker.predictor.Predictor.delete_endpoint() instead.

update_endpoint in deploy()

The update_endpoint argument in deploy() methods for estimators and models has been deprecated. Please use sagemaker.predictor.Predictor.update_endpoint() instead.

Require framework_version and py_version for Frameworks

Framework estimator and model classes now require framework_version and py_version instead of supplying defaults, unless an image URI is explicitly supplied.

For example:

from sagemaker.tensorflow import TensorFlow

TensorFlow(
    entry_point="script.py",
    framework_version="2.2.0",  # now required
    py_version="py37",  # now required
    role="my-role",
    instance_type="ml.m5.xlarge",
    instance_count=1,
)

from sagemaker.mxnet import MXNetModel

MXNetModel(
    model_data="s3://bucket/model.tar.gz",
    role="my-role",
    entry_point="inference.py",
    framework_version="1.6.0",  # now required
    py_version="py3",  # now required
)

Parameter and Class Name Changes

Estimators

Renamed Estimator Parameters

The following estimator parameters have been renamed:

v1.x

v2.0 and later

train_instance_count

instance_count

train_instance_type

instance_type

train_max_run

max_run

train_use_spot_instances

use_spot_instances

train_max_run_wait

max_run_wait

train_volume_size

volume_size

train_volume_kms_key

volume_kms_key

distributions

For TensorFlow and MXNet estimators, distributions has been renamed to distribution.

Specify Custom Training Images

The image_name parameter has been renamed to image_uri for specifying a custom Docker image URI to use with training.

Models

sagemaker.model.Model Parameter Order

The parameter order for sagemaker.model.Model changed: instead of model_data being first, image_uri (formerly image) is first. As a result, model_data has been made into an optional parameter.

If you are using the sagemaker.model.Model class, your code should be changed as follows:

# v1.x
Model("s3://bucket/path/model.tar.gz", "my-image:latest")

# v2.0 and later
Model("my-image:latest", model_data="s3://bucket/path/model.tar.gz")
Specify Custom Serving Image

The image parameter has been renamed to image_uri for specifying a custom Docker image URI to use with inference.

Predictors

sagemaker.predictor.RealTimePredictor has been renamed to sagemaker.predictor.Predictor.

In addition, for sagemaker.predictor.Predictor, sagemaker.sparkml.model.SparkMLPredictor, and predictors for Amazon algorithm (e.g. Factorization Machines, Linear Learner, etc.), the endpoint attribute has been renamed to endpoint_name.

Dependency Changes

SciPy

SciPy is no longer a required dependency of the SageMaker Python SDK.

If you use sagemaker.amazon.common.write_spmatrix_to_sparse_tensor() and don’t already install SciPy in your environment, you can use our scipy installation target:

pip install sagemaker[scipy]

TensorFlow

The tensorflow installation target has been removed, as it is no longer needed for any SageMaker Python SDK functionality.

If you want to install TensorFlow, see the TensorFlow documentation.

Automatically Upgrade Your Code

To help make your transition as seamless as possible, v2 of the SageMaker Python SDK comes with a command-line tool to automate updating your code. It automates as much as possible, but there are still syntactical and stylistic changes that cannot be performed by the script.

Warning

While the tool is intended to be easy to use, we recommend using it as part of a process that includes testing before and after you run the tool.

Usage

Currently, the tool supports only converting one file at a time:

$ sagemaker-upgrade-v2 --in-file input.py --out-file output.py
$ sagemaker-upgrade-v2 --in-file input.ipynb --out-file output.ipynb

You can apply it to a set of files using a loop:

$ for file in $(find input-dir); do sagemaker-upgrade-v2 --in-file $file --out-file output-dir/$file; done

Limitations

Aliased Imports

The tool checks for a limited number of patterns when looking for constructors. For example, if you are using a TensorFlow estimator, only the following invocation styles are handled:

TensorFlow()
sagemaker.tensorflow.TensorFlow()
sagemaker.tensorflow.estimator.TensorFlow()

If you have aliased an import, e.g. from sagemaker.tensorflow import TensorFlow as TF, the tool does not take care of updating its parameters.

TensorFlow Serving

If you are using the sagemaker.tensorflow.serving.Model class, the tool does not take care of adding a framework version or changing it to sagemaker.tensorflow.TensorFlowModel.

sagemaker.model.Model

If you are using the sagemaker.model.Model class, the tool does not take care of switching the order between model_data and image_uri (formerly image).

update_endpoint and delete_endpoint

The tool does not take care of removing the update_endpoint argument from a deploy call. If you are using that argument, please modify your code to use sagemaker.predictor.Predictor.update_endpoint() instead.

The tool also does not handle delete_endpoint calls on estimators or HyperparameterTuner. If you are using that method, please modify your code to use sagemaker.predictor.Predictor.delete_endpoint() instead.