Use Version 2.x of the SageMaker Python SDK¶
Installation¶
To install the latest version:
pip install --upgrade sagemaker
If you are executing this pip install command in a notebook, make sure to restart your kernel.
Breaking Changes¶
This section is for major changes that may require updates to your SageMaker Python SDK code. For the full list of changes, see the CHANGELOG.
Removals¶
Python 2 Support¶
This library is no longer compatible with Python 2. Python 2 has been EOL since January 1, 2020. Please upgrade to Python 3 if you haven’t already.
Remove Legacy TensorFlow¶
TensorFlow versions 1.4-1.10 and some variations of versions 1.11-1.12 (see What Constitutes “Legacy TensorFlow Support”) are no longer natively supported by the SageMaker Python SDK.
To use those versions of TensorFlow, you must specify the Docker image URI explicitly, and configure settings via hyperparameters or environment variables rather than using SDK parameters. For more information, see Upgrade from Legacy TensorFlow Support.
SageMaker Python SDK CLI¶
The SageMaker Python SDK CLI has been removed. (This is different from the AWS CLI.)
delete_endpoint()
for Estimators and HyperparameterTuner
¶
The delete_endpoint()
method for estimators and HyperparameterTuner
is now a no-op.
Please use sagemaker.predictor.Predictor.delete_endpoint()
instead.
update_endpoint
in deploy()
¶
The update_endpoint
argument in deploy()
methods for estimators and models is now a no-op.
Please use sagemaker.predictor.Predictor.update_endpoint()
instead.
serializer
and deserializer
in create_model()
¶
The serializer
and deserializer
arguments in
sagemaker.estimator.Estimator.create_model()
are now no-ops.
Please specify serializers and deserializers in deploy()
methods instead.
content_type
and accept
in the Predictor Constructor¶
The content_type
and accept
parameters are now no-ops in the
following classes and methods:
sagemaker.predictor.Predictor
sagemaker.estimator.Estimator.create_model
sagemaker.algorithms.AlgorithmEstimator.create_model
sagemaker.tensorflow.model.TensorFlowPredictor
Please specify content types in a serializer or deserializer class instead.
Changes in Default Behavior¶
Require framework_version
and py_version
for Frameworks¶
Framework estimator and model classes now require framework_version
and py_version
instead of supplying defaults,
unless an image URI is explicitly supplied.
For example:
from sagemaker.tensorflow import TensorFlow
TensorFlow(
entry_point="script.py",
framework_version="2.2.0", # now required
py_version="py37", # now required
role="my-role",
instance_type="ml.m5.xlarge",
instance_count=1,
)
from sagemaker.mxnet import MXNetModel
MXNetModel(
model_data="s3://bucket/model.tar.gz",
role="my-role",
entry_point="inference.py",
framework_version="1.6.0", # now required
py_version="py3", # now required
)
Log Display Behavior with attach()
¶
Logs are no longer printed when using attach()
with an estimator.
To view logs after attaching a training job to an estimator, use sagemaker.estimator.EstimatorBase.logs()
.
HyperparameterTuner.fit()
and Transformer.transform()
¶
sagemaker.tuner.HyperparameterTuner.fit()
and sagemaker.transformer.Transformer.transform()
now wait
until the completion of the Hyperparameter Tuning Job or Batch Transform Job, respectively.
To make the function non-blocking, use wait=False
.
XGBoost Predictor¶
The default serializer of sagemaker.xgboost.model.XGBoostPredictor
has been changed from NumpySerializer
to LibSVMSerializer
.
Parameter Order Changes¶
sagemaker.model.Model
Parameter Order¶
The parameter order for sagemaker.model.Model
changed: instead of model_data
being first, image_uri
(formerly image
) is first.
As a result, model_data
has been made into an optional parameter.
If you are using the sagemaker.model.Model
class, your code should be changed as follows:
# v1.x
Model("s3://bucket/path/model.tar.gz", "my-image:latest")
# v2.0 and later
Model("my-image:latest", model_data="s3://bucket/path/model.tar.gz")
Airflow Parameter Order¶
For sagemaker.workflow.airflow.model_config()
and sagemaker.workflow.airflow.model_config_from_estimator()
,
instance_type
is no longer the first positional argument and is now an optional keyword argument.
Dependency Changes¶
SciPy¶
SciPy is no longer a required dependency of the SageMaker Python SDK.
If you use sagemaker.amazon.common.write_spmatrix_to_sparse_tensor()
and
don’t already install SciPy in your environment, you can use our scipy
installation target:
pip install sagemaker[scipy]
TensorFlow¶
The tensorflow
installation target has been removed, as it is no longer needed for any SageMaker Python SDK functionality.
If you want to install TensorFlow, see the TensorFlow documentation.
Non-Breaking Changes¶
Deprecations¶
Pre-instantiated Serializer and Deserializer Objects¶
The csv_serializer
, json_serializer
, npy_serializer
, csv_deserializer
,
json_deserializer
, and numpy_deserializer
objects have been deprecated.
Please instantiate the objects instead.
v1.x |
v2.0 and later |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
sagemaker.content_types
¶
The sagemaker.content_types
module is deprecated in v2.0 and later of the
SageMaker Python SDK.
Instead of importing constants from sagemaker.content_types
, explicitly
write MIME types as a string.
v1.x |
v2.0 and later |
---|---|
|
|
|
|
|
|
|
|
Image URI Functions (e.g. get_image_uri
)¶
The following functions have been deprecated in favor of sagemaker.image_uris.retrieve()
:
sagemaker.amazon_estimator.get_image_uri()
sagemaker.fw_utils.create_image_uri()
sagemaker.fw_registry.registry()
sagemaker.utils.get_ecr_image_uri_prefix()
For more information about usage, see sagemaker.image_uris.retrieve()
.
enable_cloudwatch_metrics
for Estimators and Models¶
The parameter enable_cloudwatch_metrics
has been deprecated.
CloudWatch metrics are already emitted for all Training Jobs, etc.
sagemaker.fw_utils.parse_s3_url
¶
The sagemaker.fw_utils.parse_s3_url
function has been deprecated.
Please use sagemaker.s3.parse_s3_url()
instead.
sagemaker.session.ModelContainer
¶
The class sagemaker.session.ModelContainer
has been deprecated, as it is not needed for creating inference pipelines.
sagemaker.workflow.condition_step.JsonGet
¶
The class sagemaker.workflow.condition_step.JsonGet
has been deprecated.
Please use sagemaker.workflow.functions.JsonGet
instead.
Parameter and Class Name Changes¶
Estimators¶
Renamed Estimator Parameters¶
The following estimator parameters have been renamed:
v1.x |
v2.0 and later |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Serializer and Deserializer Classes¶
The follow serializer/deserializer classes have been renamed and/or moved:
v1.x |
v2.0 and later |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
sagemaker.serializers.LibSVMSerializer
has been added in v2.0.
distributions
¶
For TensorFlow and MXNet estimators, distributions
has been renamed to distribution
.
Specify Custom Training Images¶
The image_name
parameter has been renamed to image_uri
for specifying a custom Docker image URI to use with training.
Models¶
Specify Custom Serving Image¶
The image
parameter has been renamed to image_uri
for specifying a custom Docker image URI to use with inference.
TensorFlow Serving Model¶
sagemaker.tensorflow.serving.Model
has been renamed to sagemaker.tensorflow.model.TensorFlowModel
.
(For the previous implementation of that class, see Remove Legacy TensorFlow).
Predictors¶
Generic Predictor Class Name¶
sagemaker.predictor.RealTimePredictor
has been renamed to sagemaker.predictor.Predictor
.
Endpoint Argument Name¶
For sagemaker.predictor.Predictor
, sagemaker.sparkml.model.SparkMLPredictor
,
and predictors for Amazon algorithm (e.g. Factorization Machines, Linear Learner, etc.),
the endpoint
attribute has been renamed to endpoint_name
.
TensorFlow Serving Predictor¶
sagemaker.tensorflow.serving.Predictor
has been renamed to sagemaker.tensorflow.model.TensorFlowPredictor
.
(For the previous implementation of that class, see Remove Legacy TensorFlow).
Inputs¶
s3_input
¶
sagemaker.session.s3_input
has been renamed to sagemaker.inputs.TrainingInput
.
ShuffleConfig
¶
sagemaker.session.ShuffleConfig
has been renamed to sagemaker.inputs.ShuffleConfig
.
Airflow¶
For sagemaker.workflow.airflow.model_config()
, sagemaker.workflow.airflow.model_config_from_estimator()
, and
sagemaker.workflow.airflow.transform_config_from_estimator()
, the image
argument has been renamed to image_uri
.
Automatically Upgrade Your Code¶
To help make your transition as seamless as possible, v2 of the SageMaker Python SDK comes with a command-line tool to automate updating your code. It automates as much as possible, but there are still syntactical and stylistic changes that cannot be performed by the script.
Warning
While the tool is intended to be easy to use, we recommend using it as part of a process that includes testing before and after you run the tool.
Usage¶
Currently, the tool supports only converting one file at a time:
$ sagemaker-upgrade-v2 --in-file input.py --out-file output.py
$ sagemaker-upgrade-v2 --in-file input.ipynb --out-file output.ipynb
You can apply it to a set of files using a loop:
$ for file in $(find input-dir); do sagemaker-upgrade-v2 --in-file $file --out-file output-dir/$file; done
Limitations¶
Jupyter Notebook Cells with Shell Commands¶
If your Jupyter notebook has a code cell with lines that start with either %%
or !
, the tool ignores that cell.
The other cells in the notebook are still updated.
Aliased Imports¶
The tool checks for a limited number of patterns when looking for constructors. For example, if you are using a TensorFlow estimator, only the following invocation styles are handled:
TensorFlow()
sagemaker.tensorflow.TensorFlow()
sagemaker.tensorflow.estimator.TensorFlow()
If you have aliased an import, e.g. from sagemaker.tensorflow import TensorFlow as TF
, the tool does not take care of updating its parameters.
TensorFlow Serving¶
If you are using the sagemaker.tensorflow.serving.Model
class, the tool does not take care of adding a framework version or changing it to sagemaker.tensorflow.TensorFlowModel
.
sagemaker.model.Model
¶
If you are using the sagemaker.model.Model
class, the tool does not take care of switching the order between model_data
and image_uri
(formerly image
).
update_endpoint
and delete_endpoint
¶
The tool does not take care of removing the update_endpoint
argument from a deploy
call.
If you are using that argument, please modify your code to use sagemaker.predictor.Predictor.update_endpoint()
instead.
The tool also does not handle delete_endpoint
calls on estimators or HyperparameterTuner
.
If you are using that method, please modify your code to use sagemaker.predictor.Predictor.delete_endpoint()
instead.