Serializers

Implements base methods for serializing data for an inference endpoint.

class sagemaker.base_serializers.BaseSerializer

Bases: ABC

Abstract base class for creation of new serializers.

Provides a skeleton for customization requiring the overriding of the method serialize and the class attribute CONTENT_TYPE.

abstract serialize(data)

Serialize data into the media type specified by CONTENT_TYPE.

Parameters

data (object) – Data to be serialized.

Returns

Serialized data used for a request.

Return type

object

abstract property CONTENT_TYPE

The MIME type of the data sent to the inference endpoint.

class sagemaker.base_serializers.SimpleBaseSerializer(content_type='application/json')

Bases: BaseSerializer

Abstract base class for creation of new serializers.

This class extends the API of :class:~`sagemaker.serializers.BaseSerializer` with more user-friendly options for setting the Content-Type header, in situations where it can be provided at init and freely updated.

Initialize a SimpleBaseSerializer instance.

Parameters
  • content_type (str) – The MIME type to signal to the inference endpoint when sending

  • (default (request data) – “application/json”).

property CONTENT_TYPE

The data MIME type set in the Content-Type header on prediction endpoint requests.

class sagemaker.base_serializers.CSVSerializer(content_type='text/csv')

Bases: SimpleBaseSerializer

Serialize data of various formats to a CSV-formatted string.

Initialize a CSVSerializer instance.

Parameters

content_type (str) – The MIME type to signal to the inference endpoint when sending request data (default: “text/csv”).

serialize(data)

Serialize data of various formats to a CSV-formatted string.

Parameters

data (object) – Data to be serialized. Can be a NumPy array, list, file, Pandas DataFrame, or buffer.

Returns

The data serialized as a CSV-formatted string.

Return type

str

class sagemaker.base_serializers.NumpySerializer(dtype=None, content_type='application/x-npy')

Bases: SimpleBaseSerializer

Serialize data to a buffer using the .npy format.

Initialize a NumpySerializer instance.

Parameters
  • content_type (str) – The MIME type to signal to the inference endpoint when sending request data (default: “application/x-npy”).

  • dtype (str) – The dtype of the data.

serialize(data)

Serialize data to a buffer using the .npy format.

Parameters

data (object) – Data to be serialized. Can be a NumPy array, list, file, or buffer.

Returns

A buffer containing data serialzied in the .npy format.

Return type

io.BytesIO

class sagemaker.base_serializers.JSONSerializer(content_type='application/json')

Bases: SimpleBaseSerializer

Serialize data to a JSON formatted string.

Initialize a SimpleBaseSerializer instance.

Parameters
  • content_type (str) – The MIME type to signal to the inference endpoint when sending

  • (default (request data) – “application/json”).

serialize(data)

Serialize data of various formats to a JSON formatted string.

Parameters

data (object) – Data to be serialized.

Returns

The data serialized as a JSON string.

Return type

str

class sagemaker.base_serializers.IdentitySerializer(content_type='application/octet-stream')

Bases: SimpleBaseSerializer

Serialize data by returning data without modification.

This serializer may be useful if, for example, you’re sending raw bytes such as from an image file’s .read() method.

Initialize an IdentitySerializer instance.

Parameters

content_type (str) – The MIME type to signal to the inference endpoint when sending request data (default: “application/octet-stream”).

serialize(data)

Return data without modification.

Parameters

data (object) – Data to be serialized.

Returns

The unmodified data.

Return type

object

class sagemaker.base_serializers.JSONLinesSerializer(content_type='application/jsonlines')

Bases: SimpleBaseSerializer

Serialize data to a JSON Lines formatted string.

Initialize a JSONLinesSerializer instance.

Parameters

content_type (str) – The MIME type to signal to the inference endpoint when sending request data (default: “application/jsonlines”).

serialize(data)

Serialize data of various formats to a JSON Lines formatted string.

Parameters

data (object) – Data to be serialized. The data can be a string, iterable of JSON serializable objects, or a file-like object.

Returns

The data serialized as a string containing newline-separated

JSON values.

Return type

str

class sagemaker.base_serializers.SparseMatrixSerializer(content_type='application/x-npz')

Bases: SimpleBaseSerializer

Serialize a sparse matrix to a buffer using the .npz format.

Initialize a SparseMatrixSerializer instance.

Parameters

content_type (str) – The MIME type to signal to the inference endpoint when sending request data (default: “application/x-npz”).

serialize(data)

Serialize a sparse matrix to a buffer using the .npz format.

Sparse matrices can be in the csc, csr, bsr, dia or coo formats.

Parameters

data (scipy.sparse.spmatrix) – The sparse matrix to serialize.

Returns

A buffer containing the serialized sparse matrix.

Return type

io.BytesIO

class sagemaker.base_serializers.LibSVMSerializer(content_type='text/libsvm')

Bases: SimpleBaseSerializer

Serialize data of various formats to a LibSVM-formatted string.

The data must already be in LIBSVM file format: <label> <index1>:<value1> <index2>:<value2> …

It is suitable for sparse datasets since it does not store zero-valued features.

Initialize a LibSVMSerializer instance.

Parameters

content_type (str) – The MIME type to signal to the inference endpoint when sending request data (default: “text/libsvm”).

serialize(data)

Serialize data of various formats to a LibSVM-formatted string.

Parameters

data (object) – Data to be serialized. Can be a string or a file-like object.

Returns

The data serialized as a LibSVM-formatted string.

Return type

str

Raises

ValueError – If unable to handle input format

class sagemaker.base_serializers.DataSerializer(content_type='file-path/raw-bytes')

Bases: SimpleBaseSerializer

Serialize data in any file by extracting raw bytes from the file.

Initialize a DataSerializer instance.

Parameters

content_type (str) – The MIME type to signal to the inference endpoint when sending request data (default: “file-path/raw-bytes”).

serialize(data)

Serialize file data to a raw bytes.

Parameters

data (object) – Data to be serialized. The data can be a string representing file-path or the raw bytes from a file.

Returns

The data serialized as raw-bytes from the input.

Return type

raw-bytes

class sagemaker.base_serializers.StringSerializer(content_type='text/plain')

Bases: SimpleBaseSerializer

Encode the string to utf-8 bytes.

Initialize a StringSerializer instance.

Parameters

content_type (str) – The MIME type to signal to the inference endpoint when sending request data (default: “text/plain”).

serialize(data)

Encode the string to utf-8 bytes.

Parameters

data (object) – Data to be serialized.

Returns

The data serialized as raw-bytes from the input.

Return type

raw-bytes

class sagemaker.base_serializers.TorchTensorSerializer(content_type='tensor/pt')

Bases: SimpleBaseSerializer

Serialize torch.Tensor to a buffer by converting tensor to numpy and call NumpySerializer.

Parameters

data (object) – Data to be serialized. The data must be of torch.Tensor type.

Returns

The data serialized as raw-bytes from the input.

Return type

raw-bytes

Initialize a SimpleBaseSerializer instance.

Parameters
  • content_type (str) – The MIME type to signal to the inference endpoint when sending

  • (default (request data) – “application/json”).

serialize(data)

Serialize torch.Tensor to a buffer.

Parameters

data (object) – Data to be serialized. The data must be of torch.Tensor type.

Returns

The data serialized as raw-bytes from the input.

Return type

raw-bytes

Implements methods for serializing data for an inference endpoint.

sagemaker.serializers.retrieve_options(region=None, model_id=None, model_version=None, tolerate_vulnerable_model=False, tolerate_deprecated_model=False, sagemaker_session=<sagemaker.session.Session object>)

Retrieves the supported serializers for the model matching the given arguments.

Parameters
  • region (str) – The AWS Region for which to retrieve the supported serializers. Defaults to None.

  • model_id (str) – The model ID of the model for which to retrieve the supported serializers. (Default: None).

  • model_version (str) – The version of the model for which to retrieve the supported serializers. (Default: None).

  • tolerate_vulnerable_model (bool) – True if vulnerable versions of model specifications should be tolerated (exception not raised). If False, raises an exception if the script used by this version of the model has dependencies with known security vulnerabilities. (Default: False).

  • tolerate_deprecated_model (bool) – True if deprecated models should be tolerated (exception not raised). False if these models should raise an exception. (Default: False).

  • sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions. If not specified, one is created using the default AWS configuration chain. (Default: sagemaker.jumpstart.constants.DEFAULT_JUMPSTART_SAGEMAKER_SESSION).

Returns

The supported serializers to use for the model.

Return type

List[SimpleBaseSerializer]

Raises

ValueError – If the combination of arguments specified is not supported.

sagemaker.serializers.retrieve_default(region=None, model_id=None, model_version=None, tolerate_vulnerable_model=False, tolerate_deprecated_model=False, model_type=JumpStartModelType.OPEN_WEIGHTS, sagemaker_session=<sagemaker.session.Session object>)

Retrieves the default serializer for the model matching the given arguments.

Parameters
  • region (str) – The AWS Region for which to retrieve the default serializer. Defaults to None.

  • model_id (str) – The model ID of the model for which to retrieve the default serializer. (Default: None).

  • model_version (str) – The version of the model for which to retrieve the default serializer. (Default: None).

  • tolerate_vulnerable_model (bool) – True if vulnerable versions of model specifications should be tolerated (exception not raised). If False, raises an exception if the script used by this version of the model has dependencies with known security vulnerabilities. (Default: False).

  • tolerate_deprecated_model (bool) – True if deprecated models should be tolerated (exception not raised). False if these models should raise an exception. (Default: False).

  • sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions. If not specified, one is created using the default AWS configuration chain. (Default: sagemaker.jumpstart.constants.DEFAULT_JUMPSTART_SAGEMAKER_SESSION).

  • model_type (JumpStartModelType) –

Returns

The default serializer to use for the model.

Return type

SimpleBaseSerializer

Raises

ValueError – If the combination of arguments specified is not supported.