Model Builder

This module contains classes related to Amazon Sagemaker Model Builder

class sagemaker.serve.builder.model_builder.ModelBuilder(model_path='/tmp/sagemaker/model-builder/2978086cf87411eebdef0242ac110002', role_arn=None, sagemaker_session=None, name='model-name-29780ad8f87411eebdef0242ac110002', mode=Mode.SAGEMAKER_ENDPOINT, shared_libs=<factory>, dependencies=<factory>, env_vars=<factory>, log_level=10, content_type=None, accept_type=None, s3_model_data_url=None, instance_type='ml.c5.xlarge', schema_builder=None, model=None, inference_spec=None, image_uri=None, image_config=None, vpc_config=None, model_server=None, model_metadata=None)

Class that builds a deployable model.

Parameters
  • role_arn (Optional[str]) – The role for the endpoint.

  • model_path (Optional[str]) – The path of the model directory.

  • sagemaker_session (Optional[sagemaker.session.Session]) – The SageMaker session to use for the execution.

  • name (Optional[str]) – The model name.

  • mode (Optional[Mode]) –

    The mode of operation. The following modes are supported:

    • Mode.SAGEMAKER_ENDPOINT: Launch on a SageMaker endpoint

    • Mode.LOCAL_CONTAINER: Launch locally with a container

  • shared_libs (List[str]) – Any shared libraries you want to bring into the model packaging.

  • dependencies (Optional[Dict[str, Any]) –

    The dependencies of the model or container. Takes a dict as an input where you can specify autocapture as True or False, a requirements file, or custom dependencies as a list. A sample dependencies dict:

    {
        "auto": False,
        "requirements": "/path/to/requirements.txt",
        "custom": ["custom_module==1.2.3",
          "other_module@http://some/website.whl"],
    }
    

  • env_vars (Optional[Dict[str, str]) – The environment variables for the runtime execution.

  • log_level (Optional[int]) – The log level. Possible values are CRITICAL, ERROR, WARNING, INFO, DEBUG, and NOTSET.

  • content_type (Optional[str]) – The content type of the endpoint input data. This value is automatically derived from the input sample, but you can override it.

  • accept_type (Optional[str]) – The content type of the data accepted from the endpoint. This value is automatically derived from the output sample, but you can override the value.

  • s3_model_data_url (Optional[str]) – The S3 location where you want to upload the model package. Defaults to a S3 bucket prefixed with the account ID.

  • instance_type (Optional[str]) – The instance type of the endpoint. Defaults to the CPU instance type to help narrow the container type based on the instance family.

  • schema_builder (Optional[SchemaBuilder]) – The input/output schema of the model. The schema builder translates the input into bytes and converts the response into a stream. All translations between the server and the client are handled automatically with the specified input and output.

  • model (Optional[Union[object, str]) – Model object (with predict method to perform inference) or a HuggingFace/JumpStart Model ID. Either model or inference_spec is required for the model builder to build the artifact.

  • inference_spec (InferenceSpec) – The inference spec file with your customized invoke and load functions.

  • image_uri (Optional[str]) – The container image uri (which is derived from a SageMaker-based container).

  • image_config (dict[str, str] or dict[str, PipelineVariable]) – Specifies whether the image of model container is pulled from ECR, or private registry in your VPC. By default it is set to pull model container image from ECR. (default: None).

  • vpc_config (Optional[Dict[str, List[Union[str, PipelineVariable]]]]) – The VpcConfig set on the model (default: None) * ‘Subnets’ (List[Union[str, PipelineVariable]]): List of subnet ids. * ‘SecurityGroupIds’ (List[Union[str, PipelineVariable]]]): List of security group ids.

  • model_server (Optional[ModelServer]) – The model server to which to deploy. You need to provide this argument when you specify an image_uri in order for model builder to build the artifacts correctly (according to the model server). Possible values for this argument are TORCHSERVE, MMS, TENSORFLOW_SERVING, DJL_SERVING, TRITON, and``TGI``.

  • model_metadata (Optional[Dict[str, Any]) – Dictionary used to override model metadata. Currently, HF_TASK is overridable for HuggingFace model. MLFLOW_MODEL_PATH is available for providing local path or s3 path to MLflow artifacts. However, MLFLOW_MODEL_PATH is experimental and is not intended for production use at this moment.

ModelBuilder.build(mode=None, role_arn=None, sagemaker_session=None)

Create a deployable Model instance with ModelBuilder.

Parameters
  • mode (Type[Mode], optional) – The mode. Defaults to None.

  • role_arn (str, optional) – The IAM role arn. Defaults to None.

  • sagemaker_session (Optional[Session]) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the function creates one using the default AWS configuration chain.

Returns

A deployable Model object.

Return type

Type[Model]

ModelBuilder.save(save_path=None, s3_path=None, sagemaker_session=None, role_arn=None)

WARNING: This function is expremental and not intended for production use.

This function is available for models served by DJL serving.

Parameters
  • save_path (Optional[str]) – The path where you want to save resources.

  • s3_path (Optional[str]) – The path where you want to upload resources.

  • sagemaker_session (Optional[Session]) –

  • role_arn (Optional[str]) –

Return type

Type[Model]

class sagemaker.serve.spec.inference_spec.InferenceSpec

Abstract base class for holding custom load, invoke and prepare functions.

Provides a skeleton for customization to override the methods load, invoke and prepare.

class sagemaker.serve.builder.schema_builder.SchemaBuilder(sample_input, sample_output, input_translator=None, output_translator=None)

Automatically detects the serializer and deserializer for your model.

This is done by inspecting the sample_input and sample_output object. Alternatively, provide your custom serializer and deserializer for your request or response by creating a class that inherits CustomPayloadTranslator and provide it to SchemaBuilder.

Parameters
  • sample_input (object) – Sample input to the model which can be used for testing. The schema builder internally generates the content type and corresponding serializing functions.

  • sample_output (object) – Sample output to the model which can be used for testing. The schema builder internally generates the accept type and corresponding serializing functions.

  • input_translator (Optional[CustomPayloadTranslator]) – If you want to define your own serialization method for the payload, you can implement your functions for translation.

  • output_translator (Optional[CustomPayloadTranslator]) – If you want to define your own serialization method for the output, you can implement your functions for translation.

class sagemaker.serve.marshalling.custom_payload_translator.CustomPayloadTranslator(content_type='application/custom', accept_type='application/custom')

Abstract base class for handling custom payload serialization and deserialization.

Provides a skeleton for customization requiring the overriding of the serialize_payload and deserialize_payload methods.

Parameters
  • content_type (str) – The content type of the endpoint input data.

  • accept_type (str) – The content type of the data accepted from the endpoint.