Model Builder

This module contains classes related to Amazon Sagemaker Model Builder

class sagemaker.serve.builder.model_builder.ModelBuilder(model_path='/tmp/sagemaker/model-builder/50e90eeeb2b611ef86470242ac110002', role_arn=None, sagemaker_session=None, name='model-name-50e912eab2b611ef86470242ac110002', mode=Mode.SAGEMAKER_ENDPOINT, shared_libs=<factory>, dependencies=<factory>, env_vars=<factory>, log_level=10, content_type=None, accept_type=None, s3_model_data_url=None, instance_type='ml.c5.xlarge', schema_builder=None, model=None, inference_spec=None, image_uri=None, image_config=None, vpc_config=None, model_server=None, model_metadata=None)

Class that builds a deployable model.

Parameters
  • role_arn (Optional[str]) – The role for the endpoint.

  • model_path (Optional[str]) – The path of the model directory.

  • sagemaker_session (Optional[sagemaker.session.Session]) – The SageMaker session to use for the execution.

  • name (Optional[str]) – The model name.

  • mode (Optional[Mode]) –

    The mode of operation. The following modes are supported:

    • Mode.SAGEMAKER_ENDPOINT: Launch on a SageMaker endpoint

    • Mode.LOCAL_CONTAINER: Launch locally with a container

    • Mode.IN_PROCESS: Launch locally to a FastAPI server instead of using a container.

  • shared_libs (List[str]) – Any shared libraries you want to bring into the model packaging.

  • dependencies (Optional[Dict[str, Any]) –

    The dependencies of the model or container. Takes a dict as an input where you can specify autocapture as True or False, a requirements file, or custom dependencies as a list. A sample dependencies dict:

    {
        "auto": False,
        "requirements": "/path/to/requirements.txt",
        "custom": ["custom_module==1.2.3",
          "other_module@http://some/website.whl"],
    }
    

  • env_vars (Optional[Dict[str, str]) – The environment variables for the runtime execution.

  • log_level (Optional[int]) – The log level. Possible values are CRITICAL, ERROR, WARNING, INFO, DEBUG, and NOTSET.

  • content_type (Optional[str]) – The content type of the endpoint input data. This value is automatically derived from the input sample, but you can override it.

  • accept_type (Optional[str]) – The content type of the data accepted from the endpoint. This value is automatically derived from the output sample, but you can override the value.

  • s3_model_data_url (Optional[str]) – The S3 location where you want to upload the model package. Defaults to a S3 bucket prefixed with the account ID.

  • instance_type (Optional[str]) – The instance type of the endpoint. Defaults to the CPU instance type to help narrow the container type based on the instance family.

  • schema_builder (Optional[SchemaBuilder]) – The input/output schema of the model. The schema builder translates the input into bytes and converts the response into a stream. All translations between the server and the client are handled automatically with the specified input and output. The schema builder can be omitted for HuggingFace models with task types TextGeneration, TextClassification, and QuestionAnswering. Omitting SchemaBuilder is in beta for FillMask, and AutomaticSpeechRecognition use-cases.

  • model (Optional[Union[object, str, ModelTrainer, TrainingJob, Estimator]]) – Define object from which training artifacts can be extracted. Either model or inference_spec is required for the model builder to build the artifact.

  • inference_spec (InferenceSpec) – The inference spec file with your customized invoke and load functions.

  • image_uri (Optional[str]) – The container image uri (which is derived from a SageMaker-based container).

  • image_config (dict[str, str] or dict[str, PipelineVariable]) – Specifies whether the image of model container is pulled from ECR, or private registry in your VPC. By default it is set to pull model container image from ECR. (default: None).

  • vpc_config (Optional[Dict[str, List[Union[str, PipelineVariable]]]]) – The VpcConfig set on the model (default: None) * ‘Subnets’ (List[Union[str, PipelineVariable]]): List of subnet ids. * ‘SecurityGroupIds’ (List[Union[str, PipelineVariable]]]): List of security group ids.

  • model_server (Optional[ModelServer]) – The model server to which to deploy. You need to provide this argument when you specify an image_uri in order for model builder to build the artifacts correctly (according to the model server). Possible values for this argument are TORCHSERVE, MMS, TENSORFLOW_SERVING, DJL_SERVING, TRITON, TGI, and TEI.

  • model_metadata (Optional[Dict[str, Any]) – Dictionary used to override model metadata. Currently, HF_TASK is overridable for HuggingFace model. HF_TASK should be set for new models without task metadata in the Hub, adding unsupported task types will throw an exception. MLFLOW_MODEL_PATH is available for providing local path or s3 path to MLflow artifacts. However, MLFLOW_MODEL_PATH is experimental and is not intended for production use at this moment. CUSTOM_MODEL_PATH is available for providing local path or s3 path to model artifacts. FINE_TUNING_MODEL_PATH is available for providing s3 path to fine-tuned model artifacts. FINE_TUNING_JOB_NAME is available for providing fine-tuned job name. Both FINE_TUNING_MODEL_PATH and FINE_TUNING_JOB_NAME are mutually exclusive.

ModelBuilder.build(*args, **kwargs)
ModelBuilder.save(*args, **kwargs)
class sagemaker.serve.spec.inference_spec.InferenceSpec

Abstract base class for holding custom load, invoke and prepare functions.

Provides a skeleton for customization to override the methods load, invoke and prepare.

class sagemaker.serve.builder.schema_builder.SchemaBuilder(sample_input, sample_output, input_translator=None, output_translator=None)

Automatically detects the serializer and deserializer for your model.

This is done by inspecting the sample_input and sample_output object. Alternatively, provide your custom serializer and deserializer for your request or response by creating a class that inherits CustomPayloadTranslator and provide it to SchemaBuilder.

Parameters
  • sample_input (object) – Sample input to the model which can be used for testing. The schema builder internally generates the content type and corresponding serializing functions.

  • sample_output (object) – Sample output to the model which can be used for testing. The schema builder internally generates the accept type and corresponding serializing functions.

  • input_translator (Optional[CustomPayloadTranslator]) – If you want to define your own serialization method for the payload, you can implement your functions for translation.

  • output_translator (Optional[CustomPayloadTranslator]) – If you want to define your own serialization method for the output, you can implement your functions for translation.

class sagemaker.serve.marshalling.custom_payload_translator.CustomPayloadTranslator(content_type='application/custom', accept_type='application/custom')

Abstract base class for handling custom payload serialization and deserialization.

Provides a skeleton for customization requiring the overriding of the serialize_payload and deserialize_payload methods.

Parameters
  • content_type (str) – The content type of the endpoint input data.

  • accept_type (str) – The content type of the data accepted from the endpoint.