Model Builder¶
This module contains classes related to Amazon Sagemaker Model Builder
- class sagemaker.serve.builder.model_builder.ModelBuilder(model_path='/tmp/sagemaker/model-builder/8da4334066f811efbd5c0242ac110002', role_arn=None, sagemaker_session=None, name='model-name-8da4352a66f811efbd5c0242ac110002', mode=Mode.SAGEMAKER_ENDPOINT, shared_libs=<factory>, dependencies=<factory>, env_vars=<factory>, log_level=10, content_type=None, accept_type=None, s3_model_data_url=None, instance_type='ml.c5.xlarge', schema_builder=None, model=None, inference_spec=None, image_uri=None, image_config=None, vpc_config=None, model_server=None, model_metadata=None)¶
Class that builds a deployable model.
- Parameters
role_arn (Optional[str]) – The role for the endpoint.
model_path (Optional[str]) – The path of the model directory.
sagemaker_session (Optional[sagemaker.session.Session]) – The SageMaker session to use for the execution.
name (Optional[str]) – The model name.
mode (Optional[Mode]) –
The mode of operation. The following modes are supported:
Mode.SAGEMAKER_ENDPOINT
: Launch on a SageMaker endpointMode.LOCAL_CONTAINER
: Launch locally with a container
shared_libs (List[str]) – Any shared libraries you want to bring into the model packaging.
dependencies (Optional[Dict[str, Any]) –
The dependencies of the model or container. Takes a dict as an input where you can specify autocapture as
True
orFalse
, a requirements file, or custom dependencies as a list. A sampledependencies
dict:{ "auto": False, "requirements": "/path/to/requirements.txt", "custom": ["custom_module==1.2.3", "other_module@http://some/website.whl"], }
env_vars (Optional[Dict[str, str]) – The environment variables for the runtime execution.
log_level (Optional[int]) – The log level. Possible values are
CRITICAL
,ERROR
,WARNING
,INFO
,DEBUG
, andNOTSET
.content_type (Optional[str]) – The content type of the endpoint input data. This value is automatically derived from the input sample, but you can override it.
accept_type (Optional[str]) – The content type of the data accepted from the endpoint. This value is automatically derived from the output sample, but you can override the value.
s3_model_data_url (Optional[str]) – The S3 location where you want to upload the model package. Defaults to a S3 bucket prefixed with the account ID.
instance_type (Optional[str]) – The instance type of the endpoint. Defaults to the CPU instance type to help narrow the container type based on the instance family.
schema_builder (Optional[SchemaBuilder]) – The input/output schema of the model. The schema builder translates the input into bytes and converts the response into a stream. All translations between the server and the client are handled automatically with the specified input and output. The schema builder can be omitted for HuggingFace models with task types TextGeneration, TextClassification, and QuestionAnswering. Omitting SchemaBuilder is in beta for FillMask, and AutomaticSpeechRecognition use-cases.
model (Optional[Union[object, str]) – Model object (with
predict
method to perform inference) or a HuggingFace/JumpStart Model ID. Eithermodel
orinference_spec
is required for the model builder to build the artifact.inference_spec (InferenceSpec) – The inference spec file with your customized
invoke
andload
functions.image_uri (Optional[str]) – The container image uri (which is derived from a SageMaker-based container).
image_config (dict[str, str] or dict[str, PipelineVariable]) – Specifies whether the image of model container is pulled from ECR, or private registry in your VPC. By default it is set to pull model container image from ECR. (default: None).
vpc_config (Optional[Dict[str, List[Union[str, PipelineVariable]]]]) – The VpcConfig set on the model (default: None) * ‘Subnets’ (List[Union[str, PipelineVariable]]): List of subnet ids. * ‘SecurityGroupIds’ (List[Union[str, PipelineVariable]]]): List of security group ids.
model_server (Optional[ModelServer]) – The model server to which to deploy. You need to provide this argument when you specify an
image_uri
in order for model builder to build the artifacts correctly (according to the model server). Possible values for this argument areTORCHSERVE
,MMS
,TENSORFLOW_SERVING
,DJL_SERVING
,TRITON
,TGI
, andTEI
.model_metadata (Optional[Dict[str, Any]) – Dictionary used to override model metadata. Currently,
HF_TASK
is overridable for HuggingFace model. HF_TASK should be set for new models without task metadata in the Hub, adding unsupported task types will throw an exception.MLFLOW_MODEL_PATH
is available for providing local path or s3 path to MLflow artifacts. However,MLFLOW_MODEL_PATH
is experimental and is not intended for production use at this moment.CUSTOM_MODEL_PATH
is available for providing local path or s3 path to model artifacts.FINE_TUNING_MODEL_PATH
is available for providing s3 path to fine-tuned model artifacts.FINE_TUNING_JOB_NAME
is available for providing fine-tuned job name. BothFINE_TUNING_MODEL_PATH
andFINE_TUNING_JOB_NAME
are mutually exclusive.
- ModelBuilder.build(mode=None, role_arn=None, sagemaker_session=None)¶
Create a deployable
Model
instance withModelBuilder
.- Parameters
mode (Type[Mode], optional) – The mode. Defaults to
None
.role_arn (str, optional) – The IAM role arn. Defaults to
None
.sagemaker_session (Optional[Session]) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the function creates one using the default AWS configuration chain.
- Returns
A deployable
Model
object.- Return type
Type[Model]
- ModelBuilder.save(save_path=None, s3_path=None, sagemaker_session=None, role_arn=None)¶
WARNING: This function is expremental and not intended for production use.
This function is available for models served by DJL serving.
- Parameters
save_path (Optional[str]) – The path where you want to save resources. Defaults to
None
.s3_path (Optional[str]) – The path where you want to upload resources. Defaults to
None
.sagemaker_session (Optional[Session]) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the function creates one using the default AWS configuration chain. Defaults to
None
.role_arn (Optional[str]) – The IAM role arn. Defaults to
None
.
- Return type
- class sagemaker.serve.spec.inference_spec.InferenceSpec¶
Abstract base class for holding custom
load
,invoke
andprepare
functions.Provides a skeleton for customization to override the methods
load
,invoke
andprepare
.
- class sagemaker.serve.builder.schema_builder.SchemaBuilder(sample_input, sample_output, input_translator=None, output_translator=None)¶
Automatically detects the serializer and deserializer for your model.
This is done by inspecting the sample_input and sample_output object. Alternatively, provide your custom serializer and deserializer for your request or response by creating a class that inherits
CustomPayloadTranslator
and provide it toSchemaBuilder
.- Parameters
sample_input (object) – Sample input to the model which can be used for testing. The schema builder internally generates the content type and corresponding serializing functions.
sample_output (object) – Sample output to the model which can be used for testing. The schema builder internally generates the accept type and corresponding serializing functions.
input_translator (Optional[CustomPayloadTranslator]) – If you want to define your own serialization method for the payload, you can implement your functions for translation.
output_translator (Optional[CustomPayloadTranslator]) – If you want to define your own serialization method for the output, you can implement your functions for translation.
- class sagemaker.serve.marshalling.custom_payload_translator.CustomPayloadTranslator(content_type='application/custom', accept_type='application/custom')¶
Abstract base class for handling custom payload serialization and deserialization.
Provides a skeleton for customization requiring the overriding of the serialize_payload and deserialize_payload methods.