sagemaker.serve.bedrock_model_builder#
Holds the BedrockModelBuilder class.
Classes
|
Builder class for deploying models to Amazon Bedrock. |
- class sagemaker.serve.bedrock_model_builder.BedrockModelBuilder(model: ModelTrainer | MultiTurnRLTrainer | AgentRFTJob | TrainingJob | ModelPackage | None)[source]#
Bases:
objectBuilder class for deploying models to Amazon Bedrock.
This class provides functionality to deploy SageMaker models to Bedrock using either model import jobs or custom model creation, depending on the model type (Nova models vs. other models).
- Parameters:
model – The model to deploy. Can be a ModelTrainer, MultiTurnRLTrainer, TrainingJob, or ModelPackage instance.
- create_deployment(model_arn: str, deployment_name: str | None = None, poll_interval: int = 60, max_wait: int = 3600, **kwargs) Dict[str, Any][source]#
Create a deployment for a Nova custom model.
Polls the model status until it becomes Active before creating the deployment, then polls the deployment status until it becomes Active.
- Parameters:
model_arn – ARN of the custom model to deploy.
deployment_name – Name for the deployment.
poll_interval – Seconds between status checks. Defaults to 60 for model, 30 for deployment.
max_wait – Maximum seconds to wait per polling phase. Defaults to 3600.
**kwargs – Additional parameters for create_custom_model_deployment.
- Returns:
Response from Bedrock create_custom_model_deployment API.
- Raises:
RuntimeError – If the model or deployment fails or times out.
ValueError – If model_arn is not provided.
- create_provisioned_throughput(model_id: str | None = None, provisioned_model_name: str | None = None, model_units: int = 1, commitment_duration: str | None = None, tags: list | None = None, poll_interval: int = 60, max_wait: int = 3600) Dict[str, Any][source]#
Create provisioned throughput for an imported model on Bedrock.
Calls CreateProvisionedModelThroughput and polls until the provisioned throughput reaches InService status.
- Parameters:
model_id – ARN or name of the model. If not provided, uses the model ID from the most recent deploy() call.
provisioned_model_name – Name for the provisioned throughput resource.
model_units – Number of model units to provision. Defaults to 1.
commitment_duration – Commitment duration. Valid values: ‘OneMonth’, ‘SixMonths’. If not provided, no commitment is set (on-demand).
tags – Tags for the provisioned throughput resource.
poll_interval – Seconds between status checks. Defaults to 60.
max_wait – Maximum seconds to wait. Defaults to 3600.
- Returns:
Response from Bedrock create_provisioned_model_throughput API.
- Raises:
RuntimeError – If the provisioned throughput fails or times out.
ValueError – If model_id cannot be determined or provisioned_model_name is not provided.
- deploy(job_name: str | None = None, imported_model_name: str | None = None, custom_model_name: str | None = None, role_arn: str | None = None, job_tags: list | None = None, imported_model_tags: list | None = None, model_tags: list | None = None, client_request_token: str | None = None, imported_model_kms_key_id: str | None = None, deployment_name: str | None = None) Dict[str, Any][source]#
Deploy the model to Bedrock.
Automatically detects if the model is a Nova model and uses the appropriate Bedrock API (create_custom_model for Nova, create_model_import_job for OSS). For Nova models, creates a custom model deployment and polls until active. For OSS models, creates a model import job and polls until complete. Once deploy() returns, the model is ready for on-demand inference. For provisioned throughput, use the separate create_provisioned_throughput() method.
- Parameters:
job_name – Name for the model import job (OSS models only).
imported_model_name – Name for the imported model (OSS models only).
custom_model_name – Name for the custom model (Nova models only).
role_arn – IAM role ARN with permissions for Bedrock operations.
job_tags – Tags for the import job (OSS models only).
imported_model_tags – Tags for the imported model (OSS models only).
model_tags – Tags for the custom model (Nova models only).
client_request_token – Unique token for idempotency (OSS models only).
imported_model_kms_key_id – KMS key ID for encryption (OSS models only).
deployment_name – Name for the deployment (Nova models only). If not provided, defaults to custom_model_name suffixed with ‘-deployment’.
- Returns:
the create_custom_model_deployment response. For OSS models: the completed get_model_import_job response.
- Return type:
For Nova models
- Raises:
ValueError – If model_package is not set or required parameters are missing.
RuntimeError – If the import job or deployment fails or times out.