AutoML¶
A class for SageMaker AutoML Jobs.
-
class
sagemaker.automl.automl.
AutoML
(role, target_attribute_name, output_kms_key=None, output_path=None, base_job_name=None, compression_type=None, sagemaker_session=None, volume_kms_key=None, encrypt_inter_container_traffic=False, vpc_config=None, problem_type=None, max_candidates=None, max_runtime_per_training_job_in_seconds=None, total_job_runtime_in_seconds=None, job_objective=None, generate_candidate_definitions_only=False, tags=None)¶ Bases:
object
A class for creating and interacting with SageMaker AutoML jobs.
-
fit
(inputs=None, wait=True, logs=True, job_name=None)¶ Create an AutoML Job with the input dataset.
- Parameters
inputs (str or list[str] or AutoMLInput) – Local path or S3 Uri where the training data is stored. Or an AutoMLInput object. If a local path is provided, the dataset will be uploaded to an S3 location.
wait (bool) – Whether the call should wait until the job completes (default: True).
logs (bool) – Whether to show the logs produced by the job. Only meaningful when wait is True (default: True). if
wait
is False,logs
will be set to False as well.job_name (str) – Training job name. If not specified, the estimator generates a default job name, based on the training image name and current timestamp.
-
classmethod
attach
(auto_ml_job_name, sagemaker_session=None)¶ Attach to an existing AutoML job.
Creates and returns a AutoML bound to an existing automl job.
- Parameters
auto_ml_job_name (str) – AutoML job name
sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, the one originally associated with the
AutoML
instance is used.
- Returns
A
AutoML
instance with the attached automl job.- Return type
sagemaker.automl.AutoML
-
describe_auto_ml_job
(job_name=None)¶ Returns the job description of an AutoML job for the given job name.
-
best_candidate
(job_name=None)¶ Returns the best candidate of an AutoML job for a given name.
-
list_candidates
(job_name=None, status_equals=None, candidate_name=None, candidate_arn=None, sort_order=None, sort_by=None, max_results=None)¶ Returns the list of candidates of an AutoML job for a given name.
- Parameters
job_name (str) – The name of the AutoML job. If None, will use object’s _current_job name.
status_equals (str) – Filter the result with candidate status, values could be “Completed”, “InProgress”, “Failed”, “Stopped”, “Stopping”
candidate_name (str) – The name of a specified candidate to list. Default to None.
candidate_arn (str) – The Arn of a specified candidate to list. Default to None.
sort_order (str) – The order that the candidates will be listed in result. Default to None.
sort_by (str) – The value that the candidates will be sorted by. Default to None.
max_results (int) – The number of candidates will be listed in results, between 1 to 100. Default to None. If None, will return all the candidates.
- Returns
A list of dictionaries with candidates information.
- Return type
-
create_model
(name, sagemaker_session=None, candidate=None, vpc_config=None, enable_network_isolation=False, model_kms_key=None, predictor_cls=None, inference_response_keys=None)¶ Creates a model from a given candidate or the best candidate from the job.
- Parameters
name (str) – The pipeline model name.
sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, the one originally associated with the
AutoML
instance is used.:candidate (CandidateEstimator or dict) – a CandidateEstimator used for deploying to a SageMaker Inference Pipeline. If None, the best candidate will be used. If the candidate input is a dict, a CandidateEstimator will be created from it.
vpc_config (dict) – Specifies a VPC that your training jobs and hosted models have access to. Contents include “SecurityGroupIds” and “Subnets”.
enable_network_isolation (bool) – Isolates the training container. No inbound or outbound network calls can be made, except for calls between peers within a training cluster for distributed training. Default: False
model_kms_key (str) – KMS key ARN used to encrypt the repacked model archive file if the model is repacked
predictor_cls (callable[string, sagemaker.session.Session]) – A function to call to create a predictor (default: None). If specified,
deploy()
returns the result of invoking this function on the created endpoint name.inference_response_keys (list) – List of keys for response content. The order of the keys will dictate the content order in the response.
- Returns
PipelineModel object.
-
deploy
(initial_instance_count, instance_type, serializer=None, deserializer=None, candidate=None, sagemaker_session=None, name=None, endpoint_name=None, tags=None, wait=True, vpc_config=None, enable_network_isolation=False, model_kms_key=None, predictor_cls=None, inference_response_keys=None)¶ Deploy a candidate to a SageMaker Inference Pipeline.
- Parameters
initial_instance_count (int) – The initial number of instances to run in the
Endpoint
created from thisModel
.instance_type (str) – The EC2 instance type to deploy this Model to. For example, ‘ml.p2.xlarge’.
serializer (
BaseSerializer
) – A serializer object, used to encode data for an inference endpoint (default: None). Ifserializer
is not None, thenserializer
will override the default serializer. The default serializer is set by thepredictor_cls
.deserializer (
BaseDeserializer
) – A deserializer object, used to decode data from an inference endpoint (default: None). Ifdeserializer
is not None, thendeserializer
will override the default deserializer. The default deserializer is set by thepredictor_cls
.candidate (CandidateEstimator or dict) – a CandidateEstimator used for deploying to a SageMaker Inference Pipeline. If None, the best candidate will be used. If the candidate input is a dict, a CandidateEstimator will be created from it.
sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, the one originally associated with the
AutoML
instance is used.name (str) – The pipeline model name. If None, a default model name will be selected on each
deploy
.endpoint_name (str) – The name of the endpoint to create (default: None). If not specified, a unique endpoint name will be created.
tags (List[dict[str, str]]) – The list of tags to attach to this specific endpoint.
wait (bool) – Whether the call should wait until the deployment of model completes (default: True).
vpc_config (dict) – Specifies a VPC that your training jobs and hosted models have access to. Contents include “SecurityGroupIds” and “Subnets”.
enable_network_isolation (bool) – Isolates the training container. No inbound or outbound network calls can be made, except for calls between peers within a training cluster for distributed training. Default: False
model_kms_key (str) – KMS key ARN used to encrypt the repacked model archive file if the model is repacked
predictor_cls (callable[string, sagemaker.session.Session]) – A function to call to create a predictor (default: None). If specified,
deploy()
returns the result of invoking this function on the created endpoint name.inference_response_keys (list) – List of keys for response content. The order of the keys will dictate the content order in the response.
- Returns
If
predictor_cls
is specified, the invocation ofself.predictor_cls
on the created endpoint name. Otherwise,None
.- Return type
callable[string, sagemaker.session.Session] or
None
-
classmethod
validate_and_update_inference_response
(inference_containers, inference_response_keys)¶ Validates the requested inference keys and updates response content.
On validation, also updates the inference containers to emit appropriate response content in the inference response.
- Parameters
- Raises
ValueError – if one or more of inference_response_keys are unsupported by the model
-
-
class
sagemaker.automl.automl.
AutoMLInput
(inputs, target_attribute_name, compression=None)¶ Bases:
object
Accepts parameters that specify an S3 input for an auto ml job
Provides a method to turn those parameters into a dictionary.
Convert an S3 Uri or a list of S3 Uri to an AutoMLInput object.
- Parameters
(str, list[str]) (inputs) – a string or a list of string that points to (a) S3 location(s) where input data is stored.
(str) (compression) – the target attribute name for regression or classification.
(str) – if training data is compressed, the compression type. The default value is None.
-
to_request_dict
()¶ Generates a request dictionary using the parameters provided to the class.
-
class
sagemaker.automl.automl.
AutoMLJob
(sagemaker_session, job_name, inputs)¶ Bases:
sagemaker.job._Job
A class for interacting with CreateAutoMLJob API.
Args: sagemaker_session: job_name:
-
classmethod
start_new
(auto_ml, inputs)¶ Create a new Amazon SageMaker AutoML job from auto_ml.
-
describe
()¶ Prints out a response from the DescribeAutoMLJob API call.
-
classmethod
A class for AutoML Job’s Candidate.
-
class
sagemaker.automl.candidate_estimator.
CandidateEstimator
(candidate, sagemaker_session=None)¶ Bases:
object
A class for SageMaker AutoML Job Candidate
Constructor of CandidateEstimator.
- Parameters
candidate (dict) – a dictionary of candidate returned by AutoML.list_candidates() or AutoML.best_candidate().
sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.
-
get_steps
()¶ Get the step job of a candidate so that users can construct estimators/transformers
- Returns
- a list of dictionaries that provide information about each step job’s name,
type, inputs and description
- Return type
-
fit
(inputs, candidate_name=None, volume_kms_key=None, encrypt_inter_container_traffic=False, vpc_config=None, wait=True, logs=True)¶ Rerun a candidate’s step jobs with new input datasets or security config.
- Parameters
inputs (str or list[str]) – Local path or S3 Uri where the training data is stored. If a local path is provided, the dataset will be uploaded to an S3 location.
candidate_name (str) – name of the candidate to be rerun, if None, candidate’s original name will be used.
volume_kms_key (str) – The KMS key id to encrypt data on the storage volume attached to the ML compute instance(s).
encrypt_inter_container_traffic (bool) – To encrypt all communications between ML compute instances in distributed training. Default: False.
vpc_config (dict) – Specifies a VPC that jobs and hosted models have access to. Control access to and from training and model containers by configuring the VPC
wait (bool) – Whether the call should wait until all jobs completes (default: True).
logs (bool) – Whether to show the logs produced by the job. Only meaningful when wait is True (default: True).
-
class
sagemaker.automl.candidate_estimator.
CandidateStep
(name, inputs, step_type, description)¶ Bases:
object
A class that maintains an AutoML Candidate step’s name, inputs, type, and description.
-
property
name
¶ Name of the candidate step -> (str)
-
property
inputs
¶ Inputs of the candidate step -> (dict)
-
property
type
¶ Type of the candidate step, Training or Transform -> (str)
-
property
description
¶ Description of candidate step job -> (dict)
-
property