HyperparameterTuner¶

class sagemaker.tuner.HyperparameterTuner(estimator, objective_metric_name, hyperparameter_ranges, metric_definitions=None, strategy='Bayesian', objective_type='Maximize', max_jobs=None, max_parallel_jobs=1, max_runtime_in_seconds=None, tags=None, base_tuning_job_name=None, warm_start_config=None, strategy_config=None, completion_criteria_config=None, early_stopping_type='Off', estimator_name=None, random_seed=None)¶

Bases: object

Defines interaction with Amazon SageMaker hyperparameter tuning jobs.

It also supports deploying the resulting models.

Creates a HyperparameterTuner instance.

It takes an estimator to obtain configuration information for training jobs that are created as the result of a hyperparameter tuning job.

Parameters

estimator (sagemaker.estimator.EstimatorBase) – An estimator object that has been initialized with the desired configuration. There does not need to be a training job associated with this instance.
objective_metric_name (str or PipelineVariable) – Name of the metric for evaluating training jobs.
hyperparameter_ranges (dict[str, sagemaker.parameter.ParameterRange]) – Dictionary of parameter ranges. These parameter ranges can be one of three types: Continuous, Integer, or Categorical. The keys of the dictionary are the names of the hyperparameter, and the values are the appropriate parameter range class to represent the range.
metric_definitions (list[dict[str, str] or list[dict[str, PipelineVariable]]) – A list of dictionaries that defines the metric(s) used to evaluate the training jobs (default: None). Each dictionary contains two keys: ‘Name’ for the name of the metric, and ‘Regex’ for the regular expression used to extract the metric from the logs. This should be defined only for hyperparameter tuning jobs that don’t use an Amazon algorithm.
strategy (str or PipelineVariable) – Strategy to be used for hyperparameter estimations (default: ‘Bayesian’).
objective_type (str or PipelineVariable) – The type of the objective metric for evaluating training jobs. This value can be either ‘Minimize’ or ‘Maximize’ (default: ‘Maximize’).
max_jobs (int or PipelineVariable) – Maximum total number of training jobs to start for the hyperparameter tuning job. The default value is unspecified fot the GridSearch strategy and the default value is 1 for all others strategies (default: None).
max_parallel_jobs (int or PipelineVariable) – Maximum number of parallel training jobs to start (default: 1).
max_runtime_in_seconds (int or PipelineVariable) – The maximum time in seconds that a training job launched by a hyperparameter tuning job can run.
tags (list[dict[str, str] or list[dict[str, PipelineVariable]]) – List of tags for labeling the tuning job (default: None). For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.
base_tuning_job_name (str) – Prefix for the hyperparameter tuning job name when the fit() method launches. If not specified, a default job name is generated, based on the training image name and current timestamp.
warm_start_config (sagemaker.tuner.WarmStartConfig) – A WarmStartConfig object that has been initialized with the configuration defining the nature of warm start tuning job.
strategy_config (sagemaker.tuner.StrategyConfig) – A configuration for “Hyperparameter” tuning job optimisation strategy.
completion_criteria_config (sagemaker.tuner.TuningJobCompletionCriteriaConfig) – A configuration for the completion criteria.
early_stopping_type (str or PipelineVariable) – Specifies whether early stopping is enabled for the job. Can be either ‘Auto’ or ‘Off’ (default: ‘Off’). If set to ‘Off’, early stopping will not be attempted. If set to ‘Auto’, early stopping of some training jobs may happen, but is not guaranteed to.
estimator_name (str) – A unique name to identify an estimator within the hyperparameter tuning job, when more than one estimator is used with the same tuning job (default: None).
random_seed (int) – An initial value used to initialize a pseudo-random number generator. Setting a random seed will make the hyperparameter tuning search strategies to produce more consistent configurations for the same tuning job.

TUNING_JOB_NAME_MAX_LENGTH = 32¶

SAGEMAKER_ESTIMATOR_MODULE = 'sagemaker_estimator_module'¶

SAGEMAKER_ESTIMATOR_CLASS_NAME = 'sagemaker_estimator_class_name'¶

DEFAULT_ESTIMATOR_MODULE = 'sagemaker.estimator'¶

DEFAULT_ESTIMATOR_CLS_NAME = 'Estimator'¶

override_resource_config(instance_configs)¶

Override the instance configuration of the estimators used by the tuner.

Parameters: instance_configs (List[InstanceConfig] or Dict[str, List[InstanceConfig]) – The InstanceConfigs to use as an override for the instance configuration of the estimator. None will remove the override.

fit(inputs=None, job_name=None, include_cls_metadata=False, estimator_kwargs=None, wait=True, **kwargs)¶

Start a hyperparameter tuning job.

Parameters

inputs –
Information about the training data. Please refer to the fit() method of the associated estimator, as this can take any of the following forms:
- (str) - The S3 location where training data is saved.
- (dict[str, str] or dict[str, sagemaker.inputs.TrainingInput]) -
  If using multiple channels for training data, you can specify a dict mapping channel names to strings or TrainingInput() objects.
- (sagemaker.inputs.TrainingInput) - Channel configuration for S3 data sources
  that can provide additional information about the training dataset. See sagemaker.inputs.TrainingInput() for full details.
- (sagemaker.session.FileSystemInput) - channel configuration for
  a file system data source that can provide additional information as well as the path to the training dataset.
- (sagemaker.amazon.amazon_estimator.RecordSet) - A collection of
  Amazon :class:~`Record` objects serialized and stored in S3. For use with an estimator for an Amazon algorithm.
- (sagemaker.amazon.amazon_estimator.FileSystemRecordSet) -
  Amazon SageMaker channel configuration for a file system data source for Amazon algorithms.
- (list[sagemaker.amazon.amazon_estimator.RecordSet]) - A list of
  :class:~`sagemaker.amazon.amazon_estimator.RecordSet` objects, where each instance is a different channel of training data.
- (list[sagemaker.amazon.amazon_estimator.FileSystemRecordSet]) - A list of
  :class:~`sagemaker.amazon.amazon_estimator.FileSystemRecordSet` objects, where each instance is a different channel of training data.
job_name (str) – Tuning job name. If not specified, the tuner generates a default job name, based on the training image name and current timestamp.
include_cls_metadata –
It can take one of the following two forms.
- (bool) - Whether or not the hyperparameter tuning job should include information
  about the estimator class (default: False). This information is passed as a hyperparameter, so if the algorithm you are using cannot handle unknown hyperparameters (e.g. an Amazon SageMaker built-in algorithm that does not have a custom estimator in the Python SDK), then set include_cls_metadata to False.
- (dict[str, bool]) - This version should be used for tuners created via the
  factory method create(), to specify the flag for each estimator provided in the estimator_dict argument of the method. The keys would be the same estimator names as in estimator_dict. If one estimator doesn’t need the flag set, then no need to include it in the dictionary.
estimator_kwargs (dict[str, dict]) – Dictionary for other arguments needed for training. Should be used only for tuners created via the factory method create(). The keys are the estimator names for the estimator_dict argument of create() method. Each value is a dictionary for the other arguments needed for training of the corresponding estimator.
wait (bool) – Whether the call should wait until the job completes (default: True).
**kwargs – Other arguments needed for training. Please refer to the fit() method of the associated estimator to see what other arguments are needed.

classmethod attach(tuning_job_name, sagemaker_session=None, job_details=None, estimator_cls=None)¶

Attach to an existing hyperparameter tuning job.

Create a HyperparameterTuner bound to an existing hyperparameter tuning job. After attaching, if there exists a best training job (or any other completed training job), that can be deployed to create an Amazon SageMaker Endpoint and return a Predictor.

The HyperparameterTuner instance could be created in one of the following two forms.

If the ‘TrainingJobDefinition’ field is present in tuning job description, the tuner
will be created using the default constructor with a single estimator.

If the ‘TrainingJobDefinitions’ field (list) is present in tuning job description,
the tuner will be created using the factory method create() with one or several estimators. Each estimator corresponds to one item in the ‘TrainingJobDefinitions’ field, while the estimator names would come from the ‘DefinitionName’ field of items in the ‘TrainingJobDefinitions’ field. For more details on how tuners are created from multiple estimators, see create() documentation.

For more details on ‘TrainingJobDefinition’ and ‘TrainingJobDefinitions’ fields in tuning job description, see https://botocore.readthedocs.io/en/latest/reference/services/sagemaker.html#SageMaker.Client.create_hyper_parameter_tuning_job

Parameters

tuning_job_name (str) – The name of the hyperparameter tuning job to attach to.
sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.
job_details (dict) – The response to a DescribeHyperParameterTuningJob call. If not specified, the HyperparameterTuner will perform one such call with the provided hyperparameter tuning job name.
estimator_cls –
It can take one of the following two forms.

(str): The estimator class name associated with the training jobs, e.g.
’sagemaker.estimator.Estimator’. If not specified, the HyperparameterTuner will try to derive the correct estimator class from training job metadata, defaulting to :class:~`sagemaker.estimator.Estimator` if it is unable to determine a more specific class.

(dict[str, str]): This form should be used only when the ‘TrainingJobDefinitions’
field (list) is present in tuning job description. In this scenario training jobs could be created from different training job definitions in the ‘TrainingJobDefinitions’ field, each of which would be mapped to a different estimator after the attach() call. The estimator_cls should then be a dictionary to specify estimator class names for individual estimators as needed. The keys should be the ‘DefinitionName’ value of items in ‘TrainingJobDefinitions’, which would be used as estimator names in the resulting tuner instance.

Examples

Example #1 - assuming we have the following tuning job description, which has the ‘TrainingJobDefinition’ field present using a SageMaker built-in algorithm (i.e. PCA), and attach() can derive the estimator class from the training image. So estimator_cls would not be needed.

{
    'BestTrainingJob': 'best_training_job_name',
    'TrainingJobDefinition': {
        'AlgorithmSpecification': {
            'TrainingImage': '174872318107.dkr.ecr.us-west-2.amazonaws.com/pca:1,
        },
    },
}

>>> my_tuner.fit()
>>> job_name = my_tuner.latest_tuning_job.name
Later on:
>>> attached_tuner = HyperparameterTuner.attach(job_name)
>>> attached_tuner.deploy()

Example #2 - assuming we have the following tuning job description, which has a 2-item list for the ‘TrainingJobDefinitions’ field. In this case ‘estimator_cls’ is only needed for the 2nd item since the 1st item uses a SageMaker built-in algorithm (i.e. PCA).

{
    'BestTrainingJob': 'best_training_job_name',
    'TrainingJobDefinitions': [
        {
            'DefinitionName': 'estimator_pca',
            'AlgorithmSpecification': {
                'TrainingImage': '174872318107.dkr.ecr.us-west-2.amazonaws.com/pca:1,
            },
        },
        {
            'DefinitionName': 'estimator_byoa',
            'AlgorithmSpecification': {
                'TrainingImage': '123456789012.dkr.ecr.us-west-2.amazonaws.com/byoa:latest,
            },
        }
    ]
}

>>> my_tuner.fit()
>>> job_name = my_tuner.latest_tuning_job.name
Later on:
>>> attached_tuner = HyperparameterTuner.attach(
>>>     job_name,
>>>     estimator_cls={
>>>         'estimator_byoa': 'org.byoa.Estimator'
>>>     })
>>> attached_tuner.deploy()

Returns: A HyperparameterTuner instance with the attached hyperparameter tuning job.
Return type: sagemaker.tuner.HyperparameterTuner

deploy(initial_instance_count, instance_type, serializer=None, deserializer=None, accelerator_type=None, endpoint_name=None, wait=True, model_name=None, kms_key=None, data_capture_config=None, **kwargs)¶

Deploy the best trained or user specified model to an Amazon SageMaker endpoint.

And also return a sagemaker.Predictor object.

For more information: http://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-training.html

Parameters

initial_instance_count (int) – Minimum number of EC2 instances to deploy to an endpoint for prediction.
instance_type (str) – Type of EC2 instance to deploy to an endpoint for prediction, for example, ‘ml.c4.xlarge’.
serializer (BaseSerializer) – A serializer object, used to encode data for an inference endpoint (default: None). If serializer is not None, then serializer will override the default serializer. The default serializer is set by the predictor_cls.
deserializer (BaseDeserializer) – A deserializer object, used to decode data from an inference endpoint (default: None). If deserializer is not None, then deserializer will override the default deserializer. The default deserializer is set by the predictor_cls.
accelerator_type (str) – Type of Elastic Inference accelerator to attach to an endpoint for model loading and inference, for example, ‘ml.eia1.medium’. If not specified, no Elastic Inference accelerator will be attached to the endpoint. For more information: https://docs.aws.amazon.com/sagemaker/latest/dg/ei.html
endpoint_name (str) – Name to use for creating an Amazon SageMaker endpoint. If not specified, the name of the training job is used.
wait (bool) – Whether the call should wait until the deployment of model completes (default: True).
model_name (str) – Name to use for creating an Amazon SageMaker model. If not specified, the name of the training job is used.
kms_key (str) – The ARN of the KMS key that is used to encrypt the data on the storage volume attached to the instance hosting the endpoint.
data_capture_config (sagemaker.model_monitor.DataCaptureConfig) – Specifies configuration related to Endpoint data capture for use with Amazon SageMaker Model Monitoring. Default: None.
**kwargs – Other arguments needed for deployment. Please refer to the create_model() method of the associated estimator to see what other arguments are needed.

Returns

A predictor that provides a predict(): method, which can be used to send requests to the Amazon SageMaker endpoint and obtain inferences.

Return type

sagemaker.predictor.Predictor

stop_tuning_job()¶: Stop latest running hyperparameter tuning job.

describe()¶: Returns a response from the DescribeHyperParameterTuningJob API call.

wait()¶: Wait for latest hyperparameter tuning job to finish.

best_estimator(best_training_job=None)¶

Return the estimator that has best training job attached.

The trained model can then be deployed to an Amazon SageMaker endpoint and return a sagemaker.Predictor object.

Parameters

best_training_job (dict) –

Dictionary containing “TrainingJobName” and “TrainingJobDefinitionName”.

Example:

{
    "TrainingJobName": "my_training_job_name",
    "TrainingJobDefinitionName": "my_training_job_definition_name"
}

Returns

The estimator that has the best training job: attached.

Return type

sagemaker.estimator.EstimatorBase

Raises

Exception – If there is no best training job available for the hyperparameter tuning job.

best_training_job()¶

Return name of the best training job for the latest hyperparameter tuning job.

Raises: Exception – If there is no best training job available for the hyperparameter tuning job.

hyperparameter_ranges()¶

Return the hyperparameter ranges in a dictionary.

Dictionary to be used as part of a request for creating a hyperparameter tuning job.

hyperparameter_ranges_dict()¶: Return a dictionary of hyperparameter ranges for all estimators in estimator_dict

property sagemaker_session¶

Convenience method for accessing the SageMaker session.

It access Session object associated with the estimator for the HyperparameterTuner.

analytics()¶

An instance of HyperparameterTuningJobAnalytics for this latest tuning job of this tuner.

Analytics olbject gives you access to tuning results summarized into a pandas dataframe.

transfer_learning_tuner(additional_parents=None, estimator=None)¶

Creates a new HyperparameterTuner.

Creation is done by copying the request fields from the provided parent to the new instance of HyperparameterTuner. Followed by addition of warm start configuration with the type as “TransferLearning” and parents as the union of provided list of additional_parents and the self. Also, training image in the new tuner’s estimator is updated with the provided training_image.

Examples

>>> parent_tuner = HyperparameterTuner.attach(tuning_job_name="parent-job-1")
>>> transfer_learning_tuner = parent_tuner.transfer_learning_tuner(
>>>                                             additional_parents={"parent-job-2"})
Later On:
>>> transfer_learning_tuner.fit(inputs={})

Parameters

additional_parents (set{str}) – Set of additional parents along with the self to be used in warm starting
estimator (sagemaker.estimator.EstimatorBase) – An estimator object that has been initialized with the desired configuration. There does not need to be a training job associated with this instance.

Returns

HyperparameterTuner instance which can be used to launch transfer learning tuning job.

Return type

sagemaker.tuner.HyperparameterTuner

identical_dataset_and_algorithm_tuner(additional_parents=None)¶

Creates a new HyperparameterTuner.

Creation is done by copying the request fields from the provided parent to the new instance of HyperparameterTuner.

Followed by addition of warm start configuration with the type as “IdenticalDataAndAlgorithm” and parents as the union of provided list of additional_parents and the self

Examples

>>> parent_tuner = HyperparameterTuner.attach(tuning_job_name="parent-job-1")
>>> identical_dataset_algo_tuner = parent_tuner.identical_dataset_and_algorithm_tuner(
>>>                                                additional_parents={"parent-job-2"})
Later On:
>>> identical_dataset_algo_tuner.fit(inputs={})

Parameters: additional_parents (set{str}) – Set of additional parents along with the self to be used in warm starting
Returns: HyperparameterTuner instance which can be used to launch identical dataset and algorithm tuning job.
Return type: sagemaker.tuner.HyperparameterTuner

classmethod create(estimator_dict, objective_metric_name_dict, hyperparameter_ranges_dict, metric_definitions_dict=None, base_tuning_job_name=None, strategy='Bayesian', strategy_config=None, completion_criteria_config=None, objective_type='Maximize', max_jobs=None, max_parallel_jobs=1, max_runtime_in_seconds=None, tags=None, warm_start_config=None, early_stopping_type='Off', random_seed=None)¶

Factory method to create a HyperparameterTuner instance.

It takes one or more estimators to obtain configuration information for training jobs that are created as the result of a hyperparameter tuning job. The estimators are provided through a dictionary (i.e. estimator_dict) with unique estimator names as the keys. For individual estimators separate objective metric names and hyperparameter ranges should be provided in two dictionaries, i.e. objective_metric_name_dict and hyperparameter_ranges_dict, with the same estimator names as the keys. Optional metrics definitions could also be provided for individual estimators via another dictionary metric_definitions_dict.

Parameters

estimator_dict (dict[str, sagemaker.estimator.EstimatorBase]) – Dictionary of estimator instances that have been initialized with the desired configuration. There does not need to be a training job associated with the estimator instances. The keys of the dictionary would be referred to as “estimator names”.
objective_metric_name_dict (dict[str, str]) – Dictionary of names of the objective metric for evaluating training jobs. The keys are the same set of estimator names as in estimator_dict, and there must be one entry for each estimator in estimator_dict.
hyperparameter_ranges_dict (dict[str, dict[str, sagemaker.parameter.ParameterRange]]) – Dictionary of tunable hyperparameter ranges. The keys are the same set of estimator names as in estimator_dict, and there must be one entry for each estimator in estimator_dict. Each value is a dictionary of sagemaker.parameter.ParameterRange instance, which can be one of three types: Continuous, Integer, or Categorical. The keys of each ParameterRange dictionaries are the names of the hyperparameter, and the values are the appropriate parameter range class to represent the range.
metric_definitions_dict (dict(str, list[dict]]) – Dictionary of metric definitions. The keys are the same set or a subset of estimator names as in estimator_dict, and there must be one entry for each estimator in estimator_dict. Each value is a list of dictionaries that defines the metric(s) used to evaluate the training jobs (default: None). Each of these dictionaries contains two keys: ‘Name’ for the name of the metric, and ‘Regex’ for the regular expression used to extract the metric from the logs. This should be defined only for hyperparameter tuning jobs that don’t use an Amazon algorithm.
base_tuning_job_name (str) – Prefix for the hyperparameter tuning job name when the fit() method launches. If not specified, a default job name is generated, based on the training image name and current timestamp.
strategy (str) – Strategy to be used for hyperparameter estimations (default: ‘Bayesian’).
strategy_config (dict) – The configuration for a training job launched by a hyperparameter tuning job.
completion_criteria_config (dict) – The configuration for tuning job completion criteria.
objective_type (str) – The type of the objective metric for evaluating training jobs. This value can be either ‘Minimize’ or ‘Maximize’ (default: ‘Maximize’).
max_jobs (int) – Maximum total number of training jobs to start for the hyperparameter tuning job. The default value is unspecified fot the GridSearch strategy and the value is 1 for all others strategies (default: None).
max_parallel_jobs (int) – Maximum number of parallel training jobs to start (default: 1).
max_runtime_in_seconds (int) – The maximum time in seconds that a training job launched by a hyperparameter tuning job can run.
tags (list[dict]) – List of tags for labeling the tuning job (default: None). For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.
warm_start_config (sagemaker.tuner.WarmStartConfig) – A WarmStartConfig object that has been initialized with the configuration defining the nature of warm start tuning job.
early_stopping_type (str) – Specifies whether early stopping is enabled for the job. Can be either ‘Auto’ or ‘Off’ (default: ‘Off’). If set to ‘Off’, early stopping will not be attempted. If set to ‘Auto’, early stopping of some training jobs may happen, but is not guaranteed to.
random_seed (int) – An initial value used to initialize a pseudo-random number generator. Setting a random seed will make the hyperparameter tuning search strategies to produce more consistent configurations for the same tuning job.

Returns

a new HyperparameterTuner object that can start a hyperparameter tuning job with one or more estimators.

Return type

sagemaker.tuner.HyperparameterTuner

delete_endpoint(**kwargs)¶

class sagemaker.tuner.ContinuousParameter(min_value, max_value, scaling_type='Auto')¶

Bases: sagemaker.parameter.ParameterRange

A class for representing hyperparameters that have a continuous range of possible values.

Parameters

min_value (float) – The minimum value for the range.
max_value (float) – The maximum value for the range.
scaling_type (Union[str, sagemaker.workflow.entities.PipelineVariable]) –

Initialize a parameter range.

Parameters

min_value (float or int or PipelineVariable) – The minimum value for the range.
max_value (float or int or PipelineVariable) – The maximum value for the range.
scaling_type (str or PipelineVariable) – The scale used for searching the range during tuning (default: ‘Auto’). Valid values: ‘Auto’, ‘Linear’, ‘Logarithmic’ and ‘ReverseLogarithmic’.

classmethod cast_to_type(value)¶: Placeholder docstring

class sagemaker.tuner.IntegerParameter(min_value, max_value, scaling_type='Auto')¶

Bases: sagemaker.parameter.ParameterRange

A class for representing hyperparameters that have an integer range of possible values.

Parameters

min_value (int) – The minimum value for the range.
max_value (int) – The maximum value for the range.
scaling_type (Union[str, sagemaker.workflow.entities.PipelineVariable]) –

Initialize a parameter range.

Parameters

min_value (float or int or PipelineVariable) – The minimum value for the range.
max_value (float or int or PipelineVariable) – The maximum value for the range.
scaling_type (str or PipelineVariable) – The scale used for searching the range during tuning (default: ‘Auto’). Valid values: ‘Auto’, ‘Linear’, ‘Logarithmic’ and ‘ReverseLogarithmic’.

classmethod cast_to_type(value)¶: Placeholder docstring

class sagemaker.tuner.CategoricalParameter(values)¶

Bases: sagemaker.parameter.ParameterRange

A class for representing hyperparameters that have a discrete list of possible values.

Initialize a CategoricalParameter.

Parameters: values (list or object) – The possible values for the hyperparameter. This input will be converted into a list of strings.

as_tuning_range(name)¶

Represent the parameter range as a dictionary.

It is suitable for a request to create an Amazon SageMaker hyperparameter tuning job.

Parameters: name (str) – The name of the hyperparameter.
Returns: A dictionary that contains the name and values of the hyperparameter.
Return type: dict[str, list[str]]

as_json_range(name)¶

Represent the parameter range as a dictionary.

Dictionary is suitable for a request to create an Amazon SageMaker hyperparameter tuning job using one of the deep learning frameworks.

The deep learning framework images require that hyperparameters be serialized as JSON.

Parameters: name (str) – The name of the hyperparameter.
Returns: A dictionary that contains the name and values of the hyperparameter, where the values are serialized as JSON.
Return type: dict[str, list[str]]

is_valid(value)¶: Placeholder docstring

classmethod cast_to_type(value)¶: Placeholder docstring