Analytics

class sagemaker.analytics.AnalyticsMetricsBase

Bases: object

Base class for tuning job or training job analytics classes.

Understands common functionality like persistence and caching.

Initializes AnalyticsMetricsBase instance.

export_csv(filename)

Persists the analytics dataframe to a file.

Parameters

filename (str) – The name of the file to save to.

dataframe(force_refresh=False)

A pandas dataframe with lots of interesting results about this object.

Created by calling SageMaker List and Describe APIs and converting them into a convenient tabular summary.

Parameters

force_refresh (bool) – Set to True to fetch the latest data from SageMaker API.

clear_cache()

Clear the object of all local caches of API methods.

So that the next time any properties are accessed they will be refreshed from the service.

class sagemaker.analytics.HyperparameterTuningJobAnalytics(hyperparameter_tuning_job_name, sagemaker_session=None)

Bases: AnalyticsMetricsBase

Fetch results about a hyperparameter tuning job and make them accessible for analytics.

Initialize a HyperparameterTuningJobAnalytics instance.

Parameters
  • hyperparameter_tuning_job_name (str) – name of the HyperparameterTuningJob to analyze.

  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

property name

Name of the HyperparameterTuningJob being analyzed

clear_cache()

Clear the object of all local caches of API methods.

property tuning_ranges

A dictionary describing the ranges of all tuned hyperparameters.

The keys are the names of the hyperparameter, and the values are the ranges.

The output can take one of two forms:

  • If the ‘TrainingJobDefinition’ field is present in the job description, the output

    is a dictionary constructed from ‘ParameterRanges’ in ‘HyperParameterTuningJobConfig’ of the job description. The keys are the parameter names, while the values are the parameter ranges. Example: >>> { >>> “eta”: {“MaxValue”: “1”, “MinValue”: “0”, “Name”: “eta”}, >>> “gamma”: {“MaxValue”: “10”, “MinValue”: “0”, “Name”: “gamma”}, >>> “iterations”: {“MaxValue”: “100”, “MinValue”: “50”, “Name”: “iterations”}, >>> “num_layers”: {“MaxValue”: “30”, “MinValue”: “5”, “Name”: “num_layers”}, >>> }

  • If the ‘TrainingJobDefinitions’ field (list) is present in the job description,

    the output is a dictionary with keys as the ‘DefinitionName’ values from all items in ‘TrainingJobDefinitions’, and each value would be a dictionary constructed from ‘HyperParameterRanges’ in each item in ‘TrainingJobDefinitions’ in the same format as above Example: >>> { >>> “estimator_1”: { >>> “eta”: {“MaxValue”: “1”, “MinValue”: “0”, “Name”: “eta”}, >>> “gamma”: {“MaxValue”: “10”, “MinValue”: “0”, “Name”: “gamma”}, >>> }, >>> “estimator_2”: { >>> “framework”: {“Values”: [“TF”, “MXNet”], “Name”: “framework”}, >>> “gamma”: {“MaxValue”: “1.0”, “MinValue”: “0.2”, “Name”: “gamma”} >>> } >>> }

For more details about the ‘TrainingJobDefinition’ and ‘TrainingJobDefinitions’ fields in job description, see https://botocore.readthedocs.io/en/latest/reference/services/sagemaker.html#SageMaker.Client.create_hyper_parameter_tuning_job

description(force_refresh=False)

Call DescribeHyperParameterTuningJob for the hyperparameter tuning job.

Parameters

force_refresh (bool) – Set to True to fetch the latest data from SageMaker API.

Returns

The Amazon SageMaker response for DescribeHyperParameterTuningJob.

Return type

dict

training_job_summaries(force_refresh=False)

A (paginated) list of everything from ListTrainingJobsForTuningJob.

Parameters

force_refresh (bool) – Set to True to fetch the latest data from SageMaker API.

Returns

The Amazon SageMaker response for ListTrainingJobsForTuningJob.

Return type

dict

class sagemaker.analytics.TrainingJobAnalytics(training_job_name, metric_names=None, sagemaker_session=None, start_time=None, end_time=None, period=None)

Bases: AnalyticsMetricsBase

Fetch training curve data from CloudWatch Metrics for a specific training job.

Initialize a TrainingJobAnalytics instance.

Parameters
  • training_job_name (str) – name of the TrainingJob to analyze.

  • metric_names (list, optional) – string names of all the metrics to collect for this training job. If not specified, then it will use all metric names configured for this job.

  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is specified using the default AWS configuration chain.

  • start_time

  • end_time

  • period

CLOUDWATCH_NAMESPACE = '/aws/sagemaker/TrainingJobs'
property name

Name of the TrainingJob being analyzed

clear_cache()

Clear the object of all local caches of API methods.

This is so that the next time any properties are accessed they will be refreshed from the service.

class sagemaker.analytics.ExperimentAnalytics(experiment_name=None, search_expression=None, sort_by=None, sort_order=None, metric_names=None, parameter_names=None, sagemaker_session=None, input_artifact_names=None, output_artifact_names=None)

Bases: AnalyticsMetricsBase

Fetch trial component data and make them accessible for analytics.

Initialize a ExperimentAnalytics instance.

Parameters
  • experiment_name (str, optional) – Name of the experiment if you want to constrain the search to only trial components belonging to an experiment.

  • search_expression (dict, optional) – The search query to find the set of trial components to use to populate the data frame.

  • sort_by (str, optional) – The name of the resource property used to sort the set of trial components.

  • sort_order (str optional) – How trial components are ordered, valid values are Ascending and Descending. The default is Descending.

  • metric_names (list, optional) – string names of all the metrics to be shown in the data frame. If not specified, all metrics will be shown of all trials.

  • parameter_names (list, optional) – string names of the parameters to be shown in the data frame. If not specified, all parameters will be shown of all trials.

  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

  • input_artifact_names (dict optional) – The input artifacts for the experiment. Examples of input artifacts are datasets, algorithms, hyperparameters, source code, and instance types.

  • output_artifact_names (dict optional) – The output artifacts for the experiment. Examples of output artifacts are metrics, snapshots, logs, and images.

MAX_TRIAL_COMPONENTS = 10000
property name

Name of the Experiment being analyzed.

clear_cache()

Clear the object of all local caches of API methods.