Experiments¶
Run¶
-
class
sagemaker.experiments.
Run
(experiment_name, run_name=None, experiment_display_name=None, run_display_name=None, tags=None, sagemaker_session=None)¶ A collection of parameters, metrics, and artifacts to create a ML model.
Construct a Run instance.
SageMaker Experiments automatically tracks the inputs, parameters, configurations, and results of your iterations as runs. You can assign, group, and organize these runs into experiments. You can also create, compare, and evaluate runs.
The code sample below shows how to initialize a run, log parameters to the Run object and invoke a training job under the context of this Run object, which automatically passes the run’s
experiment_config
(including the experiment name, run name etc.) to the training job.Note
All log methods (e.g.
log_parameter
,log_metric
, etc.) have to be called within the run context (i.e. thewith
statement). Otherwise, aRuntimeError
is thrown.with Run(experiment_name="my-exp", run_name="my-run", ...) as run: run.log_parameter(...) ... estimator.fit(job_name="my-job") # Create a training job
In order to reuse an existing run to log extra data,
load_run
is recommended. For example, instead of theRun
constructor, theload_run
is recommended to use in a job script to load the existing run created before the job launch. Otherwise, a new run may be created each time you launch a job.The code snippet below displays how to load the run initialized above in a custom training job script, where no
run_name
orexperiment_name
is presented as they are automatically retrieved from the experiment config in the job environment.with load_run(sagemaker_session=sagemaker_session) as run: run.log_metric(...) ...
- Parameters
experiment_name (str) – The name of the experiment. The name must be unique within an account.
run_name (str) – The name of the run. If it is not specified, one is auto generated.
experiment_display_name (str) – Name of the experiment that will appear in UI, such as SageMaker Studio. (default: None). This display name is used in a create experiment call. If an experiment with the specified name already exists, this display name won’t take effect.
run_display_name (str) – The display name of the run used in UI (default: None). This display name is used in a create run call. If a run with the specified name already exists, this display name won’t take effect.
tags (List[Dict[str, str]]) – A list of tags to be used for all create calls, e.g. to create an experiment, a run group, etc. (default: None).
sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.
-
property
experiment_config
¶ Get experiment config from run attributes.
-
log_parameter
(name, value)¶ Record a single parameter value for this run.
Overwrites any previous value recorded for the specified parameter name.
-
log_parameters
(parameters)¶ Record a collection of parameter values for this run.
-
log_metric
(name, value, timestamp=None, step=None)¶ Record a custom scalar metric value for this run.
Note
This method is for manual custom metrics, for automatic metrics see the
enable_sagemaker_metrics
parameter on theestimator
class.- Parameters
name (str) – The name of the metric.
value (float) – The value of the metric.
timestamp (datetime.datetime) – The timestamp of the metric. If not specified, the current UTC time will be used.
step (int) – The integer iteration number of the metric value (default: None).
-
log_precision_recall
(y_true, predicted_probabilities, positive_label=None, title=None, is_output=True, no_skill=None)¶ Create and log a precision recall graph artifact for Studio UI to render.
The artifact is stored in S3 and represented as a lineage artifact with an association with the run.
You can view the artifact in the UI. If your job is created by a pipeline execution you can view the artifact by selecting the corresponding step in the pipelines UI. See also SageMaker Pipelines
This method requires sklearn library.
- Parameters
y_true (list or array) – True labels. If labels are not binary then positive_label should be given.
predicted_probabilities (list or array) – Estimated/predicted probabilities.
positive_label (str or int) – Label of the positive class (default: None).
title (str) – Title of the graph (default: None).
is_output (bool) – Determines direction of association to the run. Defaults to True (output artifact). If set to False then represented as input association.
no_skill (int) – The precision threshold under which the classifier cannot discriminate between the classes and would predict a random class or a constant class in all cases (default: None).
-
log_roc_curve
(y_true, y_score, title=None, is_output=True)¶ Create and log a receiver operating characteristic (ROC curve) artifact.
The artifact is stored in S3 and represented as a lineage artifact with an association with the run.
You can view the artifact in the UI. If your job is created by a pipeline execution you can view the artifact by selecting the corresponding step in the pipelines UI. See also SageMaker Pipelines
This method requires sklearn library.
- Parameters
y_true (list or array) – True labels. If labels are not binary then positive_label should be given.
y_score (list or array) – Estimated/predicted probabilities.
title (str) – Title of the graph (default: None).
is_output (bool) – Determines direction of association to the run. Defaults to True (output artifact). If set to False then represented as input association.
-
log_confusion_matrix
(y_true, y_pred, title=None, is_output=True)¶ Create and log a confusion matrix artifact.
The artifact is stored in S3 and represented as a lineage artifact with an association with the run.
You can view the artifact in the UI. If your job is created by a pipeline execution you can view the artifact by selecting the corresponding step in the pipelines UI. See also SageMaker Pipelines This method requires sklearn library.
- Parameters
y_true (list or array) – True labels. If labels are not binary then positive_label should be given.
y_pred (list or array) – Predicted labels.
title (str) – Title of the graph (default: None).
is_output (bool) – Determines direction of association to the run. Defaults to True (output artifact). If set to False then represented as input association.
-
log_artifact
(name, value, media_type=None, is_output=True)¶ Record a single artifact for this run.
Overwrites any previous value recorded for the specified name.
- Parameters
-
log_file
(file_path, name=None, media_type=None, is_output=True)¶ Upload a file to s3 and store it as an input/output artifact in this run.
- Parameters
file_path (str) – The path of the local file to upload.
name (str) – The name of the artifact (default: None).
media_type (str) – The MediaType (MIME type) of the file. If not specified, this library will attempt to infer the media type from the file extension of
file_path
.is_output (bool) – Determines direction of association to the run. Defaults to True (output artifact). If set to False then represented as input association.
-
close
()¶ Persist any data saved locally.
-
run.
load_run
(experiment_name=None, sagemaker_session=None)¶ Load an existing run.
In order to reuse an existing run to log extra data,
load_run
is recommended. It can be used in several ways:Use
load_run
by explicitly passing inrun_name
andexperiment_name
.
If
run_name
andexperiment_name
are passed in, they are honored over the default experiment config in the job environment or the run context (i.e. within thewith
block).Note
Both
run_name
andexperiment_name
should be supplied to make this usage work. Otherwise, you may get aValueError
.with load_run(experiment_name="my-exp", run_name="my-run") as run: run.log_metric(...) ...
Use the
load_run
in a job script without supplyingrun_name
andexperiment_name
.
In this case, the default experiment config (specified when creating the job) is fetched from the job environment to load the run.
# In a job script with load_run() as run: run.log_metric(...) ...
3. Use the
load_run
in a notebook within a run context (i.e. thewith
block) but without supplyingrun_name
andexperiment_name
.Every time we call
with Run(...) as run1:
, the initializedrun1
is tracked in the run context. Then when we callload_run()
under this with statement, therun1
in the context is loaded by default.# In a notebook with Run(experiment_name="my-exp", run_name="my-run", ...) as run1: run1.log_parameter(...) with load_run() as run2: # run2 is the same object as run1 run2.log_metric(...) ...
- Parameters
run_name (str) – The name of the run to be loaded (default: None). If it is None, the
RunName
in theExperimentConfig
of the job will be fetched to load the run.experiment_name (str) – The name of the Experiment that the to be loaded run is associated with (default: None). Note: the experiment_name must be supplied along with a valid run_name. Otherwise, it will be ignored.
sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.
- Returns
The loaded Run object.
- Return type
-
experiments.
list_runs
(created_before=None, created_after=None, sagemaker_session=None, max_results=None, next_token=None, sort_by=<SortByType.CREATION_TIME: 'CreationTime'>, sort_order=<SortOrderType.DESCENDING: 'Descending'>)¶ Return a list of
Run
objects matching the given criteria.- Parameters
experiment_name (str) – Only Run objects related to the specified experiment are returned.
created_before (datetime.datetime) – Return Run objects created before this instant (default: None).
created_after (datetime.datetime) – Return Run objects created after this instant (default: None).
sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.
max_results (int) – Maximum number of Run objects to retrieve (default: None).
next_token (str) – Token for next page of results (default: None).
sort_by (SortByType) – The property to sort results by. One of NAME, CREATION_TIME (default: CREATION_TIME).
sort_order (SortOrderType) – One of ASCENDING, or DESCENDING (default: DESCENDING).
- Returns
A list of
Run
objects.- Return type