Debugger

Amazon SageMaker Debugger provides full visibility into training jobs of state-of-the-art machine learning models. This SageMaker Debugger module provides high-level methods to set up Debugger configurations to monitor, profile, and debug your training job. Configure the Debugger-specific parameters when constructing a SageMaker estimator to gain visibility and insights into your training job.

class sagemaker.debugger.get_rule_container_image_uri(region)

Return the Debugger rule image URI for the given AWS Region.

For a full list of rule image URIs, see Use Debugger Docker Images for Built-in or Custom Rules.

Parameters

region (str) – A string of AWS Region. For example, 'us-east-1'.

Returns

Formatted image URI for the given AWS Region and the rule container type.

Return type

str

class sagemaker.debugger.get_default_profiler_rule

Return the default built-in profiler rule with a unique name.

Returns

The instance of the built-in ProfilerRule.

Return type

sagemaker.debugger.ProfilerRule

class sagemaker.debugger.rule_configs

A helper module to configure the SageMaker Debugger built-in rules with the Rule classmethods and and the ProfilerRule classmethods.

For a full list of built-in rules, see List of Debugger Built-in Rules.

This module is imported from the Debugger client library for rule configuration. For more information, see Amazon SageMaker Debugger RulesConfig.

class sagemaker.debugger.RuleBase(name, image_uri, instance_type, container_local_output_path, s3_output_path, volume_size_in_gb, rule_parameters)

Bases: abc.ABC

The SageMaker Debugger rule base class that cannot be instantiated directly.

Tip

Debugger rule classes inheriting this RuleBase class are Rule and ProfilerRule. Do not directly use the rule base class to instantiate a SageMaker Debugger rule. Use the Rule classmethods for debugging and the ProfilerRule classmethods for profiling.

name

The name of the rule.

Type

str

image_uri

The image URI to use the rule.

Type

str

instance_type

Type of EC2 instance to use. For example, ‘ml.c4.xlarge’.

Type

str

container_local_output_path

The local path to store the Rule output.

Type

str

s3_output_path

The location in S3 to store the output.

Type

str

volume_size_in_gb

Size in GB of the EBS volume to use for storing data.

Type

int

rule_parameters

A dictionary of parameters for the rule.

Type

dict

Method generated by attrs for class RuleBase.

class sagemaker.debugger.Rule(name, image_uri, instance_type, container_local_output_path, s3_output_path, volume_size_in_gb, rule_parameters, collections_to_save, actions=None)

Bases: sagemaker.debugger.debugger.RuleBase

The SageMaker Debugger Rule class configures debugging rules to debug your training job.

The debugging rules analyze tensor outputs from your training job and monitor conditions that are critical for the success of the training job.

SageMaker Debugger comes pre-packaged with built-in debugging rules. For example, the debugging rules can detect whether gradients are getting too large or too small, or if a model is overfitting. For a full list of built-in rules for debugging, see List of Debugger Built-in Rules. You can also write your own rules using the custom rule classmethod.

Configure the debugging rules using the following classmethods.

Tip

Use the following Rule.sagemaker class method for built-in debugging rules or the Rule.custom class method for custom debugging rules. Do not directly use the Rule initialization method.

classmethod sagemaker(base_config, name=None, container_local_output_path=None, s3_output_path=None, other_trials_s3_input_paths=None, rule_parameters=None, collections_to_save=None, actions=None)

Initialize a Rule object for a built-in debugging rule.

Parameters
  • base_config (dict) –

    Required. This is the base rule config dictionary returned from the rule_configs method. For example, rule_configs.dead_relu(). For a full list of built-in rules for debugging, see List of Debugger Built-in Rules.

  • name (str) – Optional. The name of the debugger rule. If one is not provided, the name of the base_config will be used.

  • container_local_output_path (str) – Optional. The local path in the rule processing container.

  • s3_output_path (str) – Optional. The location in Amazon S3 to store the output tensors. The default Debugger output path for debugging data is created under the default output path of the Estimator class. For example, s3://sagemaker-<region>-<12digit_account_id>/<training-job-name>/debug-output/.

  • other_trials_s3_input_paths ([str]) – Optional. The Amazon S3 input paths of other trials to use the SimilarAcrossRuns rule.

  • rule_parameters (dict) – Optional. A dictionary of parameters for the rule.

  • collections_to_save (CollectionConfig) – Optional. A list of CollectionConfig objects to be saved.

Returns

An instance of the built-in rule.

Return type

Rule

Example of how to create a built-in rule instance:

from sagemaker.debugger import Rule, rule_configs

built_in_rules = [
    Rule.sagemaker(rule_configs.built_in_rule_name_in_pysdk_format_1()),
    Rule.sagemaker(rule_configs.built_in_rule_name_in_pysdk_format_2()),
    ...
    Rule.sagemaker(rule_configs.built_in_rule_name_in_pysdk_format_n())
]

You need to replace the built_in_rule_name_in_pysdk_format_* with the names of built-in rules. You can find the rule names at List of Debugger Built-in Rules.

Example of creating a built-in rule instance with adjusting parameter values:

from sagemaker.debugger import Rule, rule_configs

built_in_rules = [
    Rule.sagemaker(
        base_config=rule_configs.built_in_rule_name_in_pysdk_format(),
        rule_parameters={
                "key": "value"
        }
        collections_to_save=[
            CollectionConfig(
                name="tensor_collection_name",
                parameters={
                    "key": "value"
                }
            )
        ]
    )
]

For more information about setting up the rule_parameters parameter, see List of Debugger Built-in Rules.

For more information about setting up the collections_to_save parameter, see the CollectionConfig class.

classmethod custom(name, image_uri, instance_type, volume_size_in_gb, source=None, rule_to_invoke=None, container_local_output_path=None, s3_output_path=None, other_trials_s3_input_paths=None, rule_parameters=None, collections_to_save=None, actions=None)

Initialize a Rule object for a custom debugging rule.

You can create a custom rule that analyzes tensors emitted during the training of a model and monitors conditions that are critical for the success of a training job. For more information, see Create Debugger Custom Rules for Training Job Analysis.

Parameters
  • name (str) – Required. The name of the debugger rule.

  • image_uri (str) – Required. The URI of the image to be used by the debugger rule.

  • instance_type (str) – Required. Type of EC2 instance to use, for example, ‘ml.c4.xlarge’.

  • volume_size_in_gb (int) – Required. Size in GB of the EBS volume to use for storing data.

  • source (str) – Optional. A source file containing a rule to invoke. If provided, you must also provide rule_to_invoke. This can either be an S3 uri or a local path.

  • rule_to_invoke (str) – Optional. The name of the rule to invoke within the source. If provided, you must also provide source.

  • container_local_output_path (str) – Optional. The local path in the container.

  • s3_output_path (str) – Optional. The location in Amazon S3 to store the output tensors. The default Debugger output path for debugging data is created under the default output path of the Estimator class. For example, s3://sagemaker-<region>-<12digit_account_id>/<training-job-name>/debug-output/.

  • other_trials_s3_input_paths ([str]) – Optional. The Amazon S3 input paths of other trials to use the SimilarAcrossRuns rule.

  • rule_parameters (dict) – Optional. A dictionary of parameters for the rule.

  • collections_to_save ([sagemaker.debugger.CollectionConfig]) – Optional. A list of CollectionConfig objects to be saved.

Returns

The instance of the custom rule.

Return type

Rule

prepare_actions(training_job_name)

Prepare actions for Debugger Rule.

Parameters

training_job_name (str) – The training job name. To be set as the default training job prefix for the StopTraining action if it is specified.

to_debugger_rule_config_dict()

Generates a request dictionary using the parameters provided when initializing object.

Returns

An portion of an API request as a dictionary.

Return type

dict

class sagemaker.debugger.ProfilerRule(name, image_uri, instance_type, container_local_output_path, s3_output_path, volume_size_in_gb, rule_parameters)

Bases: sagemaker.debugger.debugger.RuleBase

The SageMaker Debugger ProfilerRule class configures profiling rules.

SageMaker Debugger profiling rules automatically analyze hardware system resource utilization and framework metrics of a training job to identify performance bottlenecks.

SageMaker Debugger comes pre-packaged with built-in profiling rules. For example, the profiling rules can detect if GPUs are underutilized due to CPU bottlenecks or IO bottlenecks. For a full list of built-in rules for debugging, see List of Debugger Built-in Rules. You can also write your own profiling rules using the Amazon SageMaker Debugger APIs.

Tip

Use the following ProfilerRule.sagemaker class method for built-in profiling rules or the ProfilerRule.custom class method for custom profiling rules. Do not directly use the Rule initialization method.

Method generated by attrs for class RuleBase.

classmethod sagemaker(base_config, name=None, container_local_output_path=None, s3_output_path=None)

Initialize a ProfilerRule object for a built-in profiling rule.

The rule analyzes system and framework metrics of a given training job to identify performance bottlenecks.

Parameters
  • base_config (rule_configs.ProfilerRule) –

    The base rule configuration object returned from the rule_configs method. For example, ‘rule_configs.ProfilerReport()’. For a full list of built-in rules for debugging, see List of Debugger Built-in Rules.

  • name (str) – The name of the profiler rule. If one is not provided, the name of the base_config will be used.

  • container_local_output_path (str) – The path in the container.

  • s3_output_path (str) – The location in Amazon S3 to store the profiling output data. The default Debugger output path for profiling data is created under the default output path of the Estimator class. For example, s3://sagemaker-<region>-<12digit_account_id>/<training-job-name>/profiler-output/.

Returns

The instance of the built-in ProfilerRule.

Return type

ProfilerRule

classmethod custom(name, image_uri, instance_type, volume_size_in_gb, source=None, rule_to_invoke=None, container_local_output_path=None, s3_output_path=None, rule_parameters=None)

Initialize a ProfilerRule object for a custom profiling rule.

You can create a rule that analyzes system and framework metrics emitted during the training of a model and monitors conditions that are critical for the success of a training job.

Parameters
  • name (str) – The name of the profiler rule.

  • image_uri (str) – The URI of the image to be used by the proflier rule.

  • instance_type (str) – Type of EC2 instance to use, for example, ‘ml.c4.xlarge’.

  • volume_size_in_gb (int) – Size in GB of the EBS volume to use for storing data.

  • source (str) – A source file containing a rule to invoke. If provided, you must also provide rule_to_invoke. This can either be an S3 uri or a local path.

  • rule_to_invoke (str) – The name of the rule to invoke within the source. If provided, you must also provide the source.

  • container_local_output_path (str) – The path in the container.

  • s3_output_path (str) – The location in Amazon S3 to store the output. The default Debugger output path for profiling data is created under the default output path of the Estimator class. For example, s3://sagemaker-<region>-<12digit_account_id>/<training-job-name>/profiler-output/.

  • rule_parameters (dict) – A dictionary of parameters for the rule.

Returns

The instance of the custom ProfilerRule.

Return type

ProfilerRule

to_profiler_rule_config_dict()

Generates a request dictionary using the parameters provided when initializing object.

Returns

An portion of an API request as a dictionary.

Return type

dict

class sagemaker.debugger.CollectionConfig(name, parameters=None)

Bases: object

Creates tensor collections for SageMaker Debugger.

Constructor for collection configuration.

Parameters
  • name (str) – Required. The name of the collection configuration.

  • parameters (dict) – Optional. The parameters for the collection configuration.

Example of creating a CollectionConfig object:

from sagemaker.debugger import CollectionConfig

collection_configs=[
    CollectionConfig(name="tensor_collection_1")
    CollectionConfig(name="tensor_collection_2")
    ...
    CollectionConfig(name="tensor_collection_n")
]

For a full list of Debugger built-in collection, see Debugger Built in Collections.

Example of creating a CollectionConfig object with parameter adjustment:

You can use the following CollectionConfig template in two ways: (1) to adjust the parameters of the built-in tensor collections, and (2) to create custom tensor collections.

If you put the built-in collection names to the name parameter, CollectionConfig takes it to match the built-in collections and adjust parameters. If you specify a new name to the name parameter, CollectionConfig creates a new tensor collection, and you must use include_regex parameter to specify regex of tensors you want to collect.

from sagemaker.debugger import CollectionConfig

collection_configs=[
    CollectionConfig(
        name="tensor_collection",
        parameters={
            "key_1": "value_1",
            "key_2": "value_2"
            ...
            "key_n": "value_n"
        }
    )
]

The following list shows the available CollectionConfig parameters.

Parameter Key

Descriptions

include_regex

Specify a list of regex patterns of tensors to save.

Tensors whose names match these patterns will be saved.

save_histogram

Set True if want to save histogram output data for

TensorFlow visualization.

reductions

Specify certain reduction values of tensors.

This helps reduce the amount of data saved and

increase training speed.

Available values are min, max, median,

mean, std, variance, sum, and prod.

save_interval

train.save_interval

eval.save_interval

predict.save_interval

global.save_interval

Specify how often to save tensors in steps.

You can also specify the save intervals

in TRAIN, EVAL, PREDICT, and GLOBAL modes.

The default value is 500 steps.

save_steps

train.save_steps

eval.save_steps

predict.save_steps

global.save_steps

Specify the exact step numbers to save tensors.

You can also specify the save steps

in TRAIN, EVAL, PREDICT, and GLOBAL modes.

start_step

train.start_step

eval.start_step

predict.start_step

global.start_step

Specify the exact start step to save tensors.

You can also specify the start steps

in TRAIN, EVAL, PREDICT, and GLOBAL modes.

end_step

train.end_step

eval.end_step

predict.end_step

global.end_step

Specify the exact end step to save tensors.

You can also specify the end steps

in TRAIN, EVAL, PREDICT, and GLOBAL modes.

For example, the following code shows how to control the save_interval parameters of the built-in losses tensor collection. With the following collection configuration, Debugger collects loss values every 100 steps from training loops and every 10 steps from evaluation loops.

collection_configs=[
    CollectionConfig(
        name="losses",
        parameters={
            "train.save_interval": "100",
            "eval.save_interval": "10"
        }
    )
]
class sagemaker.debugger.DebuggerHookConfig(s3_output_path=None, container_local_output_path=None, hook_parameters=None, collection_configs=None)

Bases: object

Create a Debugger hook configuration object to save the tensor for debugging.

DebuggerHookConfig provides options to customize how debugging information is emitted and saved. This high-level DebuggerHookConfig class runs based on the smdebug.SaveConfig class.

Initialize the DebuggerHookConfig instance.

Parameters
  • s3_output_path (str) – Optional. The location in Amazon S3 to store the output tensors. The default Debugger output path is created under the default output path of the Estimator class. For example, s3://sagemaker-<region>-<12digit_account_id>/<training-job-name>/debug-output/.

  • container_local_output_path (str) – Optional. The local path in the container.

  • hook_parameters (dict) – Optional. A dictionary of parameters.

  • collection_configs ([sagemaker.debugger.CollectionConfig]) – Required. A list of CollectionConfig objects to be saved at the s3_output_path.

Example of creating a DebuggerHookConfig object:

from sagemaker.debugger import CollectionConfig, DebuggerHookConfig

collection_configs=[
    CollectionConfig(name="tensor_collection_1")
    CollectionConfig(name="tensor_collection_2")
    ...
    CollectionConfig(name="tensor_collection_n")
]

hook_config = DebuggerHookConfig(
    collection_configs=collection_configs
)
class sagemaker.debugger.TensorBoardOutputConfig(s3_output_path, container_local_output_path=None)

Bases: object

Create a tensor ouput configuration object for debugging visualizations on TensorBoard.

Initialize the TensorBoardOutputConfig instance.

Parameters
  • s3_output_path (str) – Optional. The location in Amazon S3 to store the output.

  • container_local_output_path (str) – Optional. The local path in the container.

class sagemaker.debugger.ProfilerConfig(s3_output_path=None, system_monitor_interval_millis=None, framework_profile_params=None)

Bases: object

Configuration for collecting system and framework metrics of SageMaker training jobs.

SageMaker Debugger collects system and framework profiling information of training jobs and identify performance bottlenecks.

Initialize a ProfilerConfig instance.

Pass the output of this class to the profiler_config parameter of the generic Estimator class and SageMaker Framework estimators.

Parameters
  • s3_output_path (str) – The location in Amazon S3 to store the output. The default Debugger output path for profiling data is created under the default output path of the Estimator class. For example, s3://sagemaker-<region>-<12digit_account_id>/<training-job-name>/profiler-output/.

  • system_monitor_interval_millis (int) – The time interval in milliseconds to collect system metrics. Available values are 100, 200, 500, 1000 (1 second), 5000 (5 seconds), and 60000 (1 minute) milliseconds. The default is 500 milliseconds.

  • framework_profile_params (FrameworkProfile) – A parameter object for framework metrics profiling. Configure it using the FrameworkProfile class. To use the default framework profile parameters, pass FrameworkProfile(). For more information about the default values, see FrameworkProfile.

Example: The following example shows the basic profiler_config parameter configuration, enabling system monitoring every 5000 milliseconds and framework profiling with default parameter values.

from sagemaker.debugger import ProfilerConfig, FrameworkProfile

profiler_config = ProfilerConfig(
    system_monitor_interval_millis = 5000
    framework_profile_params = FrameworkProfile()
)
class sagemaker.debugger.FrameworkProfile(local_path='/opt/ml/output/profiler', file_max_size=10485760, file_close_interval=60, file_open_fail_threshold=50, detailed_profiling_config=None, dataloader_profiling_config=None, python_profiling_config=None, horovod_profiling_config=None, smdataparallel_profiling_config=None, start_step=None, num_steps=None, start_unix_time=None, duration=None)

Bases: object

Sets up the profiling configuration for framework metrics.

Validates user inputs and fills in default values if no input is provided. There are three main profiling options to choose from: DetailedProfilingConfig, DataloaderProfilingConfig, and PythonProfilingConfig.

The following list shows available scenarios of configuring the profiling options.

1. None of the profiling configuration, step range, or time range is specified. SageMaker Debugger activates framework profiling based on the default settings of each profiling option.

from sagemaker.debugger import ProfilerConfig, FrameworkProfile

profiler_config=ProfilerConfig(
    framework_profile_params=FrameworkProfile()
)

2. Target step or time range is specified to this FrameworkProfile class. The requested target step or time range setting propagates to all of the framework profiling options. For example, if you configure this class as following, all of the profiling options profiles the 6th step:

from sagemaker.debugger import ProfilerConfig, FrameworkProfile

profiler_config=ProfilerConfig(
    framework_profile_params=FrameworkProfile(start_step=6, num_steps=1)
)

3. Individual profiling configurations are specified through the *_profiling_config parameters. SageMaker Debugger profiles framework metrics only for the specified profiling configurations. For example, if the DetailedProfilingConfig class is configured but not the other profiling options, Debugger only profiles based on the settings specified to the DetailedProfilingConfig class. For example, the following example shows a profiling configuration to perform detailed profiling at step 10, data loader profiling at step 9 and 10, and Python profiling at step 12.

from sagemaker.debugger import ProfilerConfig, FrameworkProfile

profiler_config=ProfilerConfig(
    framework_profile_params=FrameworkProfile(
        detailed_profiling_config=DetailedProfilingConfig(start_step=10, num_steps=1),
        dataloader_profiling_config=DataloaderProfilingConfig(start_step=9, num_steps=2),
        python_profiling_config=PythonProfilingConfig(start_step=12, num_steps=1),
    )
)

If the individual profiling configurations are specified in addition to the step or time range, SageMaker Debugger prioritizes the individual profiling configurations and ignores the step or time range. For example, in the following code, the start_step=1 and num_steps=10 will be ignored.

from sagemaker.debugger import ProfilerConfig, FrameworkProfile

profiler_config=ProfilerConfig(
    framework_profile_params=FrameworkProfile(
        start_step=1,
        num_steps=10,
        detailed_profiling_config=DetailedProfilingConfig(start_step=10, num_steps=1),
        dataloader_profiling_config=DataloaderProfilingConfig(start_step=9, num_steps=2),
        python_profiling_config=PythonProfilingConfig(start_step=12, num_steps=1)
    )
)

Initialize the FrameworkProfile class object.

Parameters
  • detailed_profiling_config (DetailedProfilingConfig) – The configuration for detailed profiling. Configure it using the DetailedProfilingConfig class. Pass DetailedProfilingConfig() to use the default configuration.

  • dataloader_profiling_config (DataloaderProfilingConfig) – The configuration for dataloader metrics profiling. Configure it using the DataloaderProfilingConfig class. Pass DataloaderProfilingConfig() to use the default configuration.

  • python_profiling_config (PythonProfilingConfig) – The configuration for stats collected by the Python profiler (cProfile or Pyinstrument). Configure it using the PythonProfilingConfig class. Pass PythonProfilingConfig() to use the default configuration.

  • start_step (int) – The step at which to start profiling.

  • num_steps (int) – The number of steps to profile.

  • start_unix_time (int) – The Unix time at which to start profiling.

  • duration (float) – The duration in seconds to profile.

Tip

Available profiling range parameter pairs are (start_step and num_steps) and (start_unix_time and duration). The two parameter pairs are mutually exclusive, and this class validates if one of the two pairs is used. If both pairs are specified, a conflict error occurs.

class sagemaker.debugger.DetailedProfilingConfig(start_step=None, num_steps=None, start_unix_time=None, duration=None, profile_default_steps=False)

Bases: sagemaker.debugger.metrics_config.MetricsConfigBase

The configuration for framework metrics to be collected for detailed profiling.

Specify target steps or a target duration to profile.

By default, it profiles step 5 of training.

If profile_default_steps is set to True and none of the other range parameters is specified, the class uses the default configuration for detailed profiling.

Parameters
  • start_step (int) – The step to start profiling. The default is step 5.

  • num_steps (int) – The number of steps to profile. The default is for 1 step.

  • start_unix_time (int) – The Unix time to start profiling.

  • duration (float) – The duration in seconds to profile.

  • profile_default_steps (bool) – Indicates whether the default config should be used.

Tip

Available profiling range parameter pairs are (start_step and num_steps) and (start_unix_time and duration). The two parameter pairs are mutually exclusive, and this class validates if one of the two pairs is used. If both pairs are specified, a conflict error occurs.

class sagemaker.debugger.DataloaderProfilingConfig(start_step=None, num_steps=None, start_unix_time=None, duration=None, profile_default_steps=False, metrics_regex='.*')

Bases: sagemaker.debugger.metrics_config.MetricsConfigBase

The configuration for framework metrics to be collected for data loader profiling.

Specify target steps or a target duration to profile.

By default, it profiles step 7 of training. If profile_default_steps is set to True and none of the other range parameters is specified, the class uses the default config for dataloader profiling.

Parameters
  • start_step (int) – The step to start profiling. The default is step 7.

  • num_steps (int) – The number of steps to profile. The default is for 1 step.

  • start_unix_time (int) – The Unix time to start profiling. The default is for 1 step.

  • duration (float) – The duration in seconds to profile.

  • profile_default_steps (bool) – Indicates whether the default config should be used.

class sagemaker.debugger.PythonProfilingConfig(start_step=None, num_steps=None, start_unix_time=None, duration=None, profile_default_steps=False, python_profiler=<PythonProfiler.CPROFILE: 'cprofile'>, cprofile_timer=<cProfileTimer.TOTAL_TIME: 'total_time'>)

Bases: sagemaker.debugger.metrics_config.MetricsConfigBase

The configuration for framework metrics to be collected for Python profiling.

Choose a Python profiler: cProfile or Pyinstrument.

Specify target steps or a target duration to profile. If no parameter is specified, it profiles based on profiling configurations preset by the profile_default_steps parameter, which is set to True by default. If you specify the following parameters, then the profile_default_steps parameter will be ignored.

Parameters
  • start_step (int) – The step to start profiling. The default is step 9.

  • num_steps (int) – The number of steps to profile. The default is for 3 steps.

  • start_unix_time (int) – The Unix time to start profiling.

  • duration (float) – The duration in seconds to profile.

  • profile_default_steps (bool) – Indicates whether the default configuration should be used. If set to True, Python profiling will be done at step 9, 10, and 11 of training, using cProfiler and collecting metrics based on the total time, cpu time, and off cpu time for these three steps respectively. The default is True.

  • python_profiler (PythonProfiler) – The Python profiler to use to collect python profiling stats. Available options are "cProfile" and "Pyinstrument". The default is "cProfile". Instead of passing the string values, you can also use the enumerator util, PythonProfiler, to choose one of the available options.

  • cprofile_timer (cProfileTimer) – The timer to be used by cProfile when collecting python profiling stats. Available options are "total_time", "cpu_time", and "off_cpu_time". The default is "total_time". If you choose Pyinstrument, this parameter is ignored. Instead of passing the string values, you can also use the enumerator util, cProfileTimer, to choose one of the available options.

class sagemaker.debugger.PythonProfiler(value)

Bases: enum.Enum

Enum to list the Python profiler options for Python profiling.

CPROFILE

Use to choose "cProfile".

PYINSTRUMENT

Use to choose "Pyinstrument".

class sagemaker.debugger.cProfileTimer(value)

Bases: enum.Enum

Enum to list the possible cProfile timers for Python profiling.

TOTAL_TIME

Use to choose "total_time".

CPU_TIME

Use to choose "cpu_time".

OFF_CPU_TIME

Use to choose "off_cpu_time".

The various types of metrics configurations that can be specified in FrameworkProfile.

class sagemaker.debugger.metrics_config.StepRange(start_step, num_steps)

Configuration for the range of steps to profile.

It returns the target steps in dictionary format that you can pass to the FrameworkProfile class.

Set the start step and num steps.

If the start step is not specified, Debugger starts profiling at step 0. If num steps is not specified, profile for 1 step.

Parameters
  • start_step (int) – The step to start profiling.

  • num_steps (int) – The number of steps to profile.

to_json()

Convert the step range into a dictionary.

Returns

The step range as a dictionary.

Return type

dict

class sagemaker.debugger.metrics_config.TimeRange(start_unix_time, duration)

Configuration for the range of Unix time to profile.

It returns the target time duration in dictionary format that you can pass to the FrameworkProfile class.

Set the start Unix time and duration.

If the start Unix time is not specified, profile starting at step 0. If the duration is not specified, profile for 1 step.

Parameters
  • start_unix_time (int) – The Unix time to start profiling.

  • duration (float) – The duration in seconds to profile.

to_json()

Convert the time range into a dictionary.

Returns

The time range as a dictionary.

Return type

dict