Model Monitor

This module contains code related to Amazon SageMaker Model Monitoring.

These classes assist with suggesting baselines and creating monitoring schedules for data captured by SageMaker Endpoints.

class sagemaker.model_monitor.model_monitoring.ModelMonitor(role, image_uri, instance_count=1, instance_type='ml.m5.xlarge', entrypoint=None, volume_size_in_gb=30, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, base_job_name=None, sagemaker_session=None, env=None, tags=None, network_config=None)

Bases: object

Sets up Amazon SageMaker Monitoring Schedules and baseline suggestions.

Use this class when you want to provide your own container image containing the code you’d like to run, in order to produce your own statistics and constraint validation files. For a more guided experience, consider using the DefaultModelMonitor class instead.

Initializes a Monitor instance.

The Monitor handles baselining datasets and creating Amazon SageMaker Monitoring Schedules to monitor SageMaker endpoints.

Parameters
  • role (str) – An AWS IAM role. The Amazon SageMaker jobs use this role.

  • image_uri (str) – The uri of the image to use for the jobs started by the Monitor.

  • instance_count (int) – The number of instances to run the jobs with.

  • instance_type (str) – Type of EC2 instance to use for the job, for example, ‘ml.m5.xlarge’.

  • entrypoint ([str]) – The entrypoint for the job.

  • volume_size_in_gb (int) – Size in GB of the EBS volume to use for storing data during processing (default: 30).

  • volume_kms_key (str) – A KMS key for the job’s volume.

  • output_kms_key (str) – The KMS key id for the job’s outputs.

  • max_runtime_in_seconds (int) – Timeout in seconds. After this amount of time, Amazon SageMaker terminates the job regardless of its current status. Default: 3600

  • base_job_name (str) – Prefix for the job name. If not specified, a default name is generated based on the training image name and current timestamp.

  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

  • env (dict) – Environment variables to be passed to the job.

  • tags ([dict]) – List of tags to be passed to the job.

  • network_config (sagemaker.network.NetworkConfig) – A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.

run_baseline(baseline_inputs, output, arguments=None, wait=True, logs=True, job_name=None)

Run a processing job meant to baseline your dataset.

Parameters
  • baseline_inputs ([sagemaker.processing.ProcessingInput]) – Input files for the processing job. These must be provided as ProcessingInput objects.

  • output (sagemaker.processing.ProcessingOutput) – Destination of the constraint_violations and statistics json files.

  • arguments ([str]) – A list of string arguments to be passed to a processing job.

  • wait (bool) – Whether the call should wait until the job completes (default: True).

  • logs (bool) – Whether to show the logs produced by the job. Only meaningful when wait is True (default: True).

  • job_name (str) – Processing job name. If not specified, the processor generates a default job name, based on the image name and current timestamp.

create_monitoring_schedule(endpoint_input, output, statistics=None, constraints=None, monitor_schedule_name=None, schedule_cron_expression=None)

Creates a monitoring schedule to monitor an Amazon SageMaker Endpoint.

If constraints and statistics are provided, or if they are able to be retrieved from a previous baselining job associated with this monitor, those will be used. If constraints and statistics cannot be automatically retrieved, baseline_inputs will be required in order to kick off a baselining job.

Parameters
  • endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput.

  • output (sagemaker.model_monitor.MonitoringOutput) – The output of the monitoring schedule.

  • statistics (sagemaker.model_monitor.Statistic or str) – If provided alongside constraints, these will be used for monitoring the endpoint. This can be a sagemaker.model_monitor.Statistic object or an S3 uri pointing to a statistic JSON file.

  • constraints (sagemaker.model_monitor.Constraints or str) – If provided alongside statistics, these will be used for monitoring the endpoint. This can be a sagemaker.model_monitor.Constraints object or an S3 uri pointing to a constraints JSON file.

  • monitor_schedule_name (str) – Schedule name. If not specified, the processor generates a default job name, based on the image name and current timestamp.

  • schedule_cron_expression (str) – The cron expression that dictates the frequency that this job runs at. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions. Default: Daily.

update_monitoring_schedule(endpoint_input=None, output=None, statistics=None, constraints=None, schedule_cron_expression=None, instance_count=None, instance_type=None, entrypoint=None, volume_size_in_gb=None, volume_kms_key=None, output_kms_key=None, arguments=None, max_runtime_in_seconds=None, env=None, network_config=None, role=None, image_uri=None)

Updates the existing monitoring schedule.

If more options than schedule_cron_expression are to be updated, a new job definition will be created to hold them. The old job definition will not be deleted.

Parameters
  • endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput.

  • output (sagemaker.model_monitor.MonitoringOutput) – The output of the monitoring schedule.

  • statistics (sagemaker.model_monitor.Statistic or str) – If provided alongside constraints, these will be used for monitoring the endpoint. This can be a sagemaker.model_monitor.Statistics object or an S3 uri pointing to a statistics JSON file.

  • constraints (sagemaker.model_monitor.Constraints or str) – If provided alongside statistics, these will be used for monitoring the endpoint. This can be a sagemaker.model_monitor.Constraints object or an S3 uri pointing to a constraints JSON file.

  • schedule_cron_expression (str) – The cron expression that dictates the frequency that this job runs at. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions.

  • instance_count (int) – The number of instances to run the jobs with.

  • instance_type (str) – Type of EC2 instance to use for the job, for example, ‘ml.m5.xlarge’.

  • entrypoint (str) – The entrypoint for the job.

  • volume_size_in_gb (int) – Size in GB of the EBS volume to use for storing data during processing (default: 30).

  • volume_kms_key (str) – A KMS key for the job’s volume.

  • output_kms_key (str) – The KMS key id for the job’s outputs.

  • arguments ([str]) – A list of string arguments to be passed to a processing job.

  • max_runtime_in_seconds (int) – Timeout in seconds. After this amount of time, Amazon SageMaker terminates the job regardless of its current status. Default: 3600

  • env (dict) – Environment variables to be passed to the job.

  • network_config (sagemaker.network.NetworkConfig) – A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.

  • role (str) – An AWS IAM role name or ARN. The Amazon SageMaker jobs use this role.

  • image_uri (str) – The uri of the image to use for the jobs started by the Monitor.

start_monitoring_schedule()

Starts the monitoring schedule.

stop_monitoring_schedule()

Stops the monitoring schedule.

delete_monitoring_schedule()

Deletes the monitoring schedule (subclass is responsible for deleting job definition)

baseline_statistics(file_name='statistics.json')

Returns a Statistics object representing the statistics json file

Object is generated by the latest baselining job.

Parameters

file_name (str) – The name of the .json statistics file

Returns

The Statistics object representing the file that

was generated by the job.

Return type

sagemaker.model_monitor.Statistics

suggested_constraints(file_name='constraints.json')

Returns a Statistics object representing the constraints json file.

Object is generated by the latest baselining job

Parameters

file_name (str) – The name of the .json constraints file

Returns

The Constraints object representing the file that

was generated by the job.

Return type

sagemaker.model_monitor.Constraints

latest_monitoring_statistics(file_name='statistics.json')

Returns the sagemaker.model_monitor.

Statistics generated by the latest monitoring execution.

Parameters

file_name (str) – The name of the statistics file to be retrieved. Only override if generating a custom file name.

Returns

The Statistics object representing the file

generated by the latest monitoring execution.

Return type

sagemaker.model_monitoring.Statistics

latest_monitoring_constraint_violations(file_name='constraint_violations.json')

Returns the sagemaker.model_monitor.

ConstraintViolations generated by the latest monitoring execution.

Parameters

file_name (str) – The name of the constraint violdations file to be retrieved. Only override if generating a custom file name.

Returns

The ConstraintViolations object

representing the file generated by the latest monitoring execution.

Return type

sagemaker.model_monitoring.ConstraintViolations

describe_latest_baselining_job()

Describe the latest baselining job kicked off by the suggest workflow.

describe_schedule()

Describes the schedule that this object represents.

Returns

A dictionary response with the monitoring schedule description.

Return type

dict

list_executions()

Get the list of the latest monitoring executions in descending order of “ScheduledTime”.

Statistics or violations can be called following this example: .. rubric:: Example

>>> my_executions = my_monitor.list_executions()
>>> second_to_last_execution_statistics = my_executions[-1].statistics()
>>> second_to_last_execution_violations = my_executions[-1].constraint_violations()
Returns

List of MonitoringExecutions in

descending order of “ScheduledTime”.

Return type

[sagemaker.model_monitor.MonitoringExecution]

classmethod attach(monitor_schedule_name, sagemaker_session=None)

Set this object’s schedule name point to the Amazon Sagemaker Monitoring Schedule name.

This allows subsequent describe_schedule or list_executions calls to point to the given schedule.

Parameters
  • monitor_schedule_name (str) – The name of the schedule to attach to.

  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

classmethod monitoring_type()

Type of the monitoring job.

class sagemaker.model_monitor.model_monitoring.DefaultModelMonitor(role, instance_count=1, instance_type='ml.m5.xlarge', volume_size_in_gb=30, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, base_job_name=None, sagemaker_session=None, env=None, tags=None, network_config=None)

Bases: sagemaker.model_monitor.model_monitoring.ModelMonitor

Sets up Amazon SageMaker Monitoring Schedules and baseline suggestions.

Use this class when you want to utilize Amazon SageMaker Monitoring’s plug-and-play solution that only requires your dataset and optional pre/postprocessing scripts. For a more customized experience, consider using the ModelMonitor class instead.

Initializes a Monitor instance.

The Monitor handles baselining datasets and creating Amazon SageMaker Monitoring Schedules to monitor SageMaker endpoints.

Parameters
  • role (str) – An AWS IAM role name or ARN. The Amazon SageMaker jobs use this role.

  • instance_count (int) – The number of instances to run the jobs with.

  • instance_type (str) – Type of EC2 instance to use for the job, for example, ‘ml.m5.xlarge’.

  • volume_size_in_gb (int) – Size in GB of the EBS volume to use for storing data during processing (default: 30).

  • volume_kms_key (str) – A KMS key for the processing volume.

  • output_kms_key (str) – The KMS key id for the job’s outputs.

  • max_runtime_in_seconds (int) – Timeout in seconds. After this amount of time, Amazon SageMaker terminates the job regardless of its current status. Default: 3600

  • base_job_name (str) – Prefix for the job name. If not specified, a default name is generated based on the training image name and current timestamp.

  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

  • env (dict) – Environment variables to be passed to the job.

  • tags ([dict]) – List of tags to be passed to the job.

  • network_config (sagemaker.network.NetworkConfig) – A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.

JOB_DEFINITION_BASE_NAME = 'data-quality-job-definition'
classmethod monitoring_type()

Type of the monitoring job.

suggest_baseline(baseline_dataset, dataset_format, record_preprocessor_script=None, post_analytics_processor_script=None, output_s3_uri=None, wait=True, logs=True, job_name=None)

Suggest baselines for use with Amazon SageMaker Model Monitoring Schedules.

Parameters
  • baseline_dataset (str) – The path to the baseline_dataset file. This can be a local path or an S3 uri.

  • dataset_format (dict) – The format of the baseline_dataset.

  • record_preprocessor_script (str) – The path to the record preprocessor script. This can be a local path or an S3 uri.

  • post_analytics_processor_script (str) – The path to the record post-analytics processor script. This can be a local path or an S3 uri.

  • output_s3_uri (str) – Desired S3 destination Destination of the constraint_violations and statistics json files. Default: “s3://<default_session_bucket>/<job_name>/output”

  • wait (bool) – Whether the call should wait until the job completes (default: True).

  • logs (bool) – Whether to show the logs produced by the job. Only meaningful when wait is True (default: True).

  • job_name (str) – Processing job name. If not specified, the processor generates a default job name, based on the image name and current timestamp.

Returns

The ProcessingJob object representing the

baselining job.

Return type

sagemaker.processing.ProcessingJob

create_monitoring_schedule(endpoint_input, record_preprocessor_script=None, post_analytics_processor_script=None, output_s3_uri=None, constraints=None, statistics=None, monitor_schedule_name=None, schedule_cron_expression=None, enable_cloudwatch_metrics=True)

Creates a monitoring schedule to monitor an Amazon SageMaker Endpoint.

If constraints and statistics are provided, or if they are able to be retrieved from a previous baselining job associated with this monitor, those will be used. If constraints and statistics cannot be automatically retrieved, baseline_inputs will be required in order to kick off a baselining job.

Parameters
  • endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput.

  • record_preprocessor_script (str) – The path to the record preprocessor script. This can be a local path or an S3 uri.

  • post_analytics_processor_script (str) – The path to the record post-analytics processor script. This can be a local path or an S3 uri.

  • output_s3_uri (str) – Desired S3 destination of the constraint_violations and statistics json files. Default: “s3://<default_session_bucket>/<job_name>/output”

  • constraints (sagemaker.model_monitor.Constraints or str) – If provided alongside statistics, these will be used for monitoring the endpoint. This can be a sagemaker.model_monitor.Constraints object or an s3_uri pointing to a constraints JSON file.

  • statistics (sagemaker.model_monitor.Statistic or str) – If provided alongside constraints, these will be used for monitoring the endpoint. This can be a sagemaker.model_monitor.Statistics object or an s3_uri pointing to a statistics JSON file.

  • monitor_schedule_name (str) – Schedule name. If not specified, the processor generates a default job name, based on the image name and current timestamp.

  • schedule_cron_expression (str) – The cron expression that dictates the frequency that this job run. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions. Default: Daily.

  • enable_cloudwatch_metrics (bool) – Whether to publish cloudwatch metrics as part of the baselining or monitoring jobs.

update_monitoring_schedule(endpoint_input=None, record_preprocessor_script=None, post_analytics_processor_script=None, output_s3_uri=None, statistics=None, constraints=None, schedule_cron_expression=None, instance_count=None, instance_type=None, volume_size_in_gb=None, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, env=None, network_config=None, enable_cloudwatch_metrics=None, role=None)

Updates the existing monitoring schedule.

Parameters
  • endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput.

  • record_preprocessor_script (str) – The path to the record preprocessor script. This can be a local path or an S3 uri.

  • post_analytics_processor_script (str) – The path to the record post-analytics processor script. This can be a local path or an S3 uri.

  • output_s3_uri (str) – Desired S3 destination of the constraint_violations and statistics json files.

  • statistics (sagemaker.model_monitor.Statistic or str) – If provided alongside constraints, these will be used for monitoring the endpoint. This can be a sagemaker.model_monitor.Statistics object or an S3 uri pointing to a statistics JSON file.

  • constraints (sagemaker.model_monitor.Constraints or str) – If provided alongside statistics, these will be used for monitoring the endpoint. This can be a sagemaker.model_monitor.Constraints object or an S3 uri pointing to a constraints JSON file.

  • schedule_cron_expression (str) – The cron expression that dictates the frequency that this job runs at. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions.

  • instance_count (int) – The number of instances to run the jobs with.

  • instance_type (str) – Type of EC2 instance to use for the job, for example, ‘ml.m5.xlarge’.

  • volume_size_in_gb (int) – Size in GB of the EBS volume to use for storing data during processing (default: 30).

  • volume_kms_key (str) – A KMS key for the job’s volume.

  • output_kms_key (str) – The KMS key id for the job’s outputs.

  • max_runtime_in_seconds (int) – Timeout in seconds. After this amount of time, Amazon SageMaker terminates the job regardless of its current status. Default: 3600

  • env (dict) – Environment variables to be passed to the job.

  • network_config (sagemaker.network.NetworkConfig) – A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.

  • enable_cloudwatch_metrics (bool) – Whether to publish cloudwatch metrics as part of the baselining or monitoring jobs.

  • role (str) – An AWS IAM role name or ARN. The Amazon SageMaker jobs use this role.

delete_monitoring_schedule()

Deletes the monitoring schedule and its job definition.

run_baseline()

Not implemented.

‘.run_baseline()’ is only allowed for ModelMonitor objects. Please use suggest_baseline for DefaultModelMonitor objects, instead.

Raises

NotImplementedError

classmethod attach(monitor_schedule_name, sagemaker_session=None)

Sets this object’s schedule name to the name provided.

This allows subsequent describe_schedule or list_executions calls to point to the given schedule.

Parameters
  • monitor_schedule_name (str) – The name of the schedule to attach to.

  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

latest_monitoring_statistics()

Returns the sagemaker.model_monitor.Statistics.

These are the statistics generated by the latest monitoring execution.

Returns

The Statistics object representing the file

generated by the latest monitoring execution.

Return type

sagemaker.model_monitoring.Statistics

latest_monitoring_constraint_violations()

Returns the sagemaker.model_monitor.

ConstraintViolations generated by the latest monitoring execution.

Returns

The ConstraintViolations object

representing the file generated by the latest monitoring execution.

Return type

sagemaker.model_monitoring.ConstraintViolations

class sagemaker.model_monitor.model_monitoring.ModelQualityMonitor(role, instance_count=1, instance_type='ml.m5.xlarge', volume_size_in_gb=30, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, base_job_name=None, sagemaker_session=None, env=None, tags=None, network_config=None)

Bases: sagemaker.model_monitor.model_monitoring.ModelMonitor

Amazon SageMaker model monitor to monitor quality metrics for an endpoint.

Please see the __init__ method of its base class for how to instantiate it.

Initializes a monitor instance.

The monitor handles baselining datasets and creating Amazon SageMaker Monitoring Schedules to monitor SageMaker endpoints.

Parameters
  • role (str) – An AWS IAM role. The Amazon SageMaker jobs use this role.

  • instance_count (int) – The number of instances to run the jobs with.

  • instance_type (str) – Type of EC2 instance to use for the job, for example, ‘ml.m5.xlarge’.

  • volume_size_in_gb (int) – Size in GB of the EBS volume to use for storing data during processing (default: 30).

  • volume_kms_key (str) – A KMS key for the job’s volume.

  • output_kms_key (str) – The KMS key id for the job’s outputs.

  • max_runtime_in_seconds (int) – Timeout in seconds. After this amount of time, Amazon SageMaker terminates the job regardless of its current status. Default: 3600

  • base_job_name (str) – Prefix for the job name. If not specified, a default name is generated based on the training image name and current timestamp.

  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

  • env (dict) – Environment variables to be passed to the job.

  • tags ([dict]) – List of tags to be passed to the job.

  • network_config (sagemaker.network.NetworkConfig) – A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.

JOB_DEFINITION_BASE_NAME = 'model-quality-job-definition'
classmethod monitoring_type()

Type of the monitoring job.

suggest_baseline(baseline_dataset, dataset_format, problem_type, inference_attribute=None, probability_attribute=None, ground_truth_attribute=None, probability_threshold_attribute=None, post_analytics_processor_script=None, output_s3_uri=None, wait=False, logs=False, job_name=None)

Suggest baselines for use with Amazon SageMaker Model Monitoring Schedules.

Parameters
  • baseline_dataset (str) – The path to the baseline_dataset file. This can be a local path or an S3 uri.

  • dataset_format (dict) – The format of the baseline_dataset.

  • problem_type (str) – The type of problem of this model quality monitoring. Valid values are “Regression”, “BinaryClassification”, “MulticlassClassification”.

  • inference_attribute (str) – Index or JSONpath to locate predicted label(s).

  • probability_attribute (str or int) – Index or JSONpath to locate probabilities.

  • ground_truth_attribute (str) – Index or JSONpath to locate actual label(s).

  • probability_threshold_attribute (float) – threshold to convert probabilities to binaries Only used for ModelQualityMonitor, ModelBiasMonitor and ModelExplainabilityMonitor

  • post_analytics_processor_script (str) – The path to the record post-analytics processor script. This can be a local path or an S3 uri.

  • output_s3_uri (str) – Desired S3 destination Destination of the constraint_violations and statistics json files. Default: “s3://<default_session_bucket>/<job_name>/output”

  • wait (bool) – Whether the call should wait until the job completes (default: False).

  • logs (bool) – Whether to show the logs produced by the job. Only meaningful when wait is True (default: False).

  • job_name (str) – Processing job name. If not specified, the processor generates a default job name, based on the image name and current timestamp.

Returns

The ProcessingJob object representing the

baselining job.

Return type

sagemaker.processing.ProcessingJob

create_monitoring_schedule(endpoint_input, ground_truth_input, problem_type, record_preprocessor_script=None, post_analytics_processor_script=None, output_s3_uri=None, constraints=None, monitor_schedule_name=None, schedule_cron_expression=None, enable_cloudwatch_metrics=True)

Creates a monitoring schedule.

Parameters
  • endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput.

  • ground_truth_input (str) – S3 URI to ground truth dataset.

  • problem_type (str) – The type of problem of this model quality monitoring. Valid values are “Regression”, “BinaryClassification”, “MulticlassClassification”.

  • record_preprocessor_script (str) – The path to the record preprocessor script. This can be a local path or an S3 uri.

  • post_analytics_processor_script (str) – The path to the record post-analytics processor script. This can be a local path or an S3 uri.

  • output_s3_uri (str) – S3 destination of the constraint_violations and analysis result. Default: “s3://<default_session_bucket>/<job_name>/output”

  • constraints (sagemaker.model_monitor.Constraints or str) – If provided it will be used for monitoring the endpoint. It can be a Constraints object or an S3 uri pointing to a constraints JSON file.

  • monitor_schedule_name (str) – Schedule name. If not specified, the processor generates a default job name, based on the image name and current timestamp.

  • schedule_cron_expression (str) – The cron expression that dictates the frequency that this job run. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions. Default: Daily.

  • enable_cloudwatch_metrics (bool) – Whether to publish cloudwatch metrics as part of the baselining or monitoring jobs.

update_monitoring_schedule(endpoint_input=None, ground_truth_input=None, problem_type=None, record_preprocessor_script=None, post_analytics_processor_script=None, output_s3_uri=None, constraints=None, schedule_cron_expression=None, enable_cloudwatch_metrics=None, role=None, instance_count=None, instance_type=None, volume_size_in_gb=None, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, env=None, network_config=None)

Updates the existing monitoring schedule.

If more options than schedule_cron_expression are to be updated, a new job definition will be created to hold them. The old job definition will not be deleted.

Parameters
  • endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput.

  • ground_truth_input (str) – S3 URI to ground truth dataset.

  • problem_type (str) – The type of problem of this model quality monitoring. Valid values are “Regression”, “BinaryClassification”, “MulticlassClassification”.

  • record_preprocessor_script (str) – The path to the record preprocessor script. This can be a local path or an S3 uri.

  • post_analytics_processor_script (str) – The path to the record post-analytics processor script. This can be a local path or an S3 uri.

  • output_s3_uri (str) – S3 destination of the constraint_violations and analysis result. Default: “s3://<default_session_bucket>/<job_name>/output”

  • constraints (sagemaker.model_monitor.Constraints or str) – If provided it will be used for monitoring the endpoint. It can be a Constraints object or an S3 uri pointing to a constraints JSON file.

  • schedule_cron_expression (str) – The cron expression that dictates the frequency that this job run. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions. Default: Daily.

  • enable_cloudwatch_metrics (bool) – Whether to publish cloudwatch metrics as part of the baselining or monitoring jobs.

  • role (str) – An AWS IAM role. The Amazon SageMaker jobs use this role.

  • instance_count (int) – The number of instances to run the jobs with.

  • instance_type (str) – Type of EC2 instance to use for the job, for example, ‘ml.m5.xlarge’.

  • volume_size_in_gb (int) – Size in GB of the EBS volume to use for storing data during processing (default: 30).

  • volume_kms_key (str) – A KMS key for the job’s volume.

  • output_kms_key (str) – The KMS key id for the job’s outputs.

  • max_runtime_in_seconds (int) – Timeout in seconds. After this amount of time, Amazon SageMaker terminates the job regardless of its current status. Default: 3600

  • env (dict) – Environment variables to be passed to the job.

  • network_config (sagemaker.network.NetworkConfig) – A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.

delete_monitoring_schedule()

Deletes the monitoring schedule and its job definition.

classmethod attach(monitor_schedule_name, sagemaker_session=None)

Sets this object’s schedule name to the name provided.

This allows subsequent describe_schedule or list_executions calls to point to the given schedule.

Parameters
  • monitor_schedule_name (str) – The name of the schedule to attach to.

  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

class sagemaker.model_monitor.model_monitoring.BaseliningJob(sagemaker_session, job_name, inputs, outputs, output_kms_key=None)

Bases: sagemaker.processing.ProcessingJob

Provides functionality to retrieve baseline-specific files output from baselining job.

Initializes a Baselining job.

It tracks a baselining job kicked off by the suggest workflow.

Parameters
  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

  • job_name (str) – Name of the Amazon SageMaker Model Monitoring Baselining Job.

  • inputs ([sagemaker.processing.ProcessingInput]) – A list of ProcessingInput objects.

  • outputs ([sagemaker.processing.ProcessingOutput]) – A list of ProcessingOutput objects.

  • output_kms_key (str) – The output kms key associated with the job. Defaults to None if not provided.

classmethod from_processing_job(processing_job)

Initializes a Baselining job from a processing job.

Parameters

processing_job (sagemaker.processing.ProcessingJob) – The ProcessingJob used for baselining instance.

Returns

The instance of ProcessingJob created

using the current job name.

Return type

sagemaker.processing.BaseliningJob

baseline_statistics(file_name='statistics.json', kms_key=None)

Returns a sagemaker.model_monitor.

Statistics object representing the statistics JSON file generated by this baselining job.

Parameters
  • file_name (str) – The name of the json-formatted statistics file

  • kms_key (str) – The kms key to use when retrieving the file.

Returns

The Statistics object representing the file that

was generated by the job.

Return type

sagemaker.model_monitor.Statistics

Raises

UnexpectedStatusException – This is thrown if the job is not in a ‘Complete’ state.

suggested_constraints(file_name='constraints.json', kms_key=None)

Returns a sagemaker.model_monitor.

Constraints object representing the constraints JSON file generated by this baselining job.

Parameters
  • file_name (str) – The name of the json-formatted constraints file

  • kms_key (str) – The kms key to use when retrieving the file.

Returns

The Constraints object representing the file that

was generated by the job.

Return type

sagemaker.model_monitor.Constraints

Raises

UnexpectedStatusException – This is thrown if the job is not in a ‘Complete’ state.

class sagemaker.model_monitor.model_monitoring.MonitoringExecution(sagemaker_session, job_name, inputs, output, output_kms_key=None)

Bases: sagemaker.processing.ProcessingJob

Provides functionality to retrieve monitoring-specific files from monitoring executions.

Initializes a MonitoringExecution job that tracks a monitoring execution.

Its kicked off by an Amazon SageMaker Model Monitoring Schedule.

Parameters
  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

  • job_name (str) – The name of the monitoring execution job.

  • output (sagemaker.Processing.ProcessingOutput) – The output associated with the monitoring execution.

  • output_kms_key (str) – The output kms key associated with the job. Defaults to None if not provided.

classmethod from_processing_arn(sagemaker_session, processing_job_arn)

Initializes a Baselining job from a processing arn.

Parameters
  • processing_job_arn (str) – ARN of the processing job to create a MonitoringExecution

  • of. (out) –

  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

Returns

The instance of ProcessingJob created

using the current job name.

Return type

sagemaker.processing.BaseliningJob

statistics(file_name='statistics.json', kms_key=None)

Returns a sagemaker.model_monitor.

Statistics object representing the statistics JSON file generated by this monitoring execution.

Parameters
  • file_name (str) – The name of the json-formatted statistics file

  • kms_key (str) – The kms key to use when retrieving the file.

Returns

The Statistics object representing the file that

was generated by the execution.

Return type

sagemaker.model_monitor.Statistics

Raises

UnexpectedStatusException – This is thrown if the job is not in a ‘Complete’ state.

constraint_violations(file_name='constraint_violations.json', kms_key=None)

Returns a sagemaker.model_monitor.

ConstraintViolations object representing the constraint violations JSON file generated by this monitoring execution.

Parameters
  • file_name (str) – The name of the json-formatted constraint violations file.

  • kms_key (str) – The kms key to use when retrieving the file.

Returns

The ConstraintViolations object

representing the file that was generated by the monitoring execution.

Return type

sagemaker.model_monitor.ConstraintViolations

Raises

UnexpectedStatusException – This is thrown if the job is not in a ‘Complete’ state.

class sagemaker.model_monitor.model_monitoring.EndpointInput(endpoint_name, destination, s3_input_mode='File', s3_data_distribution_type='FullyReplicated', start_time_offset=None, end_time_offset=None, features_attribute=None, inference_attribute=None, probability_attribute=None, probability_threshold_attribute=None)

Bases: object

Accepts parameters that specify an endpoint input for monitoring execution.

It also provides a method to turn those parameters into a dictionary.

Initialize an EndpointInput instance.

EndpointInput accepts parameters that specify an endpoint input for a monitoring job and provides a method to turn those parameters into a dictionary.

Parameters
  • endpoint_name (str) – The name of the endpoint.

  • destination (str) – The destination of the input.

  • s3_input_mode (str) – The S3 input mode. Can be one of: “File”, “Pipe. Default: “File”.

  • s3_data_distribution_type (str) – The S3 Data Distribution Type. Can be one of: “FullyReplicated”, “ShardedByS3Key”

  • start_time_offset (str) – Monitoring start time offset, e.g. “-PT1H”

  • end_time_offset (str) – Monitoring end time offset, e.g. “-PT0H”.

  • features_attribute (str) – JSONpath to locate features in JSONlines dataset. Only used for ModelBiasMonitor and ModelExplainabilityMonitor

  • inference_attribute (str) – Index or JSONpath to locate predicted label(s). Only used for ModelQualityMonitor, ModelBiasMonitor, and ModelExplainabilityMonitor

  • probability_attribute (str or int) – Index or JSONpath to locate probabilities. Only used for ModelQualityMonitor, ModelBiasMonitor and ModelExplainabilityMonitor

  • probability_threshold_attribute (float) – threshold to convert probabilities to binaries Only used for ModelQualityMonitor, ModelBiasMonitor and ModelExplainabilityMonitor

class sagemaker.model_monitor.model_monitoring.MonitoringOutput(source, destination=None, s3_upload_mode='Continuous')

Bases: object

Accepts parameters that specify an S3 output for a monitoring job.

It also provides a method to turn those parameters into a dictionary.

Initialize a MonitoringOutput instance.

MonitoringOutput accepts parameters that specify an S3 output for a monitoring job and provides a method to turn those parameters into a dictionary.

Parameters
  • source (str) – The source for the output.

  • destination (str) – The destination of the output. Optional. Default: s3://<default-session-bucket/schedule_name/output

  • s3_upload_mode (str) – The S3 upload mode.

This module contains code related to the ModelMonitoringFile class.

Codes are used for managing the constraints and statistics JSON files generated and consumed by Amazon SageMaker Model Monitoring Schedules.

class sagemaker.model_monitor.monitoring_files.ModelMonitoringFile(body_dict, file_s3_uri, kms_key, sagemaker_session)

Bases: object

Represents a file with a body and an S3 uri.

Initializes a file with a body and an S3 uri.

Parameters
  • body_dict (str) – The body of the JSON file.

  • file_s3_uri (str) – The uri of the JSON file.

  • kms_key (str) – The kms key to be used to decrypt the file in S3.

  • sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.

save(new_save_location_s3_uri=None)

Save the current instance’s body to s3 using the instance’s s3 path.

The S3 path can be overridden by providing one. This also overrides the default save location for this object.

Parameters

new_save_location_s3_uri (str) – Optional. The S3 path to save the file to. If not provided, the file is saved in place in S3. If provided, the file’s S3 path is permanently updated.

Returns

The s3 location to which the file was saved.

Return type

str

class sagemaker.model_monitor.monitoring_files.Statistics(body_dict, statistics_file_s3_uri, kms_key=None, sagemaker_session=None)

Bases: sagemaker.model_monitor.monitoring_files.ModelMonitoringFile

Represents the statistics JSON file used in Amazon SageMaker Model Monitoring.

Initializes the Statistics object used in Amazon SageMaker Model Monitoring.

Parameters
  • body_dict (str) – The body of the statistics JSON file.

  • statistics_file_s3_uri (str) – The uri of the statistics JSON file.

  • kms_key (str) – The kms key to be used to decrypt the file in S3.

  • sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.

classmethod from_s3_uri(statistics_file_s3_uri, kms_key=None, sagemaker_session=None)

Generates a Statistics object from an s3 uri.

Parameters
  • statistics_file_s3_uri (str) – The uri of the statistics JSON file.

  • kms_key (str) – The kms key to be used to decrypt the file in S3.

  • sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.

Returns

The instance of Statistics generated from

the s3 uri.

Return type

sagemaker.model_monitor.Statistics

classmethod from_string(statistics_file_string, kms_key=None, file_name=None, sagemaker_session=None)

Generates a Statistics object from an s3 uri.

Parameters
  • statistics_file_string (str) – The uri of the statistics JSON file.

  • kms_key (str) – The kms key to be used to encrypt the file in S3.

  • file_name (str) – The file name to use when uploading to S3.

  • sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.

Returns

The instance of Statistics generated from

the s3 uri.

Return type

sagemaker.model_monitor.Statistics

classmethod from_file_path(statistics_file_path, kms_key=None, sagemaker_session=None)

Initializes a Statistics object from a file path.

Parameters
  • statistics_file_path (str) – The path to the statistics file.

  • kms_key (str) – The kms_key to use when encrypting the file in S3.

  • sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.

Returns

The instance of Statistics generated from

the local file path.

Return type

sagemaker.model_monitor.Statistics

class sagemaker.model_monitor.monitoring_files.Constraints(body_dict, constraints_file_s3_uri, kms_key=None, sagemaker_session=None)

Bases: sagemaker.model_monitor.monitoring_files.ModelMonitoringFile

Represents the constraints JSON file used in Amazon SageMaker Model Monitoring.

Initializes the Constraints object used in Amazon SageMaker Model Monitoring.

Parameters
  • body_dict (str) – The body of the constraints JSON file.

  • constraints_file_s3_uri (str) – The uri of the constraints JSON file.

  • kms_key (str) – The kms key to be used to decrypt the file in S3.

  • sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.

classmethod from_s3_uri(constraints_file_s3_uri, kms_key=None, sagemaker_session=None)

Generates a Constraints object from an s3 uri.

Parameters
  • constraints_file_s3_uri (str) – The uri of the constraints JSON file.

  • kms_key (str) – The kms key to be used to decrypt the file in S3.

  • sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.

Returns

The instance of Constraints generated from

the s3 uri.

Return type

sagemaker.model_monitor.Constraints

classmethod from_string(constraints_file_string, kms_key=None, file_name=None, sagemaker_session=None)

Generates a Constraints object from an s3 uri.

Parameters
  • constraints_file_string (str) – The uri of the constraints JSON file.

  • kms_key (str) – The kms key to be used to encrypt the file in S3.

  • file_name (str) – The file name to use when uploading to S3.

  • sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.

Returns

The instance of Constraints generated from

the s3 uri.

Return type

sagemaker.model_monitor.Constraints

classmethod from_file_path(constraints_file_path, kms_key=None, sagemaker_session=None)

Initializes a Constraints object from a file path.

Parameters
  • constraints_file_path (str) – The path to the constraints file.

  • kms_key (str) – The kms_key to use when encrypting the file in S3.

  • sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.

Returns

The instance of Constraints generated from

the local file path.

Return type

sagemaker.model_monitor.Constraints

set_monitoring(enable_monitoring, feature_name=None)

Sets the monitoring flags on this Constraints object.

If feature-name is provided, modify the feature-level override. Else, modify the top-level monitoring flag.

Parameters
  • enable_monitoring (bool) – Whether to enable monitoring or not.

  • feature_name (str) – Sets the feature-level monitoring flag if provided. Otherwise, sets the file-level override.

class sagemaker.model_monitor.monitoring_files.ConstraintViolations(body_dict, constraint_violations_file_s3_uri, kms_key=None, sagemaker_session=None)

Bases: sagemaker.model_monitor.monitoring_files.ModelMonitoringFile

Represents the constraint violations JSON file used in Amazon SageMaker Model Monitoring.

Initializes the ConstraintViolations object used in Amazon SageMaker Model Monitoring.

Parameters
  • body_dict (str) – The body of the constraint violations JSON file.

  • constraint_violations_file_s3_uri (str) – The uri of the constraint violations JSON file.

  • kms_key (str) – The kms key to be used to decrypt the file in S3.

  • sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.

classmethod from_s3_uri(constraint_violations_file_s3_uri, kms_key=None, sagemaker_session=None)

Generates a ConstraintViolations object from an s3 uri.

Parameters
  • constraint_violations_file_s3_uri (str) – The uri of the constraint violations JSON file.

  • kms_key (str) – The kms key to be used to decrypt the file in S3.

  • sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.

Returns

The instance of ConstraintViolations

generated from the s3 uri.

Return type

sagemaker.model_monitor.ConstraintViolations

classmethod from_string(constraint_violations_file_string, kms_key=None, file_name=None, sagemaker_session=None)

Generates a ConstraintViolations object from an s3 uri.

Parameters
  • constraint_violations_file_string (str) – The uri of the constraint violations JSON file.

  • kms_key (str) – The kms key to be used to encrypt the file in S3.

  • file_name (str) – The file name to use when uploading to S3.

  • sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.

Returns

The instance of ConstraintViolations

generated from the s3 uri.

Return type

sagemaker.model_monitor.ConstraintViolations

classmethod from_file_path(constraint_violations_file_path, kms_key=None, sagemaker_session=None)

Initializes a ConstraintViolations object from a file path.

Parameters
  • constraint_violations_file_path (str) – The path to the constraint violations file.

  • kms_key (str) – The kms_key to use when encrypting the file in S3.

  • sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.

Returns

The instance of ConstraintViolations

generated from the local file path.

Return type

sagemaker.model_monitor.ConstraintViolations

This module contains code related to the DatasetFormat class.

Codes are used for managing the constraints JSON file generated and consumed by Amazon SageMaker Model Monitoring Schedules.

class sagemaker.model_monitor.dataset_format.DatasetFormat

Bases: object

Represents a Dataset Format that is used when calling a DefaultModelMonitor.

static csv(header=True, output_columns_position='START')

Returns a DatasetFormat JSON string for use with a DefaultModelMonitor.

Parameters
  • header (bool) – Whether the csv dataset to baseline and monitor has a header. Default: True.

  • output_columns_position (str) – The position of the output columns. Must be one of (“START”, “END”). Default: “START”.

Returns

JSON string containing DatasetFormat to be used by DefaultModelMonitor.

Return type

dict

static json(lines=True)

Returns a DatasetFormat JSON string for use with a DefaultModelMonitor.

Parameters

lines (bool) – Whether the file should be read as a json object per line. Default: True.

Returns

JSON string containing DatasetFormat to be used by DefaultModelMonitor.

Return type

dict

static sagemaker_capture_json()

Returns a DatasetFormat SageMaker Capture Json string for use with a DefaultModelMonitor.

Returns

JSON string containing DatasetFormat to be used by DefaultModelMonitor.

Return type

dict

This module contains code related to the DataCaptureConfig class.

Codes are used for configuring capture, collection, and storage, for prediction requests and responses for models hosted on SageMaker Endpoints.

class sagemaker.model_monitor.data_capture_config.DataCaptureConfig(enable_capture, sampling_percentage=20, destination_s3_uri=None, kms_key_id=None, capture_options=None, csv_content_types=None, json_content_types=None, sagemaker_session=None)

Bases: object

Configuration object passed in when deploying models to Amazon SageMaker Endpoints.

This object specifies configuration related to endpoint data capture for use with Amazon SageMaker Model Monitoring.

Initialize a DataCaptureConfig object for capturing data from Amazon SageMaker Endpoints.

Parameters
  • enable_capture (bool) – Required. Whether data capture should be enabled or not.

  • sampling_percentage (int) – Optional. Default=20. The percentage of data to sample. Must be between 0 and 100.

  • destination_s3_uri (str) – Optional. Defaults to “s3://<default-session-bucket>/ model-monitor/data-capture”.

  • kms_key_id (str) – Optional. Default=None. The kms key to use when writing to S3.

  • capture_options ([str]) – Optional. Must be a list containing any combination of the following values: “REQUEST”, “RESPONSE”. Default=[“REQUEST”, “RESPONSE”]. Denotes which data to capture between request and response.

  • csv_content_types ([str]) – Optional. Default=[“text/csv”].

  • json_content_types ([str]) – Optional. Default=[“application/json”].

  • sagemaker_session (sagemaker.session.Session) – A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.

API_MAPPING = {'REQUEST': 'Input', 'RESPONSE': 'Output'}

This module contains code related to the CronExpressionGenerator class.

Codes are used for generating cron expressions compatible with Amazon SageMaker Model Monitoring Schedules.

class sagemaker.model_monitor.cron_expression_generator.CronExpressionGenerator

Bases: object

Generates cron expression strings for the SageMaker Model Monitoring Schedule API.

static hourly()

Generates hourly cron expression that denotes that a job runs at the top of every hour.

Returns

The cron expression format accepted by the Amazon SageMaker Model Monitoring

Schedule API.

Return type

str

static daily(hour=0)

Generates daily cron expression that denotes that a job runs at the top of every hour.

Parameters

hour (int) –

The hour in HH24 format (UTC) to run the job at, on a daily schedule. .. rubric:: Examples

  • 00

  • 12

  • 17

  • 23

Returns

The cron expression format accepted by the Amazon SageMaker Model Monitoring

Schedule API.

Return type

str

static daily_every_x_hours(hour_interval, starting_hour=0)

Generates “daily every x hours” cron expression.

That denotes that a job runs every day at the specified hour, and then every x hours, as specified in hour_interval.

Example:
>>> daily_every_x_hours(hour_interval=2, starting_hour=0)
This will run every 2 hours starting at midnight.
>>> daily_every_x_hours(hour_interval=10, starting_hour=0)
This will run at midnight, 10am, and 8pm every day.
Parameters
  • hour_interval (int) – The hour interval to run the job at.

  • starting_hour (int) – The hour at which to begin in HH24 format (UTC).

Returns

The cron expression format accepted by the Amazon SageMaker Model Monitoring

Schedule API.

Return type

str

This module contains code related to Amazon SageMaker Explainability AI Model Monitoring.

These classes assist with suggesting baselines and creating monitoring schedules for monitoring bias metrics and feature attribution of SageMaker Endpoints.

class sagemaker.model_monitor.clarify_model_monitoring.ClarifyModelMonitor(role, instance_count=1, instance_type='ml.m5.xlarge', volume_size_in_gb=30, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, base_job_name=None, sagemaker_session=None, env=None, tags=None, network_config=None)

Bases: sagemaker.model_monitor.model_monitoring.ModelMonitor

Base class of Amazon SageMaker Explainability API model monitors.

This class is an abstract base class, please instantiate its subclasses if you want to monitor bias metrics or feature attribution of an endpoint.

Initializes a monitor instance.

The monitor handles baselining datasets and creating Amazon SageMaker Monitoring Schedules to monitor SageMaker endpoints.

Parameters
  • role (str) – An AWS IAM role. The Amazon SageMaker jobs use this role.

  • instance_count (int) – The number of instances to run the jobs with.

  • instance_type (str) – Type of EC2 instance to use for the job, for example, ‘ml.m5.xlarge’.

  • volume_size_in_gb (int) – Size in GB of the EBS volume to use for storing data during processing (default: 30).

  • volume_kms_key (str) – A KMS key for the job’s volume.

  • output_kms_key (str) – The KMS key id for the job’s outputs.

  • max_runtime_in_seconds (int) – Timeout in seconds. After this amount of time, Amazon SageMaker terminates the job regardless of its current status. Default: 3600

  • base_job_name (str) – Prefix for the job name. If not specified, a default name is generated based on the training image name and current timestamp.

  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

  • env (dict) – Environment variables to be passed to the job.

  • tags ([dict]) – List of tags to be passed to the job.

  • network_config (sagemaker.network.NetworkConfig) – A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.

run_baseline(**_)

Not implemented.

‘.run_baseline()’ is only allowed for ModelMonitor objects. Please use suggest_baseline instead.

Raises

NotImplementedError

latest_monitoring_statistics(**_)

Not implemented.

The class doesn’t support statistics.

Raises

NotImplementedError

list_executions()

Get the list of the latest monitoring executions in descending order of “ScheduledTime”.

Returns

List of

ClarifyMonitoringExecution in descending order of “ScheduledTime”.

Return type

[sagemaker.model_monitor.ClarifyMonitoringExecution]

class sagemaker.model_monitor.clarify_model_monitoring.ModelBiasMonitor(role, instance_count=1, instance_type='ml.m5.xlarge', volume_size_in_gb=30, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, base_job_name=None, sagemaker_session=None, env=None, tags=None, network_config=None)

Bases: sagemaker.model_monitor.clarify_model_monitoring.ClarifyModelMonitor

Amazon SageMaker model monitor to monitor bias metrics of an endpoint.

Please see the __init__ method of its base class for how to instantiate it.

Initializes a monitor instance.

The monitor handles baselining datasets and creating Amazon SageMaker Monitoring Schedules to monitor SageMaker endpoints.

Parameters
  • role (str) – An AWS IAM role. The Amazon SageMaker jobs use this role.

  • instance_count (int) – The number of instances to run the jobs with.

  • instance_type (str) – Type of EC2 instance to use for the job, for example, ‘ml.m5.xlarge’.

  • volume_size_in_gb (int) – Size in GB of the EBS volume to use for storing data during processing (default: 30).

  • volume_kms_key (str) – A KMS key for the job’s volume.

  • output_kms_key (str) – The KMS key id for the job’s outputs.

  • max_runtime_in_seconds (int) – Timeout in seconds. After this amount of time, Amazon SageMaker terminates the job regardless of its current status. Default: 3600

  • base_job_name (str) – Prefix for the job name. If not specified, a default name is generated based on the training image name and current timestamp.

  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

  • env (dict) – Environment variables to be passed to the job.

  • tags ([dict]) – List of tags to be passed to the job.

  • network_config (sagemaker.network.NetworkConfig) – A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.

JOB_DEFINITION_BASE_NAME = 'model-bias-job-definition'
classmethod monitoring_type()

Type of the monitoring job.

suggest_baseline(data_config, bias_config, model_config, model_predicted_label_config=None, wait=False, logs=False, job_name=None, kms_key=None)

Suggests baselines for use with Amazon SageMaker Model Monitoring Schedules.

Parameters
  • data_config (DataConfig) – Config of the input/output data.

  • bias_config (BiasConfig) – Config of sensitive groups.

  • model_config (ModelConfig) – Config of the model and its endpoint to be created.

  • model_predicted_label_config (ModelPredictedLabelConfig) – Config of how to extract the predicted label from the model output.

  • wait (bool) – Whether the call should wait until the job completes (default: False).

  • logs (bool) – Whether to show the logs produced by the job. Only meaningful when wait is True (default: False).

  • job_name (str) – Processing job name. If not specified, the processor generates a default job name, based on the image name and current timestamp.

  • kms_key (str) – The ARN of the KMS key that is used to encrypt the user code file (default: None).

Returns

The ProcessingJob object representing the

baselining job.

Return type

sagemaker.processing.ProcessingJob

create_monitoring_schedule(endpoint_input, ground_truth_input, analysis_config=None, output_s3_uri=None, constraints=None, monitor_schedule_name=None, schedule_cron_expression=None, enable_cloudwatch_metrics=True)

Creates a monitoring schedule.

Parameters
  • endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput.

  • ground_truth_input (str) – S3 URI to ground truth dataset.

  • analysis_config (str or BiasAnalysisConfig) – URI to analysis_config for the bias job. If it is None then configuration of the latest baselining job will be reused, but if no baselining job then fail the call.

  • output_s3_uri (str) – S3 destination of the constraint_violations and analysis result. Default: “s3://<default_session_bucket>/<job_name>/output”

  • constraints (sagemaker.model_monitor.Constraints or str) – If provided it will be used for monitoring the endpoint. It can be a Constraints object or an S3 uri pointing to a constraints JSON file.

  • monitor_schedule_name (str) – Schedule name. If not specified, the processor generates a default job name, based on the image name and current timestamp.

  • schedule_cron_expression (str) – The cron expression that dictates the frequency that this job run. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions. Default: Daily.

  • enable_cloudwatch_metrics (bool) – Whether to publish cloudwatch metrics as part of the baselining or monitoring jobs.

update_monitoring_schedule(endpoint_input=None, ground_truth_input=None, analysis_config=None, output_s3_uri=None, constraints=None, schedule_cron_expression=None, enable_cloudwatch_metrics=None, role=None, instance_count=None, instance_type=None, volume_size_in_gb=None, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, env=None, network_config=None)

Updates the existing monitoring schedule.

If more options than schedule_cron_expression are to be updated, a new job definition will be created to hold them. The old job definition will not be deleted.

Parameters
  • endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput.

  • ground_truth_input (str) – S3 URI to ground truth dataset.

  • analysis_config (str or BiasAnalysisConfig) – URI to analysis_config for the bias job. If it is None then configuration of the latest baselining job will be reused, but if no baselining job then fail the call.

  • output_s3_uri (str) – S3 destination of the constraint_violations and analysis result. Default: “s3://<default_session_bucket>/<job_name>/output”

  • constraints (sagemaker.model_monitor.Constraints or str) – If provided it will be used for monitoring the endpoint. It can be a Constraints object or an S3 uri pointing to a constraints JSON file.

  • schedule_cron_expression (str) – The cron expression that dictates the frequency that this job run. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions. Default: Daily.

  • enable_cloudwatch_metrics (bool) – Whether to publish cloudwatch metrics as part of the baselining or monitoring jobs.

  • role (str) – An AWS IAM role. The Amazon SageMaker jobs use this role.

  • instance_count (int) – The number of instances to run the jobs with.

  • instance_type (str) – Type of EC2 instance to use for the job, for example, ‘ml.m5.xlarge’.

  • volume_size_in_gb (int) – Size in GB of the EBS volume to use for storing data during processing (default: 30).

  • volume_kms_key (str) – A KMS key for the job’s volume.

  • output_kms_key (str) – The KMS key id for the job’s outputs.

  • max_runtime_in_seconds (int) – Timeout in seconds. After this amount of time, Amazon SageMaker terminates the job regardless of its current status. Default: 3600

  • env (dict) – Environment variables to be passed to the job.

  • network_config (sagemaker.network.NetworkConfig) – A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.

delete_monitoring_schedule()

Deletes the monitoring schedule and its job definition.

classmethod attach(monitor_schedule_name, sagemaker_session=None)

Sets this object’s schedule name to the name provided.

This allows subsequent describe_schedule or list_executions calls to point to the given schedule.

Parameters
  • monitor_schedule_name (str) – The name of the schedule to attach to.

  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

class sagemaker.model_monitor.clarify_model_monitoring.BiasAnalysisConfig(bias_config, headers=None, label=None)

Bases: object

Analysis configuration for ModelBiasMonitor.

Creates an analysis config dictionary.

Parameters
  • bias_config (sagemaker.clarify.BiasConfig) – Config object related to bias configurations.

  • headers (list[str]) – A list of column names in the input dataset.

  • label (str) – Target attribute for the model required by bias metrics. Specified as column name or index for CSV dataset, or as JSONPath for JSONLines.

class sagemaker.model_monitor.clarify_model_monitoring.ModelExplainabilityMonitor(role, instance_count=1, instance_type='ml.m5.xlarge', volume_size_in_gb=30, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, base_job_name=None, sagemaker_session=None, env=None, tags=None, network_config=None)

Bases: sagemaker.model_monitor.clarify_model_monitoring.ClarifyModelMonitor

Amazon SageMaker model monitor to monitor feature attribution of an endpoint.

Please see the __init__ method of its base class for how to instantiate it.

Initializes a monitor instance.

The monitor handles baselining datasets and creating Amazon SageMaker Monitoring Schedules to monitor SageMaker endpoints.

Parameters
  • role (str) – An AWS IAM role. The Amazon SageMaker jobs use this role.

  • instance_count (int) – The number of instances to run the jobs with.

  • instance_type (str) – Type of EC2 instance to use for the job, for example, ‘ml.m5.xlarge’.

  • volume_size_in_gb (int) – Size in GB of the EBS volume to use for storing data during processing (default: 30).

  • volume_kms_key (str) – A KMS key for the job’s volume.

  • output_kms_key (str) – The KMS key id for the job’s outputs.

  • max_runtime_in_seconds (int) – Timeout in seconds. After this amount of time, Amazon SageMaker terminates the job regardless of its current status. Default: 3600

  • base_job_name (str) – Prefix for the job name. If not specified, a default name is generated based on the training image name and current timestamp.

  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

  • env (dict) – Environment variables to be passed to the job.

  • tags ([dict]) – List of tags to be passed to the job.

  • network_config (sagemaker.network.NetworkConfig) – A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.

JOB_DEFINITION_BASE_NAME = 'model-explainability-job-definition'
classmethod monitoring_type()

Type of the monitoring job.

suggest_baseline(data_config, explainability_config, model_config, model_scores=None, wait=False, logs=False, job_name=None, kms_key=None)

Suggest baselines for use with Amazon SageMaker Model Monitoring Schedules.

Parameters
  • data_config (DataConfig) – Config of the input/output data.

  • explainability_config (ExplainabilityConfig) – Config of the specific explainability method. Currently, only SHAP is supported.

  • model_config (ModelConfig) – Config of the model and its endpoint to be created.

  • model_scores (int or str) – Index or JSONPath location in the model output for the predicted scores to be explained. This is not required if the model output is a single score.

  • wait (bool) – Whether the call should wait until the job completes (default: False).

  • logs (bool) – Whether to show the logs produced by the job. Only meaningful when wait is True (default: False).

  • job_name (str) – Processing job name. If not specified, the processor generates a default job name, based on the image name and current timestamp.

  • kms_key (str) – The ARN of the KMS key that is used to encrypt the user code file (default: None).

Returns

The ProcessingJob object representing the

baselining job.

Return type

sagemaker.processing.ProcessingJob

create_monitoring_schedule(endpoint_input, analysis_config=None, output_s3_uri=None, constraints=None, monitor_schedule_name=None, schedule_cron_expression=None, enable_cloudwatch_metrics=True)

Creates a monitoring schedule.

Parameters
  • endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput.

  • analysis_config (str or ExplainabilityAnalysisConfig) – URI to the analysis_config for the explainability job. If it is None then configuration of the latest baselining job will be reused, but if no baselining job then fail the call.

  • output_s3_uri (str) – S3 destination of the constraint_violations and analysis result. Default: “s3://<default_session_bucket>/<job_name>/output”

  • constraints (sagemaker.model_monitor.Constraints or str) – If provided it will be used for monitoring the endpoint. It can be a Constraints object or an S3 uri pointing to a constraints JSON file.

  • monitor_schedule_name (str) – Schedule name. If not specified, the processor generates a default job name, based on the image name and current timestamp.

  • schedule_cron_expression (str) – The cron expression that dictates the frequency that this job run. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions. Default: Daily.

  • enable_cloudwatch_metrics (bool) – Whether to publish cloudwatch metrics as part of the baselining or monitoring jobs.

update_monitoring_schedule(endpoint_input=None, analysis_config=None, output_s3_uri=None, constraints=None, schedule_cron_expression=None, enable_cloudwatch_metrics=None, role=None, instance_count=None, instance_type=None, volume_size_in_gb=None, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, env=None, network_config=None)

Updates the existing monitoring schedule.

If more options than schedule_cron_expression are to be updated, a new job definition will be created to hold them. The old job definition will not be deleted.

Parameters
  • endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput.

  • analysis_config (str or BiasAnalysisConfig) – URI to analysis_config for the bias job. If it is None then configuration of the latest baselining job will be reused, but if no baselining job then fail the call.

  • output_s3_uri (str) – S3 destination of the constraint_violations and analysis result. Default: “s3://<default_session_bucket>/<job_name>/output”

  • constraints (sagemaker.model_monitor.Constraints or str) – If provided it will be used for monitoring the endpoint. It can be a Constraints object or an S3 uri pointing to a constraints JSON file.

  • schedule_cron_expression (str) – The cron expression that dictates the frequency that this job run. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions. Default: Daily.

  • enable_cloudwatch_metrics (bool) – Whether to publish cloudwatch metrics as part of the baselining or monitoring jobs.

  • role (str) – An AWS IAM role. The Amazon SageMaker jobs use this role.

  • instance_count (int) – The number of instances to run the jobs with.

  • instance_type (str) – Type of EC2 instance to use for the job, for example, ‘ml.m5.xlarge’.

  • volume_size_in_gb (int) – Size in GB of the EBS volume to use for storing data during processing (default: 30).

  • volume_kms_key (str) – A KMS key for the job’s volume.

  • output_kms_key (str) – The KMS key id for the job’s outputs.

  • max_runtime_in_seconds (int) – Timeout in seconds. After this amount of time, Amazon SageMaker terminates the job regardless of its current status. Default: 3600

  • env (dict) – Environment variables to be passed to the job.

  • network_config (sagemaker.network.NetworkConfig) – A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.

delete_monitoring_schedule()

Deletes the monitoring schedule and its job definition.

classmethod attach(monitor_schedule_name, sagemaker_session=None)

Sets this object’s schedule name to the name provided.

This allows subsequent describe_schedule or list_executions calls to point to the given schedule.

Parameters
  • monitor_schedule_name (str) – The name of the schedule to attach to.

  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

class sagemaker.model_monitor.clarify_model_monitoring.ExplainabilityAnalysisConfig(explainability_config, model_config, headers=None)

Bases: object

Analysis configuration for ModelExplainabilityMonitor.

Creates an analysis config dictionary.

Parameters
class sagemaker.model_monitor.clarify_model_monitoring.ClarifyBaseliningConfig(analysis_config, features_attribute=None, inference_attribute=None, probability_attribute=None, probability_threshold_attribute=None)

Bases: object

Data class to hold some essential analysis configuration of ClarifyBaseliningJob

Initialization.

Parameters
  • analysis_config (BiasAnalysisConfig or ExplainabilityAnalysisConfig) – analysis config from configurations of the baselining job.

  • features_attribute (str) – JSONpath to locate features in predictor request payload. Only required when predictor content type is JSONlines.

  • inference_attribute (str) – Index, header or JSONpath to locate predicted label in predictor response payload.

  • probability_attribute (str) – Index or JSONpath location in the model output for probabilities or scores to be used for explainability.

  • probability_threshold_attribute (float) – Value to indicate the threshold to select the binary label in the case of binary classification. Default is 0.5.

class sagemaker.model_monitor.clarify_model_monitoring.ClarifyBaseliningJob(processing_job)

Bases: sagemaker.model_monitor.model_monitoring.BaseliningJob

Provides functionality to retrieve baseline-specific output from Clarify baselining job.

Initializes a ClarifyBaseliningJob that tracks a baselining job by suggest_baseline()

Parameters

processing_job (sagemaker.processing.ProcessingJob) – The ProcessingJob used for baselining instance.

baseline_statistics(**_)

Not implemented.

The class doesn’t support statistics.

Raises

NotImplementedError

suggested_constraints(file_name=None, kms_key=None)

Returns a sagemaker.model_monitor.

Constraints object representing the constraints JSON file generated by this baselining job.

Parameters
  • file_name (str) – Keep this parameter to align with method signature in super class, but it will be ignored.

  • kms_key (str) – The kms key to use when retrieving the file.

Returns

The Constraints object representing the file that

was generated by the job.

Return type

sagemaker.model_monitor.Constraints

Raises

UnexpectedStatusException – This is thrown if the job is not in a ‘Complete’ state.

class sagemaker.model_monitor.clarify_model_monitoring.ClarifyMonitoringExecution(sagemaker_session, job_name, inputs, output, output_kms_key=None)

Bases: sagemaker.model_monitor.model_monitoring.MonitoringExecution

Provides functionality to retrieve monitoring-specific files output from executions.

Initializes an object that tracks a monitoring execution by a Clarify model monitor

Parameters
  • sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

  • job_name (str) – The name of the monitoring execution job.

  • output (sagemaker.Processing.ProcessingOutput) – The output associated with the monitoring execution.

  • output_kms_key (str) – The output kms key associated with the job. Defaults to None if not provided.

statistics(**_)

Not implemented.

The class doesn’t support statistics.

Raises

NotImplementedError