Pipelines

ConditionStep

class sagemaker.workflow.condition_step.ConditionStep(name: str, depends_on: Union[List[str], List[sagemaker.workflow.steps.Step]] = None, display_name: str = None, description: str = None, conditions: List[sagemaker.workflow.conditions.Condition] = None, if_steps: List[Union[sagemaker.workflow.steps.Step, sagemaker.workflow.step_collections.StepCollection]] = None, else_steps: List[Union[sagemaker.workflow.steps.Step, sagemaker.workflow.step_collections.StepCollection]] = None)

Conditional step for pipelines to support conditional branching in the execution of steps.

Construct a ConditionStep for pipelines to support conditional branching.

If all of the conditions in the condition list evaluate to True, the if_steps are marked as ready for execution. Otherwise, the else_steps are marked as ready for execution.

Parameters
  • name (str) – The name of the condition step.

  • display_name (str) – The display name of the condition step.

  • description (str) – The description of the condition step.

  • conditions (List[Condition]) – A list of sagemaker.workflow.conditions.Condition instances.

  • if_steps (List[Union[Step, StepCollection]]) – A list of sagemaker.workflow.steps.Step or sagemaker.workflow.step_collections.StepCollection instances that are marked as ready for execution if the list of conditions evaluates to True.

  • else_steps (List[Union[Step, StepCollection]]) – A list of sagemaker.workflow.steps.Step or sagemaker.workflow.step_collections.StepCollection instances that are marked as ready for execution if the list of conditions evaluates to False.

Deprecated since version sagemaker.workflow.condition_step.JsonGet.

Conditions

class sagemaker.workflow.conditions.ConditionTypeEnum(*args, value=<object object>, **kwargs)

Condition type enum.

class sagemaker.workflow.conditions.Condition(condition_type: sagemaker.workflow.conditions.ConditionTypeEnum = NOTHING)

Abstract Condition entity.

condition_type

The type of condition.

Type

ConditionTypeEnum

Method generated by attrs for class Condition.

class sagemaker.workflow.conditions.ConditionComparison(condition_type: sagemaker.workflow.conditions.ConditionTypeEnum = NOTHING, left: Optional[Union[sagemaker.workflow.execution_variables.ExecutionVariable, sagemaker.workflow.parameters.Parameter, sagemaker.workflow.properties.Properties, str, int, float]] = None, right: Optional[Union[sagemaker.workflow.execution_variables.ExecutionVariable, sagemaker.workflow.parameters.Parameter, sagemaker.workflow.properties.Properties, str, int, float]] = None)

Generic comparison condition that can be used to derive specific condition comparisons.

left

The execution variable, parameter, property, or Python primitive value to use in the comparison.

Type

Union[ConditionValueType, PrimitiveType]

right

The execution variable, parameter, property, or Python primitive value to compare to.

Type

Union[ConditionValueType, PrimitiveType]

Method generated by attrs for class ConditionComparison.

class sagemaker.workflow.conditions.ConditionEquals(left: Optional[Union[sagemaker.workflow.execution_variables.ExecutionVariable, sagemaker.workflow.parameters.Parameter, sagemaker.workflow.properties.Properties, str, int, float]], right: Optional[Union[sagemaker.workflow.execution_variables.ExecutionVariable, sagemaker.workflow.parameters.Parameter, sagemaker.workflow.properties.Properties, str, int, float]])

A condition for equality comparisons.

Construct A condition for equality comparisons.

Parameters
  • left (Union[ConditionValueType, PrimitiveType]) – The execution variable, parameter, property, or Python primitive value to use in the comparison.

  • right (Union[ConditionValueType, PrimitiveType]) – The execution variable, parameter, property, or Python primitive value to compare to.

class sagemaker.workflow.conditions.ConditionGreaterThan(left: Optional[Union[sagemaker.workflow.execution_variables.ExecutionVariable, sagemaker.workflow.parameters.Parameter, sagemaker.workflow.properties.Properties, str, int, float]], right: Optional[Union[sagemaker.workflow.execution_variables.ExecutionVariable, sagemaker.workflow.parameters.Parameter, sagemaker.workflow.properties.Properties, str, int, float]])

A condition for greater than comparisons.

Construct an instance of ConditionGreaterThan for greater than comparisons.

Parameters
  • left (Union[ConditionValueType, PrimitiveType]) – The execution variable, parameter, property, or Python primitive value to use in the comparison.

  • right (Union[ConditionValueType, PrimitiveType]) – The execution variable, parameter, property, or Python primitive value to compare to.

class sagemaker.workflow.conditions.ConditionGreaterThanOrEqualTo(left: Optional[Union[sagemaker.workflow.execution_variables.ExecutionVariable, sagemaker.workflow.parameters.Parameter, sagemaker.workflow.properties.Properties, str, int, float]], right: Optional[Union[sagemaker.workflow.execution_variables.ExecutionVariable, sagemaker.workflow.parameters.Parameter, sagemaker.workflow.properties.Properties, str, int, float]])

A condition for greater than or equal to comparisons.

Construct of ConditionGreaterThanOrEqualTo for greater than or equal to comparisons.

Parameters
  • left (Union[ConditionValueType, PrimitiveType]) – The execution variable, parameter, property, or Python primitive value to use in the comparison.

  • right (Union[ConditionValueType, PrimitiveType]) – The execution variable, parameter, property, or Python primitive value to compare to.

class sagemaker.workflow.conditions.ConditionLessThan(left: Optional[Union[sagemaker.workflow.execution_variables.ExecutionVariable, sagemaker.workflow.parameters.Parameter, sagemaker.workflow.properties.Properties, str, int, float]], right: Optional[Union[sagemaker.workflow.execution_variables.ExecutionVariable, sagemaker.workflow.parameters.Parameter, sagemaker.workflow.properties.Properties, str, int, float]])

A condition for less than comparisons.

Construct an instance of ConditionLessThan for less than comparisons.

Parameters
  • left (Union[ConditionValueType, PrimitiveType]) – The execution variable, parameter, property, or Python primitive value to use in the comparison.

  • right (Union[ConditionValueType, PrimitiveType]) – The execution variable, parameter, property, or Python primitive value to compare to.

class sagemaker.workflow.conditions.ConditionLessThanOrEqualTo(left: Optional[Union[sagemaker.workflow.execution_variables.ExecutionVariable, sagemaker.workflow.parameters.Parameter, sagemaker.workflow.properties.Properties, str, int, float]], right: Optional[Union[sagemaker.workflow.execution_variables.ExecutionVariable, sagemaker.workflow.parameters.Parameter, sagemaker.workflow.properties.Properties, str, int, float]])

A condition for less than or equal to comparisons.

Construct ConditionLessThanOrEqualTo for less than or equal to comparisons.

Parameters
  • left (Union[ConditionValueType, PrimitiveType]) – The execution variable, parameter, property, or Python primitive value to use in the comparison.

  • right (Union[ConditionValueType, PrimitiveType]) – The execution variable, parameter, property, or Python primitive value to compare to.

class sagemaker.workflow.conditions.ConditionIn(value: Optional[Union[sagemaker.workflow.execution_variables.ExecutionVariable, sagemaker.workflow.parameters.Parameter, sagemaker.workflow.properties.Properties, str, int, float]], in_values: List[Optional[Union[sagemaker.workflow.execution_variables.ExecutionVariable, sagemaker.workflow.parameters.Parameter, sagemaker.workflow.properties.Properties, str, int, float]]])

A condition to check membership.

Construct a ConditionIn condition to check membership.

Parameters
  • value (Union[ConditionValueType, PrimitiveType]) – The execution variable, parameter, property or primitive value to check for membership.

  • in_values (List[Union[ConditionValueType, PrimitiveType]]) – The list of values to check for membership in.

class sagemaker.workflow.conditions.ConditionNot(expression: sagemaker.workflow.conditions.Condition)

A condition for negating another Condition.

Construct a ConditionNot condition for negating another Condition.

expression

A Condition to take the negation of.

Type

Condition

class sagemaker.workflow.conditions.ConditionOr(conditions: List[sagemaker.workflow.conditions.Condition] = None)

A condition for taking the logical OR of a list of Condition instances.

Construct a ConditionOr condition.

conditions

A list of Condition instances to logically OR.

Type

List[Condition]

Entities

class sagemaker.workflow.entities.Entity

Base object for workflow entities.

Entities must implement the to_request method.

class sagemaker.workflow.entities.DefaultEnumMeta(cls, bases, classdict)

An EnumMeta which defaults to the first value in the Enum list.

class sagemaker.workflow.entities.Expression

Base object for expressions.

Expressions must implement the expr property.

Execution Variables

class sagemaker.workflow.execution_variables.ExecutionVariable(name: str)

Pipeline execution variables for workflow.

Create a pipeline execution variable.

Parameters

name (str) – The name of the execution variable.

class sagemaker.workflow.execution_variables.ExecutionVariables

All available ExecutionVariable.

Functions

class sagemaker.workflow.functions.Join(on: str = NOTHING, values: List = NOTHING)

Join together properties.

Examples: Build a Amazon S3 Uri with bucket name parameter and pipeline execution Id and use it as training input:

bucket = ParameterString('bucket', default_value='my-bucket')

TrainingInput(
    s3_data=Join(on='/', ['s3:/', bucket, ExecutionVariables.PIPELINE_EXECUTION_ID]),
    content_type="text/csv")
values

The primitive type values, parameters, step properties, expressions to join.

Type

List[Union[PrimitiveType, Parameter, Expression]]

on_str

The string to join the values on (Defaults to “”).

Type

str

Method generated by attrs for class Join.

class sagemaker.workflow.functions.JsonGet(step_name: str, property_file: Union[sagemaker.workflow.properties.PropertyFile, str], json_path: str)

Get JSON properties from PropertyFiles.

step_name

The step name from which to get the property file.

Type

str

property_file

Either a PropertyFile instance or the name of a property file.

Type

Union[PropertyFile, str]

json_path

The JSON path expression to the requested value.

Type

str

Method generated by attrs for class JsonGet.

Parameters

class sagemaker.workflow.parameters.ParameterTypeEnum(*args, value=<object object>, **kwargs)

Parameter type enum.

class sagemaker.workflow.parameters.Parameter(name: str = NOTHING, parameter_type: sagemaker.workflow.parameters.ParameterTypeEnum = NOTHING, default_value: Optional[Union[str, int, float]] = None)

Pipeline parameter for workflow.

name

The name of the parameter.

Type

str

parameter_type

The type of the parameter.

Type

ParameterTypeEnum

default_value

The default value of the parameter.

Type

PrimitiveType

Method generated by attrs for class Parameter.

class sagemaker.workflow.parameters.ParameterString(*args, **kwargs)

String parameter for pipelines.

Create a pipeline string parameter.

Parameters
  • name (str) – The name of the parameter.

  • default_value (str) – The default value of the parameter. The default value could be overridden at start of an execution. If not set or it is set to None, a value must be provided at the start of the execution.

  • enum_values (List[str]) – Enum values for this parameter.

class sagemaker.workflow.parameters.ParameterInteger(*args, **kwargs)

Integer parameter for pipelines.

Create a pipeline integer parameter.

Parameters
  • name (str) – The name of the parameter.

  • default_value (int) – The default value of the parameter. The default value could be overridden at start of an execution. If not set or it is set to None, a value must be provided at the start of the execution.

class sagemaker.workflow.parameters.ParameterFloat(*args, **kwargs)

Float parameter for pipelines.

Create a pipeline float parameter.

Parameters
  • name (str) – The name of the parameter.

  • default_value (float) – The default value of the parameter. The default value could be overridden at start of an execution. If not set or it is set to None, a value must be provided at the start of the execution.

Pipeline

class sagemaker.workflow.pipeline.Pipeline(name: str = NOTHING, parameters: Sequence[sagemaker.workflow.parameters.Parameter] = NOTHING, pipeline_experiment_config: Optional[sagemaker.workflow.pipeline_experiment_config.PipelineExperimentConfig] = <sagemaker.workflow.pipeline_experiment_config.PipelineExperimentConfig object>, steps: Sequence[Union[sagemaker.workflow.steps.Step, sagemaker.workflow.step_collections.StepCollection]] = NOTHING, sagemaker_session: sagemaker.session.Session = NOTHING)

Pipeline for workflow.

name

The name of the pipeline.

Type

str

parameters

The list of the parameters.

Type

Sequence[Parameter]

pipeline_experiment_config

If set, the workflow will attempt to create an experiment and trial before executing the steps. Creation will be skipped if an experiment or a trial with the same name already exists. By default, pipeline name is used as experiment name and execution id is used as the trial name. If set to None, no experiment or trial will be created automatically.

Type

Optional[PipelineExperimentConfig]

steps

The list of the non-conditional steps associated with the pipeline. Any steps that are within the if_steps or else_steps of a ConditionStep cannot be listed in the steps of a pipeline. Of particular note, the workflow service rejects any pipeline definitions that specify a step in the list of steps of a pipeline and that step in the if_steps or else_steps of any ConditionStep.

Type

Sequence[Union[Step, StepCollection]]

sagemaker_session

Session object that manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the pipeline creates one using the default AWS configuration chain.

Type

sagemaker.session.Session

Method generated by attrs for class Pipeline.

to_request() → Union[Dict[str, Any], List[Dict[str, Any]]]

Gets the request structure for workflow service calls.

create(role_arn: str, description: str = None, tags: List[Dict[str, str]] = None) → Dict[str, Any]

Creates a Pipeline in the Pipelines service.

Parameters
  • role_arn (str) – The role arn that is assumed by the pipeline to create step artifacts.

  • description (str) – A description of the pipeline.

  • tags (List[Dict[str, str]]) – A list of {“Key”: “string”, “Value”: “string”} dicts as tags.

Returns

A response dict from the service.

describe() → Dict[str, Any]

Describes a Pipeline in the Workflow service.

Returns

Response dict from the service. See boto3 client documentation

update(role_arn: str, description: str = None) → Dict[str, Any]

Updates a Pipeline in the Workflow service.

Parameters
  • role_arn (str) – The role arn that is assumed by pipelines to create step artifacts.

  • description (str) – A description of the pipeline.

Returns

A response dict from the service.

upsert(role_arn: str, description: str = None, tags: List[Dict[str, str]] = None) → Dict[str, Any]

Creates a pipeline or updates it, if it already exists.

Parameters
  • role_arn (str) – The role arn that is assumed by workflow to create step artifacts.

  • description (str) – A description of the pipeline.

  • tags (List[Dict[str, str]]) – A list of {“Key”: “string”, “Value”: “string”} dicts as tags.

Returns

response dict from service

delete() → Dict[str, Any]

Deletes a Pipeline in the Workflow service.

Returns

A response dict from the service.

start(parameters: Dict[str, Union[str, int, float]] = None, execution_display_name: str = None, execution_description: str = None)

Starts a Pipeline execution in the Workflow service.

Parameters
  • parameters (Dict[str, Union[str, bool, int, float]]) – values to override pipeline parameters.

  • execution_display_name (str) – The display name of the pipeline execution.

  • execution_description (str) – A description of the execution.

Returns

A _PipelineExecution instance, if successful.

definition()str

Converts a request structure to string representation for workflow service calls.

class sagemaker.workflow.pipeline._PipelineExecution(arn: str, sagemaker_session: sagemaker.session.Session = NOTHING)

Internal class for encapsulating pipeline execution instances.

arn

The arn of the pipeline execution.

Type

str

sagemaker_session

Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the pipeline creates one using the default AWS configuration chain.

Type

sagemaker.session.Session

Method generated by attrs for class _PipelineExecution.

stop()

Stops a pipeline execution.

describe()

Describes a pipeline execution.

Returns

Information about the pipeline execution. See boto3 client describe_pipeline_execution.

list_steps()

Describes a pipeline execution’s steps.

Returns

Information about the steps of the pipeline execution. See boto3 client list_pipeline_execution_steps.

wait(delay=30, max_attempts=60)

Waits for a pipeline execution.

Parameters
  • delay (int) – The polling interval. (Defaults to 30 seconds)

  • max_attempts (int) – The maximum number of polling attempts. (Defaults to 60 polling attempts)

Pipeline Experiment Config

class sagemaker.workflow.pipeline_experiment_config.PipelineExperimentConfig(experiment_name: Union[str, sagemaker.workflow.parameters.Parameter, sagemaker.workflow.entities.Expression], trial_name: Union[str, sagemaker.workflow.parameters.Parameter, sagemaker.workflow.entities.Expression])

Experiment config for SageMaker pipeline.

Create a PipelineExperimentConfig

Examples: Use pipeline name as the experiment name and pipeline execution id as the trial name:

PipelineExperimentConfig(
    ExecutionVariables.PIPELINE_NAME, ExecutionVariables.PIPELINE_EXECUTION_ID)

Use a customized experiment name and pipeline execution id as the trial name:

PipelineExperimentConfig(
    'MyExperiment', ExecutionVariables.PIPELINE_EXECUTION_ID)
Parameters
class sagemaker.workflow.pipeline_experiment_config.PipelineExperimentConfigProperty(name: str)

Reference to pipeline experiment config property.

Create a reference to pipeline experiment property.

Parameters

name (str) – The name of the pipeline experiment config property.

Properties

class sagemaker.workflow.properties.PropertiesMeta(*args, **kwargs)

Load an internal shapes attribute from the botocore sagemaker service model.

Loads up the shapes from the botocore sagemaker service model.

class sagemaker.workflow.properties.Properties(path: str, shape_name: str = None, shape_names: List[str] = None)

Properties for use in workflow expressions.

Create a Properties instance representing the given shape.

Parameters
  • path (str) – The parent path of the Properties instance.

  • shape_name (str) – The botocore sagemaker service model shape name.

  • shape_names (str) – A List of the botocore sagemaker service model shape name.

class sagemaker.workflow.properties.PropertiesList(path: str, shape_name: str = None)

PropertiesList for use in workflow expressions.

Create a PropertiesList instance representing the given shape.

Parameters
  • path (str) – The parent path of the PropertiesList instance.

  • shape_name (str) – The botocore sagemaker service model shape name.

  • root_shape_name (str) – The botocore sagemaker service model shape name.

class sagemaker.workflow.properties.PropertyFile(name: str, output_name: str, path: str)

Provides a property file struct.

name

The name of the property file for reference with JsonGet functions.

output_name

The name of the processing job output channel.

path

The path to the file at the output channel location.

Method generated by attrs for class PropertyFile.

Step Collections

class sagemaker.workflow.step_collections.StepCollection(steps: List[sagemaker.workflow.steps.Step] = NOTHING)

A wrapper of pipeline steps for workflow.

steps

A list of steps.

Type

List[Step]

Method generated by attrs for class StepCollection.

class sagemaker.workflow.step_collections.RegisterModel(name: str, content_types, response_types, inference_instances, transform_instances, estimator: sagemaker.estimator.EstimatorBase = None, model_data=None, depends_on: Union[List[str], List[sagemaker.workflow.steps.Step]] = None, repack_model_step_retry_policies: List[sagemaker.workflow.retry.RetryPolicy] = None, register_model_step_retry_policies: List[sagemaker.workflow.retry.RetryPolicy] = None, model_package_group_name=None, model_metrics=None, approval_status=None, image_uri=None, compile_model_family=None, display_name=None, description=None, tags=None, model: Union[sagemaker.model.Model, sagemaker.pipeline.PipelineModel] = None, **kwargs)

Register Model step collection for workflow.

Construct steps _RepackModelStep and _RegisterModelStep based on the estimator.

Parameters
  • name (str) – The name of the training step.

  • estimator – The estimator instance.

  • model_data – The S3 uri to the model data from training.

  • content_types (list) – The supported MIME types for the input data (default: None).

  • response_types (list) – The supported MIME types for the output data (default: None).

  • inference_instances (list) – A list of the instance types that are used to generate inferences in real-time (default: None).

  • transform_instances (list) – A list of the instance types on which a transformation job can be run or on which an endpoint can be deployed (default: None).

  • depends_on (List[str] or List[Step]) – The list of step names or step instances the first step in the collection depends on

  • repack_model_step_retry_policies (List[RetryPolicy]) – The list of retry policies for the repack model step

  • register_model_step_retry_policies (List[RetryPolicy]) – The list of retry policies for register model step

  • model_package_group_name (str) – The Model Package Group name, exclusive to model_package_name, using model_package_group_name makes the Model Package versioned (default: None).

  • model_metrics (ModelMetrics) – ModelMetrics object (default: None).

  • approval_status (str) – Model Approval Status, values can be “Approved”, “Rejected”, or “PendingManualApproval” (default: “PendingManualApproval”).

  • image_uri (str) – The container image uri for Model Package, if not specified, Estimator’s training container image is used (default: None).

  • compile_model_family (str) – The instance family for the compiled model. If specified, a compiled model is used (default: None).

  • description (str) – Model Package description (default: None).

  • tags (List[dict[str, str]]) – The list of tags to attach to the model package group. Note that tags will only be applied to newly created model package groups; if the name of an existing group is passed to “model_package_group_name”, tags will not be applied.

  • model (object or Model) – A PipelineModel object that comprises a list of models which gets executed as a serial inference pipeline or a Model object.

  • **kwargs – additional arguments to create_model.

class sagemaker.workflow.step_collections.EstimatorTransformer(name: str, estimator: sagemaker.estimator.EstimatorBase, model_data, model_inputs, instance_count, instance_type, transform_inputs, description: str = None, display_name: str = None, image_uri=None, predictor_cls=None, env=None, strategy=None, assemble_with=None, output_path=None, output_kms_key=None, accept=None, max_concurrent_transforms=None, max_payload=None, tags=None, volume_kms_key=None, depends_on: Union[List[str], List[sagemaker.workflow.steps.Step]] = None, repack_model_step_retry_policies: List[sagemaker.workflow.retry.RetryPolicy] = None, model_step_retry_policies: List[sagemaker.workflow.retry.RetryPolicy] = None, transform_step_retry_policies: List[sagemaker.workflow.retry.RetryPolicy] = None, **kwargs)

Creates a Transformer step collection for workflow.

Construct steps required for a Transformer step collection:

An estimator-centric step collection. It models what happens in workflows when invoking the transform() method on an estimator instance: First, if custom model artifacts are required, a _RepackModelStep is included. Second, a CreateModelStep with the model data passed in from a training step or other training job output. Finally, a TransformerStep.

If repacking the model artifacts is not necessary, only the CreateModelStep and TransformerStep are in the step collection.

Parameters
  • name (str) – The name of the Transform Step.

  • estimator – The estimator instance.

  • instance_count (int) – The number of EC2 instances to use.

  • instance_type (str) – The type of EC2 instance to use.

  • strategy (str) – The strategy used to decide how to batch records in a single request (default: None). Valid values: ‘MultiRecord’ and ‘SingleRecord’.

  • assemble_with (str) – How the output is assembled (default: None). Valid values: ‘Line’ or ‘None’.

  • output_path (str) – The S3 location for saving the transform result. If not specified, results are stored to a default bucket.

  • output_kms_key (str) – Optional. A KMS key ID for encrypting the transform output (default: None).

  • accept (str) – The accept header passed by the client to the inference endpoint. If it is supported by the endpoint, it will be the format of the batch transform output.

  • env (dict) – The Environment variables to be set for use during the transform job (default: None).

  • depends_on (List[str] or List[Step]) – The list of step names or step instances the first step in the collection depends on

  • repack_model_step_retry_policies (List[RetryPolicy]) – The list of retry policies for the repack model step

  • model_step_retry_policies (List[RetryPolicy]) – The list of retry policies for model step

  • transform_step_retry_policies (List[RetryPolicy]) – The list of retry policies for transform step

Steps

class sagemaker.workflow.steps.StepTypeEnum(*args, value=<object object>, **kwargs)

Enum of step types.

class sagemaker.workflow.steps.Step(name: str = NOTHING, display_name: str = None, description: str = None, step_type: sagemaker.workflow.steps.StepTypeEnum = NOTHING, depends_on: Union[List[str], List[Step]] = None)

Pipeline step for workflow.

name

The name of the step.

Type

str

display_name

The display name of the step.

Type

str

description

The description of the step.

Type

str

step_type

The type of the step.

Type

StepTypeEnum

depends_on

The list of step names or step instances the current step depends on

Type

List[str] or List[Step]

retry_policies

The custom retry policy configuration

Type

List[RetryPolicy]

Method generated by attrs for class Step.

class sagemaker.workflow.steps.TrainingStep(name: str, estimator: sagemaker.estimator.EstimatorBase, display_name: str = None, description: str = None, inputs: Union[sagemaker.inputs.TrainingInput, dict, str, sagemaker.inputs.FileSystemInput] = None, cache_config: sagemaker.workflow.steps.CacheConfig = None, depends_on: Union[List[str], List[sagemaker.workflow.steps.Step]] = None, retry_policies: List[sagemaker.workflow.retry.RetryPolicy] = None)

Training step for workflow.

Construct a TrainingStep, given an EstimatorBase instance.

In addition to the estimator instance, the other arguments are those that are supplied to the fit method of the sagemaker.estimator.Estimator.

Parameters
  • name (str) – The name of the training step.

  • estimator (EstimatorBase) – A sagemaker.estimator.EstimatorBase instance.

  • display_name (str) – The display name of the training step.

  • description (str) – The description of the training step.

  • inputs (Union[str, dict, TrainingInput, FileSystemInput]) –

    Information about the training data. This can be one of three types:

    • (str) the S3 location where training data is saved, or a file:// path in local mode.

    • (dict[str, str] or dict[str, sagemaker.inputs.TrainingInput]) If using multiple channels for training data, you can specify a dict mapping channel names to strings or TrainingInput() objects.

    • (sagemaker.inputs.TrainingInput) - channel configuration for S3 data sources that can provide additional information as well as the path to the training dataset. See sagemaker.inputs.TrainingInput() for full details.

    • (sagemaker.inputs.FileSystemInput) - channel configuration for a file system data source that can provide additional information as well as the path to the training dataset.

  • cache_config (CacheConfig) – A sagemaker.workflow.steps.CacheConfig instance.

  • depends_on (List[str] or List[Step]) – A list of step names or step instances this sagemaker.workflow.steps.TrainingStep depends on

  • retry_policies (List[RetryPolicy]) – A list of retry policy

class sagemaker.workflow.steps.TuningStep(name: str, tuner: sagemaker.tuner.HyperparameterTuner, display_name: str = None, description: str = None, inputs=None, job_arguments: List[str] = None, cache_config: sagemaker.workflow.steps.CacheConfig = None, depends_on: Union[List[str], List[sagemaker.workflow.steps.Step]] = None, retry_policies: List[sagemaker.workflow.retry.RetryPolicy] = None)

Tuning step for workflow.

Construct a TuningStep, given a HyperparameterTuner instance.

In addition to the tuner instance, the other arguments are those that are supplied to the fit method of the sagemaker.tuner.HyperparameterTuner.

Parameters
  • name (str) – The name of the tuning step.

  • tuner (HyperparameterTuner) – A sagemaker.tuner.HyperparameterTuner instance.

  • display_name (str) – The display name of the tuning step.

  • description (str) – The description of the tuning step.

  • inputs

    Information about the training data. Please refer to the fit() method of the associated estimator, as this can take any of the following forms:

    • (str) - The S3 location where training data is saved.

    • (dict[str, str] or dict[str, sagemaker.inputs.TrainingInput]) -

      If using multiple channels for training data, you can specify a dict mapping channel names to strings or TrainingInput() objects.

    • (sagemaker.inputs.TrainingInput) - Channel configuration for S3 data sources

      that can provide additional information about the training dataset. See sagemaker.inputs.TrainingInput() for full details.

    • (sagemaker.session.FileSystemInput) - channel configuration for

      a file system data source that can provide additional information as well as the path to the training dataset.

    • (sagemaker.amazon.amazon_estimator.RecordSet) - A collection of

      Amazon :class:~`Record` objects serialized and stored in S3. For use with an estimator for an Amazon algorithm.

    • (sagemaker.amazon.amazon_estimator.FileSystemRecordSet) -

      Amazon SageMaker channel configuration for a file system data source for Amazon algorithms.

    • (list[sagemaker.amazon.amazon_estimator.RecordSet]) - A list of

      :class:~`sagemaker.amazon.amazon_estimator.RecordSet` objects, where each instance is a different channel of training data.

    • (list[sagemaker.amazon.amazon_estimator.FileSystemRecordSet]) - A list of

      :class:~`sagemaker.amazon.amazon_estimator.FileSystemRecordSet` objects, where each instance is a different channel of training data.

  • job_arguments (List[str]) – A list of strings to be passed into the processing job. Defaults to None.

  • cache_config (CacheConfig) – A sagemaker.workflow.steps.CacheConfig instance.

  • depends_on (List[str] or List[Step]) – A list of step names or step instance this sagemaker.workflow.steps.ProcessingStep depends on

  • retry_policies (List[RetryPolicy]) – A list of retry policy

TuningStep.get_top_model_s3_uri(self, top_k: int, s3_bucket: str, prefix: str = '')sagemaker.workflow.functions.Join

Get the model artifact s3 uri from the top performing training jobs.

Parameters
  • top_k (int) – the index of the top performing training job tuning step stores up to 50 top performing training jobs, hence a valid top_k value is from 0 to 49. The best training job model is at index 0

  • s3_bucket (str) – the s3 bucket to store the training job output artifact

  • prefix (str) – the s3 key prefix to store the training job output artifact

class sagemaker.workflow.steps.TransformStep(name: str, transformer: sagemaker.transformer.Transformer, inputs: sagemaker.inputs.TransformInput, display_name: str = None, description: str = None, cache_config: sagemaker.workflow.steps.CacheConfig = None, depends_on: Union[List[str], List[sagemaker.workflow.steps.Step]] = None, retry_policies: List[sagemaker.workflow.retry.RetryPolicy] = None)

Transform step for workflow.

Constructs a TransformStep, given an Transformer instance.

In addition to the transformer instance, the other arguments are those that are supplied to the transform method of the sagemaker.transformer.Transformer.

Parameters
  • name (str) – The name of the transform step.

  • transformer (Transformer) – A sagemaker.transformer.Transformer instance.

  • inputs (TransformInput) – A sagemaker.inputs.TransformInput instance.

  • cache_config (CacheConfig) – A sagemaker.workflow.steps.CacheConfig instance.

  • display_name (str) – The display name of the transform step.

  • description (str) – The description of the transform step.

  • depends_on (List[str]) – A list of step names this sagemaker.workflow.steps.TransformStep depends on

  • retry_policies (List[RetryPolicy]) – A list of retry policy

class sagemaker.workflow.steps.ProcessingStep(name: str, processor: sagemaker.processing.Processor, display_name: str = None, description: str = None, inputs: List[sagemaker.processing.ProcessingInput] = None, outputs: List[sagemaker.processing.ProcessingOutput] = None, job_arguments: List[str] = None, code: str = None, property_files: List[sagemaker.workflow.properties.PropertyFile] = None, cache_config: sagemaker.workflow.steps.CacheConfig = None, depends_on: Union[List[str], List[sagemaker.workflow.steps.Step]] = None, retry_policies: List[sagemaker.workflow.retry.RetryPolicy] = None)

Processing step for workflow.

Construct a ProcessingStep, given a Processor instance.

In addition to the processor instance, the other arguments are those that are supplied to the process method of the sagemaker.processing.Processor.

Parameters
  • name (str) – The name of the processing step.

  • processor (Processor) – A sagemaker.processing.Processor instance.

  • display_name (str) – The display name of the processing step.

  • description (str) – The description of the processing step.

  • inputs (List[ProcessingInput]) – A list of sagemaker.processing.ProcessorInput instances. Defaults to None.

  • outputs (List[ProcessingOutput]) – A list of sagemaker.processing.ProcessorOutput instances. Defaults to None.

  • job_arguments (List[str]) – A list of strings to be passed into the processing job. Defaults to None.

  • code (str) – This can be an S3 URI or a local path to a file with the framework script to run. Defaults to None.

  • property_files (List[PropertyFile]) – A list of property files that workflow looks for and resolves from the configured processing output list.

  • cache_config (CacheConfig) – A sagemaker.workflow.steps.CacheConfig instance.

  • depends_on (List[str] or List[Step]) – A list of step names or step instance this sagemaker.workflow.steps.ProcessingStep depends on

  • retry_policies (List[RetryPolicy]) – A list of retry policy

class sagemaker.workflow.steps.CreateModelStep(name: str, model: sagemaker.model.Model, inputs: sagemaker.inputs.CreateModelInput, depends_on: Union[List[str], List[sagemaker.workflow.steps.Step]] = None, retry_policies: List[sagemaker.workflow.retry.RetryPolicy] = None, display_name: str = None, description: str = None)

CreateModel step for workflow.

Construct a CreateModelStep, given an sagemaker.model.Model instance.

In addition to the Model instance, the other arguments are those that are supplied to the _create_sagemaker_model method of the sagemaker.model.Model._create_sagemaker_model.

Parameters
  • name (str) – The name of the CreateModel step.

  • model (Model) – A sagemaker.model.Model instance.

  • inputs (CreateModelInput) – A sagemaker.inputs.CreateModelInput instance. Defaults to None.

  • depends_on (List[str] or List[Step]) – A list of step names or step instances this sagemaker.workflow.steps.CreateModelStep depends on

  • retry_policies (List[RetryPolicy]) – A list of retry policy

  • display_name (str) – The display name of the CreateModel step.

  • description (str) – The description of the CreateModel step.

class sagemaker.workflow.callback_step.CallbackStep(name: str, sqs_queue_url: str, inputs: dict, outputs: List[sagemaker.workflow.callback_step.CallbackOutput], display_name: str = None, description: str = None, cache_config: sagemaker.workflow.steps.CacheConfig = None, depends_on: Union[List[str], List[sagemaker.workflow.steps.Step]] = None)

Callback step for workflow.

Constructs a CallbackStep.

Parameters
  • name (str) – The name of the callback step.

  • sqs_queue_url (str) – An SQS queue URL for receiving callback messages.

  • inputs (dict) – Input arguments that will be provided in the SQS message body of callback messages.

  • outputs (List[CallbackOutput]) – Outputs that can be provided when completing a callback.

  • display_name (str) – The display name of the callback step.

  • description (str) – The description of the callback step.

  • cache_config (CacheConfig) – A sagemaker.workflow.steps.CacheConfig instance.

  • depends_on (List[str] or List[Step]) – A list of step names or step instances this sagemaker.workflow.steps.CallbackStep depends on

class sagemaker.workflow.steps.CacheConfig(enable_caching: bool = False, expire_after=None)

Configuration class to enable caching in pipeline workflow.

If caching is enabled, the pipeline attempts to find a previous execution of a step that was called with the same arguments. Step caching only considers successful execution. If a successful previous execution is found, the pipeline propagates the values from previous execution rather than recomputing the step. When multiple successful executions exist within the timeout period, it uses the result for the most recent successful execution.

enable_caching

To enable step caching. Defaults to False.

Type

bool

expire_after

If step caching is enabled, a timeout also needs to defined. It defines how old a previous execution can be to be considered for reuse. Value should be an ISO 8601 duration string. Defaults to None.

Examples:

'p30d' # 30 days
'P4DT12H' # 4 days and 12 hours
'T12H' # 12 hours
Type

str

Method generated by attrs for class CacheConfig.

class sagemaker.workflow.lambda_step.LambdaStep(name: str, lambda_func: sagemaker.lambda_helper.Lambda, display_name: str = None, description: str = None, inputs: dict = None, outputs: List[sagemaker.workflow.lambda_step.LambdaOutput] = None, cache_config: sagemaker.workflow.steps.CacheConfig = None, depends_on: List[str] = None)

Lambda step for workflow.

Constructs a LambdaStep.

Parameters
  • name (str) – The name of the lambda step.

  • display_name (str) – The display name of the Lambda step.

  • description (str) – The description of the Lambda step.

  • lambda_func (str) – An instance of sagemaker.lambda_helper.Lambda. If lambda arn is specified in the instance, LambdaStep just invokes the function, else lambda function will be created while creating the pipeline.

  • inputs (dict) – Input arguments that will be provided to the lambda function.

  • outputs (List[LambdaOutput]) – List of outputs from the lambda function.

  • cache_config (CacheConfig) – A sagemaker.workflow.steps.CacheConfig instance.

  • depends_on (List[str]) – A list of step names this sagemaker.workflow.steps.LambdaStep depends on