Remote function classes and methods specification¶
@remote decorator¶
- client.remote(*, dependencies=None, pre_execution_commands=None, pre_execution_script=None, environment_variables=None, image_uri=None, include_local_workdir=None, custom_file_filter=None, instance_count=1, instance_type=None, job_conda_env=None, job_name_prefix=None, keep_alive_period_in_seconds=0, max_retry_attempts=1, max_runtime_in_seconds=86400, role=None, s3_kms_key=None, s3_root_uri=None, sagemaker_session=None, security_group_ids=None, subnets=None, tags=None, volume_kms_key=None, volume_size=30, encrypt_inter_container_traffic=None, spark_config=None, use_spot_instances=False, max_wait_time_in_seconds=None, use_torchrun=False, nproc_per_node=1)¶
Decorator for running the annotated function as a SageMaker training job.
This decorator wraps the annotated code and runs it as a new SageMaker job synchronously with the provided runtime settings.
If a parameter value is not set, the decorator first looks up the value from the SageMaker configuration file. If no value is specified in the configuration file or no configuration file is found, the decorator selects the default as specified below. For more information, see Configuring and using defaults with the SageMaker Python SDK.
- Parameters
_func (Optional) – A Python function to run as a SageMaker training job.
dependencies (str) –
Either the path to a dependencies file or the reserved keyword
auto_capture
. Defaults toNone
. Ifdependencies
is provided, the value must be one of the following:A path to a conda environment.yml file. The following conditions apply.
If job_conda_env is set, then the conda environment is updated by installing dependencies from the yaml file and the function is invoked within that conda environment. For this to succeed, the specified conda environment must already exist in the image.
If the environment variable
SAGEMAKER_JOB_CONDA_ENV
is set in the image, then the conda environment is updated by installing dependencies from the yaml file and the function is invoked within that conda environment. For this to succeed, the conda environment name must already be set inSAGEMAKER_JOB_CONDA_ENV
, andSAGEMAKER_JOB_CONDA_ENV
must already exist in the image.If none of the previous conditions are met, a new conda environment named
sagemaker-runtime-env
is created and the function annotated with the remote decorator is invoked in that conda environment.
A path to a requirements.txt file. The following conditions apply.
If
job_conda_env
is set in the remote decorator, dependencies are installed within that conda environment and the function annotated with the remote decorator is invoked in the same conda environment. For this to succeed, the specified conda environment must already exist in the image.If an environment variable
SAGEMAKER_JOB_CONDA_ENV
is set in the image, dependencies are installed within that conda environment and the function annotated with the remote decorator is invoked in the same. For this to succeed, the conda environment name must already be set inSAGEMAKER_JOB_CONDA_ENV
, andSAGEMAKER_JOB_CONDA_ENV
must already exist in the image.If none of the above conditions are met, conda is not used. Dependencies are installed at the system level, without any virtual environment, and the function annotated with the remote decorator is invoked using the Python runtime available in the system path.
The parameter dependencies is set to
auto_capture
. SageMaker will automatically generate an env_snapshot.yml corresponding to the current active conda environment’s snapshot. You do not need to provide a dependencies file. The following conditions apply:You must run the remote function within an active conda environment.
When installing the dependencies on the training job, the same conditions as when dependencies is set to a path to a conda environment file apply. These conditions are as follows:
If job_conda_env is set, then the conda environment is updated by installing dependencies from the yaml file and the function is invoked within that conda environment. For this to succeed, the specified conda environment must already exist in the image.
If the environment variable
SAGEMAKER_JOB_CONDA_ENV
is set in the image, then the conda environment is updated by installing dependencies from the yaml file and the function is invoked within that conda environment. For this to succeed, the conda environment name must already be set inSAGEMAKER_JOB_CONDA_ENV
, andSAGEMAKER_JOB_CONDA_ENV
must already exist in the image.If none of the previous conditions are met, a new conda environment with name
sagemaker-runtime-env
is created and the function annotated with the remote decorator is invoked in that conda environment.
None
. SageMaker will assume that there are no dependencies to install while executing the remote annotated function in the training job.
pre_execution_commands (List[str]) – List of commands to be executed prior to executing remote function. Only one of
pre_execution_commands
orpre_execution_script
can be specified at the same time. Defaults to None.pre_execution_script (str) – Path to script file to be executed prior to executing remote function. Only one of
pre_execution_commands
orpre_execution_script
can be specified at the same time. Defaults to None.environment_variables (Dict) – The environment variables used inside the decorator function. Defaults to
None
.image_uri (str) –
The universal resource identifier (URI) location of a Docker image on Amazon Elastic Container Registry (ECR). Defaults to the following based on where the SDK is running:
For users who specify
spark_config
and want to run the function in a Spark application, theimage_uri
should beNone
. A SageMaker Spark image will be used for training, otherwise aValueError
is thrown.For users on SageMaker Studio notebooks, the image used as the kernel image for the notebook is used.
For other users, it is resolved to base python image with the same python version as the environment running the local code.
If no compatible image is found, a ValueError is thrown.
include_local_workdir (bool) – A flag to indicate that the remote function should include local directories. Set to
True
if the remote function code imports local modules and methods that are not available via PyPI or conda. Only python files are included. Default value isFalse
.custom_file_filter (Callable[[str, List], List], CustomFileFilter) – Either a function that filters job dependencies to be uploaded to S3 or a
CustomFileFilter
object that specifies the local directories and files to be included in the remote function. If a callable is passed in, the function should follow the protocol ofignore
argument ofshutil.copytree
. Defaults toNone
, which means only python files are accepted and uploaded to S3.instance_count (int) – The number of instances to use. Defaults to 1. NOTE: Remote function does not support instance_count > 1 for non Spark jobs.
instance_type (str) – The Amazon Elastic Compute Cloud (EC2) instance type to use to run the SageMaker job. e.g. ml.c4.xlarge. If not provided, a ValueError is thrown.
job_conda_env (str) – The name of the conda environment to activate during job’s runtime. Defaults to
None
.job_name_prefix (str) – The prefix used used to create the underlying SageMaker job.
keep_alive_period_in_seconds (int) – The duration in seconds to retain and reuse provisioned infrastructure after the completion of a training job, also known as SageMaker managed warm pools. The use of warmpools reduces the latency time spent to provision new resources. The default value for
keep_alive_period_in_seconds
is 0. NOTE: Additional charges associated with warm pools may apply. Using this parameter also activates a new persistent cache feature, which will further reduce job start up latency than over using SageMaker managed warm pools alone by caching the package source downloaded in the previous runs.max_retry_attempts (int) – The max number of times the job is retried on
InternalServerFailure
Error from SageMaker service. Defaults to 1.max_runtime_in_seconds (int) – The upper limit in seconds to be used for training. After this specified amount of time, SageMaker terminates the job regardless of its current status. Defaults to 1 day or (86400 seconds).
role (str) –
The IAM role (either name or full ARN) used to run your SageMaker training job. Defaults to:
the SageMaker default IAM role if the SDK is running in SageMaker Notebooks or SageMaker Studio Notebooks.
if not above, a ValueError is be thrown.
s3_kms_key (str) – The key used to encrypt the input and output data. Default to
None
.s3_root_uri (str) – The root S3 folder to which the code archives and data are uploaded to. Defaults to
s3://<sagemaker-default-bucket>
.sagemaker_session (sagemaker.session.Session) – The underlying SageMaker session to which SageMaker service calls are delegated to (default: None). If not provided, one is created using a default configuration chain.
security_group_ids (List[str) – A list of security group IDs. Defaults to
None
and the training job is created without VPC config.subnets (List[str) – A list of subnet IDs. Defaults to
None
and the job is created without VPC config.tags (List[Tuple[str, str]) – A list of tags attached to the job. Defaults to
None
and the training job is created without tags.volume_kms_key (str) – An Amazon Key Management Service (KMS) key used to encrypt an Amazon Elastic Block Storage (EBS) volume attached to the training instance. Defaults to
None
.volume_size (int) – The size in GB of the storage volume for storing input and output data during training. Defaults to
30
.encrypt_inter_container_traffic (bool) – A flag that specifies whether traffic between training containers is encrypted for the training job. Defaults to
False
.spark_config (SparkConfig) – Configurations to the Spark application that runs on Spark image. If
spark_config
is specified, a SageMaker Spark image uri will be used for training. Note thatimage_uri
can not be specified at the same time otherwise aValueError
is thrown. Defaults toNone
.use_spot_instances (bool) – Specifies whether to use SageMaker Managed Spot instances for training. If enabled then the
max_wait_time_in_seconds
arg should also be set. Defaults toFalse
.max_wait_time_in_seconds (int) – Timeout in seconds waiting for spot training job. After this amount of time Amazon SageMaker will stop waiting for managed spot training job to complete. Defaults to
None
.use_torchrun (bool) – Specifies whether to use torchrun for distributed training. Defaults to
False
.nproc_per_node (int) – Specifies the number of processes per node for distributed training. Defaults to
1
.
RemoteExecutor¶
- class sagemaker.remote_function.RemoteExecutor(*, dependencies=None, pre_execution_commands=None, pre_execution_script=None, environment_variables=None, image_uri=None, include_local_workdir=None, custom_file_filter=None, instance_count=1, instance_type=None, job_conda_env=None, job_name_prefix=None, keep_alive_period_in_seconds=0, max_parallel_jobs=1, max_retry_attempts=1, max_runtime_in_seconds=86400, role=None, s3_kms_key=None, s3_root_uri=None, sagemaker_session=None, security_group_ids=None, subnets=None, tags=None, volume_kms_key=None, volume_size=30, encrypt_inter_container_traffic=None, spark_config=None, use_spot_instances=False, max_wait_time_in_seconds=None, use_torchrun=False, nproc_per_node=1)¶
Run Python functions asynchronously as SageMaker jobs
Constructor for RemoteExecutor
If a parameter value is not set, the constructor first looks up the value from the SageMaker configuration file. If no value is specified in the configuration file or no configuration file is found, the constructor selects the default as specified below. For more information, see Configuring and using defaults with the SageMaker Python SDK.
- Parameters
_func (Optional) – A Python function to run as a SageMaker training job.
dependencies (str) – Either the path to a dependencies file or the reserved keyword
auto_capture
. Defaults toNone
. Ifdependencies
is provided, the value must be one of the following:apply. (* A path to a requirements.txt file. The following conditions) –
If job_conda_env is set, then the conda environment is updated by installing dependencies from the yaml file and the function is invoked within that conda environment. For this to succeed, the specified conda environment must already exist in the image.
If the environment variable
SAGEMAKER_JOB_CONDA_ENV
is set in the image, then the conda environment is updated by installing dependencies from the yaml file and the function is invoked within that conda environment. For this to succeed, the conda environment name must already be set inSAGEMAKER_JOB_CONDA_ENV
, andSAGEMAKER_JOB_CONDA_ENV
must already exist in the image.If none of the previous conditions are met, a new conda environment named
sagemaker-runtime-env
is created and the function annotated with the remote decorator is invoked in that conda environment.
apply. –
If
job_conda_env
is set in the remote decorator, dependencies are installed within that conda environment and the function annotated with the remote decorator is invoked in the same conda environment. For this to succeed, the specified conda environment must already exist in the image.If an environment variable
SAGEMAKER_JOB_CONDA_ENV
is set in the image, dependencies are installed within that conda environment and the function annotated with the remote decorator is invoked in the same. For this to succeed, the conda environment name must already be set inSAGEMAKER_JOB_CONDA_ENV
, andSAGEMAKER_JOB_CONDA_ENV
must already exist in the image.If none of the above conditions are met, conda is not used. Dependencies are installed at the system level, without any virtual environment, and the function annotated with the remote decorator is invoked using the Python runtime available in the system path.
automatically (* The parameter dependencies is set to auto_capture. SageMaker will) –
- generate an env_snapshot.yml corresponding to the current active conda environment’s
snapshot. You do not need to provide a dependencies file. The following conditions apply:
You must run the remote function within an active conda environment.
When installing the dependencies on the training job, the same conditions as when dependencies is set to a path to a conda environment file apply. These conditions are as follows:
If job_conda_env is set, then the conda environment is updated by installing dependencies from the yaml file and the function is invoked within that conda environment. For this to succeed, the specified conda environment must already exist in the image.
If the environment variable
SAGEMAKER_JOB_CONDA_ENV
is set in the image, then the conda environment is updated by installing dependencies from the yaml file and the function is invoked within that conda environment. For this to succeed, the conda environment name must already be set inSAGEMAKER_JOB_CONDA_ENV
, andSAGEMAKER_JOB_CONDA_ENV
must already exist in the image.If none of the previous conditions are met, a new conda environment with name
sagemaker-runtime-env
is created and the function annotated with the remote decorator is invoked in that conda environment.
None
. SageMaker will assume that there are no dependencies to install while executing the remote annotated function in the training job.
pre_execution_commands (List[str]) – List of commands to be executed prior to executing remote function. Only one of
pre_execution_commands
orpre_execution_script
can be specified at the same time. Defaults to None.pre_execution_script (str) – Path to script file to be executed prior to executing remote function. Only one of
pre_execution_commands
orpre_execution_script
can be specified at the same time. Defaults to None.environment_variables (Dict) – The environment variables used inside the decorator function. Defaults to
None
.image_uri (str) –
The universal resource identifier (URI) location of a Docker image on Amazon Elastic Container Registry (ECR). Defaults to the following based on where the SDK is running:
For users who specify
spark_config
and want to run the function in a Spark application, theimage_uri
should beNone
. A SageMaker Spark image will be used for training, otherwise aValueError
is thrown.For users on SageMaker Studio notebooks, the image used as the kernel image for the notebook is used.
For other users, it is resolved to base python image with the same python version as the environment running the local code.
If no compatible image is found, a ValueError is thrown.
include_local_workdir (bool) – A flag to indicate that the remote function should include local directories. Set to
True
if the remote function code imports local modules and methods that are not available via PyPI or conda. Default value isFalse
.custom_file_filter (Callable[[str, List], List], CustomFileFilter) – Either a function that filters job dependencies to be uploaded to S3 or a
CustomFileFilter
object that specifies the local directories and files to be included in the remote function. If a callable is passed in, that function is passed to theignore
argument ofshutil.copytree
. Defaults toNone
, which means only python files are accepted and uploaded to S3.instance_count (int) – The number of instances to use. Defaults to 1. NOTE: Remote function does not support instance_count > 1 for non Spark jobs.
instance_type (str) – The Amazon Elastic Compute Cloud (EC2) instance type to use to run the SageMaker job. e.g. ml.c4.xlarge. If not provided, a ValueError is thrown.
job_conda_env (str) – The name of the conda environment to activate during job’s runtime. Defaults to
None
.job_name_prefix (str) – The prefix used used to create the underlying SageMaker job.
keep_alive_period_in_seconds (int) – The duration in seconds to retain and reuse provisioned infrastructure after the completion of a training job, also known as SageMaker managed warm pools. The use of warmpools reduces the latency time spent to provision new resources. The default value for
keep_alive_period_in_seconds
is 0. NOTE: Additional charges associated with warm pools may apply. Using this parameter also activates a new pesistent cache feature, which will further reduce job start up latency than over using SageMaker managed warm pools alone by caching the package source downloaded in the previous runs.max_parallel_jobs (int) – Maximum number of jobs that run in parallel. Defaults to 1.
max_retry_attempts (int) – The max number of times the job is retried on
InternalServerFailure
Error from SageMaker service. Defaults to 1.max_runtime_in_seconds (int) – The upper limit in seconds to be used for training. After this specified amount of time, SageMaker terminates the job regardless of its current status. Defaults to 1 day or (86400 seconds).
role (str) –
The IAM role (either name or full ARN) used to run your SageMaker training job. Defaults to:
the SageMaker default IAM role if the SDK is running in SageMaker Notebooks or SageMaker Studio Notebooks.
if not above, a ValueError is be thrown.
s3_kms_key (str) – The key used to encrypt the input and output data. Default to
None
.s3_root_uri (str) – The root S3 folder to which the code archives and data are uploaded to. Defaults to
s3://<sagemaker-default-bucket>
.sagemaker_session (sagemaker.session.Session) – The underlying SageMaker session to which SageMaker service calls are delegated to (default: None). If not provided, one is created using a default configuration chain.
security_group_ids (List[str) – A list of security group IDs. Defaults to
None
and the training job is created without VPC config.subnets (List[str) – A list of subnet IDs. Defaults to
None
and the job is created without VPC config.tags (List[Tuple[str, str]) – A list of tags attached to the job. Defaults to
None
and the training job is created without tags.volume_kms_key (str) – An Amazon Key Management Service (KMS) key used to encrypt an Amazon Elastic Block Storage (EBS) volume attached to the training instance. Defaults to
None
.volume_size (int) – The size in GB of the storage volume for storing input and output data during training. Defaults to
30
.encrypt_inter_container_traffic (bool) – A flag that specifies whether traffic between training containers is encrypted for the training job. Defaults to
False
.spark_config (SparkConfig) – Configurations to the Spark application that runs on Spark image. If
spark_config
is specified, a SageMaker Spark image uri will be used for training. Note thatimage_uri
can not be specified at the same time otherwise aValueError
is thrown. Defaults toNone
.use_spot_instances (bool) – Specifies whether to use SageMaker Managed Spot instances for training. If enabled then the
max_wait_time_in_seconds
arg should also be set. Defaults toFalse
.max_wait_time_in_seconds (int) – Timeout in seconds waiting for spot training job. After this amount of time Amazon SageMaker will stop waiting for managed spot training job to complete. Defaults to
None
.use_torchrun (bool) – Specifies whether to use torchrun for distributed training. Defaults to
False
.nproc_per_node (int) – Specifies the number of processes per node. Defaults to
1
.
- submit(func, *args, **kwargs)¶
Execute the input function as a SageMaker job asynchronously.
- Parameters
func – Python function to run as a SageMaker job.
*args – Positional arguments to the input function.
**kwargs – keyword arguments to the input function
- map(func, *iterables)¶
Return an iterator that applies function to every item of iterable, yielding the results.
If additional iterables arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel. With multiple iterables, the iterator stops when the shortest iterable is exhausted.
- Parameters
func – Python function to run as a SageMaker job.
iterables – Arguments of the input python function.
- shutdown()¶
Prevent more function executions to be submitted to this executor.
Future¶
- class sagemaker.remote_function.client.Future¶
Class representing a reference to a SageMaker job result.
Reference to the SageMaker job created as a result of the remote function run. The job may or may not have finished running.
- static from_describe_response(describe_training_job_response, sagemaker_session)¶
Construct a Future from a describe_training_job_response object.
- result(timeout=None)¶
Returns the SageMaker job result.
This method waits for the SageMaker job created from the remote function execution to complete for up to the timeout value (if specified). If timeout is
None
, this method will wait until the SageMaker job completes.
- wait(timeout=None)¶
Wait for the underlying SageMaker job to complete.
This method waits for the SageMaker job created as a result of the remote function run to complete for up to the timeout value (if specified). If timeout is
None
, this method will block until the job is completed.- Parameters
timeout (int) – Timeout in seconds to wait until the job is completed before it is stopped. Defaults to
None
.- Returns
None
- Return type
None
- cancel()¶
Cancel the function execution.
This method prevents the SageMaker job being created or stops the underlying SageMaker job early if it is already in progress.
- Returns
True
if the underlying SageMaker job created as a result of the remote function run is cancelled.- Return type
- running()¶
Check if the underlying SageMaker job is running.
- Returns
True
if the underlying SageMaker job is still running.False
, otherwise.- Return type
- cancelled()¶
Check if the underlying SageMaker job was cancelled.
- Returns
True
if the underlying SageMaker job was cancelled.False
, otherwise.- Return type
- client.list_futures(sagemaker_session=None)¶
Generates Future objects with information about jobs with given job_name_prefix.
- Parameters
job_name_prefix (str) – A prefix used to identify the SageMaker jobs associated with remote function run.
sagemaker_session (sagemaker.session.Session) – A session object that manages interactions with Amazon SageMaker APIs and any other AWS services needed.
- Yields
A sagemaker.remote_function.client.Future instance.
- client.get_future(sagemaker_session=None)¶
Get a future object with information about a job with the given job_name.
- Parameters
job_name (str) – name of the underlying SageMaker job created as a result of the remote function run.
sagemaker_session (sagemaker.session.Session) – A session object that manages interactions with Amazon SageMaker APIs and any other AWS services needed.
- Returns
A sagemaker.remote_function.client.Future instance.
- Return type
CustomFileFilter¶
- class sagemaker.remote_function.custom_file_filter.CustomFileFilter(*, ignore_name_patterns=None)¶
Configuration that specifies how the local working directory should be packaged.
Initialize a CustomFileFilter.
- Parameters
ignore_name_patterns (List[str]) – ignore files or directories with names that match one of the glob-style patterns. Defaults to None.