With Reinforcement Learning (RL) Estimators, you can train reinforcement learning models on Amazon SageMaker.
For supported RL toolkits and their versions, see https://github.com/aws/sagemaker-rl-container/#rl-images-provided-by-sagemaker
Training RL models using
RLEstimator is a two-step process:
Prepare a training script to run on SageMaker
Run this script on SageMaker via an
You should prepare your script in a separate source file than the notebook, terminal session, or source file you’re
using to submit the script to SageMaker via an
RlEstimator. This will be discussed in further detail below.
Suppose that you already have a training script called
You can then create an
RLEstimator with keyword arguments to point to this script and define how SageMaker runs it:
from sagemaker.rl import RLEstimator, RLToolkit, RLFramework rl_estimator = RLEstimator(entry_point='coach-train.py', toolkit=RLToolkit.COACH, toolkit_version='0.11.1', framework=RLFramework.TENSORFLOW, role='SageMakerRole', train_instance_type='ml.p3.2xlarge', train_instance_count=1)
After that, you simply tell the estimator to start a training job:
In the following sections, we’ll discuss how to prepare a training script for execution on SageMaker
and how to run that script on SageMaker using
Your RL training script must be a Python 3.5 compatible source file from MXNet framework or Python 3.6 for TensorFlow.
The training script is very similar to a training script you might run outside of SageMaker, but you can access useful properties about the training environment through various environment variables, such as
SM_MODEL_DIR: A string representing the path to the directory to write model artifacts to. These artifacts are uploaded to S3 for model hosting.
SM_NUM_GPUS: An integer representing the number of GPUs available to the host.
SM_OUTPUT_DATA_DIR: A string representing the filesystem path to write output artifacts to. Output artifacts may include checkpoints, graphs, and other files to save, not including model artifacts. These artifacts are compressed and uploaded to S3 to the same S3 prefix as the model artifacts.
For the exhaustive list of available environment variables, see the SageMaker Containers documentation.
RLEstimator constructor takes both required and optional arguments.
The following are required arguments to the
RLEstimator constructor. When you create an instance of RLEstimator, you must include
these in the constructor, either positionally or as keyword arguments.
entry_pointPath (absolute or relative) to the Python file which should be executed as the entry point to training.
roleAn AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if accessing AWS resource.
train_instance_countNumber of Amazon EC2 instances to use for training.
train_instance_typeType of EC2 instance to use for training, for example, ‘ml.m4.xlarge’.
You must as well include either:
toolkitRL toolkit (Ray RLlib or Coach) you want to use for executing your model training code.
toolkit_versionRL toolkit version you want to be use for executing your model training code.
frameworkFramework (MXNet or TensorFlow) you want to be used as a toolkit backed for reinforcement learning training.
image_nameAn alternative docker image to use for training and serving. If specified, the estimator will use this image for training and hosting, instead of selecting the appropriate SageMaker official image based on framework_version and py_version. Refer to: SageMaker RL Docker Containers for details on what the Official images support and where to find the source code to build your custom image.
The following are optional arguments. When you create an
RlEstimator object, you can specify these as keyword arguments.
source_dirPath (absolute or relative) to a directory with any other training source code dependencies including the entry point file. Structure within this directory will be preserved when training on SageMaker.
dependencies (list[str])A list of paths to directories (absolute or relative) with any additional libraries that will be exported to the container (default:
). The library folders will be copied to SageMaker in the same folder where the entrypoint is copied. If the
source_dirpoints to S3, code will be uploaded and the S3 location will be used instead.
For example, the following call:
>>> RLEstimator(entry_point='train.py', toolkit=RLToolkit.COACH, toolkit_version='0.11.0', framework=RLFramework.TENSORFLOW, dependencies=['my/libs/common', 'virtual-env'])
results in the following inside the container:
>>> $ ls >>> opt/ml/code >>> ├── train.py >>> ├── common >>> └── virtual-env
hyperparametersHyperparameters that will be used for training. Will be made accessible as a
dict[str, str]to the training code on SageMaker. For convenience, accepts other types besides strings, but
strwill be called on keys and values to convert them before training.
train_volume_sizeSize in GB of the EBS volume to use for storing input data during training. Must be large enough to store training data if
input_mode='File'is used (which is the default).
train_max_runTimeout in seconds for training, after which Amazon SageMaker terminates the job regardless of its current status.
input_modeThe input mode that the algorithm supports. Valid modes: ‘File’ - Amazon SageMaker copies the training dataset from the S3 location to a directory in the Docker container. ‘Pipe’ - Amazon SageMaker streams data directly from S3 to the container via a Unix named pipe.
output_pathS3 location where you want the training result (model artifacts and optional output files) saved. If not specified, results are stored to a default bucket. If the bucket with the specific name does not exist, the estimator creates the bucket during the
output_kms_keyOptional KMS key ID to optionally encrypt training output with.
job_nameName to assign for the training job that the
fit`method launches. If not specified, the estimator generates a default job name, based on the training image name and current timestamp
You start your training script by calling
fit on an
fit takes a few optional
inputs: This can take one of the following forms: A string S3 URI, for example
s3://my-bucket/my-training-data. In this case, the S3 objects rooted at the
my-training-dataprefix will be available in the default
trainchannel. A dict from string channel names to S3 URIs. In this case, the objects rooted at each S3 prefix will available as files in each channel directory.
wait: Defaults to True, whether to block and wait for the training script to complete before returning.
logs: Defaults to True, whether to show logs produced by training job in the Python session. Only meaningful when wait is True.
Amazon SageMaker RL supports multi-core and multi-instance distributed training. Depending on your use case, training and/or environment rollout can be distributed.
Please see the Amazon SageMaker examples on how it can be done using different RL toolkits.
In order to save your trained PyTorch model for deployment on SageMaker, your training script should save your model
to a certain filesystem path
/opt/ml/model. This value is also accessible through the environment variable
After an RL Estimator has been fit, you can host the newly created model in SageMaker.
fit, you can call
deploy on an
RlEstimator Estimator to create a SageMaker Endpoint.
The Endpoint runs one of the SageMaker-provided model server based on the
specified in the
RLEstimator constructor and hosts the model produced by your training script,
which was run when you called
fit. This was the model you saved to
In case if
image_name was specified it would use provided image for the deployment.
deploy returns a
sagemaker.mxnet.MXNetPredictor for MXNet or
sagemaker.tensorflow.serving.Predictor for TensorFlow.
predict returns the result of inference against your model.
# Train my estimator rl_estimator = RLEstimator(entry_point='coach-train.py', toolkit=RLToolkit.COACH, toolkit_version='0.11.0', framework=RLFramework.MXNET, role='SageMakerRole', train_instance_type='ml.c4.2xlarge', train_instance_count=1) rl_estimator.fit() # Deploy my estimator to a SageMaker Endpoint and get a MXNetPredictor predictor = rl_estimator.deploy(instance_type='ml.m4.xlarge', initial_instance_count=1) response = predictor.predict(data)
You can attach an RL Estimator to an existing training job using the
my_training_job_name = 'MyAwesomeRLTrainingJob' rl_estimator = RLEstimator.attach(my_training_job_name)
After attaching, if the training job has finished with job status “Completed”, it can be
deployed to create a SageMaker Endpoint and return a
Predictor. If the training job is in progress,
attach will block and display log messages from the training job, until the training job completes.
attach method accepts the following arguments:
training_job_name:The name of the training job to attach to.
sagemaker_session:The Session used to interact with SageMaker
Amazon provides several example Jupyter notebooks that demonstrate end-to-end training on Amazon SageMaker using RL. Please refer to:
These are also available in SageMaker Notebook Instance hosted Jupyter notebooks under the sample notebooks folder.