[!WARNING] As of April 2024, SageMaker RL containers no longer accepts new pull requests. Please follow Building Your Image to build your own RL images.
A set of Dockerfiles that enables Reinforcement Learning (RL) solutions to be used in SageMaker.
The SageMaker team uses this repository to build its official RL images. On how to use any of these images on SageMaker,
see Python SDK.
For end users, this repository is typically of interest if you need implementation details of
the official image, or if you want to use it to build your own customized RL image.
A Python environment management tool (e.g. PyEnv, VirtualEnv.
Terminologies
Toolkit
Toolkits are libraries that provide specific algorithms to train a Reinforcement Learning model. We currently provide Dockerfiles for these three toolkits:
Framework refers to a Deep Learning framework/library that a toolkit may need in order to train an algorithm. We use Sagemaker created framework images/prebuilt Amazon SageMaker Docker images as base images in a Toolkit’s Dockerfile (whenever required). Currently we are using these two frameworks:
The Dockerfiles are grouped by RL toolkit and toolkit version. Within that, they are separated
by framework (if needed). For e.g., the Dockerfile for Coach v0.11.0 with MXNet framework can be found at: coach/docker/0.11.0/Dockerfile.mxnet.
For toolkits Ray and Coach, the Dockerfiles use deep learning framework images provided by SageMaker as their “base” images.
These “base” images are specified with the following naming convention:
<framework> can be tensorflow-scriptmode (with <framework_version>1.11.0 or higher depending on the toolkit requirements)
or mxnet (with <framework_version>1.3.0 or higher depending on the toolkit requirements);
Running the tests requires installation of test dependencies.
git clone https://github.com/aws/sagemaker-rl-container.git
cd sagemaker-rl-container
pip install .
Tests are defined in test/ and include local integration and SageMaker integration tests.
Local Integration Tests
Running local integration tests require Docker and AWS credentials, as the local integration tests make calls to a couple of AWS services. The local integration tests and SageMaker integration tests require configurations specified within their respective conftest.py.
Local integration tests on GPU require Nvidia-Docker.
Before running local integration tests:
Build your Docker image.
Pass in the correct pytest arguments to run tests against your Docker image.
If you want to run local integration tests, then use:
# Required arguments for integration tests are found in test/conftest.py
pytest test/integration/local --toolkit <toolkit_to_run_tests_for> \
--docker-base-name <your_docker_image> \
--tag <your_docker_image_tag> \
--processor <cpu_or_gpu>
# Example
pytest test/integration/local --toolkit coach \
--docker-base-name custom-rl-coach-image \
--tag 1.0 \
--processor cpu
SageMaker Integration Tests
SageMaker integration tests require your Docker image to be within an [Amazon ECR repository](https://docs
.aws.amazon.com/AmazonECS/latest/developerguide/ECS_Console_Repositories.html).
The Docker base name is your [ECR repository namespace](https://docs.aws.amazon
.com/AmazonECR/latest/userguide/Repositories.html).
Amazon SageMaker RL Containers
A set of Dockerfiles that enables Reinforcement Learning (RL) solutions to be used in SageMaker.
The SageMaker team uses this repository to build its official RL images. On how to use any of these images on SageMaker, see Python SDK. For end users, this repository is typically of interest if you need implementation details of the official image, or if you want to use it to build your own customized RL image.
For information on running RL jobs on SageMaker: SageMaker RLEstimators.
For notebook examples: SageMaker Notebook Examples.
Table of Contents
Getting Started
Prerequisites
Make sure you have installed all of the following prerequisites on your development machine:
For Testing on GPU
Recommended
A Python environment management tool (e.g. PyEnv, VirtualEnv.
Terminologies
Toolkit
Toolkits are libraries that provide specific algorithms to train a Reinforcement Learning model. We currently provide Dockerfiles for these three toolkits:
Framework
Framework refers to a Deep Learning framework/library that a toolkit may need in order to train an algorithm. We use Sagemaker created framework images/prebuilt Amazon SageMaker Docker images as base images in a Toolkit’s Dockerfile (whenever required). Currently we are using these two frameworks:
Note: VW doesn’t require a framework
RL Images Provided by SageMaker
MXNet Coach Images:
TensorFlow Coach Images:
TensorFlow Ray Images:
PyTorch Ray Images:
Vowpal Wabbit Images:
List of supported SageMaker regions.
Building Your Image
Amazon SageMaker utilizes Docker containers to run all training jobs and inference endpoints.
The Docker images are built from the Dockerfiles specified in this repository at:
The Dockerfiles are grouped by RL toolkit and toolkit version. Within that, they are separated by framework (if needed). For e.g., the Dockerfile for Coach v0.11.0 with MXNet framework can be found at:
coach/docker/0.11.0/Dockerfile.mxnet.For toolkits Ray and Coach, the Dockerfiles use deep learning framework images provided by SageMaker as their “base” images.
These “base” images are specified with the following naming convention:
<framework>can betensorflow-scriptmode(with<framework_version>1.11.0or higher depending on the toolkit requirements) ormxnet(with<framework_version>1.3.0or higher depending on the toolkit requirements);<processor>can becpuorgpu;<region>values please see `list of supported SageMaker regions <https://docs.aws.amazon.com/general/latest/gr/rande.html#sagemaker_region).Before building images:
Pull deep learning framework “base” image, which require Docker, AWS credentials, and AWS CLI.
To build RL Docker image:
Running the Tests
Running the tests requires installation of test dependencies.
Tests are defined in test/ and include local integration and SageMaker integration tests.
Local Integration Tests
Running local integration tests require Docker and AWS credentials, as the local integration tests make calls to a couple of AWS services. The local integration tests and SageMaker integration tests require configurations specified within their respective conftest.py.
Local integration tests on GPU require Nvidia-Docker.
Before running local integration tests:
If you want to run local integration tests, then use:
SageMaker Integration Tests
SageMaker integration tests require your Docker image to be within an [Amazon ECR repository](https://docs .aws.amazon.com/AmazonECS/latest/developerguide/ECS_Console_Repositories.html).
The Docker base name is your [ECR repository namespace](https://docs.aws.amazon .com/AmazonECR/latest/userguide/Repositories.html).
The instance type is your specified Amazon SageMaker Instance Type that the SageMaker integration test will run on.
Before running SageMaker integration tests:
If you want to run a SageMaker integration end to end test on Amazon SageMaker, then use:
Contributing
Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.
License
This library is licensed under the Apache 2.0 License.
Note: Specific license for Toolkits/Frameworks, if any, can be found in /docker/LICENSE or in the Framework’s image