The Buildkite Elastic CI Stack gives you a private, autoscaling Buildkite Agent cluster. Use it to parallelize legacy tests across hundreds of nodes, run tests and deployments for all your Linux-based services and apps, or run AWS ops tasks.
Current release is . See Releases for older releases, or Versions for development version
Although the stack will create it’s own VPC by default, we highly recommend following best practice by setting up a separate development AWS account and using role switching and consolidated billing—see the Delegate Access Across AWS Accounts tutorial for more information.
If you’d like to use the AWS CLI, download config.json.example, rename it to config.json, and then run the below command:
The stack will have created an S3 bucket for you (or used the one you provided as the SecretsBucket parameter). This will be where the agent will fetch your SSH private keys for source control, and environment hooks to provide other secrets to your builds.
The following s3 objects are downloaded and processed:
Secrets are configured via environment variables exposed using the S3 secrets bucket
By following these simple conventions you get a scaleable, repeatable and source-controlled CI environment that any team within your organization can use.
Multiple Instances of the Stack
If you need to different instances sizes and scaling characteristics between pipelines, you can create multiple stack. Each can run on a different Agent Queue, with it’s own configuration, or even in a different AWS account.
A pipeline-uploaders stack with tiny, always-on instances for lightning fast buildkite-agent pipeline upload jobs.
A deploy stack with added credentials and permissions specifically for deployment.
Autoscaling
If you have configured MinSize < MaxSize, the stack will automatically scale up and down based on the number of scheduled jobs.
This means you can scale down to zero when idle, which means you can use larger instances for the same cost.
Metrics are collected with a Lambda function, polling every minute based on the queue the stack is configured with. The autoscaler monitors only one queue.
Terminating the instance after job is complete
You may set BuildkiteTerminateInstanceAfterJob to true to force the instance to terminate after it completes a job. Setting this value to true tells the stack to enable disconnect-after-job in the buildkite-agent.cfg file.
We strongly encourage you to find an alternative to this setting if at all possible. The turn around time for replacing these instances is currently slow (5-10 minutes depending on other stack configuration settings). If you need single use jobs, we suggest looking at our container plugins like docker, docker-compose, and ecs, all which can be found here.
Docker Registry Support
If you want to push or pull from registries such as Docker Hub or Quay you can use the environment hook in your secrets bucket to export the following environment variables:
DOCKER_LOGIN_USER="the-user-name"
DOCKER_LOGIN_PASSWORD="the-password"
DOCKER_LOGIN_SERVER="" - optional. By default it will log into Docker Hub
Setting these will perform a docker login before each pipeline step is run, allowing you to docker push to them from within your build scripts.
If you are using Amazon ECR you can set the ECRAccessPolicy parameter to the stack to either readonly, poweruser, or full depending on the access level you want your builds to have
You can disable this in individual pipelines by setting AWS_ECR_LOGIN=false.
If you want to login to an ECR server on another AWS account, you can set AWS_ECR_LOGIN_REGISTRY_IDS="id1,id2,id3".
The AWS ECR options are powered by an embedded version of the ECR plugin, so if you require options that aren’t listed here, you can disable the embedded version as above and call the plugin directly. See it’s README for more examples (requires Agent v3.x).
Versions
We recommend running the latest release, which is available at https://s3.amazonaws.com/buildkite-aws-stack/aws-stack.yml, or on the releases page.
The latest build of the stack is published to https://s3.amazonaws.com/buildkite-aws-stack/master/aws-stack.yml, along with a version for each commit in the form of https://s3.amazonaws.com/buildkite-aws-stack/master/${COMMIT}.aws-stack.yml.
Branches are published in the form of https://s3.amazonaws.com/buildkite-aws-stack/${BRANCH}/aws-stack.yml.
Updating Your Stack
To update your stack to the latest version use CloudFormation’s stack update tools with one of the urls in the Versions section.
Prior to updating, it’s a good idea to set the desired instance size on the AutoscalingGroup to 0 manually.
CloudWatch Metrics
Metrics are calculated every minute from the Buildkite API using a lambda function.
You’ll find the stack’s metrics under “Custom Namespaces > Buildkite” within CloudWatch.
Reading Instance and Agent Logs
Each instance streams file system logs such as /var/log/messages and /var/log/docker into namespaced AWS Log Groups. A full list of files and log groups can be found in the relevant Linux CloudWatch agent config.json file.
Within each stream the logs are grouped by instance id.
To debug an agent, first find the instance id from the agent in Buildkite, head to your CloudWatch Logs Dashboard, choose the desired log group, and then search for the instance id in the list of log streams.
Customizing Instances with a Bootstrap Script
You can customize your stack’s instances by using the BootstrapScriptUrl stack parameter to run a bash script on instance boot. To set up a bootstrap script, create an S3 bucket with the script, and set the BootstrapScriptUrl parameter, for example s3://my_bucket_name/my_bootstrap.sh.
If the file is private, you’ll also need to create an IAM policy to allow the instances to read the file, for example:
Once you’ve created the policy, you must specify the policy’s ARN in the ManagedPolicyARN stack parameter.
Optimizing for Slow Docker Builds
For large legacy applications the Docker build process might take a long time on new instances. For these cases it’s recommended to create an optimized “builder” stack which doesn’t scale down, keeps a warm docker cache and is responsible for building and pushing the application to Docker Hub before running the parallel build jobs across your normal CI stack.
An example of how to set this up:
Create a Docker Hub repository for pushing images to
Update the pipeline’s env hook in your secrets bucket to perform a docker login
Create a builder stack with its own queue (i.e. elastic-builders)
Here is an example build pipeline based on a production Rails application:
See Issue 81 for ideas on other solutions (contributions welcome!).
Security
This repository hasn’t been reviewed by security researchers so exercise caution and careful thought with what credentials you make available to your builds.
Anyone with commit access to your codebase (including third-party pull-requests if you’ve enabled them in Buildkite) will have access to your secrets bucket files.
Also keep in mind the EC2 HTTP metadata server is available from within builds, which means builds act with the same IAM permissions as the instance.
Development
To get started with customizing your own stack, or contributing fixes and features:
# Checkout all submodules
git submodule update --init --recursive
# Build all AMIs and render a cloud formation template - this requires AWS credentials (in the ENV)
# to build an AMI with packer
make build
# To create a new stack on AWS using the local template
make create-stack
# You can use any of the AWS* environment variables that the aws-cli supports
AWS_PROFILE="some-profile" make create-stack
# You can also use aws-vault or similar
aws-vault exec some-profile -- make create-stack
If you need to build your own AMI (because you’ve changed something in the packer directory), run:
make packer
Questions and Support
Feel free to drop an email to support@buildkite.com with questions. It helps us if you can provide the following details:
# List your stack parameters
aws cloudformation describe-stacks --stack-name MY_STACK_NAME \
--query 'Stacks[].Parameters[].[ParameterKey,ParameterValue]' --output table
The Buildkite Elastic CI Stack gives you a private, autoscaling Buildkite Agent cluster. Use it to parallelize legacy tests across hundreds of nodes, run tests and deployments for all your Linux-based services and apps, or run AWS ops tasks.
For documentation on a release, such as the latest stable release, please see its Documentation section.
Features:
Contents
Getting Started
See the Elastic CI Stack for AWS guide for a step-by-step guide, or jump straight in:
Current release is
. See Releases for older releases, or Versions for development version
If you’d like to use the AWS CLI, download
config.json.example, rename it toconfig.json, and then run the below command:Build Secrets
The stack will have created an S3 bucket for you (or used the one you provided as the
SecretsBucketparameter). This will be where the agent will fetch your SSH private keys for source control, and environment hooks to provide other secrets to your builds.The following s3 objects are downloaded and processed:
/env- An agent environment hook/private_ssh_key- A private key that is added to ssh-agent for your builds/git-credentials- A git-credentials file for git over https/{pipeline-slug}/env- An agent environment hook, specific to a pipeline/{pipeline-slug}/private_ssh_key- A private key that is added to ssh-agent for your builds, specific to the pipeline/{pipeline-slug}/git-credentials- A git-credentials file for git over https, specific to a pipelineBUILDKITE_PLUGIN_S3_SECRETS_BUCKET_PREFIXwill overwrite{pipeline-slug}These files are encrypted using Amazon’s KMS Service. See the Security section for more details.
Here’s an example that shows how to generate a private SSH key, and upload it with KMS encryption to an S3 bucket:
If you want to set secrets that your build can access, create a file that sets environment variables and upload it:
Note: Currently only using the default KMS key for s3 can be used, follow #235 for progress on using specific KMS keys
If you really want to store your secrets unencrypted, you can disable it entirely with
BUILDKITE_USE_KMS=false.What’s On Each Machine?
What Type of Builds Does This Support?
This stack is designed to run your builds in a share-nothing pattern similar to the 12 factor application principals:
By following these simple conventions you get a scaleable, repeatable and source-controlled CI environment that any team within your organization can use.
Multiple Instances of the Stack
If you need to different instances sizes and scaling characteristics between pipelines, you can create multiple stack. Each can run on a different Agent Queue, with it’s own configuration, or even in a different AWS account.
Examples:
docker-buildersstack that provides always-on workers with hot docker caches (see Optimizing for Slow Docker Builds)pipeline-uploadersstack with tiny, always-on instances for lightning fastbuildkite-agent pipeline uploadjobs.deploystack with added credentials and permissions specifically for deployment.Autoscaling
If you have configured
MinSize<MaxSize, the stack will automatically scale up and down based on the number of scheduled jobs.This means you can scale down to zero when idle, which means you can use larger instances for the same cost.
Metrics are collected with a Lambda function, polling every minute based on the queue the stack is configured with. The autoscaler monitors only one queue.
Terminating the instance after job is complete
You may set
BuildkiteTerminateInstanceAfterJobtotrueto force the instance to terminate after it completes a job. Setting this value totruetells the stack to enabledisconnect-after-jobin thebuildkite-agent.cfgfile.We strongly encourage you to find an alternative to this setting if at all possible. The turn around time for replacing these instances is currently slow (5-10 minutes depending on other stack configuration settings). If you need single use jobs, we suggest looking at our container plugins like
docker,docker-compose, andecs, all which can be found here.Docker Registry Support
If you want to push or pull from registries such as Docker Hub or Quay you can use the
environmenthook in your secrets bucket to export the following environment variables:DOCKER_LOGIN_USER="the-user-name"DOCKER_LOGIN_PASSWORD="the-password"DOCKER_LOGIN_SERVER=""- optional. By default it will log into Docker HubSetting these will perform a
docker loginbefore each pipeline step is run, allowing you todocker pushto them from within your build scripts.If you are using Amazon ECR you can set the
ECRAccessPolicyparameter to the stack to eitherreadonly,poweruser, orfulldepending on the access level you want your builds to haveYou can disable this in individual pipelines by setting
AWS_ECR_LOGIN=false.If you want to login to an ECR server on another AWS account, you can set
AWS_ECR_LOGIN_REGISTRY_IDS="id1,id2,id3".The AWS ECR options are powered by an embedded version of the ECR plugin, so if you require options that aren’t listed here, you can disable the embedded version as above and call the plugin directly. See it’s README for more examples (requires Agent v3.x).
Versions
We recommend running the latest release, which is available at
https://s3.amazonaws.com/buildkite-aws-stack/aws-stack.yml, or on the releases page.The latest build of the stack is published to
https://s3.amazonaws.com/buildkite-aws-stack/master/aws-stack.yml, along with a version for each commit in the form ofhttps://s3.amazonaws.com/buildkite-aws-stack/master/${COMMIT}.aws-stack.yml.Branches are published in the form of
https://s3.amazonaws.com/buildkite-aws-stack/${BRANCH}/aws-stack.yml.Updating Your Stack
To update your stack to the latest version use CloudFormation’s stack update tools with one of the urls in the Versions section.
Prior to updating, it’s a good idea to set the desired instance size on the AutoscalingGroup to 0 manually.
CloudWatch Metrics
Metrics are calculated every minute from the Buildkite API using a lambda function.
You’ll find the stack’s metrics under “Custom Namespaces > Buildkite” within CloudWatch.
Reading Instance and Agent Logs
Each instance streams file system logs such as
/var/log/messagesand/var/log/dockerinto namespaced AWS Log Groups. A full list of files and log groups can be found in the relevant Linux CloudWatch agentconfig.jsonfile.Within each stream the logs are grouped by instance id.
To debug an agent, first find the instance id from the agent in Buildkite, head to your CloudWatch Logs Dashboard, choose the desired log group, and then search for the instance id in the list of log streams.
Customizing Instances with a Bootstrap Script
You can customize your stack’s instances by using the
BootstrapScriptUrlstack parameter to run a bash script on instance boot. To set up a bootstrap script, create an S3 bucket with the script, and set theBootstrapScriptUrlparameter, for examples3://my_bucket_name/my_bootstrap.sh.If the file is private, you’ll also need to create an IAM policy to allow the instances to read the file, for example:
Once you’ve created the policy, you must specify the policy’s ARN in the
ManagedPolicyARNstack parameter.Optimizing for Slow Docker Builds
For large legacy applications the Docker build process might take a long time on new instances. For these cases it’s recommended to create an optimized “builder” stack which doesn’t scale down, keeps a warm docker cache and is responsible for building and pushing the application to Docker Hub before running the parallel build jobs across your normal CI stack.
An example of how to set this up:
envhook in your secrets bucket to perform adocker loginelastic-builders)Here is an example build pipeline based on a production Rails application:
See Issue 81 for ideas on other solutions (contributions welcome!).
Security
This repository hasn’t been reviewed by security researchers so exercise caution and careful thought with what credentials you make available to your builds.
Anyone with commit access to your codebase (including third-party pull-requests if you’ve enabled them in Buildkite) will have access to your secrets bucket files.
Also keep in mind the EC2 HTTP metadata server is available from within builds, which means builds act with the same IAM permissions as the instance.
Development
To get started with customizing your own stack, or contributing fixes and features:
If you need to build your own AMI (because you’ve changed something in the
packerdirectory), run:Questions and Support
Feel free to drop an email to support@buildkite.com with questions. It helps us if you can provide the following details:
Provide us with logs from Cloudwatch Logs:
Alternately, drop by
#aws-stackand#awschannels in Buildkite Community Slack and ask your question!Licence
See Licence.md (MIT)