In a recent blog post, I spoke about a migration from Heroku to AWS that Todd Trimble and I did for a client project. If you are interested in some of the backstory of this migration, you should check that out.
In this post I would like to dive into what we did to move our continuous integration off of Heroku CI. This was one of the more interesting challenges in this migration as Heroku CI and Heroku Pipelines have some great tools for building and deploying software on the Heroku platform. Replacing these tools while maintaining most of the functionality that the development team had come to rely on was no easy task. Here is a short list of the major items that we wanted to ensure we could replicate in AWS:
- Automated builds and unit tests on every code check-in
- Report the status of each build back to GitHub
- Automated deployment of successful builds on the integration branch to the development environment
- Automated deployment of review apps upon the opening of a pull request
We did some research into a few different options but ultimately settled on Jenkins. In the course of our research for potential continuous integration solutions we discovered that we could achieve all of the desired functionality with a small number of plugins outside of the core Jenkins plugins. There is also organizational experience at SEP with Jenkins so selecting a tool that the development team was already familiar with was an added benefit.
In the course of setting up the new Jenkins installation in AWS, one concern that Todd and I had was achieving the same build concurrency that we had with Heroku CI. We knew that at times there could be a number of builds running and doing this in AWS with Jenkins would take some doing.
I am sure many of you are saying to yourself, “What are you talking about? You are on AWS! Just create a bigger virtual machine!” And depending on what situation you are in that might be the right answer. It can also be an expensive answer.
While that larger instance will work when the team has 5, 10, 20, or even 100 builds going at a time, there are large chunks of the day (or week) where no builds are going on at all. This larger instance size can be costly, and taking advantage of some of the scaling services that AWS offers could help us achieve the same throughput at a much lower cost.
Build Agents as Docker Containers
In our research we came across the Amazon EC2 Container Service Plugin. This plugin allows you to dynamically create Jenkins agents running as Docker containers using Amazon Elastic Container Service (ECS). It takes advantage of the Java Network Launch Protocol (JNLP) which is great fit for the ephemeral nature of Docker containers. It will dynamically launch a build agent as a Docker container for each new build queued up on the master, provided there is enough available resources in ECS to start the container.
Since these build agents are defined as Docker images, that means that we can define multiple agent images and run them all on the same underlying EC2 instance. Amazon ECS allows us to define a cluster of EC2 instances running Docker that we can use to run our Jenkins agents as Docker containers.
Using Auto Scaling groups, we also can define parameters around which to scale up and down the number of EC2 instances that are being used to make up this cluster. AWS gives the option of scaling based on any of the available CloudWatch metrics such as CPU or memory utilization, or by defining a schedule. Currently we have a schedule defined that increases the number of instances during the workday and automatically reduces this number to 1 at the end of the workday so that we have plenty of available compute resources during the workday and can keep costs lower during off hours when we really only need resources to execute the nightly builds.
If we were to run Jenkins on one or more EC2 instances, we would either have to install all of the build tools that we need on one very large instance, or run multiple build agents, each one with a different set of build tools required for the various jobs to be performed in Jenkins. This can become a maintenance nightmare because as the build tools need updating, these agents can become outdated quickly.
By defining our build agents using Docker images, when the build tools require updating, we only need to make an update to the Dockerfile, build a new image, and push the updated image to a Docker Registry so that Jenkins can start using the updated image. Jenkins helps with building these images by maintaining a JNLP agent base image on Docker Hub so that all we have to maintain in our build agent images is the build tools we require for each different agent. Although we don’t currently, it is possible to define a Jenkins job that could build and publish these build agent images upon updates being made in the GitHub repository containing the Dockerfile.
Builds on Every Commit, Every Branch
To execute our builds, we utilize the multibranch pipeline functionality that comes by default with Jenkins. By utilizing Jenkins Pipelines, we can declare how each of our repositories are built and tested by placing a Jenkinsfile in the root of our repositories. This means that the bulk of the configuration that would typically be required for a freestyle Jenkins job instead resides in a file under source control. All that needs to be configured in Jenkins is the repository from which to pull the code from. Coupled with webhooks, we were able to achieve automated builds and tests for each commit to our repository.
In order to achieve automated deployments, we created a freestyle job that is triggered upon the successful completion of one of the build jobs on the integration branch. All of the infrastructure that we use in AWS is defined using CloudFormation and we maintain the templates in a separate repository. The deployment job simply clones the latest version of this repository and creates or updates a stack based upon the parameters with which the freestyle job was triggered.
Finally, review apps were achieved using the same deployment job. However, we needed a way to determine if the current build being performed was part of a pull request. The GitHub pull request builder plugin gave us this ability. It works quite nicely with multibranch pipelines and allowed us to call the deployment job with different parameters for deployments stemming from pull requests. This gives us the control to create new infrastructure in AWS that could deploy the code from a pull request so that developers have an easy way to review the running code before approving the code to be merged into the integration branch.
Making the Move and Next Steps
We were able to get all of this up and running in a couple of weeks. At first, we were running the Jenkins master on a standalone EC2 instance, with the build agents running as Docker containers in ECS. We were also able to get this solution building in parallel with Heroku so that we could ensure that things were working smoothly before cutting everything over to AWS. The client was pleased during the demonstration of this solution but had one minor suggestion for improvement: What if we ran the Jenkins master as a Docker container as well?
In a future blog post, I will talk about the challenges we faced as well as the benefits we enjoy by running our entire Jenkins infrastructure using Docker and ECS.