Migrating from Heroku CI to Jenkins on AWS – Part Two
In the previous post, I went into depth about our migration from Heroku CI to Jenkins on AWS by containerizing our CI/CD using Amazon Elastic Container Service (ECS) and the Amazon EC2 Container Service Plugin for Jenkins. This allowed us the flexibility of defining all of the different types of build agents we required as different Docker images, along with the scalability of Amazon allowing us to scale up and down our compute resources as demand requires.
At first, we were only running the build agents as Docker containers, but at the client’s suggestion, we also investigated running the Jenkins master as a Docker container. There were a few tripping points along the way, but we ultimately were able to achieve this as well and the benefits of this configuration made it worth the extra effort.
One key consideration in running the Jenkins master as a container is maintaining state. By design, each time you create a Docker container from an image, it starts from a clean slate. This works well for the build agents as it ensures consistent results since each job execution will not be affected by the results of a previous execution. This however is not the functionality we want for the Jenkins master, as we want to be able to upgrade the Jenkins master without losing the job configurations or build results.
Docker volumes do allow for the creation of persistent storage for a Docker container that ensures that data is maintained beyond the lifecycle of a Docker container. This only gets us one step further however because we would still be tied to the lifecycle of an EC2 instance. With frequent updates to the AMI that AWS mantains for ECS, this would quickly become a headache to try to keep the operating system up to date. It would require freqent manual maintenance that would take time away from other areas of the project. It also isn’t a cloud-native approach to this problem.
Amazon Elastic File System (EFS) allows us to create a network file store that, similar to Docker volumes, lives beyond the lifecycle of the EC2 instances that it is attached to. It will also automatically scales to accomodate the files stored on the filesystem so we won’t need to constantly expand storage as we need it.
In order to leverage EFS, we modified the launch configuration for our auto scaling group to mount the EFS file system as an NFS share each time a new instance is launched in the group. This ensures that the files for the Jenkins master are always available at a known location on each instance in the Auto Scaling group. Second, the task definition that we create for the Jenkins master mounts a data volume that maps the mounted volume to the location within the Jenkins master container that Jenkins uses to store files (/var/jenkins_home by default). Now each time we upgrade our EC2 instances or the container running the Jenkins master, our configuration and build results are maintained.
Accessing Jenkins Logs
Now that the master is running as a container and is persisting its files to an EFS volume, we need to do the initial configuration. During the setup, Jenkins writes an initial admin password to a file in the Jenkins directory. It also outputs this password in the logs. You can get to the logs by SSH’ing into the EC2 instance and pulling the logs directly from the container, but this can be tedious and it might not be desirable to have certain individuals have SSH access to the EC2 instance (or anyone for that matter).
ECS does allow for the configuration of log forwarding to CloudWatch. The task definition for the Jenkins master container can be configured to forward its logs to CloudWatch so that way we can view the logs directly within the AWS console instead of having to SSH into the EC2 instance and read the logs directly off of the container. With this configuration is in place, once the Jenkins master container has started the initial admin password will be in the CloudWatch logs within the log group configured on the task definition.
Networking Master and Agents
Now that the Jenkins master has been installed and is running, we needed to setup networking so that the build agents and the master instance could communicate. While we could have used the public web address we were utilizing for the master, that would mean that the communication between the master and the build agents would take place over the public web, which is not something that we really wanted. This is also made difficult by the fact that the Jenkins master container, as well as the underlying EC2 instance, could be terminated at any time if one of them fails their health checks.
Our solution was to run the Jenkins master service behind an internal load balancer. While it may seem like overkill for just a single container running on a single EC2 instance, what it gives us is a consistent private web address with which to communicate with the Jenkins master. This way even if a new container or a new EC2 instance is created, the address of the master will always be the same. By using an internal load balancer, we also ensure that the communication between the master and build agents is carried out within a VPC and not over the public web.
Accessing Jenkins Master from the Public Web
Now that the Jenkins master is running behind an internal load balancer, we need to make it so that we can access it, preferable over the public internet. NGINX maintains a Docker image for their HTTP server that works great as a reverse proxy. By mounting a configuration file at a known location within the container, we can configure the NGINX container to act as a reverse proxy for the internal load balancer. By placing this container behind an internet-facing load balancer, we were able to access it from the public web. And again, by using a load balancer, we can access it by a consistent web address even if the NGINX containers are recreated. By attaching an SSL certificate generated by the AWS Certificate Manager and configuring a Route 53 alias to the load balancer, we can access the Jenkins master at a consistent, user-friendly web address utilizing SSL.
Granting Permissions to Jenkins Master
In order for the Jenkins master to launch build agents in ECS, it requires access to parts of the AWS API. Amazon ECS allows for the assigning of IAM roles to ECS tasks. By assigning a role to the task that is running the Jenkins master, you don’t have to save an AWS access key in the credential store of Jenkins. Instead it will use the credentials that are assigned to the task itself. This way we can define a role that is individual to the Jenkins master instance and only contains the permissions it requires to function.
A Couple of Notes
By default, EC2 instances are configured with a timezone of UTC. Docker containers will also inherit the timezone of the underlying host when they are created. This means that by default, the Jenkins master container will be running in the UTC timezone. In most cases, this is not desirable. This can be changed by setting the TZ environment variable in the task definition for the Jenkins master instance. Now when the task runs, it will run in the specified timezone and the time stamps (and cron jobs) in the Jenkins master will reflect it.
Another lesson we learned was to keep the Jenkins master and the build agents running in separate ECS clusters on separate EC2 instances. While both are running the ECS optimized AMI, we found that running the master and agents on the same instances led to situations where the build agents would consume too much CPU or memory, leading to the master instance to become slow or unresponsive. With health checks occuring at regular intervals, this lead to the master instance occasionally being restarted by ECS. Not the end of the world, but it lead to the occaisonal job that was started and never finished (due to the master being forcibly stopped by ECS). By keeping the master and agents separated, we can ensure that there is always enough resources for the master. Additionally, we don’t run any builds on the master, we only utilize agents running as containers, ensuring that the master doesn’t get bogged down by any builds.
This post (and the last one) has been more about the story of how we achieved the migration from Heroku CI to Jenkins on AWS. In my next post I hope to dive into the CloudFormation template that we use to maintain our Jenkins infrastructure and dive into some of the details of how this solution works on AWS.