Multi-Architecture Container Builds with CodeCatalyst

AWS Graviton Processors are designed by AWS to deliver the best price performance for your cloud workloads running in Amazon Elastic Compute Cloud (Amazon EC2). Amazon CodeCatalyst recently added support to run workflow actions using on-demand or pre-provisioned compute powered by AWS Graviton processors. Customers can now access high performance AWS Graviton processors to build artifacts for Arm, or improve their price performance. In this post I will show you how to create a multi-architecture docker image using CodeCatalyst that can run on both amd64 and arm64 processors.

Background

Container images only run on a system with the same CPU architecture for which they were targeted. For example, an amd64 image runs on Intel and AMD processors, while an arm64 image runs on AWS Graviton. Note that amd64 and x86_64 are often used interchangeable, and I have chosen to use amd64 in this post. Rather than maintaining multiple repositories for each image type, you can combine variants for multiple architectures in the same repository. In addition, you can create a manifest describing which image to use for each architecture. This is known as multi-architecture, or multi-platform images.

Let us look at an example to further understand multi-arch images. In this screenshot from Amazon Elastic Container Registry (Amazon ECR), I have created two images for a simple hello-world application. One image is tagged latest-amd64 for AMD architectures and one tagged latest-arm64 for ARM architectures.

In addition, I have created an Image Index tagged latest. The image index is a map describing which image to use for each architecture. This allows my users to simply pull hello-world:latest and the index will identify the correct image based on the target platform. The image index contains the following manifest.

{
“schemaVersion”: 2,
“mediaType”: “application/vnd.docker.distribution.manifest.list.v2+json”,
“manifests”: [
{
“mediaType”: “application/vnd.docker.distribution.manifest.v2+json”,
“size”: 1573,
“digest”: “sha256:eccb6dd2c2dbfc9…”,
“platform”: {
“architecture”: “amd64”,
“os”: “linux”
}
},
{
“mediaType”: “application/vnd.docker.distribution.manifest.v2+json”,
“size”: 1573,
“digest”: “sha256:c64812837fbd43…”,
“platform”: {
“architecture”: “arm64”,
“os”: “linux”
}
}
]
}

Now that I have explained what a multi-arch image is, I will explain how to create one in a CodeCatalyst workflow. A CodeCatalyst workflow is an automated procedure that describes how to build, test, and deploy your code as part of a continuous integration and continuous delivery (CI/CD) system. A workflow defines a series of steps, or actions, to take during a workflow run. Let’s get started.

Prerequisites

If you would like to follow along with this walkthrough, you will need:

A CodeCatalyst space and associated AWS account.

An empty CodeCatalyst projectand source repository in the space.
An Amazon ECR private repository in the associated AWS account.
A CodeCatalyst environment connected to the associated AWS account.

Walkthrough

In this walkthrough I will create a simple application using an Apache HTTP Server serving a static hello world page. The workload is inconsequential. I will focus on the process of building the container image using a CodeCatalyst workflow. The Workflow will build two container images, one for amd64 and one for arm64. The two build tasks will run in parallel on different compute architectures. When both builds are complete, the workflow will build the docker manifest. At the end of this post, my workflow will look like this.

Note that docker also offers a plugin called buildx that will allow you to build a multi-architecture image with a single command. In a real-world application, the workflow would also build the source code, run unit tests, etc. on each architecture. The sample application used in this post is so simple that there is no need to build and test the source code. Let’s examine the sample application now.

Sample Application

Initially the empty repository will only have a README.md file. By the end of this post, my repository will look like this.

I’ll begin by creating the file named index.html. I used the Create file button in CodeCatalyst console shown previously. My index.html file has the following content:

<html>
<head>
<title>Hello World!</title>
</head>
<body>
<h1>Hello World!</h1>
<p>Hello from a multi-architecture container created in CodeCatalyst.</p>
</body>
</html>

I’ll also create a Dockerfile that contains two commands. The first command instructs Docker to build a new image from the Apache HTTP Server Project image called httpd. It is important to note that the httpd image already supports multiple architectures including amd64 and arm64. When creating a multi-architecture image, the base image must also support these architectures. The second command simply copies the index.html file above into the new image. My Dockerfile file has the following content.

FROM httpd
COPY ./index.html /usr/local/apache2/htdocs/

With the source code for my sample application complete, I can turn my attention to the workflow.

CI/CD Workflow

To create a new workflow, select CI/CD from navigation on the left and then select Workflows (1). Then, select Create workflow (2), leave the default options, and select Create (3).

If the workflow editor opens in YAML mode, select Visual to open the visual designer. Now, I can start adding actions to the workflow.

Build Action for the AMD64 Variant

I’ll begin by adding a build action for the amd64 container. Select “+ Actions” to open the actions list. Find the Build action and click “+” to add a new build action to the workflow.

On the Inputs tab, create three variable named AWS_DEFAULT_REGION, IMAGE_REPO_NAME, and IMAGE_TAG. Set the first two values equal to the region and **** name of your Amazon ECR repository**.** Set the third to latest-amd64. For example:

Now select the Configuration tab and rename the action docker_build_amd64. Select the Environment, AWS account connection, and Role for the associated AWS account where you created the Amazon ECR repository. For example:

Then, copy and paste the following code into the Shell commands. This code will build the image using the Dockerfile you created previously. Then, it logs into Amazon ECR, and finally, pushes the new image to ECR.

– Run: AWS_ACCOUNT_ID=`aws sts get-caller-identity –query “Account” –output text`
– Run: docker build -t $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG .
– Run: aws ecr get-login-password | docker login –username AWS –password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
– Run: docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG

If you switch back to the YAML view, you can see that the designer has added the following action to the workflow definition.

docker_build_amd64:
Identifier: aws/[email protected]
Compute:
Type: EC2
Inputs:
Sources:
– WorkflowSource
Variables:
– Name: AWS_DEFAULT_REGION
Value: us-west-2
– Name: IMAGE_REPO_NAME
Value: hello-world
– Name: IMAGE_TAG
Value: latest-amd64
Environment:
Name: demo
Connections:
– Role: CodeCatalystPreviewDevelopmentAdministrator
Name: development
Configuration:
Steps:
– Run: AWS_ACCOUNT_ID=`aws sts get-caller-identity –query “Account” –output text`
– Run: docker build -t $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG .
– Run: aws ecr get-login-password | docker login –username AWS –password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
– Run: docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG

With the amd64 image complete, you can move on to the arm64 image.

Build Action for the ARM64 Variant

Add a second build action named docker_build_arm64 for the arm64 container. The configuration is nearly identical to the previous action with two minor changes. First, on the Inputs tab, I set the IMAGE_TAG to latest-arm64.

Second, on the Configuration tab, change the compute fleet to Linux.Arm64.Large. That is all you need to do to run your action on AWS Graviton. For example:

The Shell commands are identical to the arm64 build action. In addition, don’t forget to select the Environment, AWS account connection, and Role on the configuration tab. The complete configuration for the second action looks like this:

docker_build_arm64:
Identifier: aws/[email protected]
Compute:
Type: EC2
Fleet: Linux.Arm64.Large
Inputs:
Sources:
– WorkflowSource
Variables:
– Name: AWS_DEFAULT_REGION
Value: us-west-2
– Name: IMAGE_REPO_NAME
Value: hello-world
– Name: IMAGE_TAG
Value: latest-arm64
Environment:
Name: demo
Connections:
– Role: CodeCatalystPreviewDevelopmentAdministrator
Name: development
Configuration:
Steps:
– Run: AWS_ACCOUNT_ID=`aws sts get-caller-identity –query “Account” –output text`
– Run: docker build -t $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG .
– Run: aws ecr get-login-password | docker login –username AWS –password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
– Run: docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG

Now that you have a build action for the amd64 and arm64 images, you simply need to create a manifest file describing which image to use for each architecture.

Build Action for the Manifest

The final step in the workflow is to create the Docker manifest. Create a third build action named docker_manifest. You want this action to wait for the prior two actions to complete. Therefore, select the prior two actions from the Depends on drop down, like this:

Also configure four variables. AWS_DEFAULT_REGION and IMAGE_REPO_NAME are identical to the prior actions. In addition, IMAGE_TAG_AMD64 and IMAGE_TAG_ARM64 include the tags you created in the prior actions.

On the configuration tab, select the Environment, AWS account connection, and Role as you did in the prior actions. Then, copy and paste the following Shell commands.

– Run: AWS_ACCOUNT_ID=`aws sts get-caller-identity –query “Account” –output text`
– Run: aws ecr get-login-password | docker login –username AWS –password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
– Run: docker manifest create $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG_ARM64 $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG_AMD64
– Run: docker manifest annotate –arch amd64 $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG_AMD64
– Run: docker manifest annotate –arch arm64 $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG_ARM64
– Run: docker manifest push $AWS_ACCOUNT_ID.dkr.ecr.us-west-2.amazonaws.com/$IMAGE_REPO_NAME

The shell commands create a manifest and then annotate it with the correct image for both amd64 and arm64. The final action looks like this.

docker_manifest:
Identifier: aws/[email protected]
DependsOn:
– docker_build_arm64
– docker_build_amd64
Compute:
Type: EC2
Inputs:
Sources:
– WorkflowSource
Variables:
– Name: AWS_DEFAULT_REGION
Value: us-west-2
– Name: IMAGE_REPO_NAME
Value: hello-world
– Name: IMAGE_TAG_AMD64
Value: latest-amd64
– Name: IMAGE_TAG_ARM64
Value: latest-arm64
Environment:
Name: demo
Connections:
– Role: CodeCatalystPreviewDevelopmentAdministrator
Name: development
Configuration:
Steps:
– Run: AWS_ACCOUNT_ID=`aws sts get-caller-identity –query “Account” –output
text`
– Run: aws ecr get-login-password | docker login –username AWS
–password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
– Run: docker manifest create
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG_ARM64
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG_AMD64
– Run: docker manifest annotate –arch amd64
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG_AMD64
– Run: docker manifest annotate –arch arm64
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG_ARM64
– Run: docker manifest push
$AWS_ACCOUNT_ID.dkr.ecr.us-west-2.amazonaws.com/$IMAGE_REPO_NAME

I now have a complete CI/CD workflow that creates a container images for both amd64 and arm64. When I commit the changes, CodeCatalyst will execute my workflow, build the images, and push to ECR.

Cleanup

If you have been following along with this workflow, you should delete the resources you deployed so you do not continue to incur charges. First, delete the Amazon ECR repository using the AWS console. Second, delete the project from CodeCatalyst by navigating to Project settings and choosing Delete project.

Conclusion

AWS Graviton processors are custom-built by AWS to deliver the best price performance for cloud workloads. In this post I explained how to configure CodeCatalyst workflow actions to run on AWS Graviton. I used CodeCatalyst to create a workflow that builds a multi-architecture container image that can run on both amd64 and arm64 architectures. Get started building your multi-arch containers in Amazon CodeCatalyst today! You can read more about CodeCatalyst workflows in the documentation.

Run Open Source FFMPEG at Lower Cost and Better Performance on a VT1 Instance for VOD Encoding Workloads 

FFmpeg is an open source tool commonly used by media technology companies to encode and transcode video and audio formats. FFmpeg users can leverage a cost efficient Amazon Web Services (AWS) instance for their video on demand (VOD) encoding workloads now that AWS offers VT1 support on Amazon Elastic Compute Cloud (Amazon EC2).

VT1 offers improved visual quality for 4K video, support for a newer version of FFmpeg (4.4), expanded OS/kernel support , and bug fixes. These instances are powered by the AMD-Xilinx Alveo U30 media accelerator. Xilinx has the ability to add a single line  into FFmpeg, and enable the Alveo U30 to do the transcoding work. The Xilinx Video SDK includes an enhanced version of FFmpeg that can communicate with the hardware accelerated transcode pipeline in Xilinx devices to deliver up to 30% lower cost per stream than Amazon EC2 GPU-based instances and up to 60% lower cost per stream than Amazon EC2 CPU-based instances.

Companies typically use EC2 CPU instances such as the C5 and C6 coupled with FFmpeg for their VOD encoding workloads. These workloads can be costly in cases where companies encode thousands of VOD assets. The cost of an EC2 workload is influenced by the number of concurrent encoding jobs that an instance can support and this subsequently affects the time it takes to encode targeted outputs. As VOD libraries expand, companies typically auto scale to increase the size or number of C5 and C6 instances or allow the instances to operate longer. In both cases, these workloads experience an increase in cost. Important note: There is no additional charge for AWS Auto Scaling. You pay only for the AWS resources needed to run your applications and Amazon CloudWatch monitoring fees.

Amazon EC2 VT1 instances are designed to accelerate real-time video transcoding and deliver low-cost transcoding for live video streams. VT1 is also a cost effective and performance-enhancing alternative for VOD encoding workloads. Using FFmpeg as the transcoding tool, AWS performed an evaluation of VT1, C5, and C6 instances to compare price performance and speed of encode for VOD assets. When compared to C5 and C6 instances, VT1 instances can achieve up to 75% cost savings. The results show that you could operate two VT1 instances for the price of one C5 or C6 instance. Additionally, the  .

Benchmarking Method

First let’s determine the best instance type to use for our VOD workload. C5 and C6 instances are commonly used for transcoding. We used  and C6i.8xl instances and compared them against VT1.3xl instances to transcode 4K and 1080p VOD assets. The two assets were encoded into the output targets as shown  . The VT1.3xl, C5.9xl, and C6i.8xl output targets were measured against the amount of time it took to complete the encode.

As shown here in the screenshot from the AWS console for various instance types, VT1.3xl is the smallest instance type in the VT1 family. Even though VT1.6xl compares closely to C5 and C6 in terms of CPU/memory, we chose VT1.3xl for a closer price/performance comparison.

VT1 family instance type comparison to C5 in terms of CPU/memory.

Input data points

Sample input content

The following table summarizes the key parameters of the source content video files used in measuring the encoding performance for the benchmarking

Clip Name
Frame Count
Duration
Frame Rate
Codec
Resolution
Chroma Sampling

1080p
43092
12 mins
60
H.264, High Profile
1920×1080
4:2:0 YUV

4K
776
13 secs
60
H.264, High Profile
3840×2160
4:2:0 YUV

Evaluation Adaptive Bit Rate (ABR ) targets

Adaptive bitrate streaming (ABR or ABS) is technology designed to stream files efficiently over HTTP networks. Multiple files of the same content, in different size files, are offered to a user’s video player, and the client chooses the most suitable file to play back on the device. This involves transcoding a single input stream to multiple output formats optimized for different viewing resolutions.

For the benchmarking tests the input 4K and 1080K files were transcoded to various target resolutions that can be used to support different device and network capabilities at their native resolution: 1080p, 720p, 540p, and 360p. The bitrate (br) in the graphic shown here indicates the bitrate associated with each pixel. For example a 4K input file was transcoded to 360p resolution and 640 bitrate.

Figure 1: Adaptive bitrate ladder used to transcode output video files. Source: https://developer.att.com/video-optimizer/docs/best-practices/adaptive-bitrate-video-streaming

Output results

Target duration analysis

The VT1.3xl instance completed the targeted encodes 15.709 seconds faster than the C5.9xl instance and 12.58 seconds faster than the C6.8xl instance. The results in the following charts detail that the VT1.3xl instance has better speed and price performance when compared to the C5.9xl and C6i.8xl instances.

% Price Performance = {(C5/C6 Price Performance – VT1 Price Performance) /C5/C6 Price Performance} * 100

% Speed = { (C5/C6 duration – VT1 duration) /C5/C6 duration } * 100

H.264 4K Clip (3 seconds duration)
Instance Type

VT1.3xl
C5.9xl
C6i.8xl

Codec
mpsoc_vcu_h264
x264
x264

Duration to complete ABR targets (See Evaluation Targets Below)
14.47632
30.186221
27.05856

Speed‍ Compared to VT1.3xl (%)
N/A
52.043284 %
46.500035 %

Instance Cost ($/hour)
$0.65
$1.53
$1.22

Instance Cost ($/second)
$0.00018
$0.00043
$0.000338

Price Performance: $/(clip transcoded)
$0.002605
$0.01298
$0.009145

Price Performance Compared to VT1.3xl (%) ‍
N/A
79.930662 %
71.514488 %

H.264 1080p Clip (12 minutes duration)
Instance Type

VT1.3xl
C5.9xl
C6i.8xl

Codec
mpsoc_vcu_h264
x264
x264

Duration to complete ABR targets (See Evaluation Targets Below)
490.82074
837.20001
762.63252

Speed‍ Compared to VT1.3xl (%)‍
N/A
41.373538 %
35.641252 %

Instance Cost ($/hour)
$0.65
$1.53
$1.22

Instance Cost ($/second)
$0.00018
$0.00043
$0.000338

Price Performance: $/(clip transcoded)
$0.0883
$0.3599
$0.2577

Price Performance Compared to VT1.3xl (%)‍
N/A
75.465407 %
65.735351 %

The following section explains the encoding parameters used for testing. FFmpeg was installed on 1 x C5.9xl, 1x C6i.8xl, and 1 x VT1.3xl instances.  The two input files as mentioned in sample input files were run in parallel on each instance type, and the total duration to complete the transcoding to various output target resolutions was calculated.

Technical specifications

EC2 Instances: 1 x C5.9xl, 1x C6i.8xl, 1 x VT1.3xl
Video framework: FFmpeg
Video codecs: x264 (CPU), XMA (Xilinx U30)
Quality objective: x264 faster
Operating system

For C5 and C6 – x264, Amazon Linux 2 (Linux kernel 4.14)
For VT1- Xilinx: Amazon Linux 2 (Linux kernel 5.4.0-1038-aws)

Video codec settings for encoding performance tests

Instance
C5, C6
VT1

Codec
x264
mpsoc_vcu_h264

Preset
Faster
n/a

Output Bitrate (CBR)
See Evaluation Targets above
See Evaluation Targets above

Chroma Subsampling
YUV 4:2:0
YUV 4:2:0

Color Bit Depth
8 bits
8 bits

Profile
High
n/a

Conclusion

Amazon VT1 EC2 instances are typically used for live real-time encoding; however, this blog post demonstrates VT1 VOD encoding  and price performance advantages when compared to C5 and C6 EC2 instances. VT1 instances can encode VOD assets up to 52% faster, and achieve up to 75% reduction in cost when compared to C5 and C6 instances. VT1 is best utilized in workloads with VOD encoding jobs that require low encoding speeds in the time it takes to complete outputs. Please visit the Amazon EC2 VT1 instances page for more details.

Flatlogic Admin Templates banner

Unlock the power of EC2 Graviton with GitLab CI/CD and EKS Runners

Many AWS customers are using GitLab for their DevOps needs, including source control, and continuous integration and continuous delivery (CI/CD). Many of our customers are using GitLab SaaS (the hosted edition), while others are using GitLab Self-managed to meet their security and compliance requirements.

Customers can easily add runners to their GitLab instance to perform various CI/CD jobs. These jobs include compiling source code, building software packages or container images, performing unit and integration testing, etc.—even all the way to production deployment. For the SaaS edition, GitLab offers hosted runners, and customers can provide their own runners as well. Customers who run GitLab Self-managed must provide their own runners.

In this post, we’ll discuss how customers can maximize their CI/CD capabilities by managing their GitLab runner and executor fleet with Amazon Elastic Kubernetes Service (Amazon EKS). We’ll leverage both x86 and Graviton runners, allowing customers for the first time to build and test their applications both on x86 and on AWS Graviton, our most powerful, cost-effective, and sustainable instance family. In keeping with AWS’s philosophy of “pay only for what you use,” we’ll keep our Amazon Elastic Compute Cloud (Amazon EC2) instances as small as possible, and launch ephemeral runners on Spot instances. We’ll demonstrate building and testing a simple demo application on both architectures. Finally, we’ll build and deliver a multi-architecture container image that can run on Amazon EC2 instances or AWS Fargate, both on x86 and Graviton.

Figure 1.  Managed GitLab runner architecture overview.

Let’s go through the components:

Runners

A runner is an application to which GitLab sends jobs that are defined in a CI/CD pipeline. The runner receives jobs from GitLab and executes them—either by itself, or by passing it to an executor (we’ll visit the executor in the next section).

In our design, we’ll be using a pair of self-hosted runners. One runner will accept jobs for the x86 CPU architecture, and the other will accept jobs for the arm64 (Graviton) CPU architecture. To help us route our jobs to the proper runner, we’ll apply some tags to each runner indicating the architecture for which it will be responsible. We’ll tag the x86 runner with x86, x86-64, and amd64, thereby reflecting the most common nicknames for the architecture, and we’ll tag the arm64 runner with arm64.

Currently, these runners must always be running so that they can receive jobs as they are created. Our runners only require a small amount of memory and CPU, so that we can run them on small EC2 instances to minimize cost. These include t4g.micro for Graviton builds, or t3.micro or t3a.micro for x86 builds.

To save money on these runners, consider purchasing a Savings Plan or Reserved Instances for them. Savings Plans and Reserved Instances can save you up to 72% over on-demand pricing, and there’s no minimum spend required to use them.

Kubernetes executors

In GitLab CI/CD, the executor’s job is to perform the actual build. The runner can create hundreds or thousands of executors as needed to meet current demand, subject to the concurrency limits that you specify. Executors are created only when needed, and they are ephemeral: once a job has finished running on an executor, the runner will terminate it.

In our design, we’ll use the Kubernetes executor that’s built into the GitLab runner. The Kubernetes executor simply schedules a new pod to run each job. Once the job completes, the pod terminates, thereby freeing the node to run other jobs.

The Kubernetes executor is highly customizable. We’ll configure each runner with a nodeSelector that makes sure that the jobs are scheduled only onto nodes that are running the specified CPU architecture. Other possible customizations include CPU and memory reservations, node and pod tolerations, service accounts, volume mounts, and much more.

Scaling worker nodes

For most customers, CI/CD jobs aren’t likely to be running all of the time. To save cost, we only want to run worker nodes when there’s a job to run.

To make this happen, we’ll turn to Karpenter. Karpenter provisions EC2 instances as soon as needed to fit newly-scheduled pods. If a new executor pod is scheduled, and there isn’t a qualified instance with enough capacity remaining on it, then Karpenter will quickly and automatically launch a new instance to fit the pod. Karpenter will also periodically scan the cluster and terminate idle nodes, thereby saving on costs. Karpenter can terminate a vacant node in as little as 30 seconds.

Karpenter can launch either Amazon EC2 on-demand or Spot instances depending on your needs. With Spot instances, you can save up to 90% over on-demand instance prices. Since CI/CD jobs often aren’t time-sensitive, Spot instances can be an excellent choice for GitLab execution pods. Karpenter will even automatically find the best Spot instance type to speed up the time it takes to launch an instance and minimize the likelihood of job interruption.

Deploying our solution

To deploy our solution, we’ll write a small application using the AWS Cloud Development Kit (AWS CDK) and the EKS Blueprints library. AWS CDK is an open-source software development framework to define your cloud application resources using familiar programming languages. EKS Blueprints is a library designed to make it simple to deploy complex Kubernetes resources to an Amazon EKS cluster with minimum coding.

The high-level infrastructure code – which can be found in our GitLab repo – is very simple. I’ve included comments to explain how it works.

// All CDK applications start with a new cdk.App object.
const app = new cdk.App();

// Create a new EKS cluster at v1.23. Run all non-DaemonSet pods in the
// `kube-system` (coredns, etc.) and `karpenter` namespaces in Fargate
// so that we don’t have to maintain EC2 instances for them.
const clusterProvider = new blueprints.GenericClusterProvider({
version: KubernetesVersion.V1_23,
fargateProfiles: {
main: {
selectors: [
{ namespace: ‘kube-system’ },
{ namespace: ‘karpenter’ },
]
}
},
clusterLogging: [
ClusterLoggingTypes.API,
ClusterLoggingTypes.AUDIT,
ClusterLoggingTypes.AUTHENTICATOR,
ClusterLoggingTypes.CONTROLLER_MANAGER,
ClusterLoggingTypes.SCHEDULER
]
});

// EKS Blueprints uses a Builder pattern.
blueprints.EksBlueprint.builder()
.clusterProvider(clusterProvider) // start with the Cluster Provider
.addOns(
// Use the EKS add-ons that manage coredns and the VPC CNI plugin
new blueprints.addons.CoreDnsAddOn(‘v1.8.7-eksbuild.3’),
new blueprints.addons.VpcCniAddOn(‘v1.12.0-eksbuild.1’),
// Install Karpenter
new blueprints.addons.KarpenterAddOn({
provisionerSpecs: {
// Karpenter examines scheduled pods for the following labels
// in their `nodeSelector` or `nodeAffinity` rules and routes
// the pods to the node with the best fit, provisioning a new
// node if necessary to meet the requirements.
//
// Allow either amd64 or arm64 nodes to be provisioned
‘kubernetes.io/arch’: [‘amd64’, ‘arm64’],
// Allow either Spot or On-Demand nodes to be provisioned
‘karpenter.sh/capacity-type’: [‘spot’, ‘on-demand’]
},
// Launch instances in the VPC private subnets
subnetTags: {
Name: ‘gitlab-runner-eks-demo/gitlab-runner-eks-demo-vpc/PrivateSubnet*’
},
// Apply security groups that match the following tags to the launched instances
securityGroupTags: {
‘kubernetes.io/cluster/gitlab-runner-eks-demo’: ‘owned’
}
}),
// Create a pair of a new GitLab runner deployments, one running on
// arm64 (Graviton) instance, the other on an x86_64 instance.
// We’ll show the definition of the GitLabRunner class below.
new GitLabRunner({
arch: CpuArch.ARM_64,
// If you’re using an on-premise GitLab installation, you’ll want
// to change the URL below.
gitlabUrl: ‘https://gitlab.com’,
// Kubernetes Secret containing the runner registration token
// (discussed later)
secretName: ‘gitlab-runner-secret’
}),
new GitLabRunner({
arch: CpuArch.X86_64,
gitlabUrl: ‘https://gitlab.com’,
secretName: ‘gitlab-runner-secret’
}),
)
.build(app,
// Stack name
‘gitlab-runner-eks-demo’);

The GitLabRunner class is a HelmAddOn subclass that takes a few parameters from the top-level application:

// The location and name of the GitLab Runner Helm chart
const CHART_REPO = ‘https://charts.gitlab.io’;
const HELM_CHART = ‘gitlab-runner’;

// The default namespace for the runner
const DEFAULT_NAMESPACE = ‘gitlab’;

// The default Helm chart version
const DEFAULT_VERSION = ‘0.40.1’;

export enum CpuArch {
ARM_64 = ‘arm64’,
X86_64 = ‘amd64’
}

// Configuration parameters
interface GitLabRunnerProps {
// The CPU architecture of the node on which the runner pod will reside
arch: CpuArch
// The GitLab API URL
gitlabUrl: string
// Kubernetes Secret containing the runner registration token (discussed later)
secretName: string
// Optional tags for the runner. These will be added to the default list
// corresponding to the runner’s CPU architecture.
tags?: string[]
// Optional Kubernetes namespace in which the runner will be installed
namespace?: string
// Optional Helm chart version
chartVersion?: string
}

export class GitLabRunner extends HelmAddOn {
private arch: CpuArch;
private gitlabUrl: string;
private secretName: string;
private tags: string[] = [];

constructor(props: GitLabRunnerProps) {
// Invoke the superclass (HelmAddOn) constructor
super({
name: `gitlab-runner-${props.arch}`,
chart: HELM_CHART,
repository: CHART_REPO,
namespace: props.namespace || DEFAULT_NAMESPACE,
version: props.chartVersion || DEFAULT_VERSION,
release: `gitlab-runner-${props.arch}`,
});

this.arch = props.arch;
this.gitlabUrl = props.gitlabUrl;
this.secretName = props.secretName;

// Set default runner tags
switch (this.arch) {
case CpuArch.X86_64:
this.tags.push(‘amd64’, ‘x86’, ‘x86-64’, ‘x86_64’);
break;
case CpuArch.ARM_64:
this.tags.push(‘arm64’);
break;
}
this.tags.push(…props.tags || []); // Add any custom tags
};

// `deploy` method required by the abstract class definition. Our implementation
// simply installs a Helm chart to the cluster with the proper values.
deploy(clusterInfo: ClusterInfo): void | Promise<Construct> {
const chart = this.addHelmChart(clusterInfo, this.getValues(), true);
return Promise.resolve(chart);
}

// Returns the values for the GitLab Runner Helm chart
private getValues(): Values {
return {
gitlabUrl: this.gitlabUrl,
runners: {
config: this.runnerConfig(), // runner config.toml file, from below
name: `demo-runner-${this.arch}`, // name as seen in GitLab UI
tags: uniq(this.tags).join(‘,’),
secret: this.secretName, // see below
},
// Labels to constrain the nodes where this runner can be placed
nodeSelector: {
‘kubernetes.io/arch’: this.arch,
‘karpenter.sh/capacity-type’: ‘on-demand’
},
// Default pod label
podLabels: {
‘gitlab-role’: ‘manager’
},
// Create all the necessary RBAC resources including the ServiceAccount
rbac: {
create: true
},
// Required resources (memory/CPU) for the runner pod. The runner
// is fairly lightweight as it’s a self-contained Golang app.
resources: {
requests: {
memory: ‘128Mi’,
cpu: ‘256m’
}
}
};
}

// This string contains the runner’s `config.toml` file including the
// Kubernetes executor’s configuration. Note the nodeSelector constraints
// (including the use of Spot capacity and the CPU architecture).
private runnerConfig(): string {
return `
[[runners]]
[runners.kubernetes]
namespace = “{{.Release.Namespace}}”
image = “ubuntu:16.04”
[runners.kubernetes.node_selector]
“kubernetes.io/arch” = “${this.arch}”
“kubernetes.io/os” = “linux”
“karpenter.sh/capacity-type” = “spot”
[runners.kubernetes.pod_labels]
gitlab-role = “runner”
`.trim();
}
}

For security reasons, we store the GitLab registration token in a Kubernetes Secret – never in our source code. For additional security, we recommend encrypting Secrets using an AWS Key Management Service (AWS KMS) key that you supply by specifying the encryption configuration when you create your Amazon EKS cluster. It’s a good practice to restrict access to this Secret via Kubernetes RBAC rules.

To create the Secret, run the following command:

# These two values must match the parameters supplied to the GitLabRunner constructor
NAMESPACE=gitlab
SECRET_NAME=gitlab-runner-secret
# The value of the registration token.
TOKEN=GRxxxxxxxxxxxxxxxxxxxxxx

kubectl -n $NAMESPACE create secret generic $SECRET_NAME
–from-literal=”runner-registration-token=$TOKEN”
–from-literal=”runner-token=”

Building a multi-architecture container image

Now that we’ve launched our GitLab runners and configured the executors, we can build and test a simple multi-architecture container image. If the tests pass, we can then upload it to our project’s GitLab container registry. Our application will be pretty simple: we’ll create a web server in Go that simply prints out “Hello World” and prints out the current architecture.

Find the source code of our sample app in our GitLab repo.

In GitLab, the CI/CD configuration lives in the .gitlab-ci.yml file at the root of the source repository. In this file, we declare a list of ordered build stages, and then we declare the specific jobs associated with each stage.

Our stages are:

The build stage, in which we compile our code, produce our architecture-specific images, and upload these images to the GitLab container registry. These uploaded images are tagged with a suffix indicating the architecture on which they were built. This job uses a matrix variable to run it in parallel against two different runners – one for each supported architecture. Furthermore, rather than using docker build to produce our images, we use Kaniko to build them. This lets us build our images in an unprivileged container environment and improve the security posture considerably.
The test stage, in which we test the code. As with the build stage, we use a matrix variable to run the tests in parallel in separate pods on each supported architecture.

The assembly stage, in which we create a multi-architecture image manifest from the two architecture-specific images. Then, we push the manifest into the image registry so that we can refer to it in future deployments.

Figure 2. Example CI/CD pipeline for multi-architecture images.

Here’s what our top-level configuration looks like:

variables:
# These are used by the runner to configure the Kubernetes executor, and define
# the values of spec.containers[].resources.limits.{memory,cpu} for the Pod(s).
KUBERNETES_MEMORY_REQUEST: 1Gi
KUBERNETES_CPU_REQUEST: 1

# List of stages for jobs, and their order of execution
stages:
– build
– test
– create-multiarch-manifest
Here’s what our build stage job looks like. Note the matrix of variables which are set in BUILD_ARCH as the two jobs are run in parallel:
build-job:
stage: build
parallel:
matrix: # This job is run twice, once on amd64 (x86), once on arm64
– BUILD_ARCH: amd64
– BUILD_ARCH: arm64
tags: [$BUILD_ARCH] # Associate the job with the appropriate runner
image:
name: gcr.io/kaniko-project/executor:debug
entrypoint: [“”]
script:
– mkdir -p /kaniko/.docker
# Configure authentication data for Kaniko so it can push to the
# GitLab container registry
– echo “{“auths”:{“${CI_REGISTRY}”:{“auth”:”$(printf “%s:%s” “${CI_REGISTRY_USER}” “${CI_REGISTRY_PASSWORD}” | base64 | tr -d ‘n’)”}}}” > /kaniko/.docker/config.json
# Build the image and push to the registry. In this stage, we append the build
# architecture as a tag suffix.
– >-
/kaniko/executor
–context “${CI_PROJECT_DIR}”
–dockerfile “${CI_PROJECT_DIR}/Dockerfile”
–destination “${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHORT_SHA}-${BUILD_ARCH}”

Here’s what our test stage job looks like. This time we use the image that we just produced. Our source code is copied into the application container. Then, we can run make test-api to execute the server test suite.

build-job:
stage: build
parallel:
matrix: # This job is run twice, once on amd64 (x86), once on arm64
– BUILD_ARCH: amd64
– BUILD_ARCH: arm64
tags: [$BUILD_ARCH] # Associate the job with the appropriate runner
image:
# Use the image we just built
name: “${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHORT_SHA}-${BUILD_ARCH}”
script:
– make test-container

Finally, here’s what our assembly stage looks like. We use Podman to build the multi-architecture manifest and push it into the image registry. Traditionally we might have used docker buildx to do this, but using Podman lets us do this work in an unprivileged container for additional security.

create-manifest-job:
stage: create-multiarch-manifest
tags: [arm64]
image: public.ecr.aws/docker/library/fedora:36
script:
– yum -y install podman
– echo “${CI_REGISTRY_PASSWORD}” | podman login -u “${CI_REGISTRY_USER}” –password-stdin “${CI_REGISTRY}”
– COMPOSITE_IMAGE=${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHORT_SHA}
– podman manifest create ${COMPOSITE_IMAGE}
– >-
for arch in arm64 amd64; do
podman manifest add ${COMPOSITE_IMAGE} docker://${COMPOSITE_IMAGE}-${arch};
done
– podman manifest inspect ${COMPOSITE_IMAGE}
# The composite image manifest omits the architecture from the tag suffix.
– podman manifest push ${COMPOSITE_IMAGE} docker://${COMPOSITE_IMAGE}

Trying it out

I’ve created a public test GitLab project containing the sample source code, and attached the runners to the project. We can see them at Settings > CI/CD > Runners:

Figure 3. GitLab runner configurations.

Here we can also see some pipeline executions, where some have succeeded, and others have failed.

Figure 4. GitLab sample pipeline executions.

We can also see the specific jobs associated with a pipeline execution:

Figure 5. GitLab sample job executions.

Finally, here are our container images:

Figure 6. GitLab sample container registry.

Conclusion

In this post, we’ve illustrated how you can quickly and easily construct multi-architecture container images with GitLab, Amazon EKS, Karpenter, and Amazon EC2, using both x86 and Graviton instance families. We indexed on using as many managed services as possible, maximizing security, and minimizing complexity and TCO. We dove deep on multiple facets of the process, and discussed how to save up to 90% of the solution’s cost by using Spot instances for CI/CD executions.

Find the sample code, including everything shown here today, in our GitLab repository.

Building multi-architecture images will unlock the value and performance of running your applications on AWS Graviton and give you increased flexibility over compute choice. We encourage you to get started today.

About the author:

Michael Fischer

Michael Fischer is a Principal Specialist Solutions Architect at Amazon Web Services. He focuses on helping customers build more cost-effectively and sustainably with AWS Graviton. Michael has an extensive background in systems programming, monitoring, and observability. His hobbies include world travel, diving, and playing the drums.