How Cirrusgo enabled rapid resolution with Amazon DevOps Guru

In this blog, we will walk through how Cirrusgo used Amazon DevOps Guru for RDS to quickly identify and resolve their operational issue related to database performance and reduce the impact on their business. This capability is offered by Amazon DevOps Guru for RDS which uses machine learning algorithms to help organizations identify and resolve operational issues in their applications and infrastructure.

Challenge:

Knowlegebeam, one of Cirrusgo’s managed service customers, has an e-learning web application that serves as a mission-critical tool for nearly 90,000 teachers. The application tracks daily activities, including teaching and evaluating homework and quizzes submitted by students. Any interruption of the availability of this application causes significant inconvenience to teachers and students, as well as damage to the company’s reputation. Ensuring the continuous and reliable performance of customer workloads is of utmost importance to Cirrusgo.

Identification of Operational issues with Amazon DevOps Guru:

To streamline the troubleshooting process and avoid time-consuming manual analysis of logs, Cirrusgo leveraged the power of Amazon DevOps Guru to monitor Knowledge Beam’s stack. With just a few clicks in the AWS console, Cirrusgo seamlessly enabled DevOps Guru that uses advanced machine learning techniques to analyze Amazon CloudWatch metrics, AWS CloudTrail, and Amazon Relational Database Service (Amazon RDS) Performance Insights. This enables it to quickly identify behaviors that deviate from standard operating patterns and pinpoint the root cause of operational issues.

When users reported difficulty submitting assignments via the e-learning portal, Cirrusgo’s team launched an investigation. The team discovered 4xx and 5xx Amazon Elastic Load Balancing errors in the CloudWatch metrics. There was no additional information available. While examining the load balancer and application logs, the engineers received Amazon DevOps Guru notifications regarding Amazon RDS) replica lag. The team promptly investigated and confirmed the existence of the Amazon RDS replica lag. The team ran commands to stop traffic to the replica instance and shift all traffic to the Amazon RDS primary node. Thanks to DevOps Guru’s insightful recommendations, the team identified and resolved the issue. The team was able to use the root cause of the issue and take additional steps to prevent its recurrence. This included creating an Amazon RDS Read Replica and upgrading the instance type based on the current workload.

Cirrusgo quickly identified and addressed critical operational issues in Knowledge Beam’s application. This enabled them to minimize the immediate impact and enhance their customer’s applications’ future reliability and performance.

Amazon DevOps Guru was very beneficial that helped us identify incidents in Amazon RDS. It provided useful insights we previously didn’t have, and it helped reduce our mitigation time. We implemented it to some accounts we are managing and are taking advantage”, says Mohammed Douglas Otaibi, Technical Co-Founder of Cirrusgo

Conclusion:

This post highlights how Cirrusgo leveraged Amazon DevOps Guru to identify and quickly address anomalous behavior.

Are you looking for a way to improve the monitoring of your Amazon RDS databases? Look no further than Amazon DevOps Guru. With DevOps Guru’s RDS monitoring capabilities, you can gain deep insights into the performance and health of your databases. This includes automatic anomaly detection, proactive recommendations, and alerts for issues that require your attention.

About the authors:

Harish Bannai

Harish Bannai is a Sr. Technical Account Manager at Amazon Web Services. He holds the AWS Solutions Architect Professional, Developer Associate, Analytics Specialty , AWS Database Specialty and Solutions Architect Professional certifications. He works with enterprise customers providing technical assistance on RDS, Database Migration services operational performance and sharing database best practices.

Adnan Bilwani

Adnan Bilwani is a Sr. Senior Specialist at Amazon Web Services. Lucy focuses on improving application qualification and availability by leveraging AWS DevOps services and tools.

Lucy Hartung

Lucy Hartung is a Senior Specialist at Amazon Web Services. Lucy focuses on improving application qualification and availability by leveraging AWS.

Fully Automated Deployment of an Open Source Mail Server on AWS

Many AWS customers have the requirement to host their own email solution and prefer to operate mail severs over using fully managed solutions (e.g. Amazon WorkMail). While certainly there are merits to either approach, motivations for self-managing often include:

full ownership and control
need for customized configuration
restricting access to internal networks or when connected via VPN.

In order to achieve this, customers frequently rely on open source mail servers due to flexibility and free licensing as compared to proprietary solutions like Microsoft Exchange. However, running and operating open source mail servers can be challenging as several components need to be configured and integrated to provide the desired end-to-end functionality. For example, to provide a fully functional email system, you need to combine dedicated software packages to send and receive email, access and manage inboxes, filter spam, manage users etc. Hence, this can be complex and error-prone to configure and maintain. Troubleshooting issues often calls for expert knowledge. As a result, several open source projects emerged that aim at simplifying setup of open source mail servers, such as Mail-in-a-Box, Docker Mailserver, Mailu, Modoboa, iRedMail, and several others.

In this blog post, we take this one step further by adding infrastructure automation and integrations with AWS services to fully automate the deployment of an open source mail server on AWS. We present an automated setup of a single instance mail server, striving for minimal complexity and cost, while still providing high resiliency by leveraging incremental backups and automations. As such, the solution is best suited for small to medium organizations that are looking to run open source mail servers but do not want to deal with the associated operational complexity.

The solution in this blog uses AWS CloudFormation templates to automatically setup and configure an Amazon Elastic Compute Cloud (Amazon EC2) instance running Mail-in-a-Box, which integrates features such as email , webmail, calendar, contact, and file sharing, thus providing functionality similar to popular SaaS tools or commercial solutions. All resources to reproduce the solution are provided in a public GitHub repository under an open source license (MIT-0).

Amazon Simple Storage Service (Amazon S3) is used both for offloading user data and for storing incremental application-level backups. Aside from high resiliency, this backup strategy gives way to an immutable infrastructure approach, where new deployments can be rolled out to implement updates and recover from failures which drastically simplifies operation and enhances security.

We also provide an optional integration with Amazon Simple Email Service (Amazon SES) so customers can relay their emails through reputable AWS servers and have their outgoing email accepted by third-party servers. All of this enables customers to deploy a fully featured open source mail server within minutes from AWS Management Console, or restore an existing server from an Amazon S3 backup for immutable upgrades, migration, or recovery purposes.

Overview of Solution

The following diagram shows an overview of the solution and interactions with users and other AWS services.

After preparing the AWS Account and environment, an administrator deploys the solution using an AWS CloudFormation template (1.). Optionally, a backup from Amazon S3 can be referenced during deployment to restore a previous installation of the solution (1a.). The admin can then proceed to setup via accessing the web UI (2.) to e.g., provision TLS certificates and create new users. After the admin has provisioned their accounts, users can access the web interface (3.) to send email, manage their inboxes, access calendar and contacts and share files. Optionally, outgoing emails are relayed via Amazon SES (3a.) and user data is stored in a dedicated Amazon S3 bucket (3b.). Furthermore, the solution is configured to automatically and periodically create incremental backups and store them into an S3 bucket for backups (4.).

On top of popular open source mail server packages such as Postfix for SMTP and Dovecot for IMAP, Mail-in-a-box integrates Nextcloud for calendar, contacts, and file sharing. However, note that Nextcloud capabilities in this context are limited. It’s primarily intended to be used alongside the core mail server functionalities to maintain calendar and contacts and for lightweight file sharing (e.g. for sharing files via links that are too large for email attachments). If you are looking for a fully featured, customizable and scalable Nextcloud deployment on AWS, have a look at this AWS Sample instead.

Deploying the Solution

Prerequisites

For this walkthrough, you should have the following prerequisites:

An AWS account

An existing external email address to test your new mail server. In the context of this sample, we will use [email protected] as the address.
A domain that can be exclusively used by the mail server in the sample. In the context of this sample, we will use aws-opensource-mailserver.org as the domain. If you don’t have a domain available, you can register a new one with Amazon Route 53. In case you do so, you can go ahead and delete the associated hosted zone that gets automatically created via the Amazon Route 53 Console. We won’t need this hosted zone because the mail server we deploy will also act as Domain Name System (DNS) server for the domain.
An SSH key pair for command line access to the instance. Command line access to the mail server is optional in this tutorial, but a key pair is still required for the setup. If you don’t already have a key pair, go ahead and create one in the EC2 Management Console:

(Optional) In this blog, we verify end-to-end functionality by sending an email to a single email address ([email protected] ) leveraging Amazon SES in sandbox mode. In case you want to adopt this sample for your use case and send email beyond that, you need to request removal of email sending limitations for EC2 or alternatively, if you relay your mail via Amazon SES request moving out of Amazon SES sandbox.

Preliminary steps: Setting up DNS and creating S3 Buckets

Before deploying the solution, we need to set up DNS and create Amazon S3 buckets for backups and user data.

1.     Allocate an Elastic IP address: We use the address 52.6.x.y in this sample.

2.     Configure DNS: If you have your domain registered with Amazon Route 53, you can use the AWS Management Console to change the name server and glue records for your domain. Configure two DNS servers ns1.box.<your-domain> and ns2.box.<your-domain> by placing your Elastic IP (allocated in step 1) into the Glue records field for each name server:

If you use a third-party DNS service, check their corresponding documentation on how to set the glue records.

It may take a while until the updates to the glue records propagate through the global DNS system. Optionally, before proceeding with the deployment, you can verify your glue records are setup correctly with the dig command line utility:

# Get a list of root servers for your top level domain
dig +short org. NS
# Query one of the root servers for an NS record of your domain
dig c0.org.afilias-nst.info. aws-opensource-mailserver.org. NS

This should give you output as follows:

;; ADDITIONAL SECTION:
ns1.box.aws-opensource-mailserver.org. 3600 IN A 52.6.x.y
ns2.box.aws-opensource-mailserver.org. 3600 IN A 52.6.x.y

3.     Create S3 buckets for backups and user data: Finally, in the Amazon S3 Console, create a bucket to store Nextcloud data and another bucket for backups, choosing globally unique names for both of them. In context of this sample, we will be using the two buckets (aws-opensource-mailserver-backup and aws-opensource-mailserver-nextcloud) as shown here:

Deploying and Configuring Mail-in-a-Box

Click    to deploy and specify the parameters as shown in the below screenshot to match the resources created in the previous section, leave other parameters at their default value, then click Next and Submit.

This will deploy your mail server into a public subnet of your default VPC which takes about 10 minutes. You can monitor the progress in the AWS CloudFormation Console. Meanwhile, retrieve and note the admin password for the web UI from AWS Systems Manager Parameter Store via the MailInABoxAdminPassword parameter.

Roughly one minute after your mail server finishes deploying, you can log in at its admin web UI residing at https://52.6.x.y/admin with username admin@<your-domain>, as shown in the following picture (you need to confirm the certificate exception warning from your browser):

Finally, in the admin UI navigate to System > TLS(SSL) Certificates and click Provision to obtain a valid SSL certificate and complete the setup (you might need to click on Provision twice to have all domains included in your certificate, as shown here).

At this point, you could further customize your mail server setup (e.g., by creating inboxes for additional users). However, we will continue to use the admin user in this sample for testing the setup in the next section.

Note: If your AWS account is subject to email sending restrictions on EC2, you will see an error in your admin dashboard under System > System Status Checks that says ‘Incoming Email (SMTP/postfix) is running but not publicly accessible’. You are safe to ignore this and should be able to receive emails regardless.

Testing the Solution

Receiving Email

With your existing email account, compose and send an email to admin@<your-domain>. Then login as admin@<your-domain> to the webmail UI of your AWS mail server at https://box.<your-domain>/mail and verify you received the email:

Test file sharing, calendar and contacts with Nextcloud

Your Nextcloud installation can be accessed under https://box.<your-domain>/cloud, as shown in the next figure. Here you can manage your calendar, contacts, and shared files. Contacts created and managed here are also accessible in your webmail UI when you compose an email. Refer to the Nextcloud documentation for more details. In order to keep your Nextcloud installation consistent and automatically managed by Mail-in-a-box setup scripts, admin users are advised to refrain from changing and customizing the Nextcloud configuration.

Sending Email

For this sample, we use Amazon SES to forward your outgoing email, as this is a simple way to get the emails you send accepted by other mail servers on the web. Achieving this is not trivial otherwise, as several popular email services tend to block public IP ranges of cloud providers.

Alternatively, if your AWS account has email sending limitations for EC2 you can send emails directly from your mail server. In this case, you can skip the next section and continue with Send test email, but make sure you’ve deployed your mail server stack with the SesRelay set to false. In that case, you can also bring your own IP addresses to AWS and continue using your reputable addresses or build reputation for addresses you own.

Verify your domain and existing Email address to Amazon SES

In order to use Amazon SES to accept and forward email for your domain, you first need to prove ownership of it. Navigate to Verified Identities in the Amazon SES Console and click Create identity, select domain and enter your domain. You will then be presented with a screen as shown here:

You now need to copy-paste the three CNAME DNS records from this screen over to your mail server admin dashboard. Open the admin web UI of your mail server again, select System > Custom DNS, and add the records as shown in the next screenshot.

Amazon SES will detect these records, thereby recognizing you as the owner and verifying the domain for sending emails. Similarly, while still in sandbox mode, you also need to verify ownership of the recipient email address. Navigate again to Verified Identities in the Amazon SES Console, click Create identity, choose Email Address, and enter your existing email address.

Amazon SES will then send a verification link to this address, and once you’ve confirmed via the link that you own this address, you can send emails to it.

Summing up, your verified identities section should look similar to the next screenshot before sending the test email:

Finally, if you intend to send email to arbitrary addresses with Amazon SES beyond testing in the next step, refer to the documentation on how to request production access.

Send test email

Now you are set to log back into your webmail UI and reply to the test mail you received before:

Checking the inbox of your existing mail, you should see the mail you just sent from your AWS server.

Congratulations! You have now verified full functionality of your open source mail server on AWS.

Restoring from backup

Finally, as a last step, we demonstrate how to roll out immutable deployments and restore from a backup for simple recovery, migration and upgrades. In this context, we test recreating the entire mail server from a backup stored in Amazon S3.

For that, we use the restore feature of the CloudFormation template we deployed earlier to migrate from the initial t2.micro installation to an AWS Graviton arm64-based t4g.micro instance. This exemplifies the power of the immutable infrastructure approach made possible by the automated application level backups, allowing for simple migration between instance types with different CPU architectures.

Verify you have a backup

By default, your server is configured to create an initial backup upon installation and nightly incremental backups. Using your ssh key pair, you can connect to your instance and trigger a manual backup to make sure the emails you just sent and received when testing will be included in the backup:

ssh -i aws-opensource-mailserver.pem [email protected] sudo /opt/mailinabox/management/backup.py

You can then go to your mail servers’ admin dashboard at https://box.<your-doamin>/admin and verify the backup status under System > Backup Status:

Recreate your mail server and restore from backup

First, double check that you have saved the admin password, as you will no longer be able to retrieve it from Parameter Store once you delete the original installation of your mail server. Then go ahead and delete the aws-opensource-mailserver stack from your CloudFormation Console an redeploy it by clicking on this . However, this time, adopt the parameters as shown below, changing the instance type and corresponding AMI as well as specifying the prefix in your backup S3 bucket to restore from.

Within a couple of minutes, your mail server will be up and running again, featuring the exact same state it was before you deleted it, however, running on a completely new instance powered by AWS Graviton. You can verify this by going to your webmail UI at https://box.<yourdomain>/mail and logging in with your old admin credentials.

Cleaning up

 Delete the mail server stack from CloudFormation Console

Empty and delete both the backup and Nextcloud data S3 Buckets

Release the Elastic IP
In case you registered your domain from Amazon Route 53 and do not want to hold onto it, you need to disable automatic renewal. Further, if you haven’t already, delete the hosted zone that got created automatically when registering it.

Outlook

The solution discussed so far focuses on minimal operational complexity and cost and hence is based on a single Amazon EC2 instance comprising all functions of an open source mail server, including a management UI, user database, Nextcloud and DNS. With a suitably sized instance, this setup can meet the demands of small to medium organizations. In particular, the continuous incremental backups to Amazon S3 provide high resiliency and can be leveraged in conjunction with the CloudFormation automations to quickly recover in case of instance or single Availablity Zone (AZ) failures.

Depending on your requirements, extending the solution and distributing components across AZs allows for meeting more stringent requirements regarding high availability and scalability in the context of larger deployments. Being based on open source software, there is a straight forward migration path towards these more complex distributed architectures once you outgrow the setup discussed in this post.

Conclusion

In this blog post, we showed how to automate the deployment of an open source mail server on AWS and how to quickly and effortlessly restore from a backup for rolling out immutable updates and providing high resiliency. Using AWS CloudFormation infrastructure automations and integrations with managed services such as Amazon S3 and Amazon SES, the lifecycle management and operation of open source mail servers on AWS can be simplified significantly. Once deployed, the solution provides an end-user experience similar to popular SaaS and commercial offerings.

You can go ahead and use the automations provided in this blog and the corresponding GitHub repository to get started with running your own open source mail server on AWS!

Flatlogic Admin Templates banner

Using Open Source Cedar to Write and Enforce Custom Authorization Policies

Cedar is an open source language and software development kit (SDK) for writing and enforcing authorization policies for your applications. You can use Cedar to control access to resources such as photos in a photo-sharing app, compute nodes in a micro-services cluster, or components in a workflow automation system. You specify fine-grained permissions as Cedar policies, and your application authorizes access requests by calling the Cedar SDK’s authorization engine. Cedar has a simple and expressive syntax that supports common authorization paradigms, including both role-based access control (RBAC) and attribute-based access control (ABAC). Because Cedar policies are separate from application code, they can be independently authored, analyzed, and audited, and even shared among multiple applications.

In this blog post, we introduce Cedar and the SDK using an example application, TinyTodo, whose users and teams can organize, track, and share their todo lists. We present examples of TinyTodo permissions as Cedar policies and how TinyTodo uses the Cedar authorization engine to ensure that only intended users are granted access. A more detailed version of this post is included with the TinyTodo code.

TinyTodo

TinyTodo allows individuals, called Users, and groups, called Teams, to organize, track, and share their todo lists. Users create Lists which they can populate with tasks. As tasks are completed, they can be checked off the list.

TinyTodo Permissions

We don’t want to allow TinyTodo users to see or make changes to just any task list. TinyTodo uses Cedar to control who has access to what. A List‘s creator, called its owner, can share the list with other Users or Teams. Owners can share lists in two different modes: reader and editor. A reader can get details of a List and the tasks inside it. An editor can do those things as well, but may also add new tasks, as well as edit, (un)check, and remove existing tasks.

We specify and enforce these access permissions using Cedar. Here is one of TinyTodo’s Cedar policies.

// policy 1: A User can perform any action on a List they own
permit(principal, action, resource)
when {
    resource has owner && resource.owner == principal
};

This policy states that any principal (a TinyTodo User) can perform any action on any resource (a TinyTodoList) as long as the resource has an owner attribute that matches the requesting principal.

Here’s another TinyTodo Cedar policy.

// policy 2: A User can see a List if they are either a reader or editor
permit (
    principal,
    action == Action::”GetList”,
    resource
)
when {
    principal in resource.readers || principal in resource.editors
};

This policy states that any principal can read the contents of a task list (Action::”GetList”) so long as they are in either the list’s readers group, or its editors group.

Cedar’s authorizer enforces default deny: A request is authorized only if a specific permit policy grants it.

The full set of policies can be found in the file TinyTodo file policies.cedar (discussed below). To learn more about Cedar’s syntax and capabilities, check out the Cedar online tutorial at https://www.cedarpolicy.com/.

Building TinyTodo

To build TinyTodo you need to install Rust and Python3, and the Python3 requests module. Download and build the TinyTodo code by doing the following:

> git clone https://github.com/cedar-policy/tinytodo
…downloading messages here
> cd tinytodo
> cargo build
…build messages here

The cargo build command will automatically download and build the Cedar Rust packages cedar-policy-core, cedar-policy-validator, and others, from Rust’s standard package registry, crates.io, and build the TinyTodo server, tiny-todo-server. The TinyTodo CLI is a Python script, tinytodo.py, which interacts with the server. The basic architecture is shown in Figure 1.

Figure 1: TinyTodo application architecture

Running TinyTodo

Let’s run TinyTodo. To begin, we start the server, assume the identity of user andrew, create a new todo list called Cedar blog post, add two tasks to that list, and then complete one of the tasks.

> python -i tinytodo.py
>>> start_server()
TinyTodo server started on port 8080
>>> set_user(andrew)
User is now andrew
>>> get_lists()
No lists for andrew
>>> create_list(“Cedar blog post”)
Created list ID 0
>>> get_list(0)
=== Cedar blog post ===
List ID: 0
Owner: User::”andrew”
Tasks:
>>> create_task(0,”Draft the post”)
Created task on list ID 0
>>> create_task(0,”Revise and polish”)
Created task on list ID 0
>>> get_list(0)
=== Cedar blog post ===
List ID: 0
Owner: User::”andrew”
Tasks:
1. [ ] Draft the post
2. [ ] Revise and polish
>>> toggle_task(0,1)
Toggled task on list ID 0
>>> get_list(0)
=== Cedar blog post ===
List ID: 0
Owner: User::”andrew”
Tasks:
1. [X] Draft the post
2. [ ] Revise and polish

Figure 2: Users and Teams in TinyTodo

The get_list, create_task, and toggle_task commands are all authorized by the Cedar Policy 1 we saw above: since andrew is the owner of List ID 0, he is allowed to carry out any action on it.

Now, continuing as user andrew, we share the list with team interns as a reader. TinyTodo is configured so that the relationship between users and teams is as shown in Figure 2. We switch the user identity to aaron, list the tasks, and attempt to complete another task, but the attempt is denied because aaronis only allowed to view the list (since he’s a member of interns) not edit it. Finally, we switch to user kesha and attempt to view the list, but the attempt is not allowed (interns is a member of temp, but not the reverse).

>>> share_list(0,interns,read_only=True)
Shared list ID 0 with interns as reader
>>> set_user(aaron)
User is now aaron
>>> get_list(0)
=== Cedar blog post ===
List ID: 0
Owner: User::”andrew”
Tasks:
1. [X] Draft the post
2. [ ] Revise and polish
>>> toggle_task(0,2)
Access denied. User aaron is not authorized to Toggle Task on [0, 2]
>>> set_user(kesha)
User is now kesha
>>> get_list(0)
Access denied. User kesha is not authorized to Get List on [0]
>>> stop_server()
TinyTodo server stopped on port 8080

Here, aaron‘s get_list command is authorized by the Cedar Policy 2 we saw above, since aaron is a member of the Team interns, which andrew made a reader of List 0. aaron‘s toggle_task and kesha‘s get_list commands are both denied because no specific policy exists that authorizes them.

Extending TinyTodo’s Policies with Administrator Privileges

We can change the policies with no updates to the application code because they are defined and maintained independently. To see this, add the following policy to the end of the policies.cedar file:

permit(
principal in Team::”admin”,
action,
resource in Application::”TinyTodo”);

This policy states that any user who is a member of Team::”Admin” is able to carry out any action on any List (all of which are part of the Application::”TinyTodo” group). Since user emina is defined to be a member of Team::”Admin” (see Figure 2), if we restart TinyTodo to use this new policy, we can see emina is able to view and edit any list:

> python -i tinytodo.py
>>> start_server()
=== TinyTodo started on port 8080
>>> set_user(andrew)
User is now andrew
>>> create_list(“Cedar blog post”)
Created list ID 0
>>> set_user(emina)
User is now emina
>>> get_list(0)
=== Cedar blog post ===
List ID: 0
Owner: User::”andrew”
Tasks:
>>> delete_list(0)
List Deleted
>>> stop_server()
TinyTodo server stopped on port 8080

Enforcing access requests

When the TinyTodo server receives a command from the client, such as get_list or toggle_task, it checks to see if that command is allowed by invoking the Cedar authorization engine. To do so, it translates the command information into a Cedar request and passes it with relevant data to the Cedar authorization engine, which either allows or denies the request.

Here’s what that looks like in the server code, written in Rust. Each command has a corresponding handler, and that handler first calls the function self.is_authorized to authorize the request before continuing with the command logic. Here’s what that function looks like:

pub fn is_authorized(
&self,
principal: impl AsRef<EntityUid>,
action: impl AsRef<EntityUid>,
resource: impl AsRef<EntityUid>,
) -> Result<()> {
let es = self.entities.as_entities();
let q = Request::new(
Some(principal.as_ref().clone().into()),
Some(action.as_ref().clone().into()),
Some(resource.as_ref().clone().into()),
Context::empty(),
);
info!(“is_authorized request: …”);
let resp = self.authorizer.is_authorized(&q, &self.policies, &es);
info!(“Auth response: {:?}”, resp);
match resp.decision() {
Decision::Allow => Ok(()),
Decision::Deny => Err(Error::AuthDenied(resp.diagnostics().clone())),
}
}

The Cedar authorization engine is stored in the variable self.authorizer and is invoked via the call self.authorizer.is_authorized(&q, &self.policies, &es). The first argument is the access request &q — can the principal perform action on resource with an empty context? An example from our sample run above is whether User::”kesha” can perform action Action::”GetList” on resource List::”0″. (The notation Type::”id” used here is of a Cedar entity UID, which has Rust type cedar_policy::EntityUid in the code.) The second argument is the set of Cedar policies &self.policies the engine will consult when deciding the request; these were read in by the server when it started up. The last argument &es is the set of entities the engine will consider when consulting the policies. These are data objects that represent TinyTodo’s Users, Teams, and Lists, to which the policies may refer. The Cedar authorizer returns a decision: If Decision::Allow then the TinyTodo command can proceed; if Decision::Deny then the server returns that access is denied. The request and its outcome are logged by the calls to info!(…).

Learn More

We are just getting started with TinyTodo, and we have only seen some of what the Cedar SDK can do. You can find a full tutorial in TUTORIAL.md in the tinytodo source code directory which explains (1) the full set of TinyTodo Cedar policies; (2) information about TinyTodo’s Cedar data model, i.e., how TinyTodo stores information about users, teams, lists and tasks as Cedar entities; (3) how we specify the expected data model and structure of TinyTodo access requests as a Cedar schema, and use the Cedar SDK’s validator to ensure that policies conform to the schema; and (4) challenge problems for extending TinyTodo to be even more full featured.

Cedar and Open Source

Cedar is the authorization policy language used by customers of the Amazon Verified Permissions and AWS Verified Access managed services. With the release of the Cedar SDK on GitHub, we provide transparency into Cedar’s development, invite community contributions, and hope to build trust in Cedar’s security.

All of Cedar’s code is available at https://github.com/cedar-policy/. Check out the roadmap and issues list on the site to see where it is going and how you could contribute. We welcome submissions of issues and feature requests via GitHub issues. We built the core Cedar SDK components (for example, the authorizer) using a new process called verification-guided development in order to provide extra assurance that they are safe and secure. To contribute to these components, you can submit a “request for comments” and engage with the core team to get your change approved.

To learn more, feel free to submit questions, comments, and suggestions via the public Cedar Slack workspace, https://cedar-policy.slack.com. You can also complete the online Cedar tutorial and play with it via the language playground at https://www.cedarpolicy.com/.

Flatlogic Admin Templates banner

S3 URI Parsing is now available in AWS SDK for Java 2.x

The AWS SDK for Java team is pleased to announce the general availability of Amazon Simple Storage Service (Amazon S3) URI parsing in the AWS SDK for Java 2.x. You can now parse path-style and virtual-hosted-style S3 URIs to easily retrieve the bucket, key, region, style, and query parameters. The new parseUri() API and S3Uri class provide the highly-requested parsing features that many customers miss from the AWS SDK for Java 1.x. Please note that Amazon S3 AccessPoints and Amazon S3 on Outposts URI parsing are not supported.

Motivation

Users often need to extract important components like bucket and key from stored S3 URIs to use in S3Client operations. The new parsing APIs allow users to conveniently do so, bypassing the need for manual parsing or storing the components separately.

Getting Started

To begin, first add the dependency for S3 to your project.

<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>s3</artifactId>
<version>${s3.version}</version>
</dependency>

Next, instantiate S3Client and S3Utilities objects.

S3Client s3Client = S3Client.create();
S3Utilities s3Utilities = s3Client.utilities();

Parsing an S3 URI

To parse your S3 URI, call parseUri() from S3Utilities, passing in the URI. This will return a parsed S3Uri object. If you have a String of the URI, you’ll need to convert it into an URI object first.

String url = “https://s3.us-west-1.amazonaws.com/myBucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88”;
URI uri = URI.create(url);
S3Uri s3Uri = s3Utilities.parseUri(uri);

With the S3Uri, you can call the appropriate getter methods to retrieve the bucket, key, region, style, and query parameters. If the bucket, key, or region is not specified in the URI, an empty Optional will be returned. If query parameters are not specified in the URI, an empty map will be returned. If the field is encoded in the URI, it will be returned decoded.

Region region = s3Uri.region().orElse(null); // Region.US_WEST_1
String bucket = s3Uri.bucket().orElse(null); // “myBucket”
String key = s3Uri.key().orElse(null); // “resources/doc.txt”
boolean isPathStyle = s3Uri.isPathStyle(); // true

Retrieving query parameters

There are several APIs for retrieving the query parameters. You can return a Map<String, List<String>> of the query parameters. Alternatively, you can specify a query parameter to return the first value for the given query, or return the list of values for the given query.

Map<String, List<String>> queryParams = s3Uri.rawQueryParameters(); // {versionId=[“abc123”], partNumber=[“77”, “88”]}
String versionId = s3Uri.firstMatchingRawQueryParameter(“versionId”).orElse(null); // “abc123”
String partNumber = s3Uri.firstMatchingRawQueryParameter(“partNumber”).orElse(null); // “77”
List<String> partNumbers = s3Uri.firstMatchingRawQueryParameters(“partNumber”); // [“77”, “88”]

Caveats

Special Characters

If you work with object keys or query parameters with reserved or unsafe characters, they must be URL-encoded, e.g., replace whitespace ” ” with “%20”.

Valid:
“https://s3.us-west-1.amazonaws.com/myBucket/object%20key?query=%5Bbrackets%5D”

Invalid:
“https://s3.us-west-1.amazonaws.com/myBucket/object key?query=[brackets]”

Virtual-hosted-style URIs

If you work with virtual-hosted-style URIs with bucket names that contain a dot, i.e., “.”, the dot must not be URL-encoded.

Valid:
“https://my.Bucket.s3.us-west-1.amazonaws.com/key”

Invalid:
“https://my%2EBucket.s3.us-west-1.amazonaws.com/key”

Conclusion

In this post, I discussed parsing S3 URIs in the AWS SDK for Java 2.x and provided code examples for retrieving the bucket, key, region, style, and query parameters. To learn more about how to set up and begin using the feature, visit our Developer Guide. If you are curious about how it is implemented, check out the source code on GitHub. As always, the AWS SDK for Java team welcomes bug reports, feature requests, and pull requests on the aws-sdk-java-v2 GitHub repository.

The history and future roadmap of the AWS CloudFormation Registry

AWS CloudFormation is an Infrastructure as Code (IaC) service that allows you to model your cloud resources in template files that can be authored or generated in a variety of languages. You can manage stacks that deploy those resources via the AWS Management Console, the AWS Command Line Interface (AWS CLI), or the API. CloudFormation helps customers to quickly and consistently deploy and manage cloud resources, but like all IaC tools, it faced challenges keeping up with the rapid pace of innovation of AWS services. In this post, we will review the history of the CloudFormation registry, which is the result of a strategy we developed to address scaling and standardization, as well as integration with other leading IaC tools and partner products. We will also give an update on the current state of CloudFormation resource coverage and review the future state, which has a goal of keeping CloudFormation and other IaC tools up to date with the latest AWS services and features.

History

The CloudFormation service was first announced in February of 2011, with sample templates that showed how to deploy common applications like blogs and wikis. At launch, CloudFormation supported 13 out of 15 available AWS services with 48 total resource types. At first, resource coverage was tightly coupled to the core CloudFormation engine, and all development on those resources was done by the CloudFormation team itself. Over the past decade, AWS has grown at a rapid pace, and there are currently 200+ services in total. A challenge over the years has been the coverage gap between what was possible for a customer to achieve using AWS services, and what was possible to define in a CloudFormation template.

It became obvious that we needed a change in strategy to scale resource development in a way that could keep up with the rapid pace of innovation set by hundreds of service teams delivering new features on a daily basis. Over the last decade, our pace of innovation has increased nearly 40-fold, with 80 significant new features launched in 2011 versus more than 3,000 in 2021. Since CloudFormation was a key adoption driver (or blocker) for new AWS services, those teams needed a way to create and manage their own resources. The goal was to enable day one support of new services at the time of launch with complete CloudFormation resource coverage.

In 2016, we launched an internal self-service platform that allowed service teams to control their own resources. This began to solve the scaling problems inherent in the prior model where the core CloudFormation team had to do all the work themselves. The benefits went beyond simply distributing developer effort, as the service teams have deep domain knowledge on their products, which allowed them to create more effective IaC components. However, as we developed resources on this model, we realized that additional design features were needed, such as standardization that could enable automatic support for features like drift detection and resource imports.

We embarked on a new project to address these concerns, with the goal of improving the internal developer experience as well as providing a public registry where customers could use the same programming model to define their own resource types. We realized that it wasn’t enough to simply make the new model available—we had to evangelize it with a training campaign, conduct engineering boot-camps, build better tooling like dashboards and deployment pipeline templates, and produce comprehensive on-boarding documentation. Most importantly, we made CloudFormation support a required item on the feature launch checklist for new services, a requirement that goes beyond documentation and is built into internal release tooling (exceptions to this requirement are rare as training and awareness around the registry have improved over time). This was a prime example of one of the maxims we repeat often at Amazon: good mechanisms are better than good intentions.

In 2019, we made this new functionality available to customers when we announced the CloudFormation registry, a capability that allowed developers to create and manage private resource types. We followed up in 2021 with the public registry where third parties, such as partners in the AWS Partner Network (APN), can publish extensions. The open source resource model that customers and partners use to publish third-party registry extensions is the same model used by AWS service teams to provide CloudFormation support for their features.

Once a service team on-boards their resources to the new resource model and builds the expected Create, Read, Update, Delete, and List (CRUDL) handlers, managed experiences like drift detection and resource import are all supported with no additional development effort. One recent example of day-1 CloudFormation support for a popular new feature was Lambda Function URLs, which offered a built-in HTTPS endpoint for single-function micro-services. We also migrated the Amazon Relational Database Service (Amazon RDS) Database Instance resource (AWS::RDS::DBInstance) to the new resource model in September 2022, and within a month, Amazon RDS delivered support for Amazon Aurora Serverless v2 in CloudFormation. This accelerated delivery is possible because teams can now publish independently by taking advantage of the de-centralized Registry ownership model.

Current State

We are building out future innovations for the CloudFormation service on top of this new standardized resource model so that customers can benefit from a consistent implementation of event handlers. We built AWS Cloud Control API on top of this new resource model. Cloud Control API takes the Create-Read-Update-Delete-List (CRUDL) handlers written for the new resource model and makes them available as a consistent API for provisioning resources. APN partner products such as HashiCorp Terraform, Pulumi, and Red Hat Ansible use Cloud Control API to stay in sync with AWS service launches without recurring development effort.

Figure 1. Cloud Control API Resource Handler Diagram

Besides 3rd party application support, the public registry can also be used by the developer community to create useful extensions on top of AWS services. A common solution to extending the capabilities of CloudFormation resources is to write a custom resource, which generally involves inline AWS Lambda function code that runs in response to CREATE, UPDATE, and DELETE signals during stack operations. Some of those use cases can now be solved by writing a registry extension resource type instead. For more information on custom resources and resource types, and the differences between the two, see Managing resources using AWS CloudFormation Resource Types.

CloudFormation Registry modules, which are building blocks authored in JSON or YAML, give customers a way to replace fragile copy-paste template reuse with template snippets that are published in the registry and consumed as if they were resource types. Best practices can be encapsulated and shared across an organization, which allows infrastructure developers to easily adhere to those best practices using modular components that abstract away the intricate details of resource configuration.

CloudFormation Registry hooks give security and compliance teams a vital tool to validate stack deployments before any resources are created, modified, or deleted. An infrastructure team can activate hooks in an account to ensure that stack deployments cannot avoid or suppress preventative controls implemented in hook handlers. Provisioning tools that are strictly client-side do not have this level of enforcement.

A useful by-product of publishing a resource type to the public registry is that you get automatic support for the AWS Cloud Development Kit (CDK) via an experimental open source repository on GitHub called cdk-cloudformation. In large organizations it is typical to see a mix of CloudFormation deployments using declarative templates and deployments that make use of the CDK in languages like TypeScript and Python. By publishing re-usable resource types to the registry, all of your developers can benefit from higher level abstractions, regardless of the tool they choose to create and deploy their applications. (Note that this project is still considered a developer preview and is subject to change)

If you want to see if a given CloudFormation resource is on the new registry model or not, check if the provisioning type is either Fully Mutable or Immutable by invoking the DescribeType API and inspecting the ProvisioningType response element.

Here is a sample CLI command that gets a description for the AWS::Lambda::Function resource, which is on the new registry model.

$ aws cloudformation describe-type –type RESOURCE
–type-name AWS::Lambda::Function | grep ProvisioningType

“ProvisioningType”: “FULLY_MUTABLE”,

The difference between FULLY_MUTABLE and IMMUTABLE is the presence of the Update handler. FULLY_MUTABLE types includes an update handler to process updates to the type during stack update operations. Whereas, IMMUTABLE types do not include an update handler, so the type can’t be updated and must instead be replaced during stack update operations. Legacy resource types will be NON_PROVISIONABLE.

Opportunities for improvement

As we continue to strive towards our ultimate goal of achieving full feature coverage and a complete migration away from the legacy resource model, we are constantly identifying opportunities for improvement. We are currently addressing feature gaps in supported resources, such as tagging support for EC2 VPC Endpoints and boosting coverage for resource types to support drift detection, resource import, and Cloud Control API. We have fully migrated more than 130 resources, and acknowledge that there are many left to go, and the migration has taken longer than we initially anticipated. Our top priority is to maintain the stability of existing stacks—we simply cannot break backwards compatibility in the interest of meeting a deadline, so we are being careful and deliberate. One of the big benefits of a server-side provisioning engine like CloudFormation is operational stability—no matter how long ago you deployed a stack, any future modifications to it will work without needing to worry about upgrading client libraries. We remain committed to streamlining the migration process for service teams and making it as easy and efficient as possible.

The developer experience for creating registry extensions has some rough edges, particularly for languages other than Java, which is the language of choice on AWS service teams for their resource types. It needs to be easier to author schemas, write handler functions, and test the code to make sure it performs as expected. We are devoting more resources to the maintenance of the CLI and plugins for Python, Typescript, and Go. Our response times to issues and pull requests in these and other repositories in the aws-cloudformation GitHub organization have not been as fast as they should be, and we are making improvements. One example is the cloudformation-cli repository, where we have merged more than 30 pull requests since October of 2022.

To keep up with progress on resource coverage, check out the CloudFormation Coverage Roadmap, a GitHub project where we catalog all of the open issues to be resolved. You can submit bug reports and feature requests related to resource coverage in this repository and keep tabs on the status of open requests. One of the steps we took recently to improve responses to feature requests and bugs reported on GitHub is to create a system that converts GitHub issues into tickets in our internal issue tracker. These tickets go directly to the responsible service teams—an example is the Amazon RDS resource provider, which has hundreds of merged pull requests.

We have recently announced a new GitHub repository called community-registry-extensions where we are managing a namespace for public registry extensions. You can submit and discuss new ideas for extensions and contribute to any of the related projects. We handle the testing, validation, and deployment of all resources under the AwsCommunity:: namespace, which can be activated in any AWS account for use in your own templates.

To get started with the CloudFormation registry, visit the user guide, and then dive in to the detailed developer guide for information on how to use the CloudFormation Command Line Interface (CFN-CLI) to write your own resource types, modules, and hooks.

We recently created a new Discord server dedicated to CloudFormation. Please join us to ask questions, discuss best practices, provide feedback, or just hang out! We look forward to seeing you there.

Conclusion

In this post, we hope you gained some insights into the history of the CloudFormation registry, and the design decisions that were made during our evolution towards a standardized, scalable model for resource development that can be shared by AWS service teams, customers, and APN partners. Some of the lessons that we learned along the way might be applicable to complex design initiatives at your own company. We hope to see you on Discord and GitHub as we build out a rich set of registry resources together!

About the authors:

Eric Beard

Eric is a Solutions Architect at Amazon Web Services in Seattle, Washington, where he leads the field specialist group for Infrastructure as Code. His technology career spans two decades, preceded by service in the United States Marine Corps as a Russian interpreter and arms control inspector.

Rahul Sharma

Rahul is a Senior Product Manager-Technical at Amazon Web Services with over two years of product management spanning AWS CloudFormation and AWS Cloud Control API.

Integrating DevOps Guru Insights with CloudWatch Dashboard

Many customers use Amazon CloudWatch dashboards to monitor applications and often ask how they can integrate Amazon DevOps Guru Insights in order to have a unified dashboard for monitoring.  This blog post showcases integrating DevOps Guru proactive and reactive insights to a CloudWatch dashboard by using Custom Widgets. It can help you to correlate trends over time and spot issues more efficiently by displaying related data from different sources side by side and to have a single pane of glass visualization in the CloudWatch dashboard.

Amazon DevOps Guru is a machine learning (ML) powered service that helps developers and operators automatically detect anomalies and improve application availability. DevOps Guru’s anomaly detectors can proactively detect anomalous behavior even before it occurs, helping you address issues before they happen; detailed insights provide recommendations to mitigate that behavior.

Amazon CloudWatch dashboard is a customizable home page in the CloudWatch console that monitors multiple resources in a single view. You can use CloudWatch dashboards to create customized views of the metrics and alarms for your AWS resources.

Solution overview

This post will help you to create a Custom Widget for Amazon CloudWatch dashboard that displays DevOps Guru Insights. A custom widget is part of your CloudWatch dashboard that calls an AWS Lambda function containing your custom code. The Lambda function accepts custom parameters, generates your dataset or visualization, and then returns HTML to the CloudWatch dashboard. The CloudWatch dashboard will display this HTML as a widget. In this post, we are providing sample code for the Lambda function that will call DevOps Guru APIs to retrieve the insights information and displays as a widget in the CloudWatch dashboard. The architecture diagram of the solution is below.

Figure 1: Reference architecture diagram

Prerequisites and Assumptions

An AWS account. To sign up:

Create an AWS account. For instructions, see Sign Up For AWS.

DevOps Guru should be enabled in the account. For enabling DevOps guru, see DevOps Guru Setup

Follow this Workshop to deploy a sample application in your AWS Account which can help generate some DevOps Guru insights.

Solution Deployment

We are providing two options to deploy the solution – using the AWS console and AWS CloudFormation. The first section has instructions to deploy using the AWS console followed by instructions for using CloudFormation. The key difference is that we will create one Widget while using the Console, but three Widgets are created when we use AWS CloudFormation.

Using the AWS Console:

We will first create a Lambda function that will retrieve the DevOps Guru insights. We will then modify the default IAM role associated with the Lambda function to add DevOps Guru permissions. Finally we will create a CloudWatch dashboard and add a custom widget to display the DevOps Guru insights.

Navigate to the Lambda Console after logging to your AWS Account and click on Create function.

Figure 2a: Create Lambda Function

Choose Author from Scratch and use the runtime Node.js 16.x. Leave the rest of the settings at default and create the function.

Figure 2b: Create Lambda Function

After a few seconds, the Lambda function will be created and you will see a code source box. Copy the code from the text box below and replace the code present in code source as shown in screen print below. // SPDX-License-Identifier: MIT-0
// CloudWatch Custom Widget sample: displays count of Amazon DevOps Guru Insights
const aws = require(‘aws-sdk’);

const DOCS = `## DevOps Guru Insights Count
Displays the total counts of Proactive and Reactive Insights in DevOps Guru.
`;

async function getProactiveInsightsCount(DevOpsGuru, StartTime, EndTime) {
let NextToken = null;
let proactivecount=0;

do {
const args = { StatusFilter: { Any : { StartTimeRange: { FromTime: StartTime, ToTime: EndTime }, Type: ‘PROACTIVE’ }}}
const result = await DevOpsGuru.listInsights(args).promise();
console.log(result)
NextToken = result.NextToken;
result.ProactiveInsights.forEach(res =&gt; {
console.log(result.ProactiveInsights[0].Status)
proactivecount++;
});
} while (NextToken);
return proactivecount;
}

async function getReactiveInsightsCount(DevOpsGuru, StartTime, EndTime) {
let NextToken = null;
let reactivecount=0;

do {
const args = { StatusFilter: { Any : { StartTimeRange: { FromTime: StartTime, ToTime: EndTime }, Type: ‘REACTIVE’ }}}
const result = await DevOpsGuru.listInsights(args).promise();
NextToken = result.NextToken;
result.ReactiveInsights.forEach(res =&gt; {
reactivecount++;
});
} while (NextToken);
return reactivecount;
}

function getHtmlOutput(proactivecount, reactivecount, region, event, context) {

return `DevOps Guru Proactive Insights&lt;br&gt;&lt;font size=”+10″ color=”#FF9900″&gt;${proactivecount}&lt;/font&gt;
&lt;p&gt;DevOps Guru Reactive Insights&lt;/p&gt;&lt;font size=”+10″ color=”#FF9900″&gt;${reactivecount}`;
}

exports.handler = async (event, context) =&gt; {
if (event.describe) {
return DOCS;
}
const widgetContext = event.widgetContext;
const timeRange = widgetContext.timeRange.zoom || widgetContext.timeRange;
const StartTime = new Date(timeRange.start);
const EndTime = new Date(timeRange.end);
const region = event.region || process.env.AWS_REGION;
const DevOpsGuru = new aws.DevOpsGuru({ region });

const proactivecount = await getProactiveInsightsCount(DevOpsGuru, StartTime, EndTime);
const reactivecount = await getReactiveInsightsCount(DevOpsGuru, StartTime, EndTime);

return getHtmlOutput(proactivecount, reactivecount, region, event, context);

};

Figure 3: Lambda Function Source Code

Click on Deploy to save the function code
Since we used the default settings while creating the function, a default Execution role is created and associated with the function. We will need to modify the IAM role to grant DevOps Guru permissions to retrieve Proactive and Reactive insights.
Click on the Configuration tab and select Permissions from the left side option list. You can see the IAM execution role associated with the function as shown in figure 4.

Figure 4: Lambda function execution role

Click on the IAM role name to open the role in the IAM console. Click on Add Permissions and select Attach policies.

Figure 5: IAM Role Update

Search for DevOps and select the AmazonDevOpsGuruReadOnlyAccess. Click on Add permissions to update the IAM role.

Figure 6: IAM Role Policy Update

Now that we have created the Lambda function for our custom widget and assigned appropriate permissions, we can navigate to CloudWatch to create a Dashboard.
Navigate to CloudWatch and click on dashboards from the left side list. You can choose to create a new dashboard or add the widget in an existing dashboard.
We will choose to create a new dashboard

Figure 7: Create New CloudWatch dashboard

Choose Custom Widget in the Add widget page

Figure 8: Add widget

Click Next in the custom widge page without choosing a sample

Figure 9: Custom Widget Selection

Choose the region where devops guru is enabled. Select the Lambda function that we created earlier. In the preview pane, click on preview to view DevOps Guru metrics. Once the preview is successful, create the Widget.

Figure 10: Create Custom Widget

Congratulations, you have now successfully created a CloudWatch dashboard with a custom widget to get insights from DevOps Guru. The sample code that we provided can be customized to suit your needs.

Using AWS CloudFormation

You may skip this step and move to future scope section if you have already created the resources using AWS Console.

In this step we will show you how to  deploy the solution using AWS CloudFormation. AWS CloudFormation lets you model, provision, and manage AWS and third-party resources by treating infrastructure as code. Customers define an initial template and then revise it as their requirements change. For more information on CloudFormation stack creation refer to  this blog post.

The following resources are created.

Three Lambda functions that will support CloudWatch Dashboard custom widgets
An AWS Identity and Access Management (IAM) role to that allows the Lambda function to access DevOps Guru Insights and to publish logs to CloudWatch
Three Log Groups under CloudWatch
A CloudWatch dashboard with widgets to pull data from the Lambda Functions

To deploy the solution by using the CloudFormation template

You can use this downloadable template  to set up the resources. To launch directly through the console, choose Launch Stack button, which creates the stack in the us-east-1 AWS Region.
Choose Next to go to the Specify stack details page.
(Optional) On the Configure Stack Options page, enter any tags, and then choose Next.
On the Review page, select I acknowledge that AWS CloudFormation might create IAM resources.
Choose Create stack.

It takes approximately 2-3 minutes for the provisioning to complete. After the status is “Complete”, proceed to validate the resources as listed below.

Validate the resources

Now that the stack creation has completed successfully, you should validate the resources that were created.

On AWS Console, head to CloudWatch, under Dashboards – there will be a dashboard created with name <StackName-Region>.
On AWS Console, head to CloudWatch, under LogGroups there will be 3 new log-groups created with the name as:

lambdaProactiveLogGroup
lambdaReactiveLogGroup
lambdaSummaryLogGroup

On AWS Console, head to Lambda, there will be lambda function(s) under the name:

lambdaFunctionDGProactive
lambdaFunctionDGReactive
lambdaFunctionDGSummary

On AWS Console, head to IAM, under Roles there will be a new role created with name “lambdaIAMRole”

To View Results/Outcome

With the appropriate time-range setup on CloudWatch Dashboard, you will be able to navigate through the insights that have been generated from DevOps Guru on the CloudWatch Dashboard.

Figure 11: DevOpsGuru Insights in Cloudwatch Dashboard

Cleanup

For cost optimization, after you complete and test this solution, clean up the resources. You can delete them manually if you used the AWS Console or by deleting the AWS CloudFormation stack called devopsguru-cloudwatch-dashboard if you used AWS CloudFormation.

For more information on deleting the stacks, see Deleting a stack on the AWS CloudFormation console.

Conclusion

This blog post outlined how you can integrate DevOps Guru insights into a CloudWatch Dashboard. As a customer, you can start leveraging CloudWatch Custom Widgets to include DevOps Guru Insights in an existing Operational dashboard.

AWS Customers are now using Amazon DevOps Guru to monitor and improve application performance. You can start monitoring your applications by following the instructions in the product documentation. Head over to the Amazon DevOps Guru console to get started today.

To learn more about AIOps for Serverless using Amazon DevOps Guru check out this video.

Suresh Babu

Suresh Babu is a DevOps Consultant at Amazon Web Services (AWS) with 21 years of experience in designing and implementing software solutions from various industries. He helps customers in Application Modernization and DevOps adoption. Suresh is a passionate public speaker and often speaks about DevOps and Artificial Intelligence (AI)

Venkat Devarajan

Venkat Devarajan is a Senior Solutions Architect at Amazon Webservices (AWS) supporting enterprise automotive customers. He has over 18 years of industry experience in helping customers design, build, implement and operate enterprise applications.

Ashwin Bhargava

Ashwin is a DevOps Consultant at AWS working in Professional Services Canada. He is a DevOps expert and a security enthusiast with more than 15 years of development and consulting experience.

Murty Chappidi

Murty is an APJ Partner Solutions Architecture Lead at Amazon Web Services with a focus on helping customers with accelerated and seamless journey to AWS by providing solutions through our GSI partners. He has more than 25 years’ experience in software and technology and has worked in multiple industry verticals. He is the APJ SME for AI for DevOps Focus Area. In his free time, he enjoys gardening and cooking.

10 ways to build applications faster with Amazon CodeWhisperer

Amazon CodeWhisperer is a powerful generative AI tool that gives me coding superpowers. Ever since I have incorporated CodeWhisperer into my workflow, I have become faster, smarter, and even more delighted when building applications. However, learning to use any generative AI tool effectively requires a beginner’s mindset and a willingness to embrace new ways of working.

Best practices for tapping into CodeWhisperer’s power are still emerging. But, as an early explorer, I’ve discovered several techniques that have allowed me to get the most out of this amazing tool. In this article, I’m excited to share these techniques with you, using practical examples to illustrate just how CodeWhisperer can enhance your programming workflow. I’ll explore:

Typing less
Generating functions using code
Generating functions using comments
Generating classes
Implementing algorithms
Writing unit tests
Creating sample data
Simplifying regular expressions
Learning third-party code libraries faster
Documenting code

Before we begin

If you would like to try these techniques for yourself, you will need to use a code editor with the AWS Toolkit extension installed. VS Code, AWS Cloud9, and most editors from JetBrains will work. Refer to the CodeWhisperer “Getting Started” resources for setup instructions.

CodeWhisperer will present suggestions automatically as you type. If you aren’t presented with a suggestion, you can always manually trigger a suggestion using the Option + C (Mac) or Alt + C (Windows) shortcut. CodeWhisperer will also sometimes present you with multiple suggestions to choose from. You can press the → and ← keys to cycle through all available suggestions.

The suggestions CodeWhisperer offers are non-deterministic, which means you may receive slightly different suggestions than the ones shown in this article. If you receive a suggestion that doesn’t suit your needs, try cycling through all available suggestions using the arrow keys or try rephrasing any code comment prompts slightly. Even punctuation changes can produce fresh suggestions.

The examples in this article are written in JavaScript and TypeScript. However, the techniques I’ll demonstrate are equally suited to many of the programming languages supported by CodeWhisperer. Now, onto the examples!

Typing less

At the most basic level, CodeWhisperer can be used as a supercharged auto-complete, significantly reducing your typing when writing individual lines or blocks of code. To use CodeWhisperer in this way, just write code as you normally would. CodeWhisperer will make automatic suggestions as you type. Accept suggestions when they’re helpful, and ignore suggestions when they’re not. In my experience, using CodeWhisperer in this way alone reduces my typing by 50% or more.

Generating functions

Automatically generating function bodies from signatures

Skilled programmers use clear, descriptive names for their functions and parameters. If you follow this practice in your own programming, CodeWhisperer will have all the context it needs to implement many functions for you.

After I enter the function signature below, CodeWhisperer generates the function body for me.

User input (JavaScript):

function standardDeviation(numbers)

Result:

function standardDeviation(numbers) {
let mean = numbers.reduce((a, b) => a + b) / numbers.length;
let variance =
numbers.reduce((a, b) => a + Math.pow(b – mean, 2)) / numbers.length;
return Math.sqrt(variance);
}

Generating functions from comments

Sometimes a good function signature alone isn’t enough to convey the desired behavior of a function. In those cases, you can use a detailed code comment as a prompt to CodeWhisperer. I recommend using your language’s doc comment delimiter when using this technique.

User Input (JavaScript):

/**
* Given a list that contains some numbers and strings, format them into a string
* in which the numbers are prepended with “#” and the strings are wrapped in
* double quotes.
*/

Result:

/**
* Given a list that contains some numbers and strings, format them into a string
* in which the numbers are prepended with “#” and the strings are wrapped in
* double quotes.
*/
function formatList(list) {
return list
.map((item) => {
if (typeof item === “number”) {
return `#${item}`;
} else {
return `”${item}”`;
}
})
.join(“, “);
}

Generating classes

You can generate full or partial class implementations from a single code comment. The comment should describe the key properties and behaviors of the class. Simple descriptions are often all that is required.

User input (TypeScript):

/**
* A folder has a name, children, and an optional parent. A child can’t be added
* twice.
*/

Result:

After I write the above block comment, CodeWhisperer provides multiple suggestions, including the following implementation, complete with documentation comments for each method!

/**
* A folder has a name, children, and an optional parent. A child can’t be added
* twice.
*/
export class Folder {
name: string;
children: Folder[];
parent?: Folder;

constructor(name: string, parent?: Folder) {
this.name = name;
this.children = [];
this.parent = parent;
}

/**
* Add a child to the folder.
* @param child The child to add.
* @returns True if the child was added, false if it was already present.
*/
addChild(child: Folder): boolean {
if (this.children.includes(child)) {
return false;
}
this.children.push(child);
child.parent = this;
return true;
}

/**
* Remove a child from the folder.
* @param child The child to remove.
* @returns True if the child was removed, false if it wasn’t present.
*/
removeChild(child: Folder): boolean {
const index = this.children.indexOf(child);
if (index === -1) {
return false;
}
this.children.splice(index, 1);
child.parent = undefined;
return true;
}
}

Implementing algorithms

CodeWhisperer enables you to implement well-known algorithms with ease. Simply tell CodeWhisperer which algorithm you want to apply to a particular coding problem, and it will oblige.

User input (JavaScript):

const actors = [ … ];

// Sort the actors by age using the merge sort algorithm.

Result:

const actors = [ … ];

// Sort the actors by age using the merge sort algorithm.
function mergeSort(arr) {
if (arr.length < 2) {
return arr;
}

const middle = Math.floor(arr.length / 2);
const left = arr.slice(0, middle);
const right = arr.slice(middle);

return merge(mergeSort(left), mergeSort(right));
}

function merge(left, right) {
const result = [];

while (left.length && right.length) {
if (left[0].age < right[0].age) {
result.push(left.shift());
} else {
result.push(right.shift());
}
}

while (left.length) {
result.push(left.shift());
}

while (right.length) {
result.push(right.shift());
}

return result;
}

console.log(mergeSort(actors));

Writing unit tests

CodeWhisperer accelerates the implementation of unit tests. Not only is it able to implement the logic for most of the explicit unit tests I ask for, but it often suggests unit tests for edge cases that hadn’t even occurred to me.

When I want CodeWhisperer to implement a unit test, I start with a declaration of the condition I’m testing and the expected result as in the following example.

User input (JavaScript):

describe(“Folder.addChild()”, () => {
it(“when adding an existing folder should not add duplicate”

Result:

describe(“Folder.addChild()”, () => {
it(“when adding an existing folder should not add duplicate”, () => {
const folder = new Folder(“root”);
const child = new Folder(“child”);
folder.addChild(child);
folder.addChild(child);
expect(folder.children.length).toBe(1);
});
});

After you’ve written a test or two, CodeWhisperer is likely to suggest more tests, even without explicit prompting.

Tip: You’ll notice that my test description includes 1) the function I’m testing, 2) the condition I’m testing, and 3) what I expect the result to be. Following this three-part pattern is a great way to ensure CodeWhisperer has the context it needs to implement high quality test logic.

Creating sample data

One of the delightful discoveries I’ve made is that CodeWhisperer can generate dummy data—or really any code that has a repeating pattern. As an illustration, for the following list I entered only the first “user” object as an example, and CodeWhisperer helpfully continued the pattern.

User input (JavaScript):

const fakeUsers = [
{ name: “User 1”, id: “user1” },

Result:

const fakeUsers = [
{ name: “User 1”, id: “user1” },
{ name: “User 2”, id: “user2” },
{ name: “User 3”, id: “user3” },
{ name: “User 4”, id: “user4” },
{ name: “User 5”, id: “user5” },
{ name: “User 6”, id: “user6” }
]

It even works for generating random but plausible values like the names seen below.

User input (JavaScript):

const fakeUsers = [
{ name: “Sally”, id: “user1” },

Result:

const fakeUsers = [
{ name: “Sally”, id: “user1” },
{ name: “Bob”, id: “user2” },
{ name: “Joe”, id: “user3” },
{ name: “Jane”, id: “user4” },
{ name: “Jill”, id: “user5” },
{ name: “Jack”, id: “user6” }
]

Simplifying regular expressions

While CodeWhisperer can’t translate natural language into custom regular expressions (yet), it does know a lot of commonly used regular expressions. This saves you time by putting these regular expressions at your fingertips whenever you need them.

Let’s look at one notoriously complex but common regular expression.

User input (JavaScript):

/**
* Function that validates whether a string is a valid email address.
*/

Result:

/**
* Function that validates whether a string is a valid email address.
*/
function validateEmail(email: string): boolean {
const re =
/^(([^<>()[]\.,;:s@”]+(.[^<>()[]\.,;:s@”]+)*)|(“.+”))@(([[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}])|(([a-zA-Z-0-9]+.)+[a-zA-Z]{2,}))$/;
return re.test(String(email).toLowerCase());
}

Learning third-party code libraries faster

CodeWhisperer helps accelerate my learning when I’m using an unfamiliar library, module, or framework. In the example below, I needed to write a utility function that would take a gzip compressed value coming from an API and translate it into a value that the rest of my code could understand. I knew the “pako” library would help me, but I had no idea how to use it. I just prompted CodeWhisperer with an import statement and a comment describing my goal.

import pako from “pako”;

/**
* Accepts a base64-encoded, gzip-compressed JSON string and returns
* a parsed JSON object.
*/

Result:

import pako from “pako”;

/**
* Accepts a base64-encoded, gzip-compressed JSON string and returns
* a parsed JSON object.
*/
export function decodeGzipJson(base64String) {
const compressed = atob(base64String);
const uncompressed = pako.inflate(compressed, { to: “string” });
return JSON.parse(uncompressed);
}

Documenting code

CodeWhisperer is capable of generating docstrings and comments for the code it generates, as well as for your existing code. For example, let’s say I want CodeWhisperer to document the matches() method of this FavoritesFilter TypeScript class I’ve implemented (I’ve omitted some implementation details for brevity).

class FavoritesFilter implements IAssetFilter {

matches(asset: Asset): boolean {

}
}

I can just type a doc comment delimiter (/** */) immediately above the method name and CodeWhisperer will generate the body of the doc comment for me.

Note: When using CodeWhisperer in this way you may have to manually trigger a suggestion using Option + C (Mac) or Alt + C (Windows).

class FavoritesFilter implements IAssetFilter {

/**
* Determines whether the asset matches the filter.
*/
matches(asset: Asset): boolean {

}
}

Conclusion

I hope the techniques above inspire ideas for how CodeWhisperer can make you a more productive coder. Install CodeWhisperer today to start using these time-saving techniques in your own projects. These examples only scratch the surface. As additional creative minds start applying CodeWhisperer to their daily workflows, I’m sure new techniques and best practices will continue to emerge. If you discover a novel approach that you find useful, post a comment to share what you’ve discovered. Perhaps your technique will make it into a future article and help others in the CodeWhisperer community enhance their superpowers.

Kris Schultz (he/him)

Kris Schultz has spent over 25 years bringing engaging user experiences to life by combining emerging technologies with world class design. In his role as 3D Specialist Solutions Architect, Kris helps customers leverage AWS services to power 3D applications of all sorts.

How Zomato Boosted Performance 25% and Cut Compute Cost 30% Migrating Trino and Druid Workloads to AWS Graviton

Zomato is an India-based restaurant aggregator, food delivery, dining-out company with over 350,000 listed restaurants across more than 1,000 cities in India. The company relies heavily on data analytics to enrich the customer experience and improve business efficiency. Zomato’s engineering and product teams use data insights to refine their platform’s restaurant and cuisine recommendations, improve the accuracy of waiting times at restaurants, speed up the matching of delivery partners and improve overall food delivery process.

At Zomato, different teams have different requirements for data discovery based upon their business functions. For example, number of orders placed in specific area required by a city lead team, queries resolved per minute required by customer support team or most searched dishes on special events or days by marketing and other teams. Zomato’s Data Platform team is responsible for building and maintaining a reliable platform which serves these data insights to all business units.

Zomato’s Data Platform is powered by AWS services including Amazon EMR, Amazon Aurora MySQL-Compatible Edition and Amazon DynamoDB along with open source software Trino (formerly PrestoSQL) and Apache Druid for serving the previously mentioned business metrics to different teams. Trino clusters process over 250K queries by scanning 2PB of data and Apache Druid ingests over 20 billion events and serves 8 million queries every week. To deliver performance at Zomato scale, these massively parallel systems utilize horizontal scaling of nodes running on Amazon Elastic Compute Cloud (Amazon EC2) instances in their clusters on AWS. Performance of both these data platform components is critical to support all business functions reliably and efficiently in Zomato. To improve performance in a cost-effective manner, Zomato migrated these Trino and Druid workloads onto AWS Graviton-based Amazon EC2 instances.

Graviton-based EC2 instances are powered by Arm-based AWS Graviton processors. They deliver up to 40% better price performance than comparable x86-based Amazon EC2 instances. CPU and Memory intensive Java-based applications including Trino and Druid are suitable candidates for AWS Graviton based instances to optimize price-performance, as Java is well supported and generally performant out-of-the-box on arm64.

In this blog, we will walk you through an overview of Trino and Druid, how they fit into the overall Data Platform architecture and migration journey onto AWS Graviton based instances for these workloads. We will also cover challenges faced during migration, how Zomato team overcame those challenges, business gains in terms of cost savings and better performance along with future plans of Zomato on Graviton adoption for more workloads.

Trino overview

Trino is a fast, distributed SQL query engine for querying petabyte scale data, implementing massively parallel processing (MPP) architecture. It was designed as an alternative to tools that query Apache Hadoop Distributed File System (HDFS) using pipelines of MapReduce jobs, such as Apache Hive or Apache Pig, but Trino is not limited to querying HDFS only. It has been extended to operate over a multitude of data sources, including Amazon Simple Storage Service (Amazon S3), traditional relational databases and distributed data stores including Apache Cassandra, Apache Druid, MongoDB and more. When Trino executes a query, it does so by breaking up the execution into a hierarchy of stages, which are implemented as a series of tasks distributed over a network of Trino workers. This reduces end-to-end latency and makes Trino a fast tool for ad hoc data exploration over very large data sets.

Figure 1 – Trino architecture overview

Trino coordinator is responsible for parsing statements, planning queries, and managing Trino worker nodes. Every Trino installation must have a coordinator alongside one or more Trino workers. Client applications including Apache Superset and Redash connect to the coordinator via Presto Gateway to submit statements for execution. The coordinator creates a logical model of a query involving a series of stages, which is then translated into a series of connected tasks running on a cluster of Trino workers. Presto Gateway acts as a proxy/load-balancer for multiple Trino clusters.

Druid overview

Apache Druid is a real-time database to power modern analytics applications for use cases where real-time ingest, fast query performance and high uptime are important. Druid processes are deployed on three types of server nodes: Master nodes govern data availability and ingestion, Query nodes accept queries, execute them across the system, and return the results and Data nodes ingest and store queryable data. Broker processes receive queries from external clients and forward those queries to Data servers. Historicals are the workhorses that handle storage and querying on “historical” data. MiddleManager processes handle ingestion of new data into the cluster. Please refer here to learn more on detailed Druid architecture design.

Figure 2 – Druid architecture overview

Zomato’s Data Platform Architecture on AWS

Figure 3 – Zomato’s Data Platform landscape on AWS

Zomato’s Data Platform covers data ingestion, storage, distributed processing (enrichment and enhancement), batch and real-time data pipelines unification and a robust consumption layer, through which petabytes of data is queried daily for ad-hoc and near real-time analytics. In this section, we will explain the data flow of pipelines serving data to Trino and Druid clusters in the overall Data Platform architecture.

Data Pipeline-1: Amazon Aurora MySQL-Compatible database is used to store data by various microservices at Zomato. Apache Sqoop on Amazon EMR run Extract, Transform, Load (ETL) jobs at scheduled intervals to fetch data from Aurora MySQL-Compatible to transfer it to Amazon S3 in the Optimized Row Columnar (ORC) format, which is then queried by Trino clusters.

Data Pipeline-2: Debezium Kafka connector deployed on Amazon Elastic Container Service (Amazon ECS) acts as producer and continuously polls data from Aurora MySQL-Compatible database. On detecting changes in the data, it identifies the change type and publishes the change data event to Apache Kafka in Avro format. Apache Flink on Amazon EMR consumes data from Kafka topic, performs data enrichment and transformation and writes it in ORC format in Iceberg tables on Amazon S3. Trino clusters then query data from Amazon S3.

Data Pipeline-3: Moving away from other databases, Zomato had decided to go serverless with Amazon DynamoDB because of its high performance (single-digit millisecond latency), request rate (millions per second), extreme scale as per Zomato peak expectations, economics (pay as you go) and data volume (TB, PB, EB) for their business-critical apps including Food Cart, Product Catalog and Customer preferences. DynamoDB streams publish data from these apps to Amazon S3 in JSON format to serve this data pipeline. Apache Spark on Amazon EMR reads JSON data, performs transformations including conversion into ORC format and writes data back to Amazon S3 which is used by Trino clusters for querying.

Data Pipeline-4: Zomato’s core business applications serving end users include microservices, web and mobile applications. To get near real-time insights from these core applications is critical to serve customers and win their trust continuously. Services use a custom SDK developed by data platform team to publish events to the Apache Kafka topic. Then, two downstream data pipelines consume these application events available on Kafka via Apache Flink on Amazon EMR. Flink performs data conversion into ORC format and publishes data to Amazon S3 and in a parallel data pipeline, Flink also publishes enriched data onto another Kafka topic, which further serves data to an Apache Druid cluster deployed on Amazon EC2 instances.

Performance requirements for querying at scale

All of the described data pipelines ingest data into an Amazon S3 based data lake, which is then leveraged by three types of Trino clusters – Ad-hoc clusters for ad-hoc query use cases, with a maximum query runtime of 20 minutes, ETL clusters for creating materialized views to enhance performance of dashboard queries, and Reporting clusters to run queries for dashboards with various Key Performance Indicators (KPIs), with query runtime upto 3 minutes. ETL queries are run via Apache Airflow with a built-in query retry mechanism and a runtime of up to 3 hours.

Druid is used to serve two types of queries: computing aggregated metrics based on recent events and comparing aggregated metrics to historical data. For example, how is a specific metric in the current hour compared to the same last week. Depending on the use case, the service level objective for Druid query response time ranges from a few milliseconds to a few seconds.

Graviton migration of Druid cluster

Zomato first moved Druid nodes to AWS Graviton based instances in their test cluster environment to determine query performance. Nodes running brokers and middle-managers were moved from R5 to R6g instances and nodes running historicals were migrated from i3 to R6gd instances.   Zomato logged real-world queries from their production cluster and replayed them in their test cluster to validate the performance. Post validation, Zomato saw significant performance gains and reduced cost:

Performance gains

For queries in Druid, performance was measured using typical business hours (12:00 to 22:00 Hours) load of 14K queries, as shown here, where p99 query runtime reduced by 25%.

Figure 4 – Overall Druid query performance (Intel x86-64 vs. AWS Graviton)

Also, query performance improvement on the historical nodes of the Druid cluster are shown here, where p95 query runtime reduced by 66%.

Figure 5 –Query performance on Druid Historicals (Intel x86-64 vs. AWS Graviton)

Under peak load during business hours (12:00 to 22:00 Hours as shown in the provided graph), with increasingly loaded CPUs, Graviton based instances demonstrated close to linear performance resulting in better query runtime than equivalent Intel x86 based instances. This provided headroom to Zomato to reduce their overall node count in the Druid cluster for serving the same peak load query traffic.

Figure 6 – CPU utilization (Intel x86-64 vs. AWS Graviton)

Cost savings

A Cost comparison of Intel x86 vs. AWS Graviton based instances running Druid in a test environment along with the number, instance types and hourly On-demand prices in the Singapore region is shown here. There are cost savings of ~24% running the same number of Graviton based instances. Further, Druid cluster auto scales in production environment based upon performance metrics, so average cost savings with Graviton based instances are even higher at ~30% due to better performance.

Figure 7 – Cost savings analysis (Intel x86-64 vs. AWS Graviton)

Graviton migration of Trino clusters

Zomato also moved their Trino cluster in their test environment to AWS Graviton based instances and monitored query performance for different short and long-running queries. As shown here, mean wall (elapsed) time value for different Trino queries is lower on AWS Graviton instances than equivalent Intel x86 based instances, for most of the queries (lower is better).

Figure 8 – Mean Wall Time for Trino queries (Intel x86-64 vs. AWS Graviton)

Also, p99 query runtime reduced by ~33% after migrating the Trino cluster to AWS Graviton instances for a typical business day’s (7am – 7pm) mixed query load with ~15K queries.

Figure 9 –Query performance for a typical day (7am -7pm) load

Zomato’s team further optimized overall Trino query performance by enhancing Advanced Encryption Standard (AES) performance on Graviton for TLS negotiation with Amazon S3. It was achieved by enabling -XX:+UnlockDiagnosticVMOptions and -XX:+UseAESCTRIntrinsics in extra JVM flags. As shown here, mean CPU time for queries is lower after enabling extra JVM flags, for most of the queries.

Figure 10 –Query performance after enabling extra JVM options with Graviton instances

Migration challenges and approach

Zomato team is using Trino version 359 and multi-arch or ARM64-compatible docker image for this Trino version was not available. As the team wanted to migrate their Trino cluster to Graviton based instances with minimal engineering efforts and time, they backported the Trino multi-arch supported UBI8 based Docker image to their Trino version 359.  This approach allowed faster adoption of Graviton based instances, eliminating the heavy lift of upgrading, testing and benchmarking the workload on a newer Trino version.

Next Steps

Zomato has already migrated AWS managed services including Amazon EMR and Amazon Aurora MySQL-Compatible database to AWS Graviton based instances. With the successful migration of two main open source software components (Trino and Druid) of their data platform to AWS Graviton with visible and immediate price-performance gains, the Zomato team plans to replicate that success with other open source applications running on Amazon EC2 including Apache Kafka, Apache Pinot, etc.

Conclusion

This post demonstrated the price/performance benefits of adopting AWS Graviton based instances for high throughput, near real-time big data analytics workloads running on Java-based, open source Apache Druid and Trino applications. Overall, Zomato reduced the cost of its Amazon EC2 usage by 30%, while improving performance for both time-critical and ad-hoc querying by as much as 25%. Due to better performance, Zomato was also able to right size compute footprint for these workloads on a smaller number of Amazon EC2 instances, with peak capacity of Apache Druid and Trino clusters reduced by 25% and 20% respectively.

Zomato migrated these open source software applications faster by quickly implementing customizations needed for optimum performance and compatibility with Graviton based instances. Zomato’s mission is “better food for more people” and Graviton adoption is helping with this mission by providing a more sustainable, performant, and cost-effective compute platform on AWS. This is certainly a “food for thought” for customers looking forward to improve price-performance and sustainability for their business-critical workloads running on Open Source Software (OSS).

Flatlogic Admin Templates banner

Multi-Architecture Container Builds with CodeCatalyst

AWS Graviton Processors are designed by AWS to deliver the best price performance for your cloud workloads running in Amazon Elastic Compute Cloud (Amazon EC2). Amazon CodeCatalyst recently added support to run workflow actions using on-demand or pre-provisioned compute powered by AWS Graviton processors. Customers can now access high performance AWS Graviton processors to build artifacts for Arm, or improve their price performance. In this post I will show you how to create a multi-architecture docker image using CodeCatalyst that can run on both amd64 and arm64 processors.

Background

Container images only run on a system with the same CPU architecture for which they were targeted. For example, an amd64 image runs on Intel and AMD processors, while an arm64 image runs on AWS Graviton. Note that amd64 and x86_64 are often used interchangeable, and I have chosen to use amd64 in this post. Rather than maintaining multiple repositories for each image type, you can combine variants for multiple architectures in the same repository. In addition, you can create a manifest describing which image to use for each architecture. This is known as multi-architecture, or multi-platform images.

Let us look at an example to further understand multi-arch images. In this screenshot from Amazon Elastic Container Registry (Amazon ECR), I have created two images for a simple hello-world application. One image is tagged latest-amd64 for AMD architectures and one tagged latest-arm64 for ARM architectures.

In addition, I have created an Image Index tagged latest. The image index is a map describing which image to use for each architecture. This allows my users to simply pull hello-world:latest and the index will identify the correct image based on the target platform. The image index contains the following manifest.

{
“schemaVersion”: 2,
“mediaType”: “application/vnd.docker.distribution.manifest.list.v2+json”,
“manifests”: [
{
“mediaType”: “application/vnd.docker.distribution.manifest.v2+json”,
“size”: 1573,
“digest”: “sha256:eccb6dd2c2dbfc9…”,
“platform”: {
“architecture”: “amd64”,
“os”: “linux”
}
},
{
“mediaType”: “application/vnd.docker.distribution.manifest.v2+json”,
“size”: 1573,
“digest”: “sha256:c64812837fbd43…”,
“platform”: {
“architecture”: “arm64”,
“os”: “linux”
}
}
]
}

Now that I have explained what a multi-arch image is, I will explain how to create one in a CodeCatalyst workflow. A CodeCatalyst workflow is an automated procedure that describes how to build, test, and deploy your code as part of a continuous integration and continuous delivery (CI/CD) system. A workflow defines a series of steps, or actions, to take during a workflow run. Let’s get started.

Prerequisites

If you would like to follow along with this walkthrough, you will need:

A CodeCatalyst space and associated AWS account.

An empty CodeCatalyst projectand source repository in the space.
An Amazon ECR private repository in the associated AWS account.
A CodeCatalyst environment connected to the associated AWS account.

Walkthrough

In this walkthrough I will create a simple application using an Apache HTTP Server serving a static hello world page. The workload is inconsequential. I will focus on the process of building the container image using a CodeCatalyst workflow. The Workflow will build two container images, one for amd64 and one for arm64. The two build tasks will run in parallel on different compute architectures. When both builds are complete, the workflow will build the docker manifest. At the end of this post, my workflow will look like this.

Note that docker also offers a plugin called buildx that will allow you to build a multi-architecture image with a single command. In a real-world application, the workflow would also build the source code, run unit tests, etc. on each architecture. The sample application used in this post is so simple that there is no need to build and test the source code. Let’s examine the sample application now.

Sample Application

Initially the empty repository will only have a README.md file. By the end of this post, my repository will look like this.

I’ll begin by creating the file named index.html. I used the Create file button in CodeCatalyst console shown previously. My index.html file has the following content:

<html>
<head>
<title>Hello World!</title>
</head>
<body>
<h1>Hello World!</h1>
<p>Hello from a multi-architecture container created in CodeCatalyst.</p>
</body>
</html>

I’ll also create a Dockerfile that contains two commands. The first command instructs Docker to build a new image from the Apache HTTP Server Project image called httpd. It is important to note that the httpd image already supports multiple architectures including amd64 and arm64. When creating a multi-architecture image, the base image must also support these architectures. The second command simply copies the index.html file above into the new image. My Dockerfile file has the following content.

FROM httpd
COPY ./index.html /usr/local/apache2/htdocs/

With the source code for my sample application complete, I can turn my attention to the workflow.

CI/CD Workflow

To create a new workflow, select CI/CD from navigation on the left and then select Workflows (1). Then, select Create workflow (2), leave the default options, and select Create (3).

If the workflow editor opens in YAML mode, select Visual to open the visual designer. Now, I can start adding actions to the workflow.

Build Action for the AMD64 Variant

I’ll begin by adding a build action for the amd64 container. Select “+ Actions” to open the actions list. Find the Build action and click “+” to add a new build action to the workflow.

On the Inputs tab, create three variable named AWS_DEFAULT_REGION, IMAGE_REPO_NAME, and IMAGE_TAG. Set the first two values equal to the region and **** name of your Amazon ECR repository**.** Set the third to latest-amd64. For example:

Now select the Configuration tab and rename the action docker_build_amd64. Select the Environment, AWS account connection, and Role for the associated AWS account where you created the Amazon ECR repository. For example:

Then, copy and paste the following code into the Shell commands. This code will build the image using the Dockerfile you created previously. Then, it logs into Amazon ECR, and finally, pushes the new image to ECR.

– Run: AWS_ACCOUNT_ID=`aws sts get-caller-identity –query “Account” –output text`
– Run: docker build -t $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG .
– Run: aws ecr get-login-password | docker login –username AWS –password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
– Run: docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG

If you switch back to the YAML view, you can see that the designer has added the following action to the workflow definition.

docker_build_amd64:
Identifier: aws/[email protected]
Compute:
Type: EC2
Inputs:
Sources:
– WorkflowSource
Variables:
– Name: AWS_DEFAULT_REGION
Value: us-west-2
– Name: IMAGE_REPO_NAME
Value: hello-world
– Name: IMAGE_TAG
Value: latest-amd64
Environment:
Name: demo
Connections:
– Role: CodeCatalystPreviewDevelopmentAdministrator
Name: development
Configuration:
Steps:
– Run: AWS_ACCOUNT_ID=`aws sts get-caller-identity –query “Account” –output text`
– Run: docker build -t $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG .
– Run: aws ecr get-login-password | docker login –username AWS –password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
– Run: docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG

With the amd64 image complete, you can move on to the arm64 image.

Build Action for the ARM64 Variant

Add a second build action named docker_build_arm64 for the arm64 container. The configuration is nearly identical to the previous action with two minor changes. First, on the Inputs tab, I set the IMAGE_TAG to latest-arm64.

Second, on the Configuration tab, change the compute fleet to Linux.Arm64.Large. That is all you need to do to run your action on AWS Graviton. For example:

The Shell commands are identical to the arm64 build action. In addition, don’t forget to select the Environment, AWS account connection, and Role on the configuration tab. The complete configuration for the second action looks like this:

docker_build_arm64:
Identifier: aws/[email protected]
Compute:
Type: EC2
Fleet: Linux.Arm64.Large
Inputs:
Sources:
– WorkflowSource
Variables:
– Name: AWS_DEFAULT_REGION
Value: us-west-2
– Name: IMAGE_REPO_NAME
Value: hello-world
– Name: IMAGE_TAG
Value: latest-arm64
Environment:
Name: demo
Connections:
– Role: CodeCatalystPreviewDevelopmentAdministrator
Name: development
Configuration:
Steps:
– Run: AWS_ACCOUNT_ID=`aws sts get-caller-identity –query “Account” –output text`
– Run: docker build -t $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG .
– Run: aws ecr get-login-password | docker login –username AWS –password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
– Run: docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG

Now that you have a build action for the amd64 and arm64 images, you simply need to create a manifest file describing which image to use for each architecture.

Build Action for the Manifest

The final step in the workflow is to create the Docker manifest. Create a third build action named docker_manifest. You want this action to wait for the prior two actions to complete. Therefore, select the prior two actions from the Depends on drop down, like this:

Also configure four variables. AWS_DEFAULT_REGION and IMAGE_REPO_NAME are identical to the prior actions. In addition, IMAGE_TAG_AMD64 and IMAGE_TAG_ARM64 include the tags you created in the prior actions.

On the configuration tab, select the Environment, AWS account connection, and Role as you did in the prior actions. Then, copy and paste the following Shell commands.

– Run: AWS_ACCOUNT_ID=`aws sts get-caller-identity –query “Account” –output text`
– Run: aws ecr get-login-password | docker login –username AWS –password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
– Run: docker manifest create $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG_ARM64 $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG_AMD64
– Run: docker manifest annotate –arch amd64 $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG_AMD64
– Run: docker manifest annotate –arch arm64 $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG_ARM64
– Run: docker manifest push $AWS_ACCOUNT_ID.dkr.ecr.us-west-2.amazonaws.com/$IMAGE_REPO_NAME

The shell commands create a manifest and then annotate it with the correct image for both amd64 and arm64. The final action looks like this.

docker_manifest:
Identifier: aws/[email protected]
DependsOn:
– docker_build_arm64
– docker_build_amd64
Compute:
Type: EC2
Inputs:
Sources:
– WorkflowSource
Variables:
– Name: AWS_DEFAULT_REGION
Value: us-west-2
– Name: IMAGE_REPO_NAME
Value: hello-world
– Name: IMAGE_TAG_AMD64
Value: latest-amd64
– Name: IMAGE_TAG_ARM64
Value: latest-arm64
Environment:
Name: demo
Connections:
– Role: CodeCatalystPreviewDevelopmentAdministrator
Name: development
Configuration:
Steps:
– Run: AWS_ACCOUNT_ID=`aws sts get-caller-identity –query “Account” –output
text`
– Run: aws ecr get-login-password | docker login –username AWS
–password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
– Run: docker manifest create
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG_ARM64
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG_AMD64
– Run: docker manifest annotate –arch amd64
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG_AMD64
– Run: docker manifest annotate –arch arm64
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG_ARM64
– Run: docker manifest push
$AWS_ACCOUNT_ID.dkr.ecr.us-west-2.amazonaws.com/$IMAGE_REPO_NAME

I now have a complete CI/CD workflow that creates a container images for both amd64 and arm64. When I commit the changes, CodeCatalyst will execute my workflow, build the images, and push to ECR.

Cleanup

If you have been following along with this workflow, you should delete the resources you deployed so you do not continue to incur charges. First, delete the Amazon ECR repository using the AWS console. Second, delete the project from CodeCatalyst by navigating to Project settings and choosing Delete project.

Conclusion

AWS Graviton processors are custom-built by AWS to deliver the best price performance for cloud workloads. In this post I explained how to configure CodeCatalyst workflow actions to run on AWS Graviton. I used CodeCatalyst to create a workflow that builds a multi-architecture container image that can run on both amd64 and arm64 architectures. Get started building your multi-arch containers in Amazon CodeCatalyst today! You can read more about CodeCatalyst workflows in the documentation.

Announcing General Availability of Amazon CodeCatalyst

We are pleased to announce that Amazon CodeCatalyst is now generally available. CodeCatalyst is a unified software development service that brings together everything teams need to get started planning, coding, building, testing, and deploying applications on AWS. CodeCatalyst was designed to make it easier for developers to spend more time developing application features and less time setting up project tools, creating and managing continuous integration and continuous delivery (CI/CD) pipelines, provisioning and configuring various development and deployment environments, and onboarding project collaborators. You can learn more and get started building in minutes on the AWS Free Tier at the CodeCatalyst website.

Launched in preview at AWS re:Invent in December 2022, CodeCatalyst provides an easy way for professional developers to build and deploy applications on AWS. We built CodeCatalyst based on feedback we received from customers looking for a more streamlined way to build using DevOps best practices. They want a complete software development service that lets them start new projects more quickly and gives them confidence that it will continue delivering a great long term experience throughout their application’s lifecycle.

Do more of what you love, and less of what you don’t

Starting a new project is an exciting time of imagining the possibilities: what can you build and how can you enable your end users to do something that wasn’t possible before? However, the joy of creating something new can also come with anxiety about all of the decisions to be made about tooling and integrations. Once your project is in production, managing tools and wrangling project collaborators can take your focus away from being creative and doing your best work. If you are spending too much time keeping brittle pipelines running and your teammates are constantly struggling with tooling, the day to day experience of building new features can start to feel less than joyful.

That is where CodeCatalyst comes in. It isn’t just about developer productivity – it is about helping developers and teams spend more time using the tools they are most comfortable with. Teams deliver better, more impactful outcomes to customers when they have more freedom to focus on their highest-value work and have to concern themselves less with activities that feel like roadblocks. Everything we do stems from that premise, and today’s launch marks a major milestone in helping to enable developers to have a better DevOps experience on AWS.

How CodeCatalyst delivers a great experience

There are four foundational elements of CodeCatalyst that are designed to help minimize distraction and maximize joy in the software development process: blueprints for quick project creation, actions-based CI/CD automation for managing day-to-day software lifecycle tasks, remote Dev Environments for a consistent build experience, and project and issue management for a more streamlined team collaboration.

Blueprints get you started quickly. CodeCatalyst blueprints set up an application code repository (complete with a working sample app), define cloud infrastructure, and run pre-configured CI/CD workflows for your project. Blueprints bring together the elements that are necessary both to begin a new project and deploy it into production. Blueprints can help to significantly reduce the time it takes to set up a new project. They are built by AWS for many use cases, and you can configure them with the programming languages and frameworks that you need both for your application and the underlying infrastructure-as-code. When it comes to incorporating existing tools like Jira or GitHub, CodeCatalyst has extensions that you can use to integrate them into your projects from the beginning without a lot of extra effort. Learn more about blueprints.

“CodeCatalyst helps us spend more time refining our customers’ build, test, and deploy workflows instead of implementing the underlying toolchains,” said Sean Bratcher, CEO of Buildstr. “The tight integration with AWS CDK means that definitions for infrastructure, environments, and configs live alongside the applications themselves as first-class code. This helps reduce friction when integrating with customers’ broader deployment approach.”

Actions-based CI/CD workflows take the pain out of pipeline management. CI/CD workflows in CodeCatalyst run on flexible, managed infrastructure. When you create a project with a blueprint, it comes with a complete CI/CD pipeline composed of actions from the included actions library. You can modify these pipelines with an action from the library or you can use any GitHub Action directly in the project to edit existing pipelines or build new ones from scratch. CodeCatalyst makes composing these actions into pipelines easier: you can switch back and forth between a text-based editor for declaring which actions you want to use through YAML and a visual drag-and-drop pipeline editor. Updating CI/CD workflows with new capabilities is a matter of incorporating new actions. Having CodeCatalyst create pipelines for you, based on your intent, means that you get the benefits of CI/CD automation without the ongoing pain of maintaining disparate tools.

“We needed a streamlined way within AWS to rapidly iterate development of our Reading Partners Connects e-learning platform while maintaining the highest possible quality standards,” said Yaseer Khanani, Senior Product Manager at Reading Partners. “CodeCatalyst’s built-in CI/CD workflows make it easy to efficiently deploy code and conduct testing across a distributed team.”

Automated dev environments make consistency achievable A big friction point for developers collaborating on a software project is getting everyone on the same set of dependencies and settings in their local machines, and ensuring that all other environments from test to staging to production are also consistent. To help address this, CodeCatalyst has Dev Environments that are hosted in the cloud. Dev Environments are defined using the devfile standard, ensuring that everyone working on a project gets a consistent and repeatable experience. Dev Environments connect to popular IDEs like AWS Cloud9, VS Code, and multiple JetBrains IDEs, giving you a local IDE feel while running in the cloud.

“Working closely with customers in the software developer education space, we value the reproducible and pre-configured environments Amazon CodeCatalyst provides for improving learning outcomes for new developers. CodeCatalyst allows you to personalize student experiences while providing facilitators with control over the entire experience.” said Tia Dubuisson, President of Belle Fleur Technologies.

Issue management and simplified team onboarding streamline collaboration. CodeCatalyst is designed to help provide the benefits of building in a unified software development service by making it easier to onboard and collaborate with teammates. It starts with the process of inviting new collaborators: you can invite people to work together on your project with their email address, bypassing the need for everyone to have an individual AWS account. Once they have access, collaborators can see the history and context of the project and can start contributing by creating a Dev Environment.

CodeCatalyst also has built-in issue management that is tied to your code repo, so that you can assign tasks such as code reviews and pull requests to teammates and help track progress using agile methodologies right in the service. As with the rest of CodeCatalyst, collaboration comes without the distraction of managing separate services with separate logins and disparate commercial agreements. Once you give a new teammate access, they can quickly start contributing.

New to CodeCatalyst since the Preview launch

Along with the announcement of general availability, we are excited to share a few new CodeCatalyst features. First, you can now create a new project from an existing GitHub repository. In addition, CodeCatalyst Dev Environments now support GitHub repositories allowing you to work on code stored in GitHub.

Second, CodeCatalyst Dev Environments now support Amazon CodeWhisperer. CodeWhisperer is an artificial intelligence (AI) coding companion that generates real-time code suggestions in your integrated development environment (IDE) to help you more quickly build software. CodeWhisperer is currently supported in CodeCatalyst Dev Environments using AWS Cloud 9 or Visual Studio Code.

Third, Amazon CodeCatalyst recently added support to run workflow actions using on-demand or pre-provisioned compute powered by AWS Graviton processors. AWS Graviton Processors are designed by AWS to deliver the best price performance for your cloud workloads running in Amazon Elastic Compute Cloud (Amazon EC2). Customers can use workflow actions running on AWS Graviton processors to build applications that target Arm architecture, create multi-architecture containers, and modernize legacy applications to help customers reduce costs.

Finally, the library of CodeCatalyst blueprints is continuously growing. The CodeCatalyst preview release included blueprints for common workloads like single-page web applications, serverless applications, and many others. In addition, we have recently added blueprints for Static Websites with Hugo and Jekyll, as well as Intelligent Document Processing workflows.

Learn more about CodeCatalyst at Developer Innovation Day

Next Wednesday, April 26th, we are hosting Developer Innovation Day, a free 7-hour virtual event that is all about helping developers and teams learn to be productive, and collaborate, from discovery to delivery to running software and building applications. Developers can discover how the breadth and depth of AWS tools and the right practices can unlock your team’s ability to find success and take opportunities from ideas to impact.

CodeCatalyst plays a big part in Developer Innovation Day, with five sessions designed to help you see real examples of how you can spend more time doing the work you love best! Get an overview of the service, see how to deploy a working static website in minutes, collaborating effectively with teammates, and more.

Try CodeCatalyst

Ready to try CodeCatalyst? You can get started on the AWS Free Tier today and quickly deploy a blueprint with working sample code. If you would like to learn more, you can read through a collection of DevOps blogs about CodeCatalyst or read the documentation. We can’t wait to see how you innovate with CodeCatalyst!