Fully Automated Deployment of an Open Source Mail Server on AWS

Many AWS customers have the requirement to host their own email solution and prefer to operate mail severs over using fully managed solutions (e.g. Amazon WorkMail). While certainly there are merits to either approach, motivations for self-managing often include:

full ownership and control
need for customized configuration
restricting access to internal networks or when connected via VPN.

In order to achieve this, customers frequently rely on open source mail servers due to flexibility and free licensing as compared to proprietary solutions like Microsoft Exchange. However, running and operating open source mail servers can be challenging as several components need to be configured and integrated to provide the desired end-to-end functionality. For example, to provide a fully functional email system, you need to combine dedicated software packages to send and receive email, access and manage inboxes, filter spam, manage users etc. Hence, this can be complex and error-prone to configure and maintain. Troubleshooting issues often calls for expert knowledge. As a result, several open source projects emerged that aim at simplifying setup of open source mail servers, such as Mail-in-a-Box, Docker Mailserver, Mailu, Modoboa, iRedMail, and several others.

In this blog post, we take this one step further by adding infrastructure automation and integrations with AWS services to fully automate the deployment of an open source mail server on AWS. We present an automated setup of a single instance mail server, striving for minimal complexity and cost, while still providing high resiliency by leveraging incremental backups and automations. As such, the solution is best suited for small to medium organizations that are looking to run open source mail servers but do not want to deal with the associated operational complexity.

The solution in this blog uses AWS CloudFormation templates to automatically setup and configure an Amazon Elastic Compute Cloud (Amazon EC2) instance running Mail-in-a-Box, which integrates features such as email , webmail, calendar, contact, and file sharing, thus providing functionality similar to popular SaaS tools or commercial solutions. All resources to reproduce the solution are provided in a public GitHub repository under an open source license (MIT-0).

Amazon Simple Storage Service (Amazon S3) is used both for offloading user data and for storing incremental application-level backups. Aside from high resiliency, this backup strategy gives way to an immutable infrastructure approach, where new deployments can be rolled out to implement updates and recover from failures which drastically simplifies operation and enhances security.

We also provide an optional integration with Amazon Simple Email Service (Amazon SES) so customers can relay their emails through reputable AWS servers and have their outgoing email accepted by third-party servers. All of this enables customers to deploy a fully featured open source mail server within minutes from AWS Management Console, or restore an existing server from an Amazon S3 backup for immutable upgrades, migration, or recovery purposes.

Overview of Solution

The following diagram shows an overview of the solution and interactions with users and other AWS services.

After preparing the AWS Account and environment, an administrator deploys the solution using an AWS CloudFormation template (1.). Optionally, a backup from Amazon S3 can be referenced during deployment to restore a previous installation of the solution (1a.). The admin can then proceed to setup via accessing the web UI (2.) to e.g., provision TLS certificates and create new users. After the admin has provisioned their accounts, users can access the web interface (3.) to send email, manage their inboxes, access calendar and contacts and share files. Optionally, outgoing emails are relayed via Amazon SES (3a.) and user data is stored in a dedicated Amazon S3 bucket (3b.). Furthermore, the solution is configured to automatically and periodically create incremental backups and store them into an S3 bucket for backups (4.).

On top of popular open source mail server packages such as Postfix for SMTP and Dovecot for IMAP, Mail-in-a-box integrates Nextcloud for calendar, contacts, and file sharing. However, note that Nextcloud capabilities in this context are limited. It’s primarily intended to be used alongside the core mail server functionalities to maintain calendar and contacts and for lightweight file sharing (e.g. for sharing files via links that are too large for email attachments). If you are looking for a fully featured, customizable and scalable Nextcloud deployment on AWS, have a look at this AWS Sample instead.

Deploying the Solution

Prerequisites

For this walkthrough, you should have the following prerequisites:

An AWS account

An existing external email address to test your new mail server. In the context of this sample, we will use [email protected] as the address.
A domain that can be exclusively used by the mail server in the sample. In the context of this sample, we will use aws-opensource-mailserver.org as the domain. If you don’t have a domain available, you can register a new one with Amazon Route 53. In case you do so, you can go ahead and delete the associated hosted zone that gets automatically created via the Amazon Route 53 Console. We won’t need this hosted zone because the mail server we deploy will also act as Domain Name System (DNS) server for the domain.
An SSH key pair for command line access to the instance. Command line access to the mail server is optional in this tutorial, but a key pair is still required for the setup. If you don’t already have a key pair, go ahead and create one in the EC2 Management Console:

(Optional) In this blog, we verify end-to-end functionality by sending an email to a single email address ([email protected] ) leveraging Amazon SES in sandbox mode. In case you want to adopt this sample for your use case and send email beyond that, you need to request removal of email sending limitations for EC2 or alternatively, if you relay your mail via Amazon SES request moving out of Amazon SES sandbox.

Preliminary steps: Setting up DNS and creating S3 Buckets

Before deploying the solution, we need to set up DNS and create Amazon S3 buckets for backups and user data.

1.     Allocate an Elastic IP address: We use the address 52.6.x.y in this sample.

2.     Configure DNS: If you have your domain registered with Amazon Route 53, you can use the AWS Management Console to change the name server and glue records for your domain. Configure two DNS servers ns1.box.<your-domain> and ns2.box.<your-domain> by placing your Elastic IP (allocated in step 1) into the Glue records field for each name server:

If you use a third-party DNS service, check their corresponding documentation on how to set the glue records.

It may take a while until the updates to the glue records propagate through the global DNS system. Optionally, before proceeding with the deployment, you can verify your glue records are setup correctly with the dig command line utility:

# Get a list of root servers for your top level domain
dig +short org. NS
# Query one of the root servers for an NS record of your domain
dig c0.org.afilias-nst.info. aws-opensource-mailserver.org. NS

This should give you output as follows:

;; ADDITIONAL SECTION:
ns1.box.aws-opensource-mailserver.org. 3600 IN A 52.6.x.y
ns2.box.aws-opensource-mailserver.org. 3600 IN A 52.6.x.y

3.     Create S3 buckets for backups and user data: Finally, in the Amazon S3 Console, create a bucket to store Nextcloud data and another bucket for backups, choosing globally unique names for both of them. In context of this sample, we will be using the two buckets (aws-opensource-mailserver-backup and aws-opensource-mailserver-nextcloud) as shown here:

Deploying and Configuring Mail-in-a-Box

Click    to deploy and specify the parameters as shown in the below screenshot to match the resources created in the previous section, leave other parameters at their default value, then click Next and Submit.

This will deploy your mail server into a public subnet of your default VPC which takes about 10 minutes. You can monitor the progress in the AWS CloudFormation Console. Meanwhile, retrieve and note the admin password for the web UI from AWS Systems Manager Parameter Store via the MailInABoxAdminPassword parameter.

Roughly one minute after your mail server finishes deploying, you can log in at its admin web UI residing at https://52.6.x.y/admin with username admin@<your-domain>, as shown in the following picture (you need to confirm the certificate exception warning from your browser):

Finally, in the admin UI navigate to System > TLS(SSL) Certificates and click Provision to obtain a valid SSL certificate and complete the setup (you might need to click on Provision twice to have all domains included in your certificate, as shown here).

At this point, you could further customize your mail server setup (e.g., by creating inboxes for additional users). However, we will continue to use the admin user in this sample for testing the setup in the next section.

Note: If your AWS account is subject to email sending restrictions on EC2, you will see an error in your admin dashboard under System > System Status Checks that says ‘Incoming Email (SMTP/postfix) is running but not publicly accessible’. You are safe to ignore this and should be able to receive emails regardless.

Testing the Solution

Receiving Email

With your existing email account, compose and send an email to admin@<your-domain>. Then login as admin@<your-domain> to the webmail UI of your AWS mail server at https://box.<your-domain>/mail and verify you received the email:

Test file sharing, calendar and contacts with Nextcloud

Your Nextcloud installation can be accessed under https://box.<your-domain>/cloud, as shown in the next figure. Here you can manage your calendar, contacts, and shared files. Contacts created and managed here are also accessible in your webmail UI when you compose an email. Refer to the Nextcloud documentation for more details. In order to keep your Nextcloud installation consistent and automatically managed by Mail-in-a-box setup scripts, admin users are advised to refrain from changing and customizing the Nextcloud configuration.

Sending Email

For this sample, we use Amazon SES to forward your outgoing email, as this is a simple way to get the emails you send accepted by other mail servers on the web. Achieving this is not trivial otherwise, as several popular email services tend to block public IP ranges of cloud providers.

Alternatively, if your AWS account has email sending limitations for EC2 you can send emails directly from your mail server. In this case, you can skip the next section and continue with Send test email, but make sure you’ve deployed your mail server stack with the SesRelay set to false. In that case, you can also bring your own IP addresses to AWS and continue using your reputable addresses or build reputation for addresses you own.

Verify your domain and existing Email address to Amazon SES

In order to use Amazon SES to accept and forward email for your domain, you first need to prove ownership of it. Navigate to Verified Identities in the Amazon SES Console and click Create identity, select domain and enter your domain. You will then be presented with a screen as shown here:

You now need to copy-paste the three CNAME DNS records from this screen over to your mail server admin dashboard. Open the admin web UI of your mail server again, select System > Custom DNS, and add the records as shown in the next screenshot.

Amazon SES will detect these records, thereby recognizing you as the owner and verifying the domain for sending emails. Similarly, while still in sandbox mode, you also need to verify ownership of the recipient email address. Navigate again to Verified Identities in the Amazon SES Console, click Create identity, choose Email Address, and enter your existing email address.

Amazon SES will then send a verification link to this address, and once you’ve confirmed via the link that you own this address, you can send emails to it.

Summing up, your verified identities section should look similar to the next screenshot before sending the test email:

Finally, if you intend to send email to arbitrary addresses with Amazon SES beyond testing in the next step, refer to the documentation on how to request production access.

Send test email

Now you are set to log back into your webmail UI and reply to the test mail you received before:

Checking the inbox of your existing mail, you should see the mail you just sent from your AWS server.

Congratulations! You have now verified full functionality of your open source mail server on AWS.

Restoring from backup

Finally, as a last step, we demonstrate how to roll out immutable deployments and restore from a backup for simple recovery, migration and upgrades. In this context, we test recreating the entire mail server from a backup stored in Amazon S3.

For that, we use the restore feature of the CloudFormation template we deployed earlier to migrate from the initial t2.micro installation to an AWS Graviton arm64-based t4g.micro instance. This exemplifies the power of the immutable infrastructure approach made possible by the automated application level backups, allowing for simple migration between instance types with different CPU architectures.

Verify you have a backup

By default, your server is configured to create an initial backup upon installation and nightly incremental backups. Using your ssh key pair, you can connect to your instance and trigger a manual backup to make sure the emails you just sent and received when testing will be included in the backup:

ssh -i aws-opensource-mailserver.pem [email protected] sudo /opt/mailinabox/management/backup.py

You can then go to your mail servers’ admin dashboard at https://box.<your-doamin>/admin and verify the backup status under System > Backup Status:

Recreate your mail server and restore from backup

First, double check that you have saved the admin password, as you will no longer be able to retrieve it from Parameter Store once you delete the original installation of your mail server. Then go ahead and delete the aws-opensource-mailserver stack from your CloudFormation Console an redeploy it by clicking on this . However, this time, adopt the parameters as shown below, changing the instance type and corresponding AMI as well as specifying the prefix in your backup S3 bucket to restore from.

Within a couple of minutes, your mail server will be up and running again, featuring the exact same state it was before you deleted it, however, running on a completely new instance powered by AWS Graviton. You can verify this by going to your webmail UI at https://box.<yourdomain>/mail and logging in with your old admin credentials.

Cleaning up

 Delete the mail server stack from CloudFormation Console

Empty and delete both the backup and Nextcloud data S3 Buckets

Release the Elastic IP
In case you registered your domain from Amazon Route 53 and do not want to hold onto it, you need to disable automatic renewal. Further, if you haven’t already, delete the hosted zone that got created automatically when registering it.

Outlook

The solution discussed so far focuses on minimal operational complexity and cost and hence is based on a single Amazon EC2 instance comprising all functions of an open source mail server, including a management UI, user database, Nextcloud and DNS. With a suitably sized instance, this setup can meet the demands of small to medium organizations. In particular, the continuous incremental backups to Amazon S3 provide high resiliency and can be leveraged in conjunction with the CloudFormation automations to quickly recover in case of instance or single Availablity Zone (AZ) failures.

Depending on your requirements, extending the solution and distributing components across AZs allows for meeting more stringent requirements regarding high availability and scalability in the context of larger deployments. Being based on open source software, there is a straight forward migration path towards these more complex distributed architectures once you outgrow the setup discussed in this post.

Conclusion

In this blog post, we showed how to automate the deployment of an open source mail server on AWS and how to quickly and effortlessly restore from a backup for rolling out immutable updates and providing high resiliency. Using AWS CloudFormation infrastructure automations and integrations with managed services such as Amazon S3 and Amazon SES, the lifecycle management and operation of open source mail servers on AWS can be simplified significantly. Once deployed, the solution provides an end-user experience similar to popular SaaS and commercial offerings.

You can go ahead and use the automations provided in this blog and the corresponding GitHub repository to get started with running your own open source mail server on AWS!

Flatlogic Admin Templates banner

S3 URI Parsing is now available in AWS SDK for Java 2.x

The AWS SDK for Java team is pleased to announce the general availability of Amazon Simple Storage Service (Amazon S3) URI parsing in the AWS SDK for Java 2.x. You can now parse path-style and virtual-hosted-style S3 URIs to easily retrieve the bucket, key, region, style, and query parameters. The new parseUri() API and S3Uri class provide the highly-requested parsing features that many customers miss from the AWS SDK for Java 1.x. Please note that Amazon S3 AccessPoints and Amazon S3 on Outposts URI parsing are not supported.

Motivation

Users often need to extract important components like bucket and key from stored S3 URIs to use in S3Client operations. The new parsing APIs allow users to conveniently do so, bypassing the need for manual parsing or storing the components separately.

Getting Started

To begin, first add the dependency for S3 to your project.

<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>s3</artifactId>
<version>${s3.version}</version>
</dependency>

Next, instantiate S3Client and S3Utilities objects.

S3Client s3Client = S3Client.create();
S3Utilities s3Utilities = s3Client.utilities();

Parsing an S3 URI

To parse your S3 URI, call parseUri() from S3Utilities, passing in the URI. This will return a parsed S3Uri object. If you have a String of the URI, you’ll need to convert it into an URI object first.

String url = “https://s3.us-west-1.amazonaws.com/myBucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88”;
URI uri = URI.create(url);
S3Uri s3Uri = s3Utilities.parseUri(uri);

With the S3Uri, you can call the appropriate getter methods to retrieve the bucket, key, region, style, and query parameters. If the bucket, key, or region is not specified in the URI, an empty Optional will be returned. If query parameters are not specified in the URI, an empty map will be returned. If the field is encoded in the URI, it will be returned decoded.

Region region = s3Uri.region().orElse(null); // Region.US_WEST_1
String bucket = s3Uri.bucket().orElse(null); // “myBucket”
String key = s3Uri.key().orElse(null); // “resources/doc.txt”
boolean isPathStyle = s3Uri.isPathStyle(); // true

Retrieving query parameters

There are several APIs for retrieving the query parameters. You can return a Map<String, List<String>> of the query parameters. Alternatively, you can specify a query parameter to return the first value for the given query, or return the list of values for the given query.

Map<String, List<String>> queryParams = s3Uri.rawQueryParameters(); // {versionId=[“abc123”], partNumber=[“77”, “88”]}
String versionId = s3Uri.firstMatchingRawQueryParameter(“versionId”).orElse(null); // “abc123”
String partNumber = s3Uri.firstMatchingRawQueryParameter(“partNumber”).orElse(null); // “77”
List<String> partNumbers = s3Uri.firstMatchingRawQueryParameters(“partNumber”); // [“77”, “88”]

Caveats

Special Characters

If you work with object keys or query parameters with reserved or unsafe characters, they must be URL-encoded, e.g., replace whitespace ” ” with “%20”.

Valid:
“https://s3.us-west-1.amazonaws.com/myBucket/object%20key?query=%5Bbrackets%5D”

Invalid:
“https://s3.us-west-1.amazonaws.com/myBucket/object key?query=[brackets]”

Virtual-hosted-style URIs

If you work with virtual-hosted-style URIs with bucket names that contain a dot, i.e., “.”, the dot must not be URL-encoded.

Valid:
“https://my.Bucket.s3.us-west-1.amazonaws.com/key”

Invalid:
“https://my%2EBucket.s3.us-west-1.amazonaws.com/key”

Conclusion

In this post, I discussed parsing S3 URIs in the AWS SDK for Java 2.x and provided code examples for retrieving the bucket, key, region, style, and query parameters. To learn more about how to set up and begin using the feature, visit our Developer Guide. If you are curious about how it is implemented, check out the source code on GitHub. As always, the AWS SDK for Java team welcomes bug reports, feature requests, and pull requests on the aws-sdk-java-v2 GitHub repository.

Integrating DevOps Guru Insights with CloudWatch Dashboard

Many customers use Amazon CloudWatch dashboards to monitor applications and often ask how they can integrate Amazon DevOps Guru Insights in order to have a unified dashboard for monitoring.  This blog post showcases integrating DevOps Guru proactive and reactive insights to a CloudWatch dashboard by using Custom Widgets. It can help you to correlate trends over time and spot issues more efficiently by displaying related data from different sources side by side and to have a single pane of glass visualization in the CloudWatch dashboard.

Amazon DevOps Guru is a machine learning (ML) powered service that helps developers and operators automatically detect anomalies and improve application availability. DevOps Guru’s anomaly detectors can proactively detect anomalous behavior even before it occurs, helping you address issues before they happen; detailed insights provide recommendations to mitigate that behavior.

Amazon CloudWatch dashboard is a customizable home page in the CloudWatch console that monitors multiple resources in a single view. You can use CloudWatch dashboards to create customized views of the metrics and alarms for your AWS resources.

Solution overview

This post will help you to create a Custom Widget for Amazon CloudWatch dashboard that displays DevOps Guru Insights. A custom widget is part of your CloudWatch dashboard that calls an AWS Lambda function containing your custom code. The Lambda function accepts custom parameters, generates your dataset or visualization, and then returns HTML to the CloudWatch dashboard. The CloudWatch dashboard will display this HTML as a widget. In this post, we are providing sample code for the Lambda function that will call DevOps Guru APIs to retrieve the insights information and displays as a widget in the CloudWatch dashboard. The architecture diagram of the solution is below.

Figure 1: Reference architecture diagram

Prerequisites and Assumptions

An AWS account. To sign up:

Create an AWS account. For instructions, see Sign Up For AWS.

DevOps Guru should be enabled in the account. For enabling DevOps guru, see DevOps Guru Setup

Follow this Workshop to deploy a sample application in your AWS Account which can help generate some DevOps Guru insights.

Solution Deployment

We are providing two options to deploy the solution – using the AWS console and AWS CloudFormation. The first section has instructions to deploy using the AWS console followed by instructions for using CloudFormation. The key difference is that we will create one Widget while using the Console, but three Widgets are created when we use AWS CloudFormation.

Using the AWS Console:

We will first create a Lambda function that will retrieve the DevOps Guru insights. We will then modify the default IAM role associated with the Lambda function to add DevOps Guru permissions. Finally we will create a CloudWatch dashboard and add a custom widget to display the DevOps Guru insights.

Navigate to the Lambda Console after logging to your AWS Account and click on Create function.

Figure 2a: Create Lambda Function

Choose Author from Scratch and use the runtime Node.js 16.x. Leave the rest of the settings at default and create the function.

Figure 2b: Create Lambda Function

After a few seconds, the Lambda function will be created and you will see a code source box. Copy the code from the text box below and replace the code present in code source as shown in screen print below. // SPDX-License-Identifier: MIT-0
// CloudWatch Custom Widget sample: displays count of Amazon DevOps Guru Insights
const aws = require(‘aws-sdk’);

const DOCS = `## DevOps Guru Insights Count
Displays the total counts of Proactive and Reactive Insights in DevOps Guru.
`;

async function getProactiveInsightsCount(DevOpsGuru, StartTime, EndTime) {
let NextToken = null;
let proactivecount=0;

do {
const args = { StatusFilter: { Any : { StartTimeRange: { FromTime: StartTime, ToTime: EndTime }, Type: ‘PROACTIVE’ }}}
const result = await DevOpsGuru.listInsights(args).promise();
console.log(result)
NextToken = result.NextToken;
result.ProactiveInsights.forEach(res =&gt; {
console.log(result.ProactiveInsights[0].Status)
proactivecount++;
});
} while (NextToken);
return proactivecount;
}

async function getReactiveInsightsCount(DevOpsGuru, StartTime, EndTime) {
let NextToken = null;
let reactivecount=0;

do {
const args = { StatusFilter: { Any : { StartTimeRange: { FromTime: StartTime, ToTime: EndTime }, Type: ‘REACTIVE’ }}}
const result = await DevOpsGuru.listInsights(args).promise();
NextToken = result.NextToken;
result.ReactiveInsights.forEach(res =&gt; {
reactivecount++;
});
} while (NextToken);
return reactivecount;
}

function getHtmlOutput(proactivecount, reactivecount, region, event, context) {

return `DevOps Guru Proactive Insights&lt;br&gt;&lt;font size=”+10″ color=”#FF9900″&gt;${proactivecount}&lt;/font&gt;
&lt;p&gt;DevOps Guru Reactive Insights&lt;/p&gt;&lt;font size=”+10″ color=”#FF9900″&gt;${reactivecount}`;
}

exports.handler = async (event, context) =&gt; {
if (event.describe) {
return DOCS;
}
const widgetContext = event.widgetContext;
const timeRange = widgetContext.timeRange.zoom || widgetContext.timeRange;
const StartTime = new Date(timeRange.start);
const EndTime = new Date(timeRange.end);
const region = event.region || process.env.AWS_REGION;
const DevOpsGuru = new aws.DevOpsGuru({ region });

const proactivecount = await getProactiveInsightsCount(DevOpsGuru, StartTime, EndTime);
const reactivecount = await getReactiveInsightsCount(DevOpsGuru, StartTime, EndTime);

return getHtmlOutput(proactivecount, reactivecount, region, event, context);

};

Figure 3: Lambda Function Source Code

Click on Deploy to save the function code
Since we used the default settings while creating the function, a default Execution role is created and associated with the function. We will need to modify the IAM role to grant DevOps Guru permissions to retrieve Proactive and Reactive insights.
Click on the Configuration tab and select Permissions from the left side option list. You can see the IAM execution role associated with the function as shown in figure 4.

Figure 4: Lambda function execution role

Click on the IAM role name to open the role in the IAM console. Click on Add Permissions and select Attach policies.

Figure 5: IAM Role Update

Search for DevOps and select the AmazonDevOpsGuruReadOnlyAccess. Click on Add permissions to update the IAM role.

Figure 6: IAM Role Policy Update

Now that we have created the Lambda function for our custom widget and assigned appropriate permissions, we can navigate to CloudWatch to create a Dashboard.
Navigate to CloudWatch and click on dashboards from the left side list. You can choose to create a new dashboard or add the widget in an existing dashboard.
We will choose to create a new dashboard

Figure 7: Create New CloudWatch dashboard

Choose Custom Widget in the Add widget page

Figure 8: Add widget

Click Next in the custom widge page without choosing a sample

Figure 9: Custom Widget Selection

Choose the region where devops guru is enabled. Select the Lambda function that we created earlier. In the preview pane, click on preview to view DevOps Guru metrics. Once the preview is successful, create the Widget.

Figure 10: Create Custom Widget

Congratulations, you have now successfully created a CloudWatch dashboard with a custom widget to get insights from DevOps Guru. The sample code that we provided can be customized to suit your needs.

Using AWS CloudFormation

You may skip this step and move to future scope section if you have already created the resources using AWS Console.

In this step we will show you how to  deploy the solution using AWS CloudFormation. AWS CloudFormation lets you model, provision, and manage AWS and third-party resources by treating infrastructure as code. Customers define an initial template and then revise it as their requirements change. For more information on CloudFormation stack creation refer to  this blog post.

The following resources are created.

Three Lambda functions that will support CloudWatch Dashboard custom widgets
An AWS Identity and Access Management (IAM) role to that allows the Lambda function to access DevOps Guru Insights and to publish logs to CloudWatch
Three Log Groups under CloudWatch
A CloudWatch dashboard with widgets to pull data from the Lambda Functions

To deploy the solution by using the CloudFormation template

You can use this downloadable template  to set up the resources. To launch directly through the console, choose Launch Stack button, which creates the stack in the us-east-1 AWS Region.
Choose Next to go to the Specify stack details page.
(Optional) On the Configure Stack Options page, enter any tags, and then choose Next.
On the Review page, select I acknowledge that AWS CloudFormation might create IAM resources.
Choose Create stack.

It takes approximately 2-3 minutes for the provisioning to complete. After the status is “Complete”, proceed to validate the resources as listed below.

Validate the resources

Now that the stack creation has completed successfully, you should validate the resources that were created.

On AWS Console, head to CloudWatch, under Dashboards – there will be a dashboard created with name <StackName-Region>.
On AWS Console, head to CloudWatch, under LogGroups there will be 3 new log-groups created with the name as:

lambdaProactiveLogGroup
lambdaReactiveLogGroup
lambdaSummaryLogGroup

On AWS Console, head to Lambda, there will be lambda function(s) under the name:

lambdaFunctionDGProactive
lambdaFunctionDGReactive
lambdaFunctionDGSummary

On AWS Console, head to IAM, under Roles there will be a new role created with name “lambdaIAMRole”

To View Results/Outcome

With the appropriate time-range setup on CloudWatch Dashboard, you will be able to navigate through the insights that have been generated from DevOps Guru on the CloudWatch Dashboard.

Figure 11: DevOpsGuru Insights in Cloudwatch Dashboard

Cleanup

For cost optimization, after you complete and test this solution, clean up the resources. You can delete them manually if you used the AWS Console or by deleting the AWS CloudFormation stack called devopsguru-cloudwatch-dashboard if you used AWS CloudFormation.

For more information on deleting the stacks, see Deleting a stack on the AWS CloudFormation console.

Conclusion

This blog post outlined how you can integrate DevOps Guru insights into a CloudWatch Dashboard. As a customer, you can start leveraging CloudWatch Custom Widgets to include DevOps Guru Insights in an existing Operational dashboard.

AWS Customers are now using Amazon DevOps Guru to monitor and improve application performance. You can start monitoring your applications by following the instructions in the product documentation. Head over to the Amazon DevOps Guru console to get started today.

To learn more about AIOps for Serverless using Amazon DevOps Guru check out this video.

Suresh Babu

Suresh Babu is a DevOps Consultant at Amazon Web Services (AWS) with 21 years of experience in designing and implementing software solutions from various industries. He helps customers in Application Modernization and DevOps adoption. Suresh is a passionate public speaker and often speaks about DevOps and Artificial Intelligence (AI)

Venkat Devarajan

Venkat Devarajan is a Senior Solutions Architect at Amazon Webservices (AWS) supporting enterprise automotive customers. He has over 18 years of industry experience in helping customers design, build, implement and operate enterprise applications.

Ashwin Bhargava

Ashwin is a DevOps Consultant at AWS working in Professional Services Canada. He is a DevOps expert and a security enthusiast with more than 15 years of development and consulting experience.

Murty Chappidi

Murty is an APJ Partner Solutions Architecture Lead at Amazon Web Services with a focus on helping customers with accelerated and seamless journey to AWS by providing solutions through our GSI partners. He has more than 25 years’ experience in software and technology and has worked in multiple industry verticals. He is the APJ SME for AI for DevOps Focus Area. In his free time, he enjoys gardening and cooking.

10 ways to build applications faster with Amazon CodeWhisperer

Amazon CodeWhisperer is a powerful generative AI tool that gives me coding superpowers. Ever since I have incorporated CodeWhisperer into my workflow, I have become faster, smarter, and even more delighted when building applications. However, learning to use any generative AI tool effectively requires a beginner’s mindset and a willingness to embrace new ways of working.

Best practices for tapping into CodeWhisperer’s power are still emerging. But, as an early explorer, I’ve discovered several techniques that have allowed me to get the most out of this amazing tool. In this article, I’m excited to share these techniques with you, using practical examples to illustrate just how CodeWhisperer can enhance your programming workflow. I’ll explore:

Typing less
Generating functions using code
Generating functions using comments
Generating classes
Implementing algorithms
Writing unit tests
Creating sample data
Simplifying regular expressions
Learning third-party code libraries faster
Documenting code

Before we begin

If you would like to try these techniques for yourself, you will need to use a code editor with the AWS Toolkit extension installed. VS Code, AWS Cloud9, and most editors from JetBrains will work. Refer to the CodeWhisperer “Getting Started” resources for setup instructions.

CodeWhisperer will present suggestions automatically as you type. If you aren’t presented with a suggestion, you can always manually trigger a suggestion using the Option + C (Mac) or Alt + C (Windows) shortcut. CodeWhisperer will also sometimes present you with multiple suggestions to choose from. You can press the → and ← keys to cycle through all available suggestions.

The suggestions CodeWhisperer offers are non-deterministic, which means you may receive slightly different suggestions than the ones shown in this article. If you receive a suggestion that doesn’t suit your needs, try cycling through all available suggestions using the arrow keys or try rephrasing any code comment prompts slightly. Even punctuation changes can produce fresh suggestions.

The examples in this article are written in JavaScript and TypeScript. However, the techniques I’ll demonstrate are equally suited to many of the programming languages supported by CodeWhisperer. Now, onto the examples!

Typing less

At the most basic level, CodeWhisperer can be used as a supercharged auto-complete, significantly reducing your typing when writing individual lines or blocks of code. To use CodeWhisperer in this way, just write code as you normally would. CodeWhisperer will make automatic suggestions as you type. Accept suggestions when they’re helpful, and ignore suggestions when they’re not. In my experience, using CodeWhisperer in this way alone reduces my typing by 50% or more.

Generating functions

Automatically generating function bodies from signatures

Skilled programmers use clear, descriptive names for their functions and parameters. If you follow this practice in your own programming, CodeWhisperer will have all the context it needs to implement many functions for you.

After I enter the function signature below, CodeWhisperer generates the function body for me.

User input (JavaScript):

function standardDeviation(numbers)

Result:

function standardDeviation(numbers) {
let mean = numbers.reduce((a, b) => a + b) / numbers.length;
let variance =
numbers.reduce((a, b) => a + Math.pow(b – mean, 2)) / numbers.length;
return Math.sqrt(variance);
}

Generating functions from comments

Sometimes a good function signature alone isn’t enough to convey the desired behavior of a function. In those cases, you can use a detailed code comment as a prompt to CodeWhisperer. I recommend using your language’s doc comment delimiter when using this technique.

User Input (JavaScript):

/**
* Given a list that contains some numbers and strings, format them into a string
* in which the numbers are prepended with “#” and the strings are wrapped in
* double quotes.
*/

Result:

/**
* Given a list that contains some numbers and strings, format them into a string
* in which the numbers are prepended with “#” and the strings are wrapped in
* double quotes.
*/
function formatList(list) {
return list
.map((item) => {
if (typeof item === “number”) {
return `#${item}`;
} else {
return `”${item}”`;
}
})
.join(“, “);
}

Generating classes

You can generate full or partial class implementations from a single code comment. The comment should describe the key properties and behaviors of the class. Simple descriptions are often all that is required.

User input (TypeScript):

/**
* A folder has a name, children, and an optional parent. A child can’t be added
* twice.
*/

Result:

After I write the above block comment, CodeWhisperer provides multiple suggestions, including the following implementation, complete with documentation comments for each method!

/**
* A folder has a name, children, and an optional parent. A child can’t be added
* twice.
*/
export class Folder {
name: string;
children: Folder[];
parent?: Folder;

constructor(name: string, parent?: Folder) {
this.name = name;
this.children = [];
this.parent = parent;
}

/**
* Add a child to the folder.
* @param child The child to add.
* @returns True if the child was added, false if it was already present.
*/
addChild(child: Folder): boolean {
if (this.children.includes(child)) {
return false;
}
this.children.push(child);
child.parent = this;
return true;
}

/**
* Remove a child from the folder.
* @param child The child to remove.
* @returns True if the child was removed, false if it wasn’t present.
*/
removeChild(child: Folder): boolean {
const index = this.children.indexOf(child);
if (index === -1) {
return false;
}
this.children.splice(index, 1);
child.parent = undefined;
return true;
}
}

Implementing algorithms

CodeWhisperer enables you to implement well-known algorithms with ease. Simply tell CodeWhisperer which algorithm you want to apply to a particular coding problem, and it will oblige.

User input (JavaScript):

const actors = [ … ];

// Sort the actors by age using the merge sort algorithm.

Result:

const actors = [ … ];

// Sort the actors by age using the merge sort algorithm.
function mergeSort(arr) {
if (arr.length < 2) {
return arr;
}

const middle = Math.floor(arr.length / 2);
const left = arr.slice(0, middle);
const right = arr.slice(middle);

return merge(mergeSort(left), mergeSort(right));
}

function merge(left, right) {
const result = [];

while (left.length && right.length) {
if (left[0].age < right[0].age) {
result.push(left.shift());
} else {
result.push(right.shift());
}
}

while (left.length) {
result.push(left.shift());
}

while (right.length) {
result.push(right.shift());
}

return result;
}

console.log(mergeSort(actors));

Writing unit tests

CodeWhisperer accelerates the implementation of unit tests. Not only is it able to implement the logic for most of the explicit unit tests I ask for, but it often suggests unit tests for edge cases that hadn’t even occurred to me.

When I want CodeWhisperer to implement a unit test, I start with a declaration of the condition I’m testing and the expected result as in the following example.

User input (JavaScript):

describe(“Folder.addChild()”, () => {
it(“when adding an existing folder should not add duplicate”

Result:

describe(“Folder.addChild()”, () => {
it(“when adding an existing folder should not add duplicate”, () => {
const folder = new Folder(“root”);
const child = new Folder(“child”);
folder.addChild(child);
folder.addChild(child);
expect(folder.children.length).toBe(1);
});
});

After you’ve written a test or two, CodeWhisperer is likely to suggest more tests, even without explicit prompting.

Tip: You’ll notice that my test description includes 1) the function I’m testing, 2) the condition I’m testing, and 3) what I expect the result to be. Following this three-part pattern is a great way to ensure CodeWhisperer has the context it needs to implement high quality test logic.

Creating sample data

One of the delightful discoveries I’ve made is that CodeWhisperer can generate dummy data—or really any code that has a repeating pattern. As an illustration, for the following list I entered only the first “user” object as an example, and CodeWhisperer helpfully continued the pattern.

User input (JavaScript):

const fakeUsers = [
{ name: “User 1”, id: “user1” },

Result:

const fakeUsers = [
{ name: “User 1”, id: “user1” },
{ name: “User 2”, id: “user2” },
{ name: “User 3”, id: “user3” },
{ name: “User 4”, id: “user4” },
{ name: “User 5”, id: “user5” },
{ name: “User 6”, id: “user6” }
]

It even works for generating random but plausible values like the names seen below.

User input (JavaScript):

const fakeUsers = [
{ name: “Sally”, id: “user1” },

Result:

const fakeUsers = [
{ name: “Sally”, id: “user1” },
{ name: “Bob”, id: “user2” },
{ name: “Joe”, id: “user3” },
{ name: “Jane”, id: “user4” },
{ name: “Jill”, id: “user5” },
{ name: “Jack”, id: “user6” }
]

Simplifying regular expressions

While CodeWhisperer can’t translate natural language into custom regular expressions (yet), it does know a lot of commonly used regular expressions. This saves you time by putting these regular expressions at your fingertips whenever you need them.

Let’s look at one notoriously complex but common regular expression.

User input (JavaScript):

/**
* Function that validates whether a string is a valid email address.
*/

Result:

/**
* Function that validates whether a string is a valid email address.
*/
function validateEmail(email: string): boolean {
const re =
/^(([^<>()[]\.,;:s@”]+(.[^<>()[]\.,;:s@”]+)*)|(“.+”))@(([[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}])|(([a-zA-Z-0-9]+.)+[a-zA-Z]{2,}))$/;
return re.test(String(email).toLowerCase());
}

Learning third-party code libraries faster

CodeWhisperer helps accelerate my learning when I’m using an unfamiliar library, module, or framework. In the example below, I needed to write a utility function that would take a gzip compressed value coming from an API and translate it into a value that the rest of my code could understand. I knew the “pako” library would help me, but I had no idea how to use it. I just prompted CodeWhisperer with an import statement and a comment describing my goal.

import pako from “pako”;

/**
* Accepts a base64-encoded, gzip-compressed JSON string and returns
* a parsed JSON object.
*/

Result:

import pako from “pako”;

/**
* Accepts a base64-encoded, gzip-compressed JSON string and returns
* a parsed JSON object.
*/
export function decodeGzipJson(base64String) {
const compressed = atob(base64String);
const uncompressed = pako.inflate(compressed, { to: “string” });
return JSON.parse(uncompressed);
}

Documenting code

CodeWhisperer is capable of generating docstrings and comments for the code it generates, as well as for your existing code. For example, let’s say I want CodeWhisperer to document the matches() method of this FavoritesFilter TypeScript class I’ve implemented (I’ve omitted some implementation details for brevity).

class FavoritesFilter implements IAssetFilter {

matches(asset: Asset): boolean {

}
}

I can just type a doc comment delimiter (/** */) immediately above the method name and CodeWhisperer will generate the body of the doc comment for me.

Note: When using CodeWhisperer in this way you may have to manually trigger a suggestion using Option + C (Mac) or Alt + C (Windows).

class FavoritesFilter implements IAssetFilter {

/**
* Determines whether the asset matches the filter.
*/
matches(asset: Asset): boolean {

}
}

Conclusion

I hope the techniques above inspire ideas for how CodeWhisperer can make you a more productive coder. Install CodeWhisperer today to start using these time-saving techniques in your own projects. These examples only scratch the surface. As additional creative minds start applying CodeWhisperer to their daily workflows, I’m sure new techniques and best practices will continue to emerge. If you discover a novel approach that you find useful, post a comment to share what you’ve discovered. Perhaps your technique will make it into a future article and help others in the CodeWhisperer community enhance their superpowers.

Kris Schultz (he/him)

Kris Schultz has spent over 25 years bringing engaging user experiences to life by combining emerging technologies with world class design. In his role as 3D Specialist Solutions Architect, Kris helps customers leverage AWS services to power 3D applications of all sorts.

How Zomato Boosted Performance 25% and Cut Compute Cost 30% Migrating Trino and Druid Workloads to AWS Graviton

Zomato is an India-based restaurant aggregator, food delivery, dining-out company with over 350,000 listed restaurants across more than 1,000 cities in India. The company relies heavily on data analytics to enrich the customer experience and improve business efficiency. Zomato’s engineering and product teams use data insights to refine their platform’s restaurant and cuisine recommendations, improve the accuracy of waiting times at restaurants, speed up the matching of delivery partners and improve overall food delivery process.

At Zomato, different teams have different requirements for data discovery based upon their business functions. For example, number of orders placed in specific area required by a city lead team, queries resolved per minute required by customer support team or most searched dishes on special events or days by marketing and other teams. Zomato’s Data Platform team is responsible for building and maintaining a reliable platform which serves these data insights to all business units.

Zomato’s Data Platform is powered by AWS services including Amazon EMR, Amazon Aurora MySQL-Compatible Edition and Amazon DynamoDB along with open source software Trino (formerly PrestoSQL) and Apache Druid for serving the previously mentioned business metrics to different teams. Trino clusters process over 250K queries by scanning 2PB of data and Apache Druid ingests over 20 billion events and serves 8 million queries every week. To deliver performance at Zomato scale, these massively parallel systems utilize horizontal scaling of nodes running on Amazon Elastic Compute Cloud (Amazon EC2) instances in their clusters on AWS. Performance of both these data platform components is critical to support all business functions reliably and efficiently in Zomato. To improve performance in a cost-effective manner, Zomato migrated these Trino and Druid workloads onto AWS Graviton-based Amazon EC2 instances.

Graviton-based EC2 instances are powered by Arm-based AWS Graviton processors. They deliver up to 40% better price performance than comparable x86-based Amazon EC2 instances. CPU and Memory intensive Java-based applications including Trino and Druid are suitable candidates for AWS Graviton based instances to optimize price-performance, as Java is well supported and generally performant out-of-the-box on arm64.

In this blog, we will walk you through an overview of Trino and Druid, how they fit into the overall Data Platform architecture and migration journey onto AWS Graviton based instances for these workloads. We will also cover challenges faced during migration, how Zomato team overcame those challenges, business gains in terms of cost savings and better performance along with future plans of Zomato on Graviton adoption for more workloads.

Trino overview

Trino is a fast, distributed SQL query engine for querying petabyte scale data, implementing massively parallel processing (MPP) architecture. It was designed as an alternative to tools that query Apache Hadoop Distributed File System (HDFS) using pipelines of MapReduce jobs, such as Apache Hive or Apache Pig, but Trino is not limited to querying HDFS only. It has been extended to operate over a multitude of data sources, including Amazon Simple Storage Service (Amazon S3), traditional relational databases and distributed data stores including Apache Cassandra, Apache Druid, MongoDB and more. When Trino executes a query, it does so by breaking up the execution into a hierarchy of stages, which are implemented as a series of tasks distributed over a network of Trino workers. This reduces end-to-end latency and makes Trino a fast tool for ad hoc data exploration over very large data sets.

Figure 1 – Trino architecture overview

Trino coordinator is responsible for parsing statements, planning queries, and managing Trino worker nodes. Every Trino installation must have a coordinator alongside one or more Trino workers. Client applications including Apache Superset and Redash connect to the coordinator via Presto Gateway to submit statements for execution. The coordinator creates a logical model of a query involving a series of stages, which is then translated into a series of connected tasks running on a cluster of Trino workers. Presto Gateway acts as a proxy/load-balancer for multiple Trino clusters.

Druid overview

Apache Druid is a real-time database to power modern analytics applications for use cases where real-time ingest, fast query performance and high uptime are important. Druid processes are deployed on three types of server nodes: Master nodes govern data availability and ingestion, Query nodes accept queries, execute them across the system, and return the results and Data nodes ingest and store queryable data. Broker processes receive queries from external clients and forward those queries to Data servers. Historicals are the workhorses that handle storage and querying on “historical” data. MiddleManager processes handle ingestion of new data into the cluster. Please refer here to learn more on detailed Druid architecture design.

Figure 2 – Druid architecture overview

Zomato’s Data Platform Architecture on AWS

Figure 3 – Zomato’s Data Platform landscape on AWS

Zomato’s Data Platform covers data ingestion, storage, distributed processing (enrichment and enhancement), batch and real-time data pipelines unification and a robust consumption layer, through which petabytes of data is queried daily for ad-hoc and near real-time analytics. In this section, we will explain the data flow of pipelines serving data to Trino and Druid clusters in the overall Data Platform architecture.

Data Pipeline-1: Amazon Aurora MySQL-Compatible database is used to store data by various microservices at Zomato. Apache Sqoop on Amazon EMR run Extract, Transform, Load (ETL) jobs at scheduled intervals to fetch data from Aurora MySQL-Compatible to transfer it to Amazon S3 in the Optimized Row Columnar (ORC) format, which is then queried by Trino clusters.

Data Pipeline-2: Debezium Kafka connector deployed on Amazon Elastic Container Service (Amazon ECS) acts as producer and continuously polls data from Aurora MySQL-Compatible database. On detecting changes in the data, it identifies the change type and publishes the change data event to Apache Kafka in Avro format. Apache Flink on Amazon EMR consumes data from Kafka topic, performs data enrichment and transformation and writes it in ORC format in Iceberg tables on Amazon S3. Trino clusters then query data from Amazon S3.

Data Pipeline-3: Moving away from other databases, Zomato had decided to go serverless with Amazon DynamoDB because of its high performance (single-digit millisecond latency), request rate (millions per second), extreme scale as per Zomato peak expectations, economics (pay as you go) and data volume (TB, PB, EB) for their business-critical apps including Food Cart, Product Catalog and Customer preferences. DynamoDB streams publish data from these apps to Amazon S3 in JSON format to serve this data pipeline. Apache Spark on Amazon EMR reads JSON data, performs transformations including conversion into ORC format and writes data back to Amazon S3 which is used by Trino clusters for querying.

Data Pipeline-4: Zomato’s core business applications serving end users include microservices, web and mobile applications. To get near real-time insights from these core applications is critical to serve customers and win their trust continuously. Services use a custom SDK developed by data platform team to publish events to the Apache Kafka topic. Then, two downstream data pipelines consume these application events available on Kafka via Apache Flink on Amazon EMR. Flink performs data conversion into ORC format and publishes data to Amazon S3 and in a parallel data pipeline, Flink also publishes enriched data onto another Kafka topic, which further serves data to an Apache Druid cluster deployed on Amazon EC2 instances.

Performance requirements for querying at scale

All of the described data pipelines ingest data into an Amazon S3 based data lake, which is then leveraged by three types of Trino clusters – Ad-hoc clusters for ad-hoc query use cases, with a maximum query runtime of 20 minutes, ETL clusters for creating materialized views to enhance performance of dashboard queries, and Reporting clusters to run queries for dashboards with various Key Performance Indicators (KPIs), with query runtime upto 3 minutes. ETL queries are run via Apache Airflow with a built-in query retry mechanism and a runtime of up to 3 hours.

Druid is used to serve two types of queries: computing aggregated metrics based on recent events and comparing aggregated metrics to historical data. For example, how is a specific metric in the current hour compared to the same last week. Depending on the use case, the service level objective for Druid query response time ranges from a few milliseconds to a few seconds.

Graviton migration of Druid cluster

Zomato first moved Druid nodes to AWS Graviton based instances in their test cluster environment to determine query performance. Nodes running brokers and middle-managers were moved from R5 to R6g instances and nodes running historicals were migrated from i3 to R6gd instances.   Zomato logged real-world queries from their production cluster and replayed them in their test cluster to validate the performance. Post validation, Zomato saw significant performance gains and reduced cost:

Performance gains

For queries in Druid, performance was measured using typical business hours (12:00 to 22:00 Hours) load of 14K queries, as shown here, where p99 query runtime reduced by 25%.

Figure 4 – Overall Druid query performance (Intel x86-64 vs. AWS Graviton)

Also, query performance improvement on the historical nodes of the Druid cluster are shown here, where p95 query runtime reduced by 66%.

Figure 5 –Query performance on Druid Historicals (Intel x86-64 vs. AWS Graviton)

Under peak load during business hours (12:00 to 22:00 Hours as shown in the provided graph), with increasingly loaded CPUs, Graviton based instances demonstrated close to linear performance resulting in better query runtime than equivalent Intel x86 based instances. This provided headroom to Zomato to reduce their overall node count in the Druid cluster for serving the same peak load query traffic.

Figure 6 – CPU utilization (Intel x86-64 vs. AWS Graviton)

Cost savings

A Cost comparison of Intel x86 vs. AWS Graviton based instances running Druid in a test environment along with the number, instance types and hourly On-demand prices in the Singapore region is shown here. There are cost savings of ~24% running the same number of Graviton based instances. Further, Druid cluster auto scales in production environment based upon performance metrics, so average cost savings with Graviton based instances are even higher at ~30% due to better performance.

Figure 7 – Cost savings analysis (Intel x86-64 vs. AWS Graviton)

Graviton migration of Trino clusters

Zomato also moved their Trino cluster in their test environment to AWS Graviton based instances and monitored query performance for different short and long-running queries. As shown here, mean wall (elapsed) time value for different Trino queries is lower on AWS Graviton instances than equivalent Intel x86 based instances, for most of the queries (lower is better).

Figure 8 – Mean Wall Time for Trino queries (Intel x86-64 vs. AWS Graviton)

Also, p99 query runtime reduced by ~33% after migrating the Trino cluster to AWS Graviton instances for a typical business day’s (7am – 7pm) mixed query load with ~15K queries.

Figure 9 –Query performance for a typical day (7am -7pm) load

Zomato’s team further optimized overall Trino query performance by enhancing Advanced Encryption Standard (AES) performance on Graviton for TLS negotiation with Amazon S3. It was achieved by enabling -XX:+UnlockDiagnosticVMOptions and -XX:+UseAESCTRIntrinsics in extra JVM flags. As shown here, mean CPU time for queries is lower after enabling extra JVM flags, for most of the queries.

Figure 10 –Query performance after enabling extra JVM options with Graviton instances

Migration challenges and approach

Zomato team is using Trino version 359 and multi-arch or ARM64-compatible docker image for this Trino version was not available. As the team wanted to migrate their Trino cluster to Graviton based instances with minimal engineering efforts and time, they backported the Trino multi-arch supported UBI8 based Docker image to their Trino version 359.  This approach allowed faster adoption of Graviton based instances, eliminating the heavy lift of upgrading, testing and benchmarking the workload on a newer Trino version.

Next Steps

Zomato has already migrated AWS managed services including Amazon EMR and Amazon Aurora MySQL-Compatible database to AWS Graviton based instances. With the successful migration of two main open source software components (Trino and Druid) of their data platform to AWS Graviton with visible and immediate price-performance gains, the Zomato team plans to replicate that success with other open source applications running on Amazon EC2 including Apache Kafka, Apache Pinot, etc.

Conclusion

This post demonstrated the price/performance benefits of adopting AWS Graviton based instances for high throughput, near real-time big data analytics workloads running on Java-based, open source Apache Druid and Trino applications. Overall, Zomato reduced the cost of its Amazon EC2 usage by 30%, while improving performance for both time-critical and ad-hoc querying by as much as 25%. Due to better performance, Zomato was also able to right size compute footprint for these workloads on a smaller number of Amazon EC2 instances, with peak capacity of Apache Druid and Trino clusters reduced by 25% and 20% respectively.

Zomato migrated these open source software applications faster by quickly implementing customizations needed for optimum performance and compatibility with Graviton based instances. Zomato’s mission is “better food for more people” and Graviton adoption is helping with this mission by providing a more sustainable, performant, and cost-effective compute platform on AWS. This is certainly a “food for thought” for customers looking forward to improve price-performance and sustainability for their business-critical workloads running on Open Source Software (OSS).

Flatlogic Admin Templates banner

Create a CI/CD pipeline for .NET Lambda functions with AWS CDK Pipelines

The AWS Cloud Development Kit (AWS CDK) is an open-source software development framework to define cloud infrastructure in familiar programming languages and provision it through AWS CloudFormation.

In this blog post, we will explore the process of creating a Continuous Integration/Continuous Deployment (CI/CD) pipeline for a .NET AWS Lambda function using the CDK Pipelines. We will cover all the necessary steps to automate the deployment of the .NET Lambda function, including setting up the development environment, creating the pipeline with AWS CDK, configuring the pipeline stages, and publishing the test reports. Additionally, we will show how to promote the deployment from a lower environment to a higher environment with manual approval.

Background

AWS CDK makes it easy to deploy a stack that provisions your infrastructure to AWS from your workstation by simply running cdk deploy. This is useful when you are doing initial development and testing. However, in most real-world scenarios, there are multiple environments, such as development, testing, staging, and production. It may not be the best approach to deploy your CDK application in all these environments using cdk deploy. Deployment to these environments should happen through more reliable, automated pipelines. CDK Pipelines makes it easy to set up a continuous deployment pipeline for your CDK applications, powered by AWS CodePipeline.

The AWS CDK Developer Guide’s Continuous integration and delivery (CI/CD) using CDK Pipelines page shows you how you can use CDK Pipelines to deploy a Node.js based Lambda function. However, .NET based Lambda functions are different from Node.js or Python based Lambda functions in that .NET code first needs to be compiled to create a deployment package. As a result, we decided to write this blog as a step-by-step guide to assist our .NET customers with deploying their Lambda functions utilizing CDK Pipelines.

In this post, we dive deeper into creating a real-world pipeline that runs build and unit tests, and deploys a .NET Lambda function to one or multiple environments.

Architecture

CDK Pipelines is a construct library that allows you to provision a CodePipeline pipeline. The pipeline created by CDK pipelines is self-mutating. This means, you need to run cdk deploy one time to get the pipeline started. After that, the pipeline automatically updates itself if you add new application stages or stacks in the source code.

The following diagram captures the architecture of the CI/CD pipeline created with CDK Pipelines. Let’s explore this architecture at a high level before diving deeper into the details.

Figure 1: Reference architecture diagram

The solution creates a CodePipeline with a AWS CodeCommit repo as the source (CodePipeline Source Stage). When code is checked into CodeCommit, the pipeline is automatically triggered and retrieves the code from the CodeCommit repository branch to proceed to the Build stage.

Build stage compiles the CDK application code and generates the cloud assembly.

Update Pipeline stage updates the pipeline (if necessary).

Publish Assets stage uploads the CDK assets to Amazon S3.

After Publish Assets is complete, the pipeline deploys the Lambda function to both the development and production environments. For added control, the architecture includes a manual approval step for releases that target the production environment.

Prerequisites

For this tutorial, you should have:

An AWS account

Visual Studio 2022
AWS Toolkit for Visual Studio
Node.js 18.x or later
AWS CDK v2 (2.67.0 or later required)
Git

Bootstrapping

Before you use AWS CDK to deploy CDK Pipelines, you must bootstrap the AWS environments where you want to deploy the Lambda function. An environment is the target AWS account and Region into which the stack is intended to be deployed.

In this post, you deploy the Lambda function into a development environment and, optionally, a production environment. This requires bootstrapping both environments. However, deployment to a production environment is optional; you can skip bootstrapping that environment for the time being, as we will cover that later.

This is one-time activity per environment for each environment to which you want to deploy CDK applications. To bootstrap the development environment, run the below command, substituting in the AWS account ID for your dev account, the region you will use for your dev environment, and the locally-configured AWS CLI profile you wish to use for that account. See the documentation for additional details.

cdk bootstrap aws://<DEV-ACCOUNT-ID>/<DEV-REGION>
–profile DEV-PROFILE
–cloudformation-execution-policies arn:aws:iam::aws:policy/AdministratorAccess

‐‐profile specifies the AWS CLI credential profile that will be used to bootstrap the environment. If not specified, default profile will be used. The profile should have sufficient permissions to provision the resources for the AWS CDK during bootstrap process.

‐‐cloudformation-execution-policies specifies the ARNs of managed policies that should be attached to the deployment role assumed by AWS CloudFormation during deployment of your stacks.

Note: By default, stacks are deployed with full administrator permissions using the AdministratorAccess policy, but for real-world usage, you should define a more restrictive IAM policy and use that, refer customizing bootstrapping in AWS CDK documentation and Secure CDK deployments with IAM permission boundaries to see how to do that.

Create a Git repository in AWS CodeCommit

For this post, you will use CodeCommit to store your source code. First, create a git repository named dotnet-lambda-cdk-pipeline in CodeCommit by following these steps in the CodeCommit documentation.

After you have created the repository, generate git credentials to access the repository from your local machine if you don’t already have them. Follow the steps below to generate git credentials.

Sign in to the AWS Management Console and open the IAM console.
Create an IAM user (for example, git-user).
Once user is created, attach AWSCodeCommitPowerUser policy to the user.
Next. open the user details page, choose the Security Credentials tab, and in HTTPS Git credentials for AWS CodeCommit, choose Generate.

Download credentials to download this information as a .CSV file.

Clone the recently created repository to your workstation, then cd into dotnet-lambda-cdk-pipeline directory.

git clone <CODECOMMIT-CLONE-URL>
cd dotnet-lambda-cdk-pipeline

Alternatively, you can use git-remote-codecommit to clone the repository with git clone codecommit::<REGION>://<PROFILE>@<REPOSITORY-NAME> command, replacing the placeholders with their original values. Using git-remote-codecommit does not require you to create additional IAM users to manage git credentials. To learn more, refer AWS CodeCommit with git-remote-codecommit documentation page.

Initialize the CDK project

From the command prompt, inside the dotnet-lambda-cdk-pipeline directory, initialize a AWS CDK project by running the following command.

cdk init app –language csharp

Open the generated C# solution in Visual Studio, right-click the DotnetLambdaCdkPipeline project and select Properties. Set the Target framework to .NET 6.

Create a CDK stack to provision the CodePipeline

Your CDK Pipelines application includes at least two stacks: one that represents the pipeline itself, and one or more stacks that represent the application(s) deployed via the pipeline. In this step, you create the first stack that deploys a CodePipeline pipeline in your AWS account.

From Visual Studio, open the solution by opening the .sln solution file (in the src/ folder). Once the solution has loaded, open the DotnetLambdaCdkPipelineStack.cs file, and replace its contents with the following code. Note that the filename, namespace and class name all assume you named your Git repository as shown earlier.

Note: be sure to replace “<CODECOMMIT-REPOSITORY-NAME>” in the code below with the name of your CodeCommit repository (in this blog post, we have used dotnet-lambda-cdk-pipeline).

using Amazon.CDK;
using Amazon.CDK.AWS.CodeBuild;
using Amazon.CDK.AWS.CodeCommit;
using Amazon.CDK.AWS.IAM;
using Amazon.CDK.Pipelines;
using Constructs;
using System.Collections.Generic;

namespace DotnetLambdaCdkPipeline
{
public class DotnetLambdaCdkPipelineStack : Stack
{
internal DotnetLambdaCdkPipelineStack(Construct scope, string id, IStackProps props = null) : base(scope, id, props)
{

var repository = Repository.FromRepositoryName(this, “repository”, “<CODECOMMIT-REPOSITORY-NAME>”);

// This construct creates a pipeline with 3 stages: Source, Build, and UpdatePipeline
var pipeline = new CodePipeline(this, “pipeline”, new CodePipelineProps
{
PipelineName = “LambdaPipeline”,
SelfMutation = true,

// Synth represents a build step that produces the CDK Cloud Assembly.
// The primary output of this step needs to be the cdk.out directory generated by the cdk synth command.
Synth = new CodeBuildStep(“Synth”, new CodeBuildStepProps
{
// The files downloaded from the repository will be placed in the working directory when the script is executed
Input = CodePipelineSource.CodeCommit(repository, “master”),

// Commands to run to generate CDK Cloud Assembly
Commands = new string[] { “npm install -g aws-cdk”, “cdk synth” },

// Build environment configuration
BuildEnvironment = new BuildEnvironment
{
BuildImage = LinuxBuildImage.AMAZON_LINUX_2_4,
ComputeType = ComputeType.MEDIUM,

// Specify true to get a privileged container inside the build environment image
Privileged = true
}
})
});
}
}
}

In the preceding code, you use CodeBuildStep instead of ShellStep, since ShellStep doesn’t provide a property to specify BuildEnvironment. We need to specify the build environment in order to set privileged mode, which allows access to the Docker daemon in order to build container images in the build environment. This is necessary to use the CDK’s bundling feature, which is explained in later in this blog post.

Open the file src/DotnetLambdaCdkPipeline/Program.cs, and edit its contents to reflect the below. Be sure to replace the placeholders with your AWS account ID and region for your dev environment.

using Amazon.CDK;

namespace DotnetLambdaCdkPipeline
{
sealed class Program
{
public static void Main(string[] args)
{
var app = new App();
new DotnetLambdaCdkPipelineStack(app, “DotnetLambdaCdkPipelineStack”, new StackProps
{
Env = new Amazon.CDK.Environment
{
Account = “<DEV-ACCOUNT-ID>”,
Region = “<DEV-REGION>”
}
});
app.Synth();
}
}
}

Note: Instead of committing the account ID and region to source control, you can set environment variables on the CodeBuild agent and use them; see Environments in the AWS CDK documentation for more information. Because the CodeBuild agent is also configured in your CDK code, you can use the BuildEnvironmentVariableType property to store environment variables in AWS Systems Manager Parameter Store or AWS Secrets Manager.

After you make the code changes, build the solution to ensure there are no build issues. Next, commit and push all the changes you just made. Run the following commands (or alternatively use Visual Studio’s built-in Git functionality to commit and push your changes):

git add –all .
git commit -m ‘Initial commit’
git push

Then navigate to the root directory of repository where your cdk.json file is present, and run the cdk deploy command to deploy the initial version of CodePipeline. Note that the deployment can take several minutes.

The pipeline created by CDK Pipelines is self-mutating. This means you only need to run cdk deploy one time to get the pipeline started. After that, the pipeline automatically updates itself if you add new CDK applications or stages in the source code.

After the deployment has finished, a CodePipeline is created and automatically runs. The pipeline includes three stages as shown below.

Source – It fetches the source of your AWS CDK app from your CodeCommit repository and triggers the pipeline every time you push new commits to it.

Build – This stage compiles your code (if necessary) and performs a cdk synth. The output of that step is a cloud assembly.

UpdatePipeline – This stage runs cdk deploy command on the cloud assembly generated in previous stage. It modifies the pipeline if necessary. For example, if you update your code to add a new deployment stage to the pipeline to your application, the pipeline is automatically updated to reflect the changes you made.

Figure 2: Initial CDK pipeline stages

Define a CodePipeline stage to deploy .NET Lambda function

In this step, you create a stack containing a simple Lambda function and place that stack in a stage. Then you add the stage to the pipeline so it can be deployed.

To create a Lambda project, do the following:

In Visual Studio, right-click on the solution, choose Add, then choose New Project.
In the New Project dialog box, choose the AWS Lambda Project (.NET Core – C#) template, and then choose OK or Next.
For Project Name, enter SampleLambda, and then choose Create.
From the Select Blueprint dialog, choose Empty Function, then choose Finish.

Next, create a new file in the CDK project at src/DotnetLambdaCdkPipeline/SampleLambdaStack.cs to define your application stack containing a Lambda function. Update the file with the following contents (adjust the namespace as necessary):

using Amazon.CDK;
using Amazon.CDK.AWS.Lambda;
using Constructs;
using AssetOptions = Amazon.CDK.AWS.S3.Assets.AssetOptions;

namespace DotnetLambdaCdkPipeline
{
class SampleLambdaStack: Stack
{
public SampleLambdaStack(Construct scope, string id, StackProps props = null) : base(scope, id, props)
{
// Commands executed in a AWS CDK pipeline to build, package, and extract a .NET function.
var buildCommands = new[]
{
“cd /asset-input”,
“export DOTNET_CLI_HOME=”/tmp/DOTNET_CLI_HOME””,
“export PATH=”$PATH:/tmp/DOTNET_CLI_HOME/.dotnet/tools””,
“dotnet build”,
“dotnet tool install -g Amazon.Lambda.Tools”,
“dotnet lambda package -o output.zip”,
“unzip -o -d /asset-output output.zip”
};

new Function(this, “LambdaFunction”, new FunctionProps
{
Runtime = Runtime.DOTNET_6,
Handler = “SampleLambda::SampleLambda.Function::FunctionHandler”,

// Asset path should point to the folder where .csproj file is present.
// Also, this path should be relative to cdk.json file.
Code = Code.FromAsset(“./src/SampleLambda”, new AssetOptions
{
Bundling = new BundlingOptions
{
Image = Runtime.DOTNET_6.BundlingImage,
Command = new[]
{
“bash”, “-c”, string.Join(” && “, buildCommands)
}
}
})
});
}
}
}

Building inside a Docker container

The preceding code uses bundling feature to build the Lambda function inside a docker container. Bundling starts a new docker container, copies the Lambda source code inside /asset-input directory of the container, runs the specified commands that write the package files under /asset-output directory. The files in /asset-output are copied as assets to the stack’s cloud assembly directory. In a later stage, these files are zipped and uploaded to S3 as the CDK asset.

Building Lambda functions inside Docker containers is preferable than building them locally because it reduces the host machine’s dependencies, resulting in greater consistency and reliability in your build process.

Bundling requires the creation of a docker container on your build machine. For this purpose, the privileged: true setting on the build machine has already been configured.

Adding development stage

Create a new file in the CDK project at src/DotnetLambdaCdkPipeline/DotnetLambdaCdkPipelineStage.cs to hold your stage. This class will create the development stage for your pipeline.

using Amazon.CDK;
using Constructs;

namespace DotnetLambdaCdkPipeline
{
public class DotnetLambdaCdkPipelineStage : Stage
{
internal DotnetLambdaCdkPipelineStage(Construct scope, string id, IStageProps props = null) : base(scope, id, props)
{
Stack lambdaStack = new SampleLambdaStack(this, “LambdaStack”);
}
}
}

Edit src/DotnetLambdaCdkPipeline/DotnetLambdaCdkPipelineStack.cs to add the stage to your pipeline. Add the bolded line from the code below to your file.

using Amazon.CDK;
using Amazon.CDK.Pipelines;

namespace DotnetLambdaCdkPipeline
{
public class DotnetLambdaCdkPipelineStack : Stack
{
internal DotnetLambdaCdkPipelineStack(Construct scope, string id, IStackProps props = null) : base(scope, id, props)
{

var repository = Repository.FromRepositoryName(this, “repository”, “dotnet-lambda-cdk-application”);

// This construct creates a pipeline with 3 stages: Source, Build, and UpdatePipeline
var pipeline = new CodePipeline(this, “pipeline”, new CodePipelineProps
{
PipelineName = “LambdaPipeline”,
.
.
.
});

var devStage = pipeline.AddStage(new DotnetLambdaCdkPipelineStage(this, “Development”));
}
}
}

Next, build the solution, then commit and push the changes to the CodeCommit repo. This will trigger the CodePipeline to start.

When the pipeline runs, UpdatePipeline stage detects the changes and updates the pipeline based on the code it finds there. After the UpdatePipeline stage completes, pipeline is updated with additional stages.

Let’s observe the changes:

An Assets stage has been added. This stage uploads all the assets you are using in your app to Amazon S3 (the S3 bucket created during bootstrapping) so that they could be used by other deployment stages later in the pipeline. For example, the CloudFormation template used by the development stage, includes reference to these assets, which is why assets are first moved to S3 and then referenced in later stages.

A Development stage with two actions has been added. The first action is to create the change set, and the second is to execute it.

Figure 3: CDK pipeline with development stage to deploy .NET Lambda function

After the Deploy stage has completed, you can find the newly-deployed Lambda function by visiting the Lambda console, selecting “Functions” from the left menu, and filtering the functions list with “LambdaStack”. Note the runtime is .NET.

Running Unit Test cases in the CodePipeline

Next, you will add unit test cases to your Lambda function, and run them through the pipeline to generate a test report in CodeBuild.

To create a Unit Test project, do the following:

Right click on the solution, choose Add, then choose New Project.
In the New Project dialog box, choose the xUnit Test Project template, and then choose OK or Next.
For Project Name, enter SampleLambda.Tests, and then choose Create or Next.
Depending on your version of Visual Studio, you may be prompted to select the version of .NET to use. Choose .NET 6.0 (Long Term Support), then choose Create.
Right click on SampleLambda.Tests project, choose Add, then choose Project Reference. Select SampleLambda project, and then choose OK.

Next, edit the src/SampleLambda.Tests/UnitTest1.cs file to add a unit test. You can use the code below, which verifies that the Lambda function returns the input string as upper case.

using Xunit;

namespace SampleLambda.Tests
{
public class UnitTest1
{
[Fact]
public void TestSuccess()
{
var lambda = new SampleLambda.Function();

var result = lambda.FunctionHandler(“test string”, context: null);

Assert.Equal(“TEST STRING”, result);
}
}
}

You can add pre-deployment or post-deployment actions to the stage by calling its AddPre() or AddPost() method. To execute above test cases, we will use a pre-deployment action.

To add a pre-deployment action, we will edit the src/DotnetLambdaCdkPipeline/DotnetLambdaCdkPipelineStack.cs file in the CDK project, after we add code to generate test reports.

To run the unit test(s) and publish the test report in CodeBuild, we will construct a BuildSpec for our CodeBuild project. We also provide IAM policy statements to be attached to the CodeBuild service role granting it permissions to run the tests and create reports. Update the file by adding the new code (starting with “// Add this code for test reports”) below the devStage declaration you added earlier:

using Amazon.CDK;
using Amazon.CDK.Pipelines;

namespace DotnetLambdaCdkPipeline
{
public class DotnetLambdaCdkPipelineStack : Stack
{
internal DotnetLambdaCdkPipelineStack(Construct scope, string id, IStackProps props = null) : base(scope, id, props)
{
// …
// …
// …
var devStage = pipeline.AddStage(new DotnetLambdaCdkPipelineStage(this, “Development”));

// Add this code for test reports
var reportGroup = new ReportGroup(this, “TestReports”, new ReportGroupProps
{
ReportGroupName = “TestReports”
});

// Policy statements for CodeBuild Project Role
var policyProps = new PolicyStatementProps()
{
Actions = new string[] {
“codebuild:CreateReportGroup”,
“codebuild:CreateReport”,
“codebuild:UpdateReport”,
“codebuild:BatchPutTestCases”
},
Effect = Effect.ALLOW,
Resources = new string[] { reportGroup.ReportGroupArn }
};

// PartialBuildSpec in AWS CDK for C# can be created using Dictionary
var reports = new Dictionary<string, object>()
{
{
“reports”, new Dictionary<string, object>()
{
{
reportGroup.ReportGroupArn, new Dictionary<string,object>()
{
{ “file-format”, “VisualStudioTrx” },
{ “files”, “**/*” },
{ “base-directory”, “./testresults” }
}
}
}
}
};
// End of new code block
}
}
}

Finally, add the CodeBuildStep as a pre-deployment action to the development stage with necessary CodeBuildStepProps to set up reports. Add this after the new code you added above.

devStage.AddPre(new Step[]
{
new CodeBuildStep(“Unit Test”, new CodeBuildStepProps
{
Commands= new string[]
{
“dotnet test -c Release ./src/SampleLambda.Tests/SampleLambda.Tests.csproj –logger trx –results-directory ./testresults”,
},
PrimaryOutputDirectory = “./testresults”,
PartialBuildSpec= BuildSpec.FromObject(reports),
RolePolicyStatements = new PolicyStatement[] { new PolicyStatement(policyProps) },
BuildEnvironment = new BuildEnvironment
{
BuildImage = LinuxBuildImage.AMAZON_LINUX_2_4,
ComputeType = ComputeType.MEDIUM
}
})
});

Build the solution, then commit and push the changes to the repository. Pushing the changes triggers the pipeline, runs the test cases, and publishes the report to the CodeBuild console. To view the report, after the pipeline has completed, navigate to TestReports in CodeBuild’s Report Groups as shown below.

Figure 4: Test report in CodeBuild report group

Deploying to production environment with manual approval

CDK Pipelines makes it very easy to deploy additional stages with different accounts. You have to bootstrap the accounts and Regions you want to deploy to, and they must have a trust relationship added to the pipeline account.

To bootstrap an additional production environment into which AWS CDK applications will be deployed by the pipeline, run the below command, substituting in the AWS account ID for your production account, the region you will use for your production environment, the AWS CLI profile to use with the prod account, and the AWS account ID where the pipeline is already deployed (the account you bootstrapped at the start of this blog).

cdk bootstrap aws://<PROD-ACCOUNT-ID>/<PROD-REGION>
–profile <PROD-PROFILE>
–cloudformation-execution-policies arn:aws:iam::aws:policy/AdministratorAccess
–trust <PIPELINE-ACCOUNT-ID>

The –trust option indicates which other account should have permissions to deploy AWS CDK applications into this environment. For this option, specify the pipeline’s AWS account ID.

Use below code to add a new stage for production deployment with manual approval. Add this code below the “devStage.AddPre(…)” code block you added in the previous section, and remember to replace the placeholders with your AWS account ID and region for your prod environment.

var prodStage = pipeline.AddStage(new DotnetLambdaCdkPipelineStage(this, “Production”, new StageProps
{
Env = new Environment
{
Account = “<PROD-ACCOUNT-ID>”,
Region = “<PROD-REGION>”
}
}), new AddStageOpts
{
Pre = new[] { new ManualApprovalStep(“PromoteToProd”) }
});

To support deploying CDK applications to another account, the artifact buckets must be encrypted, so add a CrossAccountKeys property to the CodePipeline near the top of the pipeline stack file, and set the value to true (see the line in bold in the code snippet below). This creates a KMS key for the artifact bucket, allowing cross-account deployments.

var pipeline = new CodePipeline(this, “pipeline”, new CodePipelineProps
{
PipelineName = “LambdaPipeline”,
SelfMutation = true,
CrossAccountKeys = true,
EnableKeyRotation = true, //Enable KMS key rotation for the generated KMS keys

// …
}

After you commit and push the changes to the repository, a new manual approval step called PromoteToProd is added to the Production stage of the pipeline. The pipeline pauses at this step and awaits manual approval as shown in the screenshot below.

Figure 5: Pipeline waiting for manual review

When you click the Review button, you are presented with the following dialog. From here, you can choose to approve or reject and add comments if needed.

Figure 6: Manual review approval dialog

Once you approve, the pipeline resumes, executes the remaining steps and completes the deployment to production environment.

Figure 7: Successful deployment to production environment

Clean up

To avoid incurring future charges, log into the AWS console of the different accounts you used, go to the AWS CloudFormation console of the Region(s) where you chose to deploy, select and click Delete on the stacks created for this activity. Alternatively, you can delete the CloudFormation Stack(s) using cdk destroy command. It will not delete the CDKToolkit stack that the bootstrap command created. If you want to delete that as well, you can do it from the AWS Console.

Conclusion

In this post, you learned how to use CDK Pipelines for automating the deployment process of .NET Lambda functions. An intuitive and flexible architecture makes it easy to set up a CI/CD pipeline that covers the entire application lifecycle, from build and test to deployment. With CDK Pipelines, you can streamline your development workflow, reduce errors, and ensure consistent and reliable deployments.
For more information on CDK Pipelines and all the ways it can be used, see the CDK Pipelines reference documentation.

About the authors:

Ankush Jain

Ankush Jain is a Cloud Consultant at AWS Professional Services based out of Pune, India. He currently focuses on helping customers migrate their .NET applications to AWS. He is passionate about cloud, with a keen interest in serverless technologies.

Sanjay Chaudhari

Sanjay Chaudhari is a Cloud Consultant with AWS Professional Services. He works with customers to migrate and modernize their Microsoft workloads to the AWS Cloud.

Monitoring Amazon DevOps Guru insights using Amazon Managed Grafana

As organizations operate day-to-day, having insights into their cloud infrastructure state can be crucial for the durability and availability of their systems. Industry research estimates[1] that downtime costs small businesses around $427 per minute of downtime, and medium to large businesses an average of $9,000 per minute of downtime. Amazon DevOps Guru customers want to monitor and generate alerts using a single dashboard. This allows them to reduce context switching between applications, providing them an opportunity to respond to operational issues faster.

DevOps Guru can integrate with Amazon Managed Grafana to create and display operational insights. Alerts can be created and communicated for any critical events captured by DevOps Guru and notifications can be sent to operation teams to respond to these events. The key telemetry data types of logs and metrics are parsed and filtered to provide the necessary insights into observability.

Furthermore, it provides plug-ins to popular open-source databases, third-party ISV monitoring tools, and other cloud services. With Amazon Managed Grafana, you can easily visualize information from multiple AWS services, AWS accounts, and Regions in a single Grafana dashboard.

In this post, we will walk you through integrating the insights generated from DevOps Guru with Amazon Managed Grafana.

Solution Overview:

This architecture diagram shows the flow of the logs and metrics that will be utilized by Amazon Managed Grafana, starting with DevOps Guru and then using Amazon EventBridge to save the insight event logs to Amazon CloudWatch Log Group DevOps Guru service metrics to be parsed by Amazon Managed Grafana and create new dashboards in Grafana from these logs and Metrics.

Now we will walk you through how to do this and set up notifications to your operations team.

Prerequisites:

The following prerequisites are required for this walkthrough:

An AWS Account

Enabled DevOps Guru on your account with CloudFormation stack, or tagged resources monitored.

Using Amazon CloudWatch Metrics

 

DevOps Guru sends service metrics to CloudWatch Metrics. We will use these to      track metrics for insights and metrics for your DevOps Guru usage; the DevOps Guru service reports the metrics to the AWS/DevOps-Guru namespace in CloudWatch by default.

First, we will provision an Amazon Managed Grafana workspace and then create a Dashboard in the workspace that uses Amazon CloudWatch as a data source.

Setting up Amazon CloudWatch Metrics

Create Grafana Workspace
Navigate to Amazon Managed Grafana from AWS console, then click Create workspace

a. Select the Authentication mechanism

i. AWS IAM Identity Center (AWS SSO) or SAML v2 based Identity Providers

ii. Service Managed Permission or Customer Managed

iii. Choose Next

b. Under “Data sources and notification channels”, choose Amazon CloudWatch

c. Create the Service.

You can use this post for more information on how to create and configure the Grafana workspace with SAML based authentication.

Next, we will show you how to create a dashboard and parse the Logs and Metrics to display the DevOps Guru insights and recommendations.

2. Configure Amazon Managed Grafana

a. Add CloudWatch as a data source:
From the left bar navigation menu, hover over AWS and select Data sources.

b. From the Services dropdown select and configure CloudWatch.

3. Create a Dashboard

a. From the left navigation bar, click on add a new Panel.

b. You will see a demo panel.

c. In the demo panel – Click on Data source and select Amazon CloudWatch.

d. For this panel we will use CloudWatch metrics to display the number of insights.

e. From Namespace select the AWS/DevOps-Guru name space, Insights as Metric name and Average for Statistics.

click apply

f. This is our first panel. We can change the panel name from the right-side bar under Title. We will name this panel “Insights

g. From the top right menu, click save dashboard and give your new dashboard a name

Using Amazon CloudWatch Logs via Amazon EventBridge

For other insights outside of the service metrics, such as a number of insights per specific service or the average for a region or for a specific AWS account, we will need to parse the event logs. These logs first need to be sent to Amazon CloudWatch Logs. We will go over the details on how to set this up and how we can parse these logs in Amazon Managed Grafana using CloudWatch Logs Query Syntax. In this post, we will show a couple of examples. For more details, please check out this User Guide documentation. This is not done by default and we will need to use Amazon EventBridge to pass these logs to CloudWatch.

DevOps Guru logs include other details that can be helpful when building Dashboards, such as region, Insight Severity (High, Medium, or Low), associated resources, and DevOps guru dashboard URL, among other things.  For more information, please check out this User Guide documentation.

EventBridge offers a serverless event bus that helps you receive, filter, transform, route, and deliver events. It provides one to many messaging solutions to support decoupled architectures, and it is easy to integrate with AWS Services and 3rd-party tools. Using Amazon EventBridge with DevOps Guru provides a solution that is easy to extend to create a ticketing system through integrations with ServiceNow, Jira, and other tools. It also makes it easy to set up alert systems through integrations with PagerDuty, Slack, and more.

 

Setting up Amazon CloudWatch Logs

Let’s dive in to creating the EventBridge rule and enhance our Grafana dashboard:

a. First head to Amazon EventBridge in the AWS console.

b. Click Create rule.

     Type in rule Name and Description. You can leave the Event bus to default and Rule type to Rule with an event pattern.

c. Select AWS events or EventBridge partner events.

    For event Pattern change to Customer patterns (JSON editor) and use:

{“source”: [“aws.devops-guru”]}

This filters for all events generated from DevOps Guru. You can use the same mechanism to filter out specific messages such as new insights, or insights closed to a different channel. For this demonstration, let’s consider extracting all events.

d. Next, for Target, select AWS service.

    Then use CloudWatch log Group.

    For the Log Group, give your group a name, such as “devops-guru”.

e. Click Create rule.

f. Navigate back to Amazon Managed Grafana.
It’s time to add a couple more additional Panels to our dashboard.  Click Add panel.
    Then Select Amazon CloudWatch, and change from metrics to CloudWatch Logs and select the Log Group we created previously.

g. For the query use the following to get the number of closed insights:

fields @detail.messageType
| filter detail.messageType=”CLOSED_INSIGHT”
| count(detail.messageType)

You’ll see the new dashboard get updated with “Data is missing a time field”.

You can either open the suggestions and select a gauge that makes sense;

Or choose from multiple visualization options.

Now we have 2 panels:

h. You can repeat the same process. To create 3rd panel for the new insights using this query:

fields @detail.messageType
| filter detail.messageType=”NEW_INSIGHT”
| count(detail.messageType)

Now we have 3 panels:

Next, depending on the visualizations, you can work with the Logs and metrics data types to parse and filter the data.

i. For our fourth panel, we will add DevOps Guru dashboard direct link to the AWS Console.

Repeat the same process as demonstrated previously one more time with this query:

fields detail.messageType, detail.insightSeverity, detail.insightUrlfilter
| filter detail.messageType=”CLOSED_INSIGHT” or detail.messageType=”NEW_INSIGHT”                       

                        Switch to table when prompted on the panel.

This will give us a direct link to the DevOps Guru dashboard and help us get to the insight details and Recommendations.

Save your dashboard.

You can extend observability by sending notifications through alerts on dashboards of panels providing metrics. The alerts will be triggered when a condition is met. The Alerts are communicated with Amazon SNS notification mechanism. This is our SNS notification channel setup.

A previously created notification is used next to communicate any alerts when the condition is met across the metrics being observed.

Cleanup

To avoid incurring future charges, delete the resources.

Navigate to EventBridge in AWS console and delete the rule created in step 4 (a-e) “devops-guru”.
Navigate to CloudWatch logs in AWS console and delete the log group created as results of step 4 (a-e) named “devops-guru”.
Amazon Managed Grafana: Navigate to Amazon Managed Grafana service and delete the Grafana services you created in step 1.

Conclusion

In this post, we have demonstrated how to successfully incorporate Amazon DevOps Guru insights into Amazon Managed Grafana and use Grafana as the observability tool. This will allow Operations team to successfully observe the state of their AWS resources and notify them through Alarms on any preset thresholds on DevOps Guru metrics and logs. You can expand on this to create other panels and dashboards specific to your needs. If you don’t have DevOps Guru, you can start monitoring your AWS applications with AWS DevOps Guru today using this link.

[1] https://www.atlassian.com/incident-management/kpis/cost-of-downtime

About the authors:

MJ Kubba

MJ Kubba is a Solutions Architect who enjoys working with public sector customers to build solutions that meet their business needs. MJ has over 15 years of experience designing and implementing software solutions. He has a keen passion for DevOps and cultural transformation.

David Ernst

David is a Sr. Specialist Solution Architect – DevOps, with 20+ years of experience in designing and implementing software solutions for various industries. David is an automation enthusiast and works with AWS customers to design, deploy, and manage their AWS workloads/architectures.

Sofia Kendall

Sofia Kendall is a Solutions Architect who helps small and medium businesses achieve their goals as they utilize the cloud. Sofia has a background in Software Engineering and enjoys working to make systems reliable, efficient, and scalable.

Introducing AWS Libcrypto for Rust, an Open Source Cryptographic Library for Rust

Today we are excited to announce the availability of AWS Libcrypto for Rust (aws-lc-rs), an open source cryptographic library for Rust software developers with FIPS cryptographic requirements. At our 2022 AWS re:Inforce talk we introduced our customers to AWS Libcrypto (AWS-LC), and our investment in and improvements to open source cryptography. Today we continue that mission by releasing aws-lc-rs, a performant cryptographic library for Linux (x86, x86-64, aarch64) and macOS (x86-64) platforms.

Rust developers increasingly need to deploy applications that meet US and Canadian government cryptographic requirements. We evaluated how to deliver FIPS validated cryptography in idiomatic and performant Rust, built around our AWS-LC offering. We found that the popular ring (v0.16) library fulfilled much of the cryptographic needs in the Rust community, but it did not meet the needs of developers with FIPS requirements. Our intention is to contribute a drop-in replacement for ring that provides FIPS support and is compatible with the ring API. Rust developers with prescribed cryptographic requirements can seamlessly integrate aws-lc-rs into their applications and deploy them into AWS Regions.

AWS-LC is the foundation of aws-lc-rs. AWS-LC is the general-purpose cryptographic library for the C programming language at AWS. It is a fork from Google’s BoringSSL, with features and performance enhancements developed by AWS, such as FIPS support, formal verification for validating implementation correctness, performance improvements on Arm processors for ChaCha20-Poly1305 and NIST P-256 algorithms, and improvements to ECDSA signature verification for NIST P-256 curves on x86 based platforms. aws-lc-rs leverages these AWS-LC optimizations to improve performance in Rust applications. AWS-LC has been submitted to an accredited lab for FIPS validation testing, and upon completion will be submitted to NIST for certification. Once NIST grants a validation certificate to AWS-LC, we will make an announcement to Rust developers on how to leverage the FIPS mode in aws-lc-rs.

We used rustls, a Rust library that provides TLS 1.2 and 1.3 protocol implementations to benchmark aws-lc-rs performance. We ran a set of benchmark scenarios on c7g.metal (Graviton3) and c6i.metal (x86-64) Amazon Elastic Compute Cloud (Amazon EC2) instance types. The graph below shows the improvement of a TLS client negotiating new connections using aws-lc-rs. aws-lc-rs with rustls significantly improves throughput in each of the algorithms tested, and for every hardware platform. We are excited to share aws-lc-rs and these cryptographic improvements with the Rust community today. We are continually evaluating our benchmarks for opportunities to improve aws-lc-rs.

Getting Started

Incorporating aws-lc-rs in your project is straightforward. Let’s look at how you can use aws-lc-rs in your Rust application by creating a SHA256 digest of the message “Hello Blog Readers!” An enhanced version of this digest example is available in the aws-lc-rs repository.

Install the Rust toolchain with rustup, if you do not already have it.
Initialize a new cargo package for your Rust project, if you don’t already have one:
$ cargo new –bin aws-lc-rs-example

Record aws-lc-rs as a dependency in the project Cargo file:
$ cargo add aws-lc-rsApplications already using the ring (v0.16.x) API: If your application already leverages the ring API, you can easily test and benchmark your application against aws-lc-rs without changing your application’s use declarations:$ cargo remove ring
$ cargo add –rename ring aws-lc-rs

Edit src/main.rs:

use aws_lc_rs::digest::{digest, SHA256};

fn main() {
const MESSAGE: &[u8] = b”Hello Blog Readers!”;

let output = digest(&SHA256, MESSAGE);

for v in output.as_ref() {
print!(“{:02x}”, *v);
}
println!();
}

Compile and run the program:
$ cargo run
Finished dev [unoptimized + debuginfo] target(s) in 0.04s
Running `target/debug/aws-lc-rs-example`
c10843d459bf8e2fa6000d59a95c0ae57966bd296d9e90531c4ec7261460c6fb

Conclusion

Rust developers increasingly need to deploy applications that meet US and Canadian government cryptographic requirements. In this post, you learned how we are building aws-lc-rs in order to bring FIPS compliant cryptography to Rust applications. Together AWS-LC and aws-lc-rs bring performance improvements for Arm and x86-64 processor families for commonly used cryptographic algorithms. If you are interested in using or contributing to aws-lc-rs source code or documentation, they are publicly available under the terms of either the Apache Software License 2.0 or ISC License from our GitHub repository. We use GitHub Issues for managing feature requests or bug reports. You can follow aws-lc-rs on crates.io for notifications about new releases. If you discover a potential security issue in aws-lc-rs or AWS-LC, we ask that you notify AWS Security using our vulnerability reporting page.

Flatlogic Admin Templates banner

AWS Now Supports Credentials-fetcher for gMSA on Amazon Linux 2023

In Q1 of 2023, AWS announced the release of the group Managed Service Account (gMSA) credentials-fetcher daemon, with initial support on Amazon Linux 2023, Fedora Linux 36, and Red Hat Enterprise Linux 9. The credentials-fetcher daemon, developed by AWS, is an open source project under the Apache 2.0 License. This release solves a 10-year, longstanding challenge affecting domain connected Linux machines. Until now, Linux users couldn’t use Microsoft Active Directory (Microsoft AD) gMSA and thus have missed out on the improved security and flexibility that gMSA offers over standard service accounts. With the release of the credentials-fetcher daemon, organizations now gain all of gMSA’s benefits without being tied to Windows based hosts.

In this blog post, we explain the use case for credentials-fetcher and give simple instructions for using an Active Directory domain joined Linux server with gMSA. We also demonstrate the interaction with other domain joined services such as Amazon Relational Database Service (Amazon RDS) for Microsoft SQL Server.  The new capabilities of credentials-fetcher pave the way for additional use cases, such as using a Linux host in Amazon Elastic Container Service (Amazon ECS) clusters with gMSA. AWS is committed to using the credentials-fetcher open source project in the AWS cloud, though users may choose to run the service elsewhere. The utility of the service is not limited to AWS. The credentials-fetcher daemon can be leveraged on any supported distribution of Linux and in any environment that meets the Microsoft Active Directory version requirement. This includes on-premise environments, hosted data centers, and other cloud providers.

Solution overview

Organizations running Windows workloads hosted in on-premises data centers use Microsoft AD to authenticate users and services to shared resources over the network. As these organizations migrate workloads into Windows-based environments on AWS and on other clouds, customers traditionally use the domain-join model to access Microsoft AD from Windows instances. In addition, organizations that use Windows containers to scale their applications and reduce their total cost of ownership (TCO) have used gMSAs for Active Directory access by providing Kerberos tickets for container-hosts.

As customers modernize their Windows and Microsoft SQL Server workloads to Linux-based platforms, they still need to authenticate the migrated applications through the organization’s existing Microsoft AD. Although customers can use the domain-join methodology to connect Linux instances to Microsoft AD, it requires a number of steps that traditionally include security limitations. The current method involves a sidecar architecture that fails to periodically rotate passwords, unlike gMSA on Windows containers, thus inducing a security risk of password exposure. Organizations with stringent security postures have not adopted this method on Linux containers and have been waiting for a “gMSA on Windows containers”-like experience on Linux containers.  Active Directory gMSAs have been technically infeasible for customers on Linux-based environments, until today.

A brief introduction to gMSA

Windows-based server infrastructure commonly uses Microsoft Active Directory to facilitate authentication and authorization between users, computers, and other computer network resources. Traditionally, enterprise applications running on Windows platforms use either manually managed accounts used as service accounts or Managed Service Accounts (MSA) for authentication and authorization. The use of manually managed service accounts brings with it the overhead of service account password management, including manually updating the password and updating the password on all servers. It also introduces increased security risks as these accounts typically have elevated privileges and are not tied to a specific user, which creates challenges for attributing activity when auditing the account. For this reason, password management of these accounts is critical.

In contrast, Managed Service Accounts don’t have any password management overhead; the passwords for these type of accounts are automatically rotated and updated on your servers. They are also limited to a single computer account, which means they can’t be used on more than one computer, and cannot be used for interactive logons. A Group Managed Service Account (gMSA) is a special type of service account which augments the functionality; its identity can be shared across multiple computers without needing to know the password. Computers should be part of a Microsoft Active Directory domain, which manages these service accounts to make use of them. Although Windows containers cannot join a domain like an instance, they can still use gMSA identity for authentication and authorization.

Credentials-fetcher’s potential scenarios

With the addition of the credentials-fetcher daemon, more organizations can use gMSA. This gives customers more options if they’re more familiar with Linux, they’re looking to save on licensing costs, and/or looking to improve their security posture. Customers can now associate Linux machines to a gMSA and take advantage of the authentication and authorization between members of that group managed security account. Environments hosted on domain joined, gMSA associated Linux machines running .NET applications or running in Linux containers can now use the gMSA to authenticate between their own domains and other services like Microsoft SQL Server.

Scenario 1: A Microsoft .NET application is running in Docker containers, with the hosts on a Microsoft Active Directory domain joined Amazon Elastic Computer Cloud (Amazon EC2) Linux server. The Linux application server is added as members of the gMSA group. The gMSA account is granted permissions to the domain joined Microsoft SQL Server or Amazon RDS for Microsoft SQL Server database.

Scenario 2: A Microsoft .NET application is running in Docker containers and Microsoft SQL server running in its own Docker container, with the hosts on a Microsoft Active Directory domain joined Amazon EC2 Linux server.  The Linux host servers of the application containers and Microsoft SQL Server container are added as members of the gMSA group. The gMSA account is granted permissions to the Microsoft SQL Server instance database running in a container.

Scenario 3: A Microsoft .NET application is running on an Amazon Elastic Container Service (Amazon ECS) cluster, hosted on a Microsoft Active Directory domain. The Linux servers within the Amazon ECS cluster are added as members of the gMSA group. The gMSA account is granted permissions to the domain joined Microsoft SQL Server or Amazon RDS for Microsoft SQL Server database.

Here is a visualization of the featured use-case scenarios.

Figure 1 Different use-case scenarios with Credentials-fetcher

Implementing the environment

This section will walk you through the prerequisites, environment setup and the installation steps for the credentials-fetcher daemon’s use cases.

Prerequisites

You have properly installed and configured the AWS Command Line Interface (AWS CLI) and PowerShell core on your workstation. We’ve chosen to use the AWS CLI for these steps so that the end-to-end workflow can be demonstrated.
This blog post as of April 4th, 2023 requires an install of Fedora Linux 36 or newer and the latest Amazon Linux 2023 AMI.
This blog post references AWS Managed Microsoft Active Directory, but it will also work with other self-managed Microsoft Active Directory scenarios as long as the Linux machines are able to be domain joined.
You have installed Amazon Relational Database Service (Amazon RDS) instance that is joined to the domain.
You have elevated administrative Active Directory credentials to configure instances to join a domain and create a Microsoft AD security group.
You have accessed to the credentials-fetcher GitHub package for the installation of the latest daemon and updated instructions.

Environment Setup for gMSA on Linux use cases

Figure 2 Credentials-fetcher running in Fedora Linux Server.

All instructions are assuming the use of the Fedora Linux 36 distro, which has been made available during the time of the blog creation. We plan to add gMSA support for additional Linux distributions in the future.

1.     Set up AWS Managed Microsoft Active Directory or Self-hosted Active Directory.

Active Directory setup:  You will set up domain-join from Linux instance to the AD domain. The Linux instance is part of the AD Security group that has access to gMSA account as configured by AD administrator.
AWS Managed Microsoft Active Directory can be deployed using this AWS CloudFormation template.

2.     Create a gMSA account as a Microsoft AD administrator.

Example: Replace ‘LinuxAppFarm’, ‘LinuxFarm01$’ and ‘CORP.EXAMPLE.COM’ with your own gMSA and domain names, respectively. Three Linux instances are displayed in this example: LinuxInstance01$, LinuxInstance02$ and LinuxInstance03$.

# Create the AD group
New-ADGroup -Name “LinuxAppFarm” -SamAccountName “LinuxAppFarm” -GroupScope DomainLocal

# Create the gMSA
New-ADServiceAccount -Name “gmsamachines” -DnsHostName “gmsamachines.CORP.EXAMPLE.COM” -ServicePrincipalNames “host/LinuxAppFarm”, “host/LinuxAppFarm.CORP.EXAMPLE.COM” -PrincipalsAllowedToRetrieveManagedPassword “LinuxAppFarm”

# Add your Linux Instance or containers to the AD group Add-ADGroupMember -Identity “LinuxAppFarm” -Members “LinuxInstances01$”, “LinuxInstances02$”, “LinuxInstances03$”, “MSSQLRDSIntance$”

3.     Verify and test the gMSA account.

PowerShell
# Get the current computer’s group membership
Test-ADServiceAccount gmsamachines

# Get the current computer’s group membership
Get-ADComputer $env:LinuxInstances01 | Get-ADPrincipalGroupMembership | Select-Object DistinguishedName

# Get the groups allowed to retrieve the gMSA password and Change “gmsamachines” for your own gMSA name
(Get-ADServiceAccount gmsamachines -Properties PrincipalsAllowedToRetrieveManagedPassword).PrincipalsAllowedToRetrieveManagedPassword.

Additional reference detailed instructions can be found in this guide to getting started with group managed service accounts.

4.     Create a credentialspec associated with a gMSA account:

Install powershell CredentialSpec module and create CredentialSpec

PowerShell
Install-Module CredentialSpec

New-CredentialSpec -AccountName LinuxAppFarm // Replace ‘LinuxAppFarm’ with your own gMSA Group

You will find the credentialspec in the directory ‘C:Program DataDockerCredentialspecsLinuxAppFarm_CredSpec.json’

5.     Obtain and deploy the supported Fedora Linux 36 version or newer supported AMI (AWS Public Cloud download.)

6.     Manually join your Linux system to the Microsoft Active Directory domain using the following command:

#Install realmd and configure DNS resolver for the Active Directory domain
sudo dnf install realmd sssd oddjob oddjob-mkhomedir adcli krb5-workstation samba-common-tools -y
sudo systemctl stop systemd-resolved
sudo systemctl disable systemd-resolved
sudo unlink /etc/resolv.conf

#Add your DNS nameserver IP and domain name to the resolv.conf and save
sudo nano /etc/resolv.conf

nameserver 10.0.0.20
search corp.example.com

#Join the Linux Server to the realm/domain case-sensitive

Replace (upper-case) realm account and domain name indicated by <bold text>  with the UPN of domain user and FQDN of domain name. Remove < and > in your final command.

Auto-join is not currently supported until the Amazon Linux 2022 distro is updated with the new rpm.

Microsoft SQL Server and Amazon RDS for Microsoft SQL Server can be added for Kerberos database authentication.

Microsoft SQL and Amazon RDS for Microsoft SQL Server must be joined to the AWS Managed Microsoft AD Domain.

See instructions on how to connect Amazon RDS for Microsoft SQL Server to the Microsoft Active Directory domain.

For the highest recommended security, constrained Kerberos delegation for gMSA should be applied to the accounts for any service access.

Set-ADAccountControl -Identity <TestgMSA$> -TrustedForDelegation $false -TrustedToAuthForDelegation $false
Set-ADServiceAccount -Identity TestgMSA$ -Clear ‘msDS-AllowedToDelegateTo’

Detailed instructions can be found here.

7.     Invoke the AddkerberosLease API with the credentialsspec input as shown in following command. This step is important to allow the credentials-fetcher to make a connection to Microsoft Active Directory. The gMSA account is then used for authentication.
Use this command with Fedora Linux only: (grpc_cli is not available on Amazon Linux)

#Replace gMSA group name, netbios name and DNS names in the command (Bold text)
grpc_cli call unix:/var/credentials-fetcher/socket/credentials_fetcher.sock AddKerberosLease “credspec_contents: ‘{“CmsPlugins”:[“ActiveDirectory”],”DomainJoinConfig”:{“Sid”:”S-1-5-21-1445507628-2856671781-3529916291″,”MachineAccountName”:”gmsamachines”,”Guid”:”af602f85-d754-4eea-9fa8-fd76810485f1″,”DnsTreeName”:”corp.example.com”,”DnsName”:”corp.example.com”,”NetBiosName”:”DEMOCORP”},”ActiveDirectoryConfig”:{“GroupManagedServiceAccounts”:[{“Name”:”gmsamachines”,”Scope”:”corp.example.com”},{“Name”:”gmsamachines”,”Scope”:”DEMOCORP”}]}}'”

Response example: (Note the response for use with your Docker application container)
path to kerberos ticket : /var/credentials-fetcher/krbdir/726837743cc966c7b4da/WebApp01

8.     Invoke the Delete kerberosLease API with lease id input as shown here. Set unique identifier lease_id, associated to the request. The deleted_kerberos_file_paths are paths associated to the Kerberos tickets deleted corresponding to the gMSA accounts.
Use this command from the Linux host:

#Delete Kerberos Lease sample command
grpc_cli call unix:/var/credentials-fetcher/socket/credentials_fetcher.sock DeleteKerberosLease “lease_id: ‘${response_lease_id_from_add_kerberos_lease}'”

Installation of credentials-fetcher on supported Linux distros

In this basic use case for testing the credentials-fetcher rpm, the following architecture is assumed for the purposes of this blog.

An AWS Managed Microsoft Active Directory joined Linux Application Server.
An AWS Managed Microsoft Active Directory joined RDS for Microsoft SQL.
gMSA account established for account credentials in an AWS Managed Microsoft Active Directory.

Fedora Linux 36 Server setup:

Deploy the “Fedora cloud based image for AWS public cloud” located here, to your AWS account.
Credentials-fetcher is packaged and included as part of the standard Fedora package repositories. Install the credentials-fetcher rpm by typing the command:

sudo dnf install credentials-fetcher -y

How to use credentials-fetcher per scenario

In these instructions, we will demonstrate the use of credentials-fetcher with an ASP.NET application and Amazon RDS for Microsoft SQL Server. A Microsoft SQL Server container scenario will also be demonstrated as an additional use case.

Scenario 1:  Using .NET Core container application on Linux with Amazon RDS for Microsoft SQL Server backend

Figure 3 Using .NET Core container application on Linux with Amazon RDS for Microsoft SQL Server backend

Once the environment prerequisites have been met, you can install Docker and a repository in preparation for deploying a .NET Core application to a container on the Linux server or servers.

1.     Set up the repository.

Install the dnf-plugins-core package (which provides the commands to manage your DNF repositories) and set up the repository.

sudo dnf -y install dnf-plugins-core

sudo dnf config-manager
–add-repo
https://download.docker.com/linux/fedora/docker-ce.repo

2.     Install Docker Engine and verify credentials-fetcher is installed and started.

Install the latest version of Docker Engine, containerd, and Docker Compose and start the Docker daemon:

sudo dnf install docker-ce docker-ce-cli containerd.io docker-compose-plugin
sudo systemctl start docker

sudo dnf install credentials-fetcher
sudo systemctl start credentials-fetcher

Additional reference detailed instructions on how to install Docker engine can be found here.

1. Create a kerberos ticket – associated to the gMSA account as described in step 8 of “Environment Setup for gMSA on Linux use cases”

Take a note of the response generated by the Kerberos ticket creation.

2.     Leverage the Docker File for environment variables

FROM mcr.microsoft.com/dotnet/sdk:5.0 AS build-image
WORKDIR /src
COPY *.csproj ./
RUN dotnet restore
COPY . ./
RUN dotnet publish -c Release -o out
FROM mcr.microsoft.com/dotnet/aspnet:5.0
WORKDIR /app
EXPOSE 80
COPY —from=build-image /src/out ./
RUN apt-get -qq update &&
apt-get -yqq install krb5-user &&
apt-get -yqq clean

ENV KRB5CCNAME=/var/credentials-fetcher/krbdir/krb5cc

ENTRYPOINT [“dotnet”, “WebApp.dll”]

Example: /var/credentials-fetcher/krbdir/726837743cc966c7b4da/WebApp01

3.     Build a Docker image for your application on the Linux server:

docker build -t <your_image_name>

4.     Run Docker with bind mount to the kerberos ticket, ensure the environment variable KRB5CCNAME (see example docker file above) is pointing to the destination location of the bind mount inside the application container.

sudo docker run -p 80:80 -d -it —name webapp1 —mount type=bind,source=/var/credentials-fetcher/krbdir/726837743cc966c7b4da/WebApp01,target=/var/credentials-fetcher/krbdir {docker_image}

5.     Add Amazon RDS for Microsoft SQL Server to the gMSA group with the following commands:

#Add the AD join SQL RDS server to the gMSA account
Set-ADServiceAccount -Identity “gmsamachines” -PrincipalsAllowedToRetrieveManagedPassword “mssqlrdsname$”

6.     Download and install the SQL Management Tool (SSMS) on a management Windows machine that is a member of the service account group to test the Amazon RDS for Microsoft SQL Server connections. The .NET application will have access to each other and the Amazon RDS for Microsoft SQL Server instance.

7.     Log in to the Amazon RDS for Microsoft SQL Server instance and apply the gMSA service account with the desired permissions required by the .NET application.

Scenario 2:  Using .NET Core container application on Linux with a Microsoft SQL Server Container

Figure 4 Using .NET Core container application on Linux with a Microsoft SQL Server Container.

As with Scenario 1, the same steps to install Docker on Linux and deploy your application will apply. The difference will be the deployment of Microsoft SQL Server in a container and the confirmation that the server operates as expected running on Linux and leveraging gMSA for authentication.

1.     As with the first scenario, install and run the credentials-fetcher on the Linux server or servers you are deploying your .NET application containers to.

2.     Deploy a Microsoft SQL Server 2022 container on one of the Linux servers in your gMSA group.

Leverage the Docker file example in “Use Case 1” environment KRB5CCNAME from the Microsoft SQL Server container.
Reference “Use Case 1” for details on verifying docker file KRB5CCNAME.

Run the following command on your Linux server to install the latest Microsoft SQL Server 2022 Docker container available. Replace yourStrong(!)Password with a strong password in the command.

sudo docker run -e “ACCEPT_EULA=Y” -e “SA_PASSWORD=yourStrong(!)Password” -p 1433:1433 -d mcr.microsoft.com/mssql/server:2022-latest —mount type=bind,source=/var/credentials-fetcher/krbdir/726837743cc966c7b4da/WebApp01,target=/var/credentials-fetcher/krbdir {SQL_docker_image}

Verify that the Microsoft SQL Server Docker container is running with the following command:

sudo docker ps

Additional reference details for deploying a Microsoft SQL Server container on Linux can be found here.

3.     Download and install the SQL Management Tool (SSMS) on a management Windows machine that is a member of the service account group to test the Microsoft SQL Server connection.

4.     Log in to the Microsoft SQL Server instance and apply the gMSA service account with the desired permissions required by the .NET application.

Conclusion

Linux containers have become a key modernization destination for customers running .NET workloads. gMSA for Linux containers will help organizations and the overall Microsoft administrator and developer community to access AD from applications and services hosted on Linux containers using the service account authentication model.

gMSA is a managed account that provides automatic password management, service principal name (SPN) management, and the ability to delegate management to administrators over multiple servers or instances. This unblocks a range of modernization use cases around identity using Microsoft AD, such as, connecting .NET Core applications hosted on Linux containers with SQL Server authenticating over Microsoft AD, and securely opening up access to network resources from applications running with service accounts. Based on the capabilities and customer usefulness, credentials-fetcher is positioned to be extended into services such as Amazon Elastic Container Service (ECS) and Amazon Elastic Kubernetes Service (EKS).

This service feature extends support for non-Windows container applications that require gMSA for Microsoft AD Authentication. AWS is dedicated to continuing development and support for the credentials-fetcher daemon open source project. We believe that open source is good for everyone and we are committed to bringing the value of open source to our customers, and the operational excellence of AWS to open source communities. Contributions and feedback are welcome.

Flatlogic Admin Templates banner

Behind the Scenes on AWS Contributions to Cloud Native Open Source Projects

Amazon Elastic Kubernetes Service (Amazon EKS) is well known in the Kubernetes community. But few realize that AWS engineers are closely involved and contributing upstream to Kubernetes and to many more cloud native open source projects.

In the past year alone, AWS contributed significantly to containerd, Cortex, etcd, Fluentd, nerdctl, Notary, OpenTelemetry, Thanos, and Tinkerbell. We employ maintainers and contributors on these projects and we will contribute more to these and other projects in the coming year. Here’s a behind-the-scenes look at our contributions and why we’re investing in the open source projects we support. You can also meet many of our contributors in the AWS booth at KubeCon Europe in Amsterdam, April 18-21, 2023 and hear from them in our virtual Container Day event 9 a.m. – 4 p.m. CEST on April 18.

“Amazon EKS is committed to open source and we are spending a lot of our cycles now focused on contributing back to the community. Kubernetes is part of a community that’s bigger than AWS and so we’re continuing to be committed to maintaining and helping that community to be successful because without it, we wouldn’t exist, either,” said Barry Cooks, Vice President, Kubernetes, at AWS and a Cloud Native Computing Foundation (CNCF) governing board member.

AWS contributes to Kubernetes and Etcd

Today, AWS is heavily involved in open source, cloud native projects. Consider, for example, some of our recent key contributions to Kubernetes and etcd, the underlying data store for Kubernetes.

“We’re building the AWS cloud provider, contributing to CAPI (cluster API), and serve as part of the security response committee. We helped implement gzip optimization which improves the performance of Kubernetes clients,” said Nathan Taber who leads the product team for Kubernetes at AWS, in a keynote at KubeCon North America 2022. “With etcd we’re bringing our operational learnings from running just so much etcd at scale, back into the community.”

The AWS cloud provider for Kubernetes is the open source interface between a Kubernetes cluster and AWS service APIs. This project allows a Kubernetes cluster to provision, monitor, and remove AWS resources necessary for operation of the cluster.

As of Kubernetes 1.27, AWS has just finished a multi-year effort to migrate our legacy cloud provider out of tree to an external cloud provider. The cloud provider migration reduces binary bloat in the main kubernetes/kubernetes (k/k) repository, as well as reduces dependency complexity and the surface area for security vulnerabilities.

AWS has also built a webhook framework that allows cloud providers to host webhooks in their cloud-controller-managers, which makes certain migration tasks easier. One use case for this is helping other cloud providers to migrate the persistent volume labeller admission controllers from the API server code, which is one of the last areas of cloud provider specific code that needs to be migrated out of core Kubernetes.

“We’ve included a lot of space in our planning for upstream open source work this year,” said Nick Turner, software developer on the AWS Kubernetes team and a chair in Kubernetes SIG-cloud-provider. “Expect us to keep up our contributions to the cloud provider and the load balancer controller as well as increase our investments in the AWS IAM authenticator for Kubernetes and KMS encryption provider.”

These and other Kubernetes contributions bring value to the entire Kubernetes community as well as to the EKS service and its customers.

Since KubeCon Detroit last fall, the EKS-etcd team has contributed numerous improvements to etcd. Chao Chen contributed to the effort to improve testing mechanisms for etcd by unifying the test frameworks used by etcd tests. Baoming Wang contributed an important metric to the Kubernetes API server code base which will help catch data corruption issues early. We’ve also worked on building a linearizability test suite, made various improvements to the core etcd database and the etcd backend database Bolt-DB, contributed to documentation, made helm more resilient to etcd side transient errors, and fixed an issue with the installation script for argo-cd-helmfile.

What’s driving AWS to contribute more to cloud native open source

Like most modern companies, AWS builds many of its services with open source components. There are several business and technical reasons we do this, which we’ve outlined in an article on The New Stack about why we invest in sustainable open source. We recognize that the success of our services depends on the success of those underlying open source projects.

Given that most of the open source projects that AWS supports underpin specific services, AWS tasks all engineers working in services, regardless of their assigned sub-service teams, to contribute in any way that they can to those upstream projects.

The result is a virtuous cycle that promotes mutually beneficial growth. As AWS services grow, so too do the open source projects upon which they are based because of AWS contributions and support. Conversely, as these open source projects grow from the contributions of other companies and developers, so do the benefits to the AWS services that depend upon them.

AWS contributions focus on performance and scale

AWS contributions to open source typically come as a practical matter in the form of bug fixes, code reviews, documentation, new features, or security enhancements. Like many developers working in the open source space, AWS engineers often work to address issues that arise in the course of their day jobs and then share the fixes with the rest of the open source community. Similarly, new features for an open source project are developed by AWS engineers to expand the project’s scale or performance which in turn increases the project’s usability, stability, and overall appeal.

Because AWS has a large number of Kubernetes clusters under management, it affords AWS a unique opportunity to test the limitations of open source software and build its edges stronger and further out from its initial core. So many of the contributions that our team members do for upstream Kubernetes, etcd, containerd, and other projects center on making sure that we provide insights to the upstream community on where things break down in scaling, production, and operational readiness.

The resulting insights provide value for the entire open source community as well as our own customers.

Take for example, the lag fix that curiously performed as a latency expander. AWS engineer Shyam Jeedigunta, was looking at the logs and metrics collected from thousands of production EKS clusters. He determined that Gzip compression is enabled inside the Kubernetes API server to reduce the demand on network bandwidth and to decrease latency.  However, the compression was actually increasing the latency for large list requests made by clients to the Kubernetes API server. Shyam, who is also co-chair of the Kubernetes scalability special interest group (SIG), took a deep dive into the issue to investigate whether a particular compression level created the problem and if so, could the compression level be reduced? Could Gzip compression be disabled entirely? What impact would that have on latency and network bandwidth?

Answers to questions like this one lead to contributions upstream in etcd and core Kubernetes from AWS service teams. Customers and others often report these kinds of issues to the project as well, but the nature of the problem isn’t clear until it’s viewed on 1,000 nodes and 200,000 objects of a certain kind. AWS engineers diagnose what’s going on, put together troubleshooting information, and collate information into proposals on how to fix the problem(s) to upstream to Kubernetes. AWS likes to spearhead fixing issues that arise from running the projects at scale.

Key AWS contributions

AWS contributes to many Kubernetes sub projects and SIGs. For example, Micah Hausler and Sri Saran Balaji Vellore Rajakumar serve on the Kubernetes Security Response Committee (SRC), Davanum Srinivas (Dims) chairs SIG-Architecture and SIG-k8s-infra, and Nick Turner is a chair in SIG-cloud-provider.  Key contributions have gone into projects including containerd, Cortex, cdk8s, CNI, nerdctl and Prometheus. Innovations have also been substantial and include TorchServe, improved ARM support through AWS Graviton, and the Virtual GPU plugin. However, this is not an exhaustive or complete list of AWS contributions and innovations in the cloud native community.

On containerd, for example, AWS employs two maintainers who contribute features and help ensure the project’s general health and security. Key contributions from AWS engineers to the containerd project include OpenTelemetry integration in the 1.7.0 release, improved tracing, and improved fuzzing integration.

“It’s been awesome to see the growth on the container runtime team here at AWS these past few years. I love to see the eagerness to learn not just *how* to contribute, but how to do it well and really benefit the broader community,” said Phil Estes, a principal engineer at AWS and a containerd maintainer.

Nerdctl, a Docker-compatible CLI for containerd and a containerd sub-project, is used by other open source projects Lima, Finch, and Rancher Desktop. AWS engineers significantly improved nerdctl’s compose support by adding 11 out of 13 missing compose commands. We enhanced nerdctl’s image signing/verification support by contributing cosign support for nerdctl compose, and notation support for nerdctl. And engineer Jin Dong recently became the first reviewer for the project from AWS.

AWS services are also standardizing on OpenTelemetry, a set of open source tools and standards for collecting metrics, logs, and traces to measure application performance. AWS Distro for OpenTelemetry (ADOT), OpenSearch, and CloudWatch are all building on OpenTelemetry and contribute back to the upstream project. All ADOT code is 100% open source and contributed upstream. Key contributions include: adding functionality to upstream observability components such as OpenTelemetry language SDKs, collectors, and agents.

“Amazon is the fourth largest contributor to OpenTelemetry with a dedicated maintainer and many contributors working on the project. A key contribution has been improving collector and metric stability, including improved Prometheus interoperability with OpenTelemetry,” said Taber.

A fourth example is Cortex where AWS is the top supporter of the project and employs three maintainers. As AWS runs this project at scale, engineers have the opportunity to identify and fix scaling cliffs before they become a problem for the rest of the community. Some of the key contributions are new features and performance improvements. Examples include partition compactor, Ring DynamoDB Multikey KV, out of order samples ingestion, snappy-block gRPC compression, ARM images, and Thanos PromQL engine integration.

We have also contributed bug fixes to Thanos, a tool for setting up highly available Prometheus instances with long term storage. Thanos is a CNCF incubating project which Cortex depends on. We participated in the development of the new Thanos PromQL engine and open sourced a tool that could use fuzzing for correctness testing which has already caught a few bugs.

AWS employs four maintainers on Tinkerbell, a cloud native open source bare metal provisioning engine for EKS Anywhere and a CNCF Sandbox project. Key contributions include organizing the project roadmap, VLAN support, a Kubernetes native backend, out-of-band management Kubernetes controller, Helm Chart deployment, and Cluster API provider updates.

“Our team has done a lot of work to update the Tinkerbell backend from Postgres to native Kubernetes,” said Taber.

AWS employs three maintainers in Notation, a sub project of Notary under the CNCF, and is the third largest code contributor to Notary. Notation enables the generation of cryptographic signatures for container images so users can verify that they come from a trusted source or process. AWS founded the sub project with other contributors to come up with specifications for signature format, generation, verification, and revocation. As part of this work we also defined a process for evaluating signature envelope formats like COSE ensuring that they met a high security bar before they were used in Notation.

AWS employees have either written or reviewed the majority of code contributions for the core Notation libraries and a CLI. AWS also employs a maintainer to Ratify so Kubernetes users can easily enable policies for signature verification with their existing admissions controllers. Similarly we also employ a maintainer to ORAS so signatures can easily be pushed to OCI registries. Notation enables users to define granular trust policies for defining which sources they want to trust, balance deployment safety and security needs, and flexibility on secure signing key storage options.

We have contributed to many other open source projects as well, including Crossplane, for which AWS added support for EKS IRSA in the China region and fixed Amazon Route 53 wildcard support, and Backstage, with AWS Proton and AWS Code Suite (AWS CodeBuild, AWS CodePipeline, and AWS CodeDeploy).

“We’re very excited about doing more development in the open, sharing that with our customers, and working directly in some cases with customers on their needs in open source projects and working together to make the community stronger in the Kubernetes space,” Cooks said.

AWS is open

We want to hear from you. AWS engineers are open to helping community members through collaboration and contribution opportunities. Tell us how we can help meet your needs.

AWS engineers, solutions architects, and product managers are hanging out on the Kubernetes community and the CNCF community Slack channels. Channels where you can reach out to us include the provider AWS channel and Karpenter channel, and the AWS controllers for Kubernetes channel on the Kubernetes Slack.

Find us and tell us what you’d like us to work on. Or if you have a particular issue that you found in one of these upstream projects that you think our engineers can help move the needle on. Come find us and talk to us in the CNCF’s AWS Slack channel and join us for our virtual Container Day on April 18, before KubeCon EU.

Flatlogic Admin Templates banner