The history and future roadmap of the AWS CloudFormation Registry

AWS CloudFormation is an Infrastructure as Code (IaC) service that allows you to model your cloud resources in template files that can be authored or generated in a variety of languages. You can manage stacks that deploy those resources via the AWS Management Console, the AWS Command Line Interface (AWS CLI), or the API. CloudFormation helps customers to quickly and consistently deploy and manage cloud resources, but like all IaC tools, it faced challenges keeping up with the rapid pace of innovation of AWS services. In this post, we will review the history of the CloudFormation registry, which is the result of a strategy we developed to address scaling and standardization, as well as integration with other leading IaC tools and partner products. We will also give an update on the current state of CloudFormation resource coverage and review the future state, which has a goal of keeping CloudFormation and other IaC tools up to date with the latest AWS services and features.

History

The CloudFormation service was first announced in February of 2011, with sample templates that showed how to deploy common applications like blogs and wikis. At launch, CloudFormation supported 13 out of 15 available AWS services with 48 total resource types. At first, resource coverage was tightly coupled to the core CloudFormation engine, and all development on those resources was done by the CloudFormation team itself. Over the past decade, AWS has grown at a rapid pace, and there are currently 200+ services in total. A challenge over the years has been the coverage gap between what was possible for a customer to achieve using AWS services, and what was possible to define in a CloudFormation template.

It became obvious that we needed a change in strategy to scale resource development in a way that could keep up with the rapid pace of innovation set by hundreds of service teams delivering new features on a daily basis. Over the last decade, our pace of innovation has increased nearly 40-fold, with 80 significant new features launched in 2011 versus more than 3,000 in 2021. Since CloudFormation was a key adoption driver (or blocker) for new AWS services, those teams needed a way to create and manage their own resources. The goal was to enable day one support of new services at the time of launch with complete CloudFormation resource coverage.

In 2016, we launched an internal self-service platform that allowed service teams to control their own resources. This began to solve the scaling problems inherent in the prior model where the core CloudFormation team had to do all the work themselves. The benefits went beyond simply distributing developer effort, as the service teams have deep domain knowledge on their products, which allowed them to create more effective IaC components. However, as we developed resources on this model, we realized that additional design features were needed, such as standardization that could enable automatic support for features like drift detection and resource imports.

We embarked on a new project to address these concerns, with the goal of improving the internal developer experience as well as providing a public registry where customers could use the same programming model to define their own resource types. We realized that it wasn’t enough to simply make the new model available—we had to evangelize it with a training campaign, conduct engineering boot-camps, build better tooling like dashboards and deployment pipeline templates, and produce comprehensive on-boarding documentation. Most importantly, we made CloudFormation support a required item on the feature launch checklist for new services, a requirement that goes beyond documentation and is built into internal release tooling (exceptions to this requirement are rare as training and awareness around the registry have improved over time). This was a prime example of one of the maxims we repeat often at Amazon: good mechanisms are better than good intentions.

In 2019, we made this new functionality available to customers when we announced the CloudFormation registry, a capability that allowed developers to create and manage private resource types. We followed up in 2021 with the public registry where third parties, such as partners in the AWS Partner Network (APN), can publish extensions. The open source resource model that customers and partners use to publish third-party registry extensions is the same model used by AWS service teams to provide CloudFormation support for their features.

Once a service team on-boards their resources to the new resource model and builds the expected Create, Read, Update, Delete, and List (CRUDL) handlers, managed experiences like drift detection and resource import are all supported with no additional development effort. One recent example of day-1 CloudFormation support for a popular new feature was Lambda Function URLs, which offered a built-in HTTPS endpoint for single-function micro-services. We also migrated the Amazon Relational Database Service (Amazon RDS) Database Instance resource (AWS::RDS::DBInstance) to the new resource model in September 2022, and within a month, Amazon RDS delivered support for Amazon Aurora Serverless v2 in CloudFormation. This accelerated delivery is possible because teams can now publish independently by taking advantage of the de-centralized Registry ownership model.

Current State

We are building out future innovations for the CloudFormation service on top of this new standardized resource model so that customers can benefit from a consistent implementation of event handlers. We built AWS Cloud Control API on top of this new resource model. Cloud Control API takes the Create-Read-Update-Delete-List (CRUDL) handlers written for the new resource model and makes them available as a consistent API for provisioning resources. APN partner products such as HashiCorp Terraform, Pulumi, and Red Hat Ansible use Cloud Control API to stay in sync with AWS service launches without recurring development effort.

Figure 1. Cloud Control API Resource Handler Diagram

Besides 3rd party application support, the public registry can also be used by the developer community to create useful extensions on top of AWS services. A common solution to extending the capabilities of CloudFormation resources is to write a custom resource, which generally involves inline AWS Lambda function code that runs in response to CREATE, UPDATE, and DELETE signals during stack operations. Some of those use cases can now be solved by writing a registry extension resource type instead. For more information on custom resources and resource types, and the differences between the two, see Managing resources using AWS CloudFormation Resource Types.

CloudFormation Registry modules, which are building blocks authored in JSON or YAML, give customers a way to replace fragile copy-paste template reuse with template snippets that are published in the registry and consumed as if they were resource types. Best practices can be encapsulated and shared across an organization, which allows infrastructure developers to easily adhere to those best practices using modular components that abstract away the intricate details of resource configuration.

CloudFormation Registry hooks give security and compliance teams a vital tool to validate stack deployments before any resources are created, modified, or deleted. An infrastructure team can activate hooks in an account to ensure that stack deployments cannot avoid or suppress preventative controls implemented in hook handlers. Provisioning tools that are strictly client-side do not have this level of enforcement.

A useful by-product of publishing a resource type to the public registry is that you get automatic support for the AWS Cloud Development Kit (CDK) via an experimental open source repository on GitHub called cdk-cloudformation. In large organizations it is typical to see a mix of CloudFormation deployments using declarative templates and deployments that make use of the CDK in languages like TypeScript and Python. By publishing re-usable resource types to the registry, all of your developers can benefit from higher level abstractions, regardless of the tool they choose to create and deploy their applications. (Note that this project is still considered a developer preview and is subject to change)

If you want to see if a given CloudFormation resource is on the new registry model or not, check if the provisioning type is either Fully Mutable or Immutable by invoking the DescribeType API and inspecting the ProvisioningType response element.

Here is a sample CLI command that gets a description for the AWS::Lambda::Function resource, which is on the new registry model.

$ aws cloudformation describe-type –type RESOURCE
–type-name AWS::Lambda::Function | grep ProvisioningType

“ProvisioningType”: “FULLY_MUTABLE”,

The difference between FULLY_MUTABLE and IMMUTABLE is the presence of the Update handler. FULLY_MUTABLE types includes an update handler to process updates to the type during stack update operations. Whereas, IMMUTABLE types do not include an update handler, so the type can’t be updated and must instead be replaced during stack update operations. Legacy resource types will be NON_PROVISIONABLE.

Opportunities for improvement

As we continue to strive towards our ultimate goal of achieving full feature coverage and a complete migration away from the legacy resource model, we are constantly identifying opportunities for improvement. We are currently addressing feature gaps in supported resources, such as tagging support for EC2 VPC Endpoints and boosting coverage for resource types to support drift detection, resource import, and Cloud Control API. We have fully migrated more than 130 resources, and acknowledge that there are many left to go, and the migration has taken longer than we initially anticipated. Our top priority is to maintain the stability of existing stacks—we simply cannot break backwards compatibility in the interest of meeting a deadline, so we are being careful and deliberate. One of the big benefits of a server-side provisioning engine like CloudFormation is operational stability—no matter how long ago you deployed a stack, any future modifications to it will work without needing to worry about upgrading client libraries. We remain committed to streamlining the migration process for service teams and making it as easy and efficient as possible.

The developer experience for creating registry extensions has some rough edges, particularly for languages other than Java, which is the language of choice on AWS service teams for their resource types. It needs to be easier to author schemas, write handler functions, and test the code to make sure it performs as expected. We are devoting more resources to the maintenance of the CLI and plugins for Python, Typescript, and Go. Our response times to issues and pull requests in these and other repositories in the aws-cloudformation GitHub organization have not been as fast as they should be, and we are making improvements. One example is the cloudformation-cli repository, where we have merged more than 30 pull requests since October of 2022.

To keep up with progress on resource coverage, check out the CloudFormation Coverage Roadmap, a GitHub project where we catalog all of the open issues to be resolved. You can submit bug reports and feature requests related to resource coverage in this repository and keep tabs on the status of open requests. One of the steps we took recently to improve responses to feature requests and bugs reported on GitHub is to create a system that converts GitHub issues into tickets in our internal issue tracker. These tickets go directly to the responsible service teams—an example is the Amazon RDS resource provider, which has hundreds of merged pull requests.

We have recently announced a new GitHub repository called community-registry-extensions where we are managing a namespace for public registry extensions. You can submit and discuss new ideas for extensions and contribute to any of the related projects. We handle the testing, validation, and deployment of all resources under the AwsCommunity:: namespace, which can be activated in any AWS account for use in your own templates.

To get started with the CloudFormation registry, visit the user guide, and then dive in to the detailed developer guide for information on how to use the CloudFormation Command Line Interface (CFN-CLI) to write your own resource types, modules, and hooks.

We recently created a new Discord server dedicated to CloudFormation. Please join us to ask questions, discuss best practices, provide feedback, or just hang out! We look forward to seeing you there.

Conclusion

In this post, we hope you gained some insights into the history of the CloudFormation registry, and the design decisions that were made during our evolution towards a standardized, scalable model for resource development that can be shared by AWS service teams, customers, and APN partners. Some of the lessons that we learned along the way might be applicable to complex design initiatives at your own company. We hope to see you on Discord and GitHub as we build out a rich set of registry resources together!

About the authors:

Eric Beard

Eric is a Solutions Architect at Amazon Web Services in Seattle, Washington, where he leads the field specialist group for Infrastructure as Code. His technology career spans two decades, preceded by service in the United States Marine Corps as a Russian interpreter and arms control inspector.

Rahul Sharma

Rahul is a Senior Product Manager-Technical at Amazon Web Services with over two years of product management spanning AWS CloudFormation and AWS Cloud Control API.

Announcing General Availability of Amazon CodeCatalyst

We are pleased to announce that Amazon CodeCatalyst is now generally available. CodeCatalyst is a unified software development service that brings together everything teams need to get started planning, coding, building, testing, and deploying applications on AWS. CodeCatalyst was designed to make it easier for developers to spend more time developing application features and less time setting up project tools, creating and managing continuous integration and continuous delivery (CI/CD) pipelines, provisioning and configuring various development and deployment environments, and onboarding project collaborators. You can learn more and get started building in minutes on the AWS Free Tier at the CodeCatalyst website.

Launched in preview at AWS re:Invent in December 2022, CodeCatalyst provides an easy way for professional developers to build and deploy applications on AWS. We built CodeCatalyst based on feedback we received from customers looking for a more streamlined way to build using DevOps best practices. They want a complete software development service that lets them start new projects more quickly and gives them confidence that it will continue delivering a great long term experience throughout their application’s lifecycle.

Do more of what you love, and less of what you don’t

Starting a new project is an exciting time of imagining the possibilities: what can you build and how can you enable your end users to do something that wasn’t possible before? However, the joy of creating something new can also come with anxiety about all of the decisions to be made about tooling and integrations. Once your project is in production, managing tools and wrangling project collaborators can take your focus away from being creative and doing your best work. If you are spending too much time keeping brittle pipelines running and your teammates are constantly struggling with tooling, the day to day experience of building new features can start to feel less than joyful.

That is where CodeCatalyst comes in. It isn’t just about developer productivity – it is about helping developers and teams spend more time using the tools they are most comfortable with. Teams deliver better, more impactful outcomes to customers when they have more freedom to focus on their highest-value work and have to concern themselves less with activities that feel like roadblocks. Everything we do stems from that premise, and today’s launch marks a major milestone in helping to enable developers to have a better DevOps experience on AWS.

How CodeCatalyst delivers a great experience

There are four foundational elements of CodeCatalyst that are designed to help minimize distraction and maximize joy in the software development process: blueprints for quick project creation, actions-based CI/CD automation for managing day-to-day software lifecycle tasks, remote Dev Environments for a consistent build experience, and project and issue management for a more streamlined team collaboration.

Blueprints get you started quickly. CodeCatalyst blueprints set up an application code repository (complete with a working sample app), define cloud infrastructure, and run pre-configured CI/CD workflows for your project. Blueprints bring together the elements that are necessary both to begin a new project and deploy it into production. Blueprints can help to significantly reduce the time it takes to set up a new project. They are built by AWS for many use cases, and you can configure them with the programming languages and frameworks that you need both for your application and the underlying infrastructure-as-code. When it comes to incorporating existing tools like Jira or GitHub, CodeCatalyst has extensions that you can use to integrate them into your projects from the beginning without a lot of extra effort. Learn more about blueprints.

“CodeCatalyst helps us spend more time refining our customers’ build, test, and deploy workflows instead of implementing the underlying toolchains,” said Sean Bratcher, CEO of Buildstr. “The tight integration with AWS CDK means that definitions for infrastructure, environments, and configs live alongside the applications themselves as first-class code. This helps reduce friction when integrating with customers’ broader deployment approach.”

Actions-based CI/CD workflows take the pain out of pipeline management. CI/CD workflows in CodeCatalyst run on flexible, managed infrastructure. When you create a project with a blueprint, it comes with a complete CI/CD pipeline composed of actions from the included actions library. You can modify these pipelines with an action from the library or you can use any GitHub Action directly in the project to edit existing pipelines or build new ones from scratch. CodeCatalyst makes composing these actions into pipelines easier: you can switch back and forth between a text-based editor for declaring which actions you want to use through YAML and a visual drag-and-drop pipeline editor. Updating CI/CD workflows with new capabilities is a matter of incorporating new actions. Having CodeCatalyst create pipelines for you, based on your intent, means that you get the benefits of CI/CD automation without the ongoing pain of maintaining disparate tools.

“We needed a streamlined way within AWS to rapidly iterate development of our Reading Partners Connects e-learning platform while maintaining the highest possible quality standards,” said Yaseer Khanani, Senior Product Manager at Reading Partners. “CodeCatalyst’s built-in CI/CD workflows make it easy to efficiently deploy code and conduct testing across a distributed team.”

Automated dev environments make consistency achievable A big friction point for developers collaborating on a software project is getting everyone on the same set of dependencies and settings in their local machines, and ensuring that all other environments from test to staging to production are also consistent. To help address this, CodeCatalyst has Dev Environments that are hosted in the cloud. Dev Environments are defined using the devfile standard, ensuring that everyone working on a project gets a consistent and repeatable experience. Dev Environments connect to popular IDEs like AWS Cloud9, VS Code, and multiple JetBrains IDEs, giving you a local IDE feel while running in the cloud.

“Working closely with customers in the software developer education space, we value the reproducible and pre-configured environments Amazon CodeCatalyst provides for improving learning outcomes for new developers. CodeCatalyst allows you to personalize student experiences while providing facilitators with control over the entire experience.” said Tia Dubuisson, President of Belle Fleur Technologies.

Issue management and simplified team onboarding streamline collaboration. CodeCatalyst is designed to help provide the benefits of building in a unified software development service by making it easier to onboard and collaborate with teammates. It starts with the process of inviting new collaborators: you can invite people to work together on your project with their email address, bypassing the need for everyone to have an individual AWS account. Once they have access, collaborators can see the history and context of the project and can start contributing by creating a Dev Environment.

CodeCatalyst also has built-in issue management that is tied to your code repo, so that you can assign tasks such as code reviews and pull requests to teammates and help track progress using agile methodologies right in the service. As with the rest of CodeCatalyst, collaboration comes without the distraction of managing separate services with separate logins and disparate commercial agreements. Once you give a new teammate access, they can quickly start contributing.

New to CodeCatalyst since the Preview launch

Along with the announcement of general availability, we are excited to share a few new CodeCatalyst features. First, you can now create a new project from an existing GitHub repository. In addition, CodeCatalyst Dev Environments now support GitHub repositories allowing you to work on code stored in GitHub.

Second, CodeCatalyst Dev Environments now support Amazon CodeWhisperer. CodeWhisperer is an artificial intelligence (AI) coding companion that generates real-time code suggestions in your integrated development environment (IDE) to help you more quickly build software. CodeWhisperer is currently supported in CodeCatalyst Dev Environments using AWS Cloud 9 or Visual Studio Code.

Third, Amazon CodeCatalyst recently added support to run workflow actions using on-demand or pre-provisioned compute powered by AWS Graviton processors. AWS Graviton Processors are designed by AWS to deliver the best price performance for your cloud workloads running in Amazon Elastic Compute Cloud (Amazon EC2). Customers can use workflow actions running on AWS Graviton processors to build applications that target Arm architecture, create multi-architecture containers, and modernize legacy applications to help customers reduce costs.

Finally, the library of CodeCatalyst blueprints is continuously growing. The CodeCatalyst preview release included blueprints for common workloads like single-page web applications, serverless applications, and many others. In addition, we have recently added blueprints for Static Websites with Hugo and Jekyll, as well as Intelligent Document Processing workflows.

Learn more about CodeCatalyst at Developer Innovation Day

Next Wednesday, April 26th, we are hosting Developer Innovation Day, a free 7-hour virtual event that is all about helping developers and teams learn to be productive, and collaborate, from discovery to delivery to running software and building applications. Developers can discover how the breadth and depth of AWS tools and the right practices can unlock your team’s ability to find success and take opportunities from ideas to impact.

CodeCatalyst plays a big part in Developer Innovation Day, with five sessions designed to help you see real examples of how you can spend more time doing the work you love best! Get an overview of the service, see how to deploy a working static website in minutes, collaborating effectively with teammates, and more.

Try CodeCatalyst

Ready to try CodeCatalyst? You can get started on the AWS Free Tier today and quickly deploy a blueprint with working sample code. If you would like to learn more, you can read through a collection of DevOps blogs about CodeCatalyst or read the documentation. We can’t wait to see how you innovate with CodeCatalyst!

Enabling DevSecOps with Amazon CodeCatalyst

DevSecOps is the practice of integrating security testing at every stage of the software development process. Amazon CodeCatalyst includes tools that encourage collaboration between developers, security specialists, and operations teams to build software that is both efficient and secure. DevSecOps brings cultural transformation that makes security a shared responsibility for everyone who is building the software.

Introduction

In a prior post in this series, Maintaining Code Quality with Amazon CodeCatalyst Reports, I discussed how developers can quickly configure test cases, run unit tests, set up code coverage, and generate reports using CodeCatalyst’s workflow actions. This was done through the lens of Maxine, the main character of Gene Kim’s The Unicorn Project. In the story, Maxine meets Purna – the QA and Release Manager and Shannon – a Security Engineer. Everyone has the same common goal to integrate security into every stage of the Software Development Lifecycle (SDLC) to ensure secure code deployments. The issue Maxine faces is that security testing is not automated and the separation of responsibilities by role leads to project stagnation.

In this post, I will focus on how DevSecOps teams can use Amazon CodeCatalyst to easily integrate and automate security using CodeCatalyst workflows. I’ll start by checking for vulnerabilities using OWASP dependency checker and Mend SCA. Then, I’ll conduct Static Analysis (SA) of source code using Pylint. I will also outline how DevSecOps teams can influence the outcome of a build by defining success criteria for Software Composition Analysis (SCA) and Static Analysis actions in the workflow. Last, I’ll show you how to gain insights from CodeCatalyst reports and surface potential issues to development teams through CodeCatalyst Issues for faster remediation.

Prerequisites

If you would like to follow along with this walkthrough, you will need to:

Have an AWS Builder ID for signing in to CodeCatalyst.
Belong to a CodeCatalyst space and have the Space administrator role assigned to you in that space. For more information, see Creating a space in CodeCatalyst, Managing members of your space, and Space administrator role.
Have an AWS account associated with your space and have the IAM role in that account. For more information about the role and role policy, see Creating a CodeCatalyst service role.
Have a Mend Account (required for the optional Mend Section)

Walkthrough

To follow along, you can re-use a project you created previously, or you can refer to a previous post that walks through creating a project using the Modern Three-tier Web Application blueprint. Blueprints provide sample code and CI/CD workflows to help you get started easily across different combinations of programming languages and architectures. The back-end code for this project is written in Python and the front-end code is written in JavaScript.

Figure 1. Modern Three-tier Web Application architecture including a presentation, application and data layer

Once the project is deployed, CodeCatalyst opens the project overview. Select CI/CD → Workflows → ApplicationDeploymentPipeline to view the current workflow.

Figure 2. ApplicationDeploymentPipeline

Modern applications use a wide array of open-source dependencies to speed up feature development, but sometimes these dependencies have unknown exploits within them. As a DevSecOps engineer, I can easily edit this workflow to scan for those vulnerable dependencies to ensure I’m delivering secure code.

Software Composition Analysis (SCA)

Software composition analysis (SCA) is a practice in the fields of Information technology and software engineering for analyzing custom-built software applications to detect embedded open-source software and analyzes whether they are up-to-date, contain security flaws, or have licensing requirements. For this walkthrough, I’ll highlight two SCA methods:

I’ll use the open-source OWASP Dependency-Check tool to scan for vulnerable dependencies in my application. It does this by determining if there is a Common Platform Enumeration (CPE) identifier for a given dependency. If found, it will generate a report linking to the associated Common Vulnerabilities and Exposures (CVE) entries.
I’ll use CodeCatalyst’s Mend SCA Action to integrate Mend for SCA into my CI/CD workflow.

Note that developers can replace either of these with a tool of their choice so long as that tool outputs an SCA report format supported by CodeCatalyst.

Software Composition Analysis using OWASP Dependency Checker

To get started, I select Edit at the top-right of the workflows tab. By default, CodeCatalyst opens the YAML tab. I change to the Visual tab to visually edit the workflow and add a CodeCatalyst Action by selecting “+Actions” (1) and then “+” (2). Next select the Configuration (3) tab and edit the Action Name (4). Make sure to select the check mark after you’re done.

Figure 3. New Action Initial Configuration

Scroll down in the Configuration tab to Shell commands. Here, copy and paste the following command snippets that runs when action is invoked.

#Set Source Repo Directory to variable
– Run: sourceRepositoryDirectory=$(pwd)
#Install Node Dependencies
– Run: cd web && npm install
#Install known vulnerable dependency (This is for Demonstrative Purposes Only)
– Run: npm install [email protected]
#Go to parent directory and download OWASP dependency-check CLI tool
– Run: cd .. && wget https://github.com/jeremylong/DependencyCheck/releases/download/v8.1.2/dependency-check-8.1.2-release.zip
#Unzip file – Run: unzip dependency-check-8.1.2-release.zip
#Navigate to dependency-check script location
– Run: cd dependency-check/bin
#Execute dependency-check shell script. Outputs in SARIF format
– Run: ./dependency-check.sh –scan $sourceRepositoryDirectory/web -o $sourceRepositoryDirectory/web/vulnerabilities -f SARIF –disableYarnAudit

These commands will install the node dependencies, download the OWASP dependency-check tool, and run it to generate findings in a SARIF file. Note the third command, which installs a module with known vulnerabilities (This is for demonstrative purposes only).

On the Outputs (1) tab, I change the Report prefix (2) to owasp-frontend. Then I set the Success criteria (3) for Vulnerabilities to 0 – Critical (4). This configuration will stop the workflow if any critical vulnerabilities are found.

Figure 4: owasp-dependecy-check-frontend

It is a best practice to scan for vulnerable dependencies before deploying resources so I’ll set my owasp-dependency-check-frontend action as the first step in the workflow. Otherwise, I might accidentally deploy vulnerable code. To do this, I select the Build (1) action group and set the Depends on (2) dropdown to my owasp-dependency-check-frontend action. Now, my action will run before any resources are built and deployed to my AWS environment. To save my changes and run the workflow, I select Commit (3) and provide a commit message.

Figure 5: Setting OWASP as the First Workflow Action

Amazon CodeCatalyst shows me the state of the workflow run in real-time. After the workflow completes, I see that the action has entered a failed state. If I were a QA Manager like Purna from the Unicorn Project, I would want to see why the action failed. On the lefthand navigation bar, I select the Reports → owasp-frontend-web/vulnerabilities/dependency-check-report.sarif for more details.

Figure 6: SCA Report Overview

This report view provides metadata such as the workflow name, run ID, action name, repository, and the commit ID. I can also see the report status, a bar graph of vulnerabilities grouped by severity, the number of libraries scanned, and a Findings panel. I had set the success criteria for this report to 0 – Critical so it failed because 1 Critical vulnerability was found. If I select a specific finding ID, I can learn more about that specific finding and even view it on the National Vulnerability Database website.

Figure 7: Critical Vulnerability CVE Finding

Now I can raise this issue with the development team through the Issues board on the left-hand navigation panel. See this previous post to learn more about how teams can collaborate in CodeCatalyst.

Note: Let’s remove [email protected] install from owasp-dependency-check-frontend action’s list of commands to allow the workflow to proceed and finish successfully.

Software Composition Analysis using Mend

Mend, formerly known as WhiteSource, is an application security company built to secure today’s digital world. Mend secures all aspects of software, providing automated remediation, prevention, and protection from problem to solution versus only detection and suggested fixes. Find more information about Mend here.

Mend Software Composition Analysis (SCA) can be run as an action within Amazon CodeCatalyst CI/CD workflows, making it easy for developers to perform open-source software vulnerability detection when building and deploying their software projects. This makes it easier for development teams to quickly build and deliver secure applications on AWS.

Getting started with CodeCatalyst and Mend is very easy. After logging in to my Mend Account, I need to create a new Mend Product named Amazon-CodeCatalyst and a Project named mythical-misfits.

Next, I navigate back to my existing workflow in CodeCatalyst and add a new action. However, this time I’ll select the Mend SCA action.

Figure 8: Mend Action

All I need to do now is go to the Configuration tab and set the following values:

Mend Project Name: mythical-misfits
Mend Product Name: Amazon-CodeCatalyst
Mend License Key: You can get the License Key from your Mend account in the CI/CD Integration section. You can get more information from here.

Figure 9: Mend Action Configuration

Then I commit the changes and return to Mend.

Figure 10: Mend Console

After successful execution, Mend will automatically update and show a report similar to the screenshot above. It contains useful information about this project like vulnerabilities, licenses, policy violations, etc. To learn more about the various capabilities of Mend SCA, see the documentation here.

Static Analysis (SA)

Static analysis, also called static code analysis, is a method of debugging that is done by examining the code without executing the program. The process provides an understanding of the code structure and can help ensure that the code adheres to industry standards. Static analysis is used in software engineering by software development and quality assurance teams.

Currently, my workflow does not do static analysis. As a DevSecOps engineer, I can add this as a step to the workflow. For this walkthrough, I’ll create an action that uses Pylint to scan my Python source code for Static Analysis. Note that you can also use other static analysis tools or a GitHub Action like SuperLinter, as covered in this previous post.

Static Analysis using Pylint

After navigating back to CI/CD → Workflows → ApplicationDeploymentPipeline and selecting Edit, I create a new test action. I change the action name to pylint and set the Configuration tab to run the following shell commands:

– Run: pip install pylint
– Run: pylint $PWD –recursive=y –output-format=json:pylint-report.json –exit-zero

On the Outputs tab, I change the Report prefix to pylint. Then I set the Success criteria for Static analysis as shown in the figure below:

Figure 11: Static Analysis Report Configuration

Being that Static Analysis is typically run before any execution, the pylint or OWASP action should be the very first action in the workflow. For the sake of this blog we will use pylint. I select the OWASP or Mend actions I created before, set the Depends on dropdown to my pylint action, and commit the changes. Once the workflow finishes, I can go to Reports > pylint-pylint-report.json for more details.

Figure 12: Pylint Static Analysis Report

The Report status is Failed because more than 1 high-severity or above bug was detected. On the Results tab I can view each finding in greater detail, including the severity, type of finding, message from the linter, and which specific line the error originates from.

Cleanup

If you have been following along with this workflow, you should delete the resources you deployed so you do not continue to incur charges. First, delete the two stacks that AWS Cloud Development Kit (CDK) deployed using the AWS CloudFormation console in the AWS account you associated when you launched the blueprint. These stacks will have names like mysfitsXXXXXWebStack and mysfitsXXXXXAppStack. Second, delete the project from CodeCatalyst by navigating to Project settings and choosing Delete project.

Conclusion

In this post, I demonstrated how DevSecOps teams can easily integrate security into Amazon CodeCatalyst workflows to automate security testing by checking for vulnerabilities using OWASP dependency checker or Mend through Software Composition Analysis (SCA) of dependencies. I also outlined how DevSecOps teams can configure Static Analysis (SA) reports and use success criteria to influence the outcome of a workflow action.

Imtranur Rahman

Imtranur Rahman is an experienced Sr. Solutions Architect in WWPS team with 14+ years of experience. Imtranur works with large AWS Global SI partners and helps them build their cloud strategy and broad adoption of Amazon’s cloud computing platform.Imtranur specializes in Containers, Dev/SecOps, GitOps, microservices based applications, hybrid application solutions, application modernization and loves innovating on behalf of his customers. He is highly customer obsessed and takes pride in providing the best solutions through his extensive expertise.

Wasay Mabood

Wasay is a Partner Solutions Architect based out of New York. He works primarily with AWS Partners on migration, training, and compliance efforts but also dabbles in web development. When he’s not working with customers, he enjoys window-shopping, lounging around at home, and experimenting with new ideas.

Publish Amazon DevOps Guru Insights to ServiceNow for Incident Management

Amazon DevOps Guru is a fully managed AIOps service that uses machine learning (ML) to quickly identify when applications are behaving outside of their normal operating patterns and generates insights from its findings. These insights generated by Amazon DevOps Guru can be used to alert on-call teams to react to anomalies for mission critical workloads. Various customers already utilize Incident management systems like ServiceNow to identify, analyze and resolve critical incidents which could impact business operations. ServiceNow is an IT Service Management (ITSM) platform that enables enterprise organizations to improve operational efficiencies. Among its products is Incident Management which provides a single pane view to customers and allows customers restore services and resolve issues quickly.

This blog post will show you how to integrate Amazon DevOps Guru insights with ServiceNow to automatically create and manage Incidents. We will demonstrate how an insight generated by Amazon DevOps Guru for an anomaly can automatically create a ServiceNow Incident, update the incident when there are new anomalies or recommendations from Amazon DevOps Guru, and close the ServiceNow Incident once the insight is resolved by Amazon DevOps Guru.

Overview of solution

This solution uses a combination of event driven architecture and Serverless technologies, to integrate DevOps Guru insights with ServiceNow. When an Amazon DevOps Guru insight is created, an Amazon EventBridge rule is used to capture the insight as an event and routed to an AWS Lambda Function target. The lambda function interacts with ServiceNow using a REST API to create, update and close an incident for corresponding DevOps Guru events captured by EventBridge.

The EventBridge rule can be customized to capture all DevOps Guru insights or narrowed down to specific insights. In this blog, we will be capturing all DevOps Guru insights and will be performing actions on ServiceNow for the below DevOps Guru events:

DevOps Guru New Insight Open
DevOps Guru New Anomaly Association
DevOps Guru Insight Severity Upgraded
DevOps Guru New Recommendation Created
DevOps Guru Insight Closed

Figure 1: Amazon DevOps Guru Integration with ServiceNow using Amazon EventBridge and AWS Lambda

Solution Implementation Steps

Prerequisites

Before you deploy the solution and proceed with this walkthrough, you should have the following prerequisites:

Gather the hostname for your ServiceNow cloud instance. If you do not have a ServiceNow instance, you can request a developer instance through the ServiceNow Developer page.
Gather the credentials of a ServiceNow user who has permissions to make REST API calls to ServiceNow, specifically to the Table API. If you don’t have a user provisioned, you can create one by following the steps in Getting started with the REST API in the ServiceNow documentation.
Create a secret in Secrets Manager to store the ServiceNow credentials created in previous step. You can choose any name for the secret but it should have two key/value pairs, one for username and other for password.
Enable DevOps Guru for your applications by following these steps or you can follow this blog to deploy a sample serverless application that can be used to generate DevOps Guru insights for anomalies detected in the application.
Install and set up SAM CLI – Install the SAM CLI

Download and set up Java. The version should be matching to the runtime that you defined in the SAM template.yaml Serverless function configuration – Install the Java SE Development Kit 11

Maven – Install Maven

Docker – Install Docker community edition

You have two options to deploy this solution, one options is to deploy from the AWS Serverless Repository and other from the Command Line Interface (CLI).

Option 1: Deploy sample ServiceNow Connector App from AWS Serverless Repository

The DevOps Guru ServiceNow Connector application is available in the AWS Serverless Application Repository which is a managed repository for serverless applications. The application is packaged with an AWS Serverless Application Model (SAM) template, definition of the AWS resources used and the link to the source code. Follow the steps below to quickly deploy this serverless application in your AWS account.

Follow the steps below to quickly deploy this serverless application in your AWS account:

Login to the AWS management console of the account to which you plan to deploy this solution.
Go to the DevOps Guru ServiceNow Connector application in the AWS Serverless Repository and click on “Deploy”.

Figure 2: Deploy solution through AWS Serverless Repository

The Lambda application deployment screen will be displayed where you can enter the ServiceNow hostname (do not include the https prefix) and the Secret Name you created in the prerequisite steps. Click on the ‘Deploy’ button.

Figure 3: AWS Lambda Application Settings

After successful deployment the AWS Lambda Application page will display the “Create complete” status for the serverlessrepo-DevOps-Guru-ServiceNow-Connector application. The CloudFormation template creates four resources:

Lambda function which has the logic to integrate to the ServiceNow
Event Bridge rule for the DevOps Guru Insights
Lambda permission
IAM role

5.     Now you can skip Option 2 and follow the steps in the “Test the Solution” section to trigger some DevOps Guru insights and validate that the incidents are created and updated in ServiceNow.

Option 2: Build and Deploy sample ServiceNow Connector App using AWS SAM Command Line Interface

As you have seen above, you can directly deploy the sample serverless application from the Serverless Repository with one click deployment. Alternatively, you can choose to clone the github source repository and deploy using the SAM CLI from your terminal.

The Serverless Application Model Command Line Interface (SAM CLI) is an extension of the AWS CLI that adds functionality for building and testing serverless applications. The CLI provides commands that enable you to verify that AWS SAM template files are written according to the specification, invoke Lambda functions locally, step-through debug Lambda functions, package and deploy serverless applications to the AWS Cloud, and so on. For details about how to use the AWS SAM CLI, including the full AWS SAM CLI Command Reference, see AWS SAM reference – AWS Serverless Application Model.

Before you proceed, make sure you have completed the Prerequisites section in the beginning which should set up the AWS SAM CLI, Maven and Java on your local terminal. You also need to install and set up Docker to run your functions in an Amazon Linux environment that matches Lambda.

Follow the steps below to build and deploy this serverless application using AWS SAM CLI in your AWS account:

Clone the source code from the github repo

$ git clone https://github.com/aws-samples/amazon-devops-guru-connector-servicenow.git

Before you build the resources defined in the SAM template, you can use the below validate command which will run cfn-lint validations on your SAM JSON/YAML template

$ sam validate –-lint –template template.yaml

3.     Build the application with SAM CLI

$ cd amazon-devops-guru-connector-servicenow
$ sam build

If everything is set up correctly, you should have a success message like shown below:

Build Succeeded

Built Artifacts : .aws-sam/build
Built Template : .aws-sam/build/template.yaml

Commands you can use next
=========================
[*] Validate SAM template: sam validate
[*] Invoke Function: sam local invoke
[*] Test Function in the Cloud: sam sync –stack-name {{stack-name}} –watch
[*] Deploy: sam deploy –guided

4.  Deploy the application with SAM CLI

$ sam deploy –-guided

This command will package and deploy your application to AWS, with a series of prompts that you should respond to as shown below:

Stack Name: The name of the stack to deploy to CloudFormation. This should be unique to your account and region, and a good starting point would be something matching your project name – amazon-devops-guru-connector-servicenow

AWS Region: The AWS region you want to deploy your application to.

Parameter ServiceNowHost []: The ServiceNow host name/instance URL you set up. Example: dev92031.service-now.com

Parameter SecretName []: The secret name that you set up for ServiceNow credentials in the Prerequisites.

Confirm changes before deploy: If set to yes, any change sets will be shown to you before execution for manual review. If set to no, the AWS SAM CLI will automatically deploy application changes.

Allow SAM CLI IAM role creation: Many AWS SAM templates, including this example, create AWS IAM roles required for the AWS Lambda function(s) included to access AWS services. By default, these are scoped down to minimum required permissions. To deploy an AWS CloudFormation stack which creates or modifies IAM roles, the CAPABILITY_IAM value for capabilities must be provided. If permission isn’t provided through this prompt, to deploy this example you must explicitly pass –capabilities CAPABILITY_IAM to the sam deploy command.

Disable rollback [y/N]: If set to Y, preserves the state of previously provisioned resources when an operation fails.

Save arguments to configuration file (samconfig.toml): If set to yes, your choices will be saved to a configuration file inside the project, so that in the future you can just re-run sam deploy without parameters to deploy changes to your application.

After you enter your parameters, you should see something like this if you have provided Y to view and confirm ChangeSets. Proceed here by providing ‘Y’ for deploying the resources.

Initiating deployment
=====================
Uploading to amazon-devops-guru-connector-servicenow/46bb4841f8f37fd41d3f40f86f31c4d7.template 1918 / 1918 (100.00%)

Waiting for changeset to be created..
CloudFormation stack changeset
—————————————————————————————————————————————————–
Operation LogicalResourceId ResourceType Replacement
—————————————————————————————————————————————————–
+ Add FunctionsDevOpsGuruPermission AWS::Lambda::Permission N/A
+ Add FunctionsDevOpsGuru AWS::Events::Rule N/A
+ Add FunctionsRole AWS::IAM::Role N/A
+ Add Functions AWS::Lambda::Function N/A
—————————————————————————————————————————————————–

Changeset created successfully. arn:aws:cloudformation:us-east-1:123456789012:changeSet/samcli-deploy1669232233/7c97b7f5-369d-400d-89cd-ebabefaa0b57

Previewing CloudFormation changeset before deployment
======================================================
Deploy this changeset? [y/N]:

Once the deployment succeeds, you should be able to see the successful creation of your resources

CloudFormation events from stack operations (refresh every 0.5 seconds)
—————————————————————————————————————————————————–
ResourceStatus ResourceType LogicalResourceId ResourceStatusReason
—————————————————————————————————————————————————–
CREATE_IN_PROGRESS AWS::CloudFormation::Stack amazon-devops-guru-connector- User Initiated
servicenow
CREATE_IN_PROGRESS AWS::IAM::Role FunctionsRole –
CREATE_IN_PROGRESS AWS::IAM::Role FunctionsRole Resource creation Initiated
CREATE_COMPLETE AWS::IAM::Role FunctionsRole –
CREATE_IN_PROGRESS AWS::Lambda::Function Functions –
CREATE_IN_PROGRESS AWS::Lambda::Function Functions Resource creation Initiated
CREATE_COMPLETE AWS::Lambda::Function Functions –
CREATE_IN_PROGRESS AWS::Events::Rule FunctionsDevOpsGuru –
CREATE_IN_PROGRESS AWS::Events::Rule FunctionsDevOpsGuru Resource creation Initiated
CREATE_COMPLETE AWS::Events::Rule FunctionsDevOpsGuru –
CREATE_IN_PROGRESS AWS::Lambda::Permission FunctionsDevOpsGuruPermission –
CREATE_IN_PROGRESS AWS::Lambda::Permission FunctionsDevOpsGuruPermission Resource creation Initiated
CREATE_COMPLETE AWS::Lambda::Permission FunctionsDevOpsGuruPermission –
CREATE_COMPLETE AWS::CloudFormation::Stack amazon-devops-guru-connector- –
servicenow
—————————————————————————————————————————————————–

Successfully created/updated stack – amazon-devops-guru-connector-servicenow in us-east-1

You can also use the below command to list the resources deployed by passing in the stack name.

$ sam list resources –stack-name amazon-devops-guru-connector-servicenow

You can also choose to test and debug your function locally with sample events using the SAM CLI local functionality. Test a single function by invoking it directly with a test event. An event is a JSON document that represents the input that the function receives from the event source. Refer the Invoking Lambda functions locally – AWS Serverless Application Model link here for more details.

Follow the below steps for testing the lambda with the SAM CLI local. You have to create an env.json file with the correct values for your ServiceNow Host and SecretManager secret name that was created in the previous step.

Make sure you have created the AWS Secrets Manager secret with the desired name as mentioned in the prerequisites, which should be used here for SECRET_NAME.
Create env.json as below, by replacing the values for SERVICE_NOW_HOST and SECRET_NAME with your real value. These will be set as the local Lambda execution environment variables.

{“Parameters”: {“SERVICE_NOW_HOST”: “SNOW_HOST”,”SECRET_NAME”: “SNOW_CREDS”}}

Run the command below to validate locally that with a sample DevOps Guru payload, to trigger Lambda locally and invoke. Remember for this to work, you should have Docker instance running and also the Secret Name created in your AWS account.

$ sam local invoke Functions –event Functions/src/test/Events/CreateIncident.json –env-vars Functions/src/test/Events/env.json

Once you are done with the above steps, move on to “Test the Solution” section below to trigger sample DevOps Guru insights and validate that the incidents are created and updated in ServiceNow.

Test the Solution

To test the solution, we will simulate a DevOps Guru insight. You can also simulate an insight by following the steps in this blog. After an anomaly is detected in the application, DevOps Guru creates an insight as seen below.

Figure 4: DevOps Guru Insight created for anomalous behavior

For the DevOps Guru insight shown above, a corresponding incident is automatically created on ServiceNow as shown below. In addition to the incident creation, any new anomalies and recommendations from DevOps Guru is also associated with the incident.

Figure 5: Corresponding ServiceNow Incident is created for the DevOps Guru Insight

When the anomalous behavior that generated the DevOps Guru insight is resolved, DevOps Guru automatically closes the insight. The corresponding ServiceNow incident that was created for the insight is also closed as seen below

Figure 6: ServiceNow Incident created for DevOps Guru Insight is resolved due to insight closure

Cleaning up

To avoid incurring future charges, delete the resources.

To delete the sample application that you created, use the AWS CLI command below and pass the stack name you provided in the sam deploy step.

$ aws cloudformation delete-stack –stack-name amazon-devops-guru-connector-servicenow

You could also use the AWS CloudFormation Console to delete the stack:

Figure 7: AWS Stack Console with Delete action

Conclusion

This blog post showcased how DevOps Guru continuously monitor resources in a particular region in your AWS account and automatically detects operational issues, predicts impending resource exhaustion, details likely cause, and recommends remediation actions. This post described a custom solution using serverless integration pattern with AWS Lambda and Amazon EventBridge which enabled integration of the DevOps Guru insights with customer’s most popular ITSM and Change management tool ServiceNow thus streamlining the Service Management governance and oversight over AWS services. Using this solution helps Customer’s with ServiceNow to improve their operational efficiencies, and get customized insights and real time incident alerts and management directly from DevOps Guru which provides a single pane of glass to restore services and systems quickly.

This solution was created to help customers who already use ServiceNow Incident Management, if you are already using Incident Manager from AWS Systems Manager, check out how that works with Amazon DevOps Guru here.

To learn more about Amazon DevOps Guru, join us for a free hands-on Immersion Day. Events are virtual and hosted at three global time zones. Register here: April 12th.

About the authors:

Abdullahi Olaoye

Abdullahi is a Senior Cloud Infrastructure Architect at AWS Professional Services where he works with enterprise customers to design and build cloud solutions that solve business challenges. When he’s not working, he enjoys travelling, watching documentaries and listening to history podcasts.

Sreenivas Ganesan

Sreenivas Ganesan is a Sr. DevOps Consultant at AWS experienced in architecting and delivering modernized DevOps solutions for enterprise customers in their journey to AWS Cloud, primarily focused on Infrastructure automation, Security and Compliance, Management and Governance, Provisioning and Orchestration. Outside of work, he enjoys watching new TV series, soccer and spending time with his family outdoors.

Mohan Udyavar

Mohan Udyavar is a Principal Technical Account Manager in the Enterprise Support organization of AWS advising customers in successfully migrating and operating their workloads on AWS. He is primarily focused on the Automotive industry providing prescriptive guidance to customers helping them improve the resilience and operational excellence posture of mission-critical applications. Outside of work, he loves cooking and working on tech projects with his son.

Integrating with GitHub Actions – Amazon CodeGuru in your DevSecOps Pipeline

Many organizations have adopted DevOps practices to streamline and automate software delivery and IT operations. A DevOps model can be adopted without sacrificing security by using automated compliance policies, fine-grained controls, and configuration management techniques. However, one of the key challenges customers face is analyzing code and detecting any vulnerabilities in the code pipeline due to a lack of access to the right tool. Amazon CodeGuru addresses this challenge by using machine learning and automated reasoning to identify critical issues and hard-to-find bugs during application development and deployment, thus improving code quality.

We discussed how you can build a CI/CD pipeline to deploy a web application in our previous post “Integrating with GitHub Actions – CI/CD pipeline to deploy a Web App to Amazon EC2”. In this post, we will use that pipeline to include security checks and integrate it with Amazon CodeGuru Reviewer to analyze and detect potential security vulnerabilities in the code before deploying it.

Amazon CodeGuru Reviewer helps you improve code security and provides recommendations based on common vulnerabilities (OWASP Top 10) and AWS security best practices. CodeGuru analyzes Java and Python code and provides recommendations for remediation. CodeGuru Reviewer detects a deviation from best practices when using AWS APIs and SDKs, and also identifies concurrency issues, resource leaks, security vulnerabilities and validates input parameters. For every workflow run, CodeGuru Reviewer’s GitHub Action copies your code and build artifacts into an S3 bucket and calls CodeGuru Reviewer APIs to analyze the artifacts and provide recommendations. Refer to the code detector library here for more information about CodeGuru Reviewer’s security and code quality detectors.

With GitHub Actions, developers can easily integrate CodeGuru Reviewer into their CI workflows, conducting code quality and security analysis. They can view CodeGuru Reviewer recommendations directly within the GitHub user interface to quickly identify and fix code issues and security vulnerabilities. Any pull request or push to the master branch will trigger a scan of the changed lines of code, and scheduled pipeline runs will trigger a full scan of the entire repository, ensuring comprehensive analysis and continuous improvement.

Solution overview

The solution comprises of the following components:

GitHub Actions – Workflow Orchestration tool that will host the Pipeline.

AWS CodeDeploy – AWS service to manage deployment on Amazon EC2 Autoscaling Group.

AWS Auto Scaling – AWS service to help maintain application availability and elasticity by automatically adding or removing Amazon EC2 instances.

Amazon EC2 – Destination Compute server for the application deployment.

Amazon CodeGuru – AWS Service to detect security vulnerabilities and automate code reviews.

AWS CloudFormation – AWS infrastructure as code (IaC) service used to orchestrate the infrastructure creation on AWS.

AWS Identity and Access Management (IAM) OIDC identity provider – Federated authentication service to establish trust between GitHub and AWS to allow GitHub Actions to deploy on AWS without maintaining AWS Secrets and credentials.

Amazon Simple Storage Service (Amazon S3) – Amazon S3 to store deployment and code scan artifacts.

The following diagram illustrates the architecture:

Figure 1. Architecture Diagram of the proposed solution in the blog

Developer commits code changes from their local repository to the GitHub repository. In this post, the GitHub action is triggered manually, but this can be automated.
GitHub action triggers the build stage.
GitHub’s Open ID Connector (OIDC) uses the tokens to authenticate to AWS and access resources.
GitHub action uploads the deployment artifacts to Amazon S3.
GitHub action invokes Amazon CodeGuru.
The source code gets uploaded into an S3 bucket when the CodeGuru scan starts.
GitHub action invokes CodeDeploy.
CodeDeploy triggers the deployment to Amazon EC2 instances in an Autoscaling group.
CodeDeploy downloads the artifacts from Amazon S3 and deploys to Amazon EC2 instances.

Prerequisites

This blog post is a continuation of our previous post – Integrating with GitHub Actions – CI/CD pipeline to deploy a Web App to Amazon EC2. You will need to setup your pipeline by following instructions in that blog.

After completing the steps, you should have a local repository with the below directory structure, and one completed Actions run.

Figure 2. Directory structure

To enable automated deployment upon git push, you will need to make a change to your .github/workflow/deploy.yml file. Specifically, you can activate the automation by modifying the following line of code in the deploy.yml file:

From:

workflow_dispatch: {}

To:

#workflow_dispatch: {}
push:
branches: [ main ]
pull_request:

Solution walkthrough

The following steps provide a high-level overview of the walkthrough:

Create an S3 bucket for the Amazon CodeGuru Reviewer.
Update the IAM role to include permissions for Amazon CodeGuru.
Associate the repository in Amazon CodeGuru.
Add Vulnerable code.
Update GitHub Actions Job to run the Amazon CodeGuru Scan.
Push the code to the repository.
Verify the pipeline.
Check the Amazon CodeGuru recommendations in the GitHub user interface.

1. Create an S3 bucket for the Amazon CodeGuru Reviewer

When you run a CodeGuru scan, your code is first uploaded to an S3 bucket in your AWS account.

Note that CodeGuru Reviewer expects the S3 bucket name to begin with codeguru-reviewer-.

You can create this bucket using the bucket policy outlined in this CloudFormation template (JSON or YAML) or by following these instructions.

2.  Update the IAM role to add permissions for Amazon CodeGuru

Locate the role created in the pre-requisite section, named “CodeDeployRoleforGitHub”.
Next, create an inline policy by following these steps. Give it a name, such as “codegurupolicy” and add the following permissions to the policy.

{
“Version”: “2012-10-17″,
“Statement”: [
{
“Action”: [
“codeguru-reviewer:ListRepositoryAssociations”,
“codeguru-reviewer:AssociateRepository”,
“codeguru-reviewer:DescribeRepositoryAssociation”,
“codeguru-reviewer:CreateCodeReview”,
“codeguru-reviewer:DescribeCodeReview”,
“codeguru-reviewer:ListRecommendations”,
“iam:CreateServiceLinkedRole”
],
“Resource”: “*”,
“Effect”: “Allow”
},
{
“Action”: [
“s3:CreateBucket”,
“s3:GetBucket*“,
“s3:List*“,
“s3:GetObject”,
“s3:PutObject”,
“s3:DeleteObject”
],
“Resource”: [
“arn:aws:s3:::codeguru-reviewer-*“,
“arn:aws:s3:::codeguru-reviewer-*/*”
],
“Effect”: “Allow”
}
]
}

3.  Associate the repository in Amazon CodeGuru

Follow the instructions here to associate your repo – https://docs.aws.amazon.com/codeguru/latest/reviewer-ug/create-github-association.html

Figure 3. Associate the repository

At this point, you will have completed your initial full analysis run. However, since this is a simple “helloWorld” program, you may not receive any recommendations. In the following steps, you will incorporate vulnerable code and trigger the analysis again, allowing CodeGuru to identify and provide recommendations for potential issues.

4.  Add Vulnerable code

Create a file application.conf
at /aws-codedeploy-github-actions-deployment/spring-boot-hello-world-example

Add the following content in application.conf file.

db.default.url=”postgres://test-ojxarsxivjuyjc:ubKveYbvNjQ5a0CU8vK4YoVIhl@ec2-54-225-223-40.compute-1.amazonaws.com:5432/dcectn1pto16vi?ssl=true&sslfactory=org.postgresql.ssl.NonValidatingFactory”

db.default.url=${?DATABASE_URL}

db.default.port=”3000″

db.default.datasource.username=”root”

db.default.datasource.password=”testsk_live_454kjkj4545FD3434Srere7878″

db.default.jpa.generate-ddl=”true”

db.default.jpa.hibernate.ddl-auto=”create”

5. Update GitHub Actions Job to run Amazon CodeGuru Scan

You will need to add a new job definition in the GitHub Actions’ yaml file. This new section should be inserted between the Build and Deploy sections for optimal workflow.
Additionally, you will need to adjust the dependency in the deploy section to reflect the new flow: Build -> CodeScan -> Deploy.
Review sample GitHub actions code for running security scan on Amazon CodeGuru Reviewer.

codescan:
needs: build
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
security-events: write

steps:

– name: Download an artifact
uses: actions/[email protected]
with:
name: build-file

– name: Configure AWS credentials
id: iam-role
continue-on-error: true
uses: aws-actions/[email protected]
with:
role-to-assume: ${{ secrets.IAMROLE_GITHUB }}
role-session-name: GitHub-Action-Role
aws-region: ${{ env.AWS_REGION }}

– uses: actions/[email protected]
if: steps.iam-role.outcome == ‘success’
with:
fetch-depth: 0

– name: CodeGuru Reviewer
uses: aws-actions/[email protected]
if: ${{ always() }}
continue-on-error: false
with:
s3_bucket: ${{ env.S3bucket_CodeGuru }}
build_path: .

– name: Store SARIF file
if: steps.iam-role.outcome == ‘success’
uses: actions/[email protected]
with:
name: SARIF_recommendations
path: ./codeguru-results.sarif.json

– name: Upload review result
uses: github/codeql-action/[email protected]
with:
sarif_file: codeguru-results.sarif.json

– run: |

echo “Check for critical volnurability”
count=$(cat codeguru-results.sarif.json | jq ‘.runs[].results[] | select(.level == “error”) | .level’ | wc -l)
if (( $count > 0 )); then
echo “There are $count critical findings, hence stopping the pipeline.”
exit 1
fi

Refer to the complete file provided below for your reference. It is important to note that you will need to replace the following environment variables with your specific values.

S3bucket_CodeGuru
AWS_REGION
S3BUCKET

name: Build and Deploy

on:
#workflow_dispatch: {}
push:
branches: [ main ]
pull_request:

env:
applicationfolder: spring-boot-hello-world-example
AWS_REGION: us-east-1 # <replace this with your AWS region>
S3BUCKET: *<Replace your bucket name here>*
S3bucket_CodeGuru: codeguru-reviewer-<*replacebucketnameher*> # S3 Bucket with “codeguru-reviewer-*” prefix

jobs:
build:
name: Build and Package
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
steps:
– uses: actions/[email protected]
name: Checkout Repository

– uses: aws-actions/[email protected]
with:
role-to-assume: ${{ secrets.IAMROLE_GITHUB }}
role-session-name: GitHub-Action-Role
aws-region: ${{ env.AWS_REGION }}

– name: Set up JDK 1.8
uses: actions/[email protected]
with:
java-version: 1.8

– name: chmod
run: chmod -R +x ./.github

– name: Build and Package Maven
id: package
working-directory: ${{ env.applicationfolder }}
run: $GITHUB_WORKSPACE/.github/scripts/build.sh

– name: Upload Artifact to s3
working-directory: ${{ env.applicationfolder }}/target
run: aws s3 cp *.war s3://${{ env.S3BUCKET }}/

– name: Artifacts for codescan action
uses: actions/[email protected]
with:
name: build-file
path: ${{ env.applicationfolder }}/target/*.war

codescan:
needs: build
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
security-events: write

steps:

– name: Download an artifact
uses: actions/[email protected]
with:
name: build-file

– name: Configure AWS credentials
id: iam-role
continue-on-error: true
uses: aws-actions/[email protected]
with:
role-to-assume: ${{ secrets.IAMROLE_GITHUB }}
role-session-name: GitHub-Action-Role
aws-region: ${{ env.AWS_REGION }}

– uses: actions/[email protected]
if: steps.iam-role.outcome == ‘success’
with:
fetch-depth: 0

– name: CodeGuru Reviewer
uses: aws-actions/[email protected]
if: ${{ always() }}
continue-on-error: false
with:
s3_bucket: ${{ env.S3bucket_CodeGuru }}
build_path: .

– name: Store SARIF file
if: steps.iam-role.outcome == ‘success’
uses: actions/[email protected]
with:
name: SARIF_recommendations
path: ./codeguru-results.sarif.json

– name: Upload review result
uses: github/codeql-action/[email protected]
with:
sarif_file: codeguru-results.sarif.json

– run: |

echo “Check for critical volnurability”
count=$(cat codeguru-results.sarif.json | jq ‘.runs[].results[] | select(.level == “error”) | .level’ | wc -l)
if (( $count > 0 )); then
echo “There are $count critical findings, hence stopping the pipeline.”
exit 1
fi
deploy:
needs: codescan
runs-on: ubuntu-latest
environment: Dev
permissions:
id-token: write
contents: read
steps:
– uses: actions/[email protected]
– uses: aws-actions/[email protected]
with:
role-to-assume: ${{ secrets.IAMROLE_GITHUB }}
role-session-name: GitHub-Action-Role
aws-region: ${{ env.AWS_REGION }}
– run: |
echo “Deploying branch ${{ env.GITHUB_REF }} to ${{ github.event.inputs.environment }}”
commit_hash=`git rev-parse HEAD`
aws deploy create-deployment –application-name CodeDeployAppNameWithASG –deployment-group-name CodeDeployGroupName –github-location repository=$GITHUB_REPOSITORY,commitId=$commit_hash –ignore-application-stop-failures

6.  Push the code to the repository:

Remember to save all the files that you have modified.
To ensure that you are in your git repository folder, you can run the command:

git remote -v

The command should return the remote branch address, which should be similar to the following:

[email protected] GitActionsDeploytoAWS % git remote -v
origin g[email protected]:<username>/GitActionsDeploytoAWS.git (fetch)
origin g[email protected]:<username>/GitActionsDeploytoAWS.git (push)

To push your code to the remote branch, run the following commands:

git add .
git commit -m “Adding Security Scan”
git push

Your code has been pushed to the repository and will trigger the workflow as per the configuration in GitHub Actions.

7.  Verify the pipeline

Your pipeline is set up to fail upon the detection of a critical vulnerability. You can also suppress recommendations from CodeGuru Reviewer if you think it is not relevant for setup. In this example, as there are two critical vulnerabilities, the pipeline will not proceed to the next step.
To view the status of the pipeline, navigate to the Actions tab on your GitHub console. You can refer to the following image for guidance.

Figure 4. GitHub Actions pipeline

To view the details of the error, you can expand the “codescan” job in the GitHub Actions console. This will provide you with more information about the specific vulnerabilities that caused the pipeline to fail and help you to address them accordingly.

Figure 5. Codescan actions logs

8. Check the Amazon CodeGuru recommendations in the GitHub user interface

Once you have run the CodeGuru Reviewer Action, any security findings and recommendations will be displayed on the Security tab within the GitHub user interface. This will provide you with a clear and convenient way to view and address any issues that were identified during the analysis.

Figure 6. Security tab with results

Clean up

To avoid incurring future charges, you should clean up the resources that you created.

Empty the Amazon S3 bucket.
Delete the CloudFormation stack (CodeDeployStack) from the AWS console.

Delete codeguru Amazon S3 bucket.

Disassociate the GitHub repository in CodeGuru Reviewer.
Delete the GitHub Secret (‘IAMROLE_GITHUB’)

Go to the repository settings on GitHub Page.
Select Secrets under Actions.
Select IAMROLE_GITHUB, and delete it.

Conclusion

Amazon CodeGuru is a valuable tool for software development teams looking to improve the quality and efficiency of their code. With its advanced AI capabilities, CodeGuru automates the manual parts of code review and helps identify performance, cost, security, and maintainability issues. CodeGuru also integrates with popular development tools and provides customizable recommendations, making it easy to use within existing workflows. By using Amazon CodeGuru, teams can improve code quality, increase development speed, lower costs, and enhance security, ultimately leading to better software and a more successful overall development process.

In this post, we explained how to integrate Amazon CodeGuru Reviewer into your code build pipeline using GitHub actions. This integration serves as a quality gate by performing code analysis and identifying challenges in your code. Now you can access the CodeGuru Reviewer recommendations directly within the GitHub user interface for guidance on resolving identified issues.

About the author:

Mahesh Biradar

Mahesh Biradar is a Solutions Architect at AWS. He is a DevOps enthusiast and enjoys helping customers implement cost-effective architectures that scale.

Suresh Moolya

Suresh Moolya is a Senior Cloud Application Architect with Amazon Web Services. He works with customers to architect, design, and automate business software at scale on AWS cloud.

Shikhar Mishra

Shikhar is a Solutions Architect at Amazon Web Services. He is a cloud security enthusiast and enjoys helping customers design secure, reliable, and cost-effective solutions on AWS.

Improve collaboration between teams by using AWS CDK constructs

There are different ways to organize teams to deliver great software products. There are companies that give the end-to-end responsibility for a product to a single team, like Amazon’s Two-Pizza teams, and there are companies where multiple teams split the responsibility between infrastructure (or platform) teams and application development teams. This post provides guidance on how collaboration efficiency can be improved in the case of a split-team approach with the help of the AWS Cloud Development Kit (CDK).

The AWS CDK is an open-source software development framework to define your cloud application resources. You do this by using familiar programming languages like TypeScript, Python, Java, C# or Go. It allows you to mix code to define your application’s infrastructure, traditionally expressed through infrastructure as code tools like AWS CloudFormation or HashiCorp Terraform, with code to bundle, compile, and package your application.

This is great for autonomous teams with end-to-end responsibility, as it helps them to keep all code related to that product in a single place and single programming language. There is no need to separate application code into a different repository than infrastructure code with a single team, but what about the split-team model?

Larger enterprises commonly split the responsibility between infrastructure (or platform) teams and application development teams. We’ll see how to use the AWS CDK to ensure team independence and agility even with multiple teams involved. We’ll have a look at the different responsibilities of the participating teams and their produced artifacts, and we’ll also discuss how to make the teams work together in a frictionless way.

This blog post assumes a basic level of knowledge on the AWS CDK and its concepts. Additionally, a very high level understanding of event driven architectures is required.

Team Topologies

Let’s first have a quick look at the different team topologies and each team’s responsibilities.

One-Team Approach

In this blog post we will focus on the split-team approach described below. However, it’s still helpful to understand what we mean by “One-Team” Approach: A single team owns an application from end-to-end. This cross-functional team decides on its own on the features to implement next, which technologies to use and how to build and deploy the resulting infrastructure and application code. The team’s responsibility is infrastructure, application code, its deployment and operations of the developed service.

If you’re interested in how to structure your AWS CDK application in a such an environment have a look at our colleague Alex Pulver’s blog post Recommended AWS CDK project structure for Python applications.

Split-Team Approach

In reality we see many customers who have separate teams for application development and infrastructure development and deployment.

Infrastructure Team

What I call the infrastructure team is also known as the platform or operations team. It configures, deploys, and operates the shared infrastructure which other teams consume to run their applications on. This can be things like an Amazon SQS queue, an Amazon Elastic Container Service (Amazon ECS) cluster as well as the CI/CD pipelines used to bring new versions of the applications into production.
It is the infrastructure team’s responsibility to get the application package developed by the Application Team deployed and running on AWS, as well as provide operational support for the application.

Application Team

Traditionally the application team just provides the application’s package (for example, a JAR file or an npm package) and it’s the infrastructure team’s responsibility to figure out how to deploy, configure, and run it on AWS. However, this traditional setup often leads to bottlenecks, as the infrastructure team will have to support many different applications developed by multiple teams. Additionally, the infrastructure team often has little knowledge of the internals of those applications. This often leads to solutions which are not optimized for the problem at hand: If the infrastructure team only offers a handful of options to run services on, the application team can’t use options optimized for their workload.

This is why we extend the traditional responsibilities of the application team in this blog post. The team provides the application and additionally the description of the infrastructure required to run the application. With “infrastructure required” we mean the AWS services used to run the application. This infrastructure description needs to be written in a format which can be consumed by the infrastructure team.

While we understand that this shift of responsibility adds additional tasks to the application team, we think that in the long term it is worth the effort. This can be the starting point to introduce DevOps concepts into the organization. However, the concepts described in this blog post are still valid even if you decide that you don’t want to add this responsibility to your application teams. The boundary of who is delivering what would then just move more into the direction of the infrastructure team.

To be successful with the given approach, the two teams need to agree on a common format on how to hand over the application, its infrastructure definition, and how to bring it to production. The AWS CDK with its concept of Constructs provides a perfect means for that.

Primer: AWS CDK Constructs

In this section we take a look at the concepts the AWS CDK provides for structuring our code base and how these concepts can be used to fit a CDK project into your team topology.

Constructs

Constructs are the basic building block of an AWS CDK application. An AWS CDK application is composed of multiple constructs which in the end define how and what is deployed by AWS CloudFormation.

The AWS CDK ships with constructs created to deploy AWS services. However, it is important to understand that you are not limited to the out-of-the-box constructs provided by the AWS CDK. The true power of AWS CDK is the possibility to create your own abstractions on top of the default constructs to create solutions for your specific requirement. To achieve this you write, publish, and consume your own, custom constructs. They codify your specific requirements, create an additional level of abstraction and allow other teams to consume and use your construct.

We will use a custom construct to separate the responsibilities between the the application and the infrastructure team. The application team will release a construct which describes the infrastructure along with its configuration required to run the application code. The infrastructure team will consume this construct to deploy and operate the workload on AWS.

How to use the AWS CDK in a Split-Team Setup

Let’s now have a look at how we can use the AWS CDK to split the responsibilities between the application and infrastructure team. I’ll introduce a sample scenario and then illustrate what each team’s responsibility is within this scenario.

Scenario

Our fictitious application development team writes an AWS Lambda function which gets deployed to AWS. Messages in an Amazon SQS queue will invoke the function. Let’s say the function will process orders (whatever this means in detail is irrelevant for the example) and each order is represented by a message in the queue.

The application development team has full flexibility when it comes to creating the AWS Lambda function. They can decide which runtime to use or how much memory to configure. The SQS queue which the function will act upon is created by the infrastructure team. The application team does not have to know how the messages end up in the queue.

With that we can have a look at a sample implementation split between the teams.

Application Team

The application team is responsible for two distinct artifacts: the application code (for example, a Java jar file or an npm module) and the AWS CDK construct used to deploy the required infrastructure on AWS to run the application (an AWS Lambda Function along with its configuration).

The lifecycles of these artifacts differ: the application code changes more frequently than the infrastructure it runs in. That’s why we want to keep the artifacts separate. With that each of the artifacts can be released at its own pace and only if it was changed.

In order to achieve these separate lifecycles, it is important to notice that a release of the application artifact needs to be completely independent from the release of the CDK construct. This fits our approach of separate teams compared to the standard CDK way of building and packaging application code within the CDK construct.

But how will this be done in our example solution? The team will build and publish an application artifact which does not contain anything related to CDK.
When a CDK Stack with this construct is synthesized it will download the pre-built artifact with a given version number from AWS CodeArtifact and use it to create the input zip file for a Lambda function. There is no build of the application package happening during the CDK synth.

With the separation of construct and application code, we need to find a way to tell the CDK construct which specific version of the application code it should fetch from CodeArtifact. We will pass this information to the construct via a property of its constructor.

For dependencies on infrastructure outside of the responsibility of the application team, I follow the pattern of dependency injection. Those dependencies, for example a shared VPC or an Amazon SQS queue, are passed into the construct from the infrastructure team.

Let’s have a look at an example. We pass in the external dependency on an SQS Queue, along with details on the desired appPackageVersion and its CodeArtifact details:

export interface OrderProcessingAppConstructProps {
    queue: aws_sqs.Queue,
    appPackageVersion: string,
    codeArtifactDetails: {
        account: string,
        repository: string,
        domain: string
    }
}

export class OrderProcessingAppConstruct extends Construct {

    constructor(scope: Construct, id: string, props: OrderProcessingAppConstructProps) {
        super(scope, id);

        const lambdaFunction = new lambda.Function(this, ‘OrderProcessingLambda’, {
            code: lambda.Code.fromDockerBuild(path.join(__dirname, ‘..’, ‘bundling’), {
                buildArgs: {
                    ‘PACKAGE_VERSION’ : props.appPackageVersion,
                    ‘CODE_ARTIFACT_ACCOUNT’ : props.codeArtifactDetails.account,
                    ‘CODE_ARTIFACT_REPOSITORY’ : props.codeArtifactDetails.repository,
                    ‘CODE_ARTIFACT_DOMAIN’ : props.codeArtifactDetails.domain
                }
            }),
            runtime: lambda.Runtime.NODEJS_16_X,
            handler: ‘node_modules/order-processing-app/dist/index.lambdaHandler’
        });
        const eventSource = new SqsEventSource(props.queue);
        lambdaFunction.addEventSource(eventSource);
    }
}

Note the code lambda.Code.fromDockerBuild(…): We use AWS CDK’s functionality to bundle the code of our Lambda function via a Docker build. The only things which happen inside of the provided Dockerfile are:

the login into the AWS CodeArtifact repository which holds the pre-built application code’s package
the download and installation of the application code’s artifact from AWS CodeArtifact (in this case via npm)

If you are interested in more details on how you can build, bundle and deploy your AWS CDK assets I highly recommend a blog post by my colleague Cory Hall: Building, bundling, and deploying applications with the AWS CDK. It goes into much more detail than what we are covering here.

Looking at the example Dockerfile we can see the two steps described above:

FROM public.ecr.aws/sam/build-nodejs16.x:latest

ARG PACKAGE_VERSION
ARG CODE_ARTIFACT_AWS_REGION
ARG CODE_ARTIFACT_ACCOUNT
ARG CODE_ARTIFACT_REPOSITORY

RUN aws codeartifact login –tool npm –repository $CODE_ARTIFACT_REPOSITORY –domain $CODE_ARTIFACT_DOMAIN –domain-owner $CODE_ARTIFACT_ACCOUNT –region $CODE_ARTIFACT_AWS_REGION
RUN npm install order-processing-app@$PACKAGE_VERSION –prefix /asset

Please note the following:

we use –prefix /asset with our npm install command. This tells npm to install the dependencies into the folder which CDK will mount into the container. All files which should go into the output of the docker build need to be placed here.
the aws codeartifact login command requires credentials with the appropriate permissions to proceed. In case you run this on for example AWS CodeBuild or inside of a CDK Pipeline you need to make sure that the used role has the appropriate policies attached.

Infrastructure Team

The infrastructure team consumes the AWS CDK construct published by the application team. They own the AWS CDK Stack which composes the whole application. Possibly this will only be one of several Stacks owned by the Infrastructure team. Other Stacks might create shared infrastructure (like VPCs, networking) and other applications.

Within the stack for our application the infrastructure team consumes and instantiates the application team’s construct, passes any dependencies into it and then deploys the stack by whatever means they see fit (e.g. through AWS CodePipeline, GitHub Actions or any other form of continuous delivery/deployment).

The dependency on the application team’s construct is manifested in the package.json of the infrastructure team’s CDK app:

{
  “name”: “order-processing-infra-app”,
  …
  “dependencies”: {
    …
    “order-app-construct” : “1.1.0”,
    …
  }
  …
}

Within the created CDK Stack we see the dependency version for the application package as well as how the infrastructure team passes in additional information (like e.g. the queue to use):

export class OrderProcessingInfraStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);   

    const orderProcessingQueue = new Queue(this, ‘order-processing-queue’);

    new OrderProcessingAppConstruct(this, ‘order-processing-app’, {
       appPackageVersion: “2.0.36”,
       queue: orderProcessingQueue,
       codeArtifactDetails: { … }
     });
  }
}

Propagating New Releases

We now have the responsibilities of each team sorted out along with the artifacts owned by each team. But how do we propagate a change done by the application team all the way to production? Or asked differently: how can we invoke the infrastructure team’s CI/CD pipeline with the updated artifact versions of the application team?

We will need to update the infrastructure team’s dependencies on the application teams artifacts whenever a new version of either the application package or the AWS CDK construct is published. With the dependencies updated we can then start the release pipeline.

One approach is to listen and react to events published by AWS CodeArtifact via Amazon EventBridge. On each release AWS CodeArtifact will publish an event to Amazon EventBridge. We can listen to that event, extract the version number of the new release from its payload and start a workflow to update either our dependency on the CDK construct (e.g. in the package.json of our CDK application) or a update the appPackageVersion which the infrastructure team passes into the consumed construct.

Here’s how a release of a new app version flows through the system:

Figure 1 – A release of the application package triggers a change and deployment of the infrastructure team’s CDK Stack

The application team publishes a new app version into AWS CodeArtifact
CodeArtifact triggers an event on Amazon EventBridge
The infrastructure team listens to this event
The infrastructure team updates its CDK stack to include the latest appPackageVersion

The infrastructure team’s CDK Stack gets deployed

And very similar the release of a new version of the CDK Construct:

Figure 2 – A release of the application team’s CDK construct triggers a change and deployment of the infrastructure team’s CDK Stack

The application team publishes a new CDK construct version into AWS CodeArtifact
CodeArtifact triggers an event on Amazon EventBridge
The infrastructure team listens to this event
The infrastructure team updates its dependency to the latest CDK construct
The infrastructure team’s CDK Stack gets deployed

We will not go into the details on how such a workflow could look like, because it’s most likely highly custom for each team (think of different tools used for code repositories, CI/CD). However, here are some ideas on how it can be accomplished:

Updating the CDK Construct dependency

To update the dependency version of the CDK construct the infrastructure team’s package.json (or other files used for dependency tracking like pom.xml) needs to be updated. You can build automation to checkout the source code and issue a command like npm install [email protected]_VERSION (where NEW_VERSION is the value read from the EventBridge event payload). You then automatically create a pull request to incorporate this change into your main branch. For a sample on what this looks like see the blog post Keeping up with your dependencies: building a feedback loop for shared librares.

Updating the appPackageVersion

To update the appPackageVersion used inside of the infrastructure team’s CDK Stack you can either follow the same approach outlined above, or you can use CDK’s capability to read from an AWS Systems Manager (SSM) Parameter Store parameter. With that you wouldn’t put the value for appPackageVersion into source control, but rather read it from SSM Parameter Store. There is a how-to for this in the AWS CDK documentation: Get a value from the Systems Manager Parameter Store. You then start the infrastructure team’s pipeline based on the event of a change in the parameter.

To have a clear understanding of what is deployed at any given time and in order to see the used parameter value in CloudFormation I’d recommend using the option described at Reading Systems Manager values at synthesis time.

Conclusion

You’ve seen how the AWS Cloud Development Kit and its Construct concept can help to ensure team independence and agility even though multiple teams (in our case an application development team and an infrastructure team) work together to bring a new version of an application into production. To do so you have put the application team in charge of not only their application code, but also of the parts of the infrastructure they use to run their application on. This is still in line with the discussed split-team approach as all shared infrastructure as well as the final deployment is in control of the infrastructure team and is only consumed by the application team’s construct.

About the Authors

As a Solutions Architect Jörg works with manufacturing customers in Germany. Before he joined AWS in 2019 he held various roles like Developer, DevOps Engineer and SRE. With that Jörg enjoys building and automating things and fell in love with the AWS Cloud Development Kit.

Mo joined AWS in 2020 as a Technical Account Manager, bringing with him 7 years of hands-on AWS DevOps experience and 6 year as System operation admin. He is a member of two Technical Field Communities in AWS (Cloud Operation and Builder Experience), focusing on supporting customers with CI/CD pipelines and AI for DevOps to ensure they have the right solutions that fit their business needs.

Journey to adopt Cloud-Native DevOps platform Series #2: Progressive delivery on Amazon EKS with Flagger and Gloo Edge Ingress Controller

In the last post, OfferUp modernized its DevOps platform with Amazon EKS and Flagger to accelerate time to market, we talked about hypergrowth and the technical challenges encountered by OfferUp in its existing DevOps platform. As a reminder, we presented how OfferUp modernized its DevOps platform with Amazon Elastic Kubernetes Service (Amazon EKS) and Flagger to gain developer’s velocity, automate faster deployment, and achieve lower cost of ownership.

In this post, we discuss the technical steps to build a DevOps platform that enables the progressive deployment of microservices on Amazon Managed Amazon EKS. Progressive delivery exposes a new version of the software incrementally to ingress traffic and continuously measures the success rate of the metrics before allowing all of the new traffics to a newer version of the software. Flagger is the Graduate project of Cloud Native Computing Foundations (CNCF) that enables progressive canary delivery, along with bule/green and A/B Testing, while measuring metrics like HTTP/gRPC request success rate and latency. Flagger shifts and routes traffic between app versions using a service mesh or an Ingress controller

We leverage Gloo Ingress Controller for traffic routing, Prometheus, Datadog, and Amazon CloudWatch for application metrics analysis and Slack to send notification. Flagger will post messages to slack when a deployment has been initialized, when a new revision has been detected, and if the canary analysis failed or succeeded.

Prerequisite steps to build the modern DevOps platform

You need an AWS Account and AWS Identity and Access Management (IAM) user to build the DevOps platform. If you don’t have an AWS account with Administrator access, then create one now by clicking here. Create an IAM user and assign admin role. You can build this platform in any AWS region however, I will you us-west-1 region throughout this post. You can use a laptop (Mac or Windows) or an Amazon Elastic Compute Cloud (AmazonEC2) instance as a client machine to install all of the necessary software to build the GitOps platform. For this post, I launched an Amazon EC2 instance (with Amazon Linux2 AMI) as the client and install all of the prerequisite software. You need the awscli, git, eksctl, kubectl, and helm applications to build the GitOps platform. Here are the prerequisite steps,

Create a named profile(eks-devops)  with the config and credentials file:

aws configure –profile eks-devops

AWS Access Key ID [None]: xxxxxxxxxxxxxxxxxxxxxx

AWS Secret Access Key [None]: xxxxxxxxxxxxxxxxxx

Default region name [None]: us-west-1

Default output format [None]:

View and verify your current IAM profile:

export AWS_PROFILE=eks-devops

aws sts get-caller-identity

If the Amazon EC2 instance doesn’t have git preinstalled, then install git in your Amazon EC2 instance:

sudo yum update -y

sudo yum install git -y

Check git version

git version

Git clone the repo and download all of the prerequisite software in the home directory.

git clone https://github.com/aws-samples/aws-gloo-flux.git

Download all of the prerequisite software from install.sh which includes awscli, eksctl, kubectl, helm, and docker:

cd aws-gloo-flux/eks-flagger/

ls -lt

chmod 700 install.sh ecr-setup.sh

. install.sh

Check the version of the software installed:

aws –version

eksctl version

kubectl version -o json

helm version

docker –version

docker info

If the docker info shows an error like “permission denied”, then reboot the Amazon EC2 instance or re-log in to the instance again.

Create an Amazon Elastic Container Repository (Amazon ECR) and push application images.

Amazon ECR is a fully-managed container registry that makes it easy for developers to share and deploy container images and artifacts. ecr setup.sh script will create a new Amazon ECR repository and also push the podinfo images (6.0.0, 6.0.1, 6.0.2, 6.1.0, 6.1.5 and 6.1.6) to the Amazon ECR. Run ecr-setup.sh script with the parameter, “ECR repository name” (e.g. ps-flagger-repository) and region (e.g. us-west-1)

./ecr-setup.sh <ps-flagger-repository> <us-west-1>

You’ll see output like the following (truncated).

###########################################################

Successfully created ECR repository and pushed podinfo images to ECR #

Please note down the ECR repository URI          

xxxxxx.dkr.ecr.us-west-1.amazonaws.com/ps-flagger-repository                                                   

Technical steps to build the modern DevOps platform

This post shows you how to use the Gloo Edge ingress controller and Flagger to automate canary releases for progressive deployment on the Amazon EKS cluster. Flagger requires a Kubernetes cluster v1.16 or newer and Gloo Edge ingress 1.6.0 or newer. This post will provide a step-by-step approach to install the Amazon EKS cluster with managed node group, Gloo Edge ingress controller, and Flagger for Gloo in the Amazon EKS cluster. Now that the cluster, metrics infrastructure, and Flagger are installed, we can install the sample application itself. We’ll use the standard Podinfo application used in the Flagger project and the accompanying loadtester tool. The Flagger “podinfo” backend service will be called by Gloo’s “VirtualService”, which is the root routing object for the Gloo Gateway. A virtual service describes the set of routes to match for a set of domains. We’ll automate the canary promotion, with the new image of the “podinfo” service, from version 6.0.0 to version 6.0.1. We’ll also create a scenario by injecting an error for automated canary rollback while deploying version 6.0.2.

Use myeks-cluster.yaml to create your Amazon EKS cluster with managed nodegroup. myeks-cluster.yaml deployment file has “cluster name” value as ps-eks-66, region value as us-west-1, availabilityZones as [us-west-1a, us-west-1b], Kubernetes version as 1.24, and nodegroup Amazon EC2 instance type as m5.2xlarge. You can change this value if you want to build the cluster in a separate region or availability zone.

eksctl create cluster -f myeks-cluster.yaml

Check the Amazon EKS Cluster details:

kubectl cluster-info

kubectl version -o json

kubectl get nodes -o wide

kubectl get pods -A -o wide

Deploy the Metrics Server:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

kubectl get deployment metrics-server -n kube-system

Update the kubeconfig file to interact with you cluster:

# aws eks update-kubeconfig –name <ekscluster-name> –region <AWS_REGION>

kubectl config view

cat $HOME/.kube/config

Create a namespace “gloo-system” and Install Gloo with Helm Chart. Gloo Edge is an Envoy-based Kubernetes-native ingress controller to facilitate and secure application traffic.

helm repo add gloo https://storage.googleapis.com/solo-public-helm

kubectl create ns gloo-system

helm upgrade -i gloo gloo/gloo –namespace gloo-system

Install Flagger and the Prometheus add-on in the same gloo-system namespace. Flagger is a Cloud Native Computing Foundation project and part of Flux family of GitOps tools.

helm repo add flagger https://flagger.app

helm upgrade -i flagger flagger/flagger

–namespace gloo-system

–set prometheus.install=true

–set meshProvider=gloo

[Optional] If you’re using Datadog as a monitoring tool, then deploy Datadog agents as a DaemonSet using the Datadog Helm chart. Replace RELEASE_NAME and DATADOG_API_KEY accordingly. If you aren’t using Datadog, then skip this step. For this post, we leverage the Prometheus open-source monitoring tool.

helm repo add datadog https://helm.datadoghq.com

helm repo update

helm install <RELEASE_NAME>

    –set datadog.apiKey=<DATADOG_API_KEY> datadog/datadog

Integrate Amazon EKS/ K8s Cluster with the Datadog Dashboard – go to the Datadog Console and add the Kubernetes integration.

[Optional] If you’re using Slack communication tool and have admin access, then Flagger can be configured to send alerts to the Slack chat platform by integrating the Slack alerting system with Flagger. If you don’t have admin access in Slack, then skip this step.

helm upgrade -i flagger flagger/flagger

–set slack.url=https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK

–set slack.channel=general

–set slack.user=flagger

–set clusterName=<my-cluster>

Create a namespace “apps”, and applications and load testing service will be deployed into this namespace.

kubectl create ns apps

Create a deployment and a horizontal pod autoscaler for your custom application or service for which canary deployment will be done.

kubectl -n apps apply -k app

kubectl get deployment -A

kubectl get hpa -n apps

Deploy the load testing service to generate traffic during the canary analysis.

kubectl -n apps apply -k tester

kubectl get deployment -A

kubectl get svc -n apps

Use apps-vs.yaml to create a Gloo virtual service definition that references a route table that will be generated by Flagger.

kubectl apply -f ./apps-vs.yaml

kubectl get vs -n apps

[Optional] If you have your own domain name, then open apps-vs.yaml in vi editor and replace podinfo.example.com with your own domain name to run the app in that domain.

Use canary.yaml to create a canary custom resource. Review the service, analysis, and metrics sections of the canary.yaml file.

kubectl apply -f ./canary.yaml

After a couple of seconds, Flagger will create the canary objects. When the bootstrap finishes, Flagger will set the canary status to “Initialized”.

kubectl -n apps get canary podinfo

NAME      STATUS        WEIGHT   LASTTRANSITIONTIME

podinfo   Initialized   0        2023-xx-xxTxx:xx:xxZ

Gloo automatically creates an ELB. Once the load balancer is provisioned and health checks pass, we can find the sample application at the load balancer’s public address. Note down the ELB’s Public address:

kubectl get svc -n gloo-system –field-selector ‘metadata.name==gateway-proxy’   -o=jsonpath='{.items[0].status.loadBalancer.ingress[0].hostname}{“n”}’

Validate if your application is running, and you’ll see an output with version 6.0.0.

curl <load balancer’s public address> -H “Host:podinfo.example.com”

Trigger progressive deployments and monitor the status

You can Trigger a canary deployment by updating the application container image from 6.0.0 to 6.01.

kubectl -n apps set image deployment/podinfo  podinfod=<ECR URI>:6.0.1

Flagger detects that the deployment revision changed and starts a new rollout.

kubectl -n apps describe canary/podinfo

Monitor all canaries, as the promoted status condition can have one of the following statuses: initialized, Waiting, Progressing, Promoting, Finalizing, Succeeded, and Failed.

watch kubectl get canaries –all-namespaces

curl < load balancer’s public address> -H “Host:podinfo.example.com”

Once canary is completed, validate your application. You can see that the version of the application is changed from 6.0.0 to 6.0.1.

{

  “hostname”: “podinfo-primary-658c9f9695-4pqbl”,

  “version”: “6.0.1”,

  “revision”: “”,

  “color”: “#34577c”,

  “logo”: “https://raw.githubusercontent.com/stefanprodan/podinfo/gh-pages/cuddle_clap.gif”,

  “message”: “greetings from podinfo v6.0.1”,

}

[Optional] Open podinfo application from the laptop browser

Find out both of the IP addresses associated with load balancer.

dig < load balancer’s public address >

Open /etc/hosts file in the laptop and add both of the IPs of load balancer in the host file.

sudo vi /etc/hosts

<Public IP address of LB Target node> podinfo.example.com

e.g.

xx.xx.xxx.xxx podinfo.example.com

xx.xx.xxx.xxx podinfo.example.com

Type “podinfo.example.com” in your browser and you’ll find the application in form similar to this:

Figure 1: Greetings from podinfo v6.0.1

Automated rollback

While doing the canary analysis, you’ll generate HTTP 500 errors and high latency to check if Flagger pauses and rolls back the faulted version. Flagger performs automatic Rollback in the case of failure.

Introduce another canary deployment with podinfo image version 6.0.2 and monitor the status of the canary.

kubectl -n apps set image deployment/podinfo podinfod=<ECR URI>:6.0.2

Run HTTP 500 errors or a high-latency error from a separate terminal window.

Generate HTTP 500 errors:

watch curl -H ‘Host:podinfo.example.com’ <load balancer’s public address>/status/500

Generate high latency:

watch curl -H ‘Host:podinfo.example.com’ < load balancer’s public address >/delay/2

When the number of failed checks reaches the canary analysis threshold, the traffic is routed back to the primary, the canary is scaled to zero, and the rollout is marked as failed.

kubectl get canaries –all-namespaces

kubectl -n apps describe canary/podinfo

Cleanup

When you’re done experimenting, you can delete all of the resources created during this series to avoid any additional charges. Let’s walk through deleting all of the resources used.

Delete Flagger resources and apps namespace
kubectl delete canary podinfo -n  apps

kubectl delete HorizontalPodAutoscaler podinfo -n apps

kubectl delete deployment podinfo -n   apps

helm -n gloo-system delete flagger

helm -n gloo-system delete gloo

kubectl delete namespace apps

Delete Amazon EKS Cluster
After you’ve finished with the cluster and nodes that you created for this tutorial, you should clean up by deleting the cluster and nodes with the following command:

eksctl delete cluster –name <cluster name> –region <region code>

Delete Amazon ECR

aws ecr delete-repository –repository-name ps-flagger-repository  –force

Conclusion

This post explained the process for setting up Amazon EKS cluster and how to leverage Flagger for progressive deployments along with Prometheus and Gloo Ingress Controller. You can enhance the deployments by integrating Flagger with Slack, Datadog, and webhook notifications for progressive deployments. Amazon EKS removes the undifferentiated heavy lifting of managing and updating the Kubernetes cluster. Managed node groups automate the provisioning and lifecycle management of worker nodes in an Amazon EKS cluster, which greatly simplifies operational activities such as new Kubernetes version deployments.

We encourage you to look into modernizing your DevOps platform from monolithic architecture to microservice-based architecture with Amazon EKS, and leverage Flagger with the right Ingress controller for secured and automated service releases.

Further Reading

Journey to adopt Cloud-Native DevOps platform Series #1: OfferUp modernized DevOps platform with Amazon EKS and Flagger to accelerate time to market

About the authors:

Purna Sanyal

Purna Sanyal is a technology enthusiast and an architect at AWS, helping digital native customers solve their business problems with successful adoption of cloud native architecture. He provides technical thought leadership, architecture guidance, and conducts PoCs to enable customers’ digital transformation. He is also passionate about building innovative solutions around Kubernetes, database, analytics, and machine learning.

Adding CDK Constructs to the AWS Analytics Reference Architecture

In 2021, we released the AWS Analytics Reference Architecture, a new AWS Cloud Development Kit (AWS CDK) application end-to-end example, as open source (docs are CC-BY-SA 4.0 International, sample code is MIT-0). It shows how our customers can use the available AWS products and features to implement well-architected analytics solutions. It also regroups AWS best practices for designing, implementing and operating analytics solutions through different purpose-built patterns. Altogether, the AWS Analytics Reference Architecture answers common requirements and solves customer challenges.

In 2022, we extended the scope of this project with AWS CDK constructs to provide more granular and reusable examples. This project is now composed of:

Reusable core components exposed in an AWS CDK library currently available in Typescript and Python. This library contains the AWS CDK constructs that can be used to quickly provision prepackaged analytics solutions.
Reference architectures consuming the reusable components in AWS CDK applications, and demonstrating end-to-end examples in a business context. Currently, only the AWS native reference architecture is available but others will follow.

In this blog post, we will first show how to consume the core library to quickly provision analytics solutions using CDK Constructs and experiment with AWS analytics products.

Building solutions with the Core Library

To illustrate how to use the core components,  let’s see how we can quickly build a Data Lake, a central piece for most analytics projects. The storage layer is implemented with the DataLakeStorage CDK construct relying on Amazon Simple Storage Service (Amazon S3), a durable, scalable and cost-effective object storage service. The query layer is implemented with the AthenaDemoSetup construct using Amazon Athena, an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. With regard to the data catalog, it‘s implemented with the DataLakeCatalog construct using AWS Glue Data Catalog.

Before getting started, please make sure to follow the instructions available here for setting up the prerequisites:

Install the necessary build dependencies
Bootstrap the AWS account
Initialize the CDK application.

This architecture diagram depicts the data lake building blocks we are going to deploy using the AWS Analytics Reference Architecture library. These are higher level constructs (commonly called L3 constructs) as they integrate several AWS services together in patterns.

To assemble these components, you can add this code snippet in your app.py file:

import aws_analytics_reference_architecture as ara

# Create a new DataLakeStorage with Raw, Clean and Transform buckets
storage = ara.DataLakeStorage(scope=self, id=”storage”)

# Create a new DataLakeCatalog with Raw, Clean and Transform databases
catalog = ara.DataLakeCatalog(scope=self, id=”catalog”)

# Configure a new Athena Workgroup
athena_defaults = ara.AthenaDemoSetup(scope=self, id=”demo_setup”)

# Generate data from Customer TPC dataset
data_generator = ara.BatchReplayer(
scope=self,
id=”customer-data”,
dataset=ara.PreparedDataset.RETAIL_1_GB_CUSTOMER,
sink_object_key=”customer”,
sink_bucket=storage.raw_bucket,
)

# Role with default permissions for any Glue service
glue_role = ara.GlueDemoRole.get_or_create(self)

# Crawler to create tables automatically
crawler = glue.CfnCrawler(self, id=’ara-crawler’, name=’ara-crawler’,
role=glue_role.iam_role.role_arn, database_name=’raw’,
targets={‘s3Targets’: [{“path”: f”s3://{storage.raw_bucket.bucket_name}/{data_generator.sink_object_key}/”}],}
)

# Trigger to kick off the crawler
cfn_trigger = glue.CfnTrigger(self, id=”MyCfnTrigger”,
actions=[{‘crawlerName’: crawler.name}],
type=”SCHEDULED”, description=”ara_crawler_trigger”,
name=”min_based_trigger”, schedule=”cron(0/5 * * * ? *)”, start_on_creation=True,
)

In addition to this library construct, the example also includes lower level constructs (commonly called L1 constructs) from the AWS CDK standard library. This shows that you can combine constructs from any CDK library interchangeably.

For use cases where customers have a need to adjust the default configurations in order to align with their organization specific requirements (e.g. data retention rules), the constructs can be changed through the class parameters as shown in this example:

storage = ara.DataLakeStorage(scope=self, id=”storage”, raw_archive_delay=180, clean_archive_delay=1095)

Finally, you can deploy the solution using the AWS CDK CLI from the root of the application with this command: cdk deploy. Once you deploy the solution, AWS CDK provisions the AWS resources included in the Constructs and you can log into your AWS account.

Go to the Athena console and start querying the data. The AthenaDemoSetup provides an Athena workgroup called “demo” that you can select to start querying the BatchReplayer data very quickly. Data is stored in the DataLakeStorage and registered in the DataLakeCatalog. Here is an example of an Athena query accessing the customer data from the BatchReplayer:

Accelerate the implementation

Earlier in the post we pointed out that the library simplifies and accelerates the development process. First, writing Python code is more appealing than writing CloudFormation markup code, either in json or yaml. Second, the CloudFormation template generated by the AWS CDK for the data lake example is 16 times more verbose than Python scripts.

❯ cdk synth | wc -w
2483

❯ wc -w ara_demo/ara_demo_stack.py
154

Demonstrating end-to-end examples with reference architectures

The AWS native reference architecture is the first reference architecture available. It explains the journey of a fake company, MyStore Inc., as it implements its data platform solution with AWS products and services . Deploying the AWS native reference architecture demonstrates a fully working example of a data platform from data ingestion to business analysis. AWS customers can learn from it, see analytics solutions in action, and play with retail dataset and business analysis.

More reference architectures will will be added to this project in Github later.

Business Story

The AWS native reference architecture is faking a retail company called MyStore Inc. that is building a new analytics platform on top of AWS products. This example shows how retail data can be ingested, processed, and analyzed in streaming and batch processes to provide business insights like sales analysis. The platform is built on top of the CDK Constructs from the core library to minimize development effort and inherit from AWS best practices.

Here is the architecture deployed by the AWS native reference architecture:

The platform is implemented in purpose-built modules. They are decoupled and can be independently provisioned but still integrate with each other. The global platformMyStore’s analytics platform has been able to deploy the following modules thanks to:

Data Lake foundations: This mandatory module (based on DataLakeCatalog and DataLakeStorage core constructs) is the core of the analytics platform. It contains the data lake storage and associated metadata for both batch and streaming data. The data lake is organized in multiple Amazon S3 buckets representing different versions of the data. (a) The raw layer contains the data coming from the data sources in the raw format. (b) The cleaned layer contains the raw data that has been cleaned and parsed to a consumable schema. (c) And the curated layer contains refactored data based on business requirements.

Batch analytics: This module is in charge of ingesting and processing data from a Stores channel generated by the legacy systems in batch mode. Data is then exposed to other modules for downstream consumption. The data preparation process leverages various features of AWS Glue, a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development via the Apache Spark framework. The orchestration of the preparation is handled using AWS Glue Workflows that allows managing and monitoring executions of Extract, Transform, and Load (ETL) activities involving multiple crawlers, jobs, and triggers. The metadata management is implemented via AWS Glue Crawlers, a serverless process that crawls data sources and sinks to extract the metadata including schemas, statistics and partitions. It saves them in the AWS Glue Data Catalog.

Streaming analytics: This module is ingesting and processing real time data from the Web channel generated by cloud native systems. The solution minimizes data analysis latency but also to feed the data lake for downstream consumption.

Data Warehouse: This module is ingesting data from the data lake to support reporting, dashboarding and ad hoc querying capabilities. The module is using an Extract, Load, and Transform (ELT) process to transform the data from the Data Lake foundations module. Here are the steps that outline the data pipeline from the data lake into the data warehouse. 1. AWS Glue Workflow reads CSV files from the Raw layer of the data lake and writes them to the Clean layer as Parquet files. 2. Stored procedures in Amazon Redshift’s stg_mystore schema extract data from the Clean layer of the data lake using Amazon Redshift Spectrum. 3. The stored procedures then transform and load the data into a star schema model.

Data Visualization: This module is providing dashboarding capabilities to business users like data analysts on top of the Data Warehouse module, but also provides data exploration on top of the Data Lake module. It is implemented with Amazon Quicksight, a scalable, serverless, embeddable, and machine learning-powered business intelligence tool. Amazon QuickSight is connected to the data lake via Amazon Athena and the data lake via Amazon Redshift using direct query mode, in opposition to the caching mode with SPICE.

Project Materials

The AWS native reference architecture provides both code and documentation about MyStore’s analytics platform:

Documentation is available on GitHub and comes in two different parts:

The high level design describes the overall data platform implemented by MyStore, and the different components involved. This is the recommended entry point to discover the solution.
The analytics solutions provide fine-grained solutions to the challenges MyStore met during the project. These technical patterns can help you choose the right solution for common challenges in analytics.

The code is publicly available here and can be reused as an example for other analytics platform implementations. The code can be deployed in an AWS account by following the getting started guide.

Conclusion

In this blog post, we introduced new AWS CDK content available for customers and partners to easily implement AWS analytics solutions with the AWS Analytics Reference Architecture. The core library provides reusable building blocks with best practices to accelerate the development life cycle on AWS and the reference architecture demonstrates running examples with end-to-end integration in a business context.

Because of its reusable nature, this project will be the foundation for lots of additional content. We plan to extend the technical scope of it with Constructs and reference architectures for a data mesh. We’ll also expand the business scope with industry focused examples. In a future blog post, we will go deeper into the constructs related to Amazon EMR Studio and Amazon EMR on EKS to demonstrate how customers can easily bootstrap an efficient data platform based on Amazon EMR Spark and notebooks.

Flatlogic Admin Templates banner