Right-size your Kubernetes Applications Using Open Source Goldilocks for Cost Optimization

In the last few years as companies have modernized their business applications, many have moved to microservices based architectures using containers on Kubernetes. A lot of the initial focus was on designing and building new cloud native architectures to support the applications. As environments have grown, we’ve seen a shift in focus to optimize resource allocation and right-size workloads to reduce costs.

In this blog post we will share guidance on how to optimize resource allocation and right-size applications in Kubernetes environments using Goldilocks. We’ll walk through how to install Goldilocks as well as a sample application to view the suggested resource recommendations. This applies to all Kubernetes applications, including those running on Amazon Elastic Kubernetes Service (Amazon EKS), that are deployed with managed node groups, self-managed node groups, and AWS Fargate.

Right-sizing applications on Kubernetes

In Kubernetes, resource right-sizing is done through setting resource specifications in the application manifest. These settings directly impact:

Performance — Kubernetes applications running on the same node will arbitrarily compete for resources without proper resource specifications. This can adversely impact application performance.
Cost Optimization — Applications deployed with oversized resource specifications will result in increased costs and underutilized infrastructure.
Autoscaling — The Kubernetes Cluster Autoscaler and Horizontal Pod Autoscaling require resource specifications to function.

The most common resource specifications in Kubernetes are for CPU and memory requests and limits.

Requests and Limits

Containerized applications are deployed on Kubernetes as Pods. CPU and memory requests and limits are an optional part of the Pod definition. CPU is specified in units of Kubernetes CPUs while memory is specified in bytes, usually as mebibytes (Mi).

Requests and limits each serve different functions in Kubernetes and affect scheduling and resource enforcement differently.

Scheduling

The Kubernetes scheduler only considers requests when determining where to place Pods in your cluster. Acceptable nodes are those that have enough available resources to satisfy the Pod’s resource requests.  Limits are not considered by the scheduler.

Resource Enforcement

The container runtime on the node where your Pods are running is responsible for resource enforcement.  Both requests and limits are factors in ensuring applications have access to their required compute resources. Their effect on CPU and memory is different:

CPU — If no limits are specified, then each Pod on a node can use all the available CPU on the host. As soon as available CPU is exhausted, Pods are throttled using a Linux primitive called cgroups. This is a resource sharing primitive that ensures each Pod gets its fair share of CPU time. CPU requests determine that fair share and are weighted to give more CPU time to Pods with larger CPU requests. If a limit is specified then CPU time will not exceed the specific limit.
Memory — Just like CPU, if no memory limits are specified, then each Pod can use all the available memory on the host. Unlike CPU, when memory is exhausted, there is no sharing mechanism. The Pod will either be terminated by the Linux Out-of-memory (OOM) killer or the kubelet will evict the Pod. The same process will happen if a Pod’s memory usage exceeds its limit.

Vertical Pod Autoscaler

So how do application owners choose the “right” values for their CPU and memory resource requests? An ideal solution is to load test the application in a development environment and measure resource usage using observability tooling. While that might make sense for your organization’s most critical applications, it’s likely not feasible for every containerized application deployed in your cluster.

Fortunately, there is a Kubernetes project that has a feature specifically designed to help provide resource recommendations — the Vertical Pod Autoscaler (VPA). VPA is a Kubernetes sub-project owned by the Autoscaling special interest group (SIG). It’s designed to automatically set Pod requests based on observed application performance. VPA collects resource usage using the Kubernetes Metric Server by default but can be optionally configured to use Prometheus as a data source.

VPA has a recommendation engine that measures application performance and makes sizing recommendations. The VPA recommendation engine can be deployed stand-alone so VPA will not perform any autoscaling actions. It’s configured by creating a VerticalPodAutoscaler custom resource for each application and VPA updates the object’s status field with resource sizing recommendations.

Creating VerticalPodAutoscaler objects for every application in your cluster and trying to read and interpret the JSON results is challenging at scale. Goldilocks is an open source project that makes this easy.

Goldilocks

Goldilocks is an open source project from Fairwinds that is designed to help organizations get their Kubernetes application resource requests “just right”. It takes its name, very appropriately, from the well known fairly tale Goldilocks and the Three Bears. Goldilocks builds on top of the Kubernetes Vertical Pod Autoscaler and provides:

A controller that automates the creation of VerticalPodAutoscaler objects for workloads in your cluster.
A dashboard that displays resource recommendations for all the monitored workloads.

The default configuration of Goldilocks is an opt-in model. You choose which workloads are monitored by adding the goldilocks.fairwinds.com/enabled: true label to a namespace.

Solution Overview

Let’s walk through how to install Goldilocks, including its dependencies Metrics Server and Vertical Pod Autoscaler. Then we’ll install a sample application to view the suggested resource recommendations. The diagram shown here illustrates all of the components on an Amazon EKS cluster and their interactions.

The Metrics Server collects resource metrics from the Kubelet running on worker nodes and exposes them through Metrics API for use by the Vertical Pod Autoscaler. The Goldilocks controller watches for namespaces with the goldilocks.fairwinds.com/enabled: true label and creates VerticalPodAutoscaler objects for each workload in those namespaces.

In this blog post, we will be creating a namespace called javajmx-sample and will be creating a tomcat deployment. We will label this namespace in order to get a recommendation from Goldilocks. As soon as we label the namespace, we will be able to see a VPA object called goldilocks-tomcat-example created.

Prerequisites

You will need the following to complete the steps in this post:

AWS Command Line Interface (AWS CLI) version 2
kubectl
helm
If you don’t have an Amazon EKS cluster, you can create one using the eksctl

Step 1: Deploying the Metrics Server

In this step, we will be deploying the Metrics server which provides the resource metrics to be used by Vertical Pod Autoscaler.

helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server

helm upgrade –install metrics-server metrics-server/metrics-server

Let’s verify the status of the metrics-server. Once successfully deployed, you should be able to see the resource utilization of the deployments within seconds:

kubectl top pods  -n kube-system

NAME                     CPU(cores)   MEMORY(bytes)  
aws-node-czlb8           2m           35Mi            
aws-node-fs22v           3m           35Mi            
aws-node-nl4js           2m           60Mi            
aws-node-vth4m           2m           59Mi            
coredns-d5b9bfc4-lbhb7   4m           13Mi            
coredns-d5b9bfc4-ngtf9   4m           14Mi            
kube-proxy-5gq76         1m           12Mi            
kube-proxy-mvp6g         1m           12Mi            
kube-proxy-vxpw9         1m           33Mi            
kube-proxy-zsfs4         1m           34Mi  

Step 2 : Enable namespaces which needs resource recommendation from Goldilocks

We will be deploying sample workloads in the javajmx-sample namespace and we will get the resource recommendation for the applications running on it. Let’s go ahead and create the namespace and label it.

kubectl create ns javajmx-sample
kubectl label ns javajmx-sample goldilocks.fairwinds.com/enabled=true

To ensure the label was applied successfully, run describe on the javajmx-sample namespace

kubectl describe ns javajmx-sample

Name:         javajmx-sample
Labels:       goldilocks.fairwinds.com/enabled=true
              kubernetes.io/metadata.name=javajmx-sample
Annotations:  <none>
Status:       Active

No resource quota.

No LimitRange resource.

Step 3 : Deploy Goldilocks

We will be using a helm chart to deploy Goldilocks. The deployment creates three objects :

Goldilocks-controller: responsible for creating the VPA objects for the workloads whose namespace is enabled for a Goldilocks recommendation

Goldilocks-vpa-recommender:  responsible for providing the resource recommendations for the workloads

Goldilocks-dashboard: summarizes the resource recommendation of the workloads and will also provide the yaml manifest for implementing the recommendation.

To deploy Goldilocks, run the following helm commands:

helm repo add fairwinds-stable https://charts.fairwinds.com/stable
helm upgrade –install goldilocks fairwinds-stable/goldilocks –namespace goldilocks –create-namespace –set vpa.enabled=true

Now, we will use kubectl to verify if the deployment was successful:

NAME                                          READY   STATUS    RESTARTS   AGE
goldilocks-controller-7bc5788596-q752s        1/1     Running   0          18h
goldilocks-dashboard-7ffff8966b-dphmj         1/1     Running   0          18h
goldilocks-dashboard-7ffff8966b-s2dgf         1/1     Running   0          18h
goldilocks-vpa-recommender-5ddf6dcd66-njgt4   1/1     Running   0          18h

Step 4 : Deploy the sample application

In this step, we will be deploying a sample application in the javajmx-sample namespace to get recommendations from Goldilocks. The application tomcat-example  is initially provisioned with a CPU and Memory request of 100m and 180Mi respectively and limits of 300m CPU and 300 Mi Memory.

kubectl apply -f https://raw.githubusercontent.com/aws-observability/aws-o11y-recipes/main/sandbox/javajmx/example/sample-javajmx-app.yaml

nht-admin:~/environment $ kubectl get pods -n javajmx-sample
NAME                              READY   STATUS    RESTARTS   AGE
tomcat-bad-traffic-generator      1/1     Running   0          127m
tomcat-example-5c874c8b8b-zt2tv   1/1     Running   0          127m
tomcat-traffic-generator          1/1     Running   0          127m

As mentioned earlier, Goldilocks will be creating VPAs for each deployment in a Goldilocks enabled namespace. Using the kubectl command, we can verify that a VPA was created in thejavajmx-sample namespace for the goldilocks-tomcat-example:

nht-admin:~/environment $ kubectl get vpa -n javajmx-sample
NAME                        MODE   CPU   MEM         PROVIDED   AGE
goldilocks-tomcat-example   Off    15m   109814751   True       127m

Step 5 : Review the Goldilocks recommendation dashboard

Goldilocks-dashboard will expose the dashboard in the port 8080 and we can access it to get the resource recommendation.  We now run this kubectl command to access the dashboard:

kubectl -n goldilocks port-forward svc/goldilocks-dashboard 8080:80

We can now open a browser to http://localhost:8080 to display the Goldilocks dashboard.

Let’s analyze the javajmx-sample namespace to see the recommendations provided by Goldilocks. We should be able to see the recommendations for the goldilocks-tomcat-example deployment.

Here the screen shows the request and limit recommendations for the javajmx-sample workload. The Current column under each Quality of Service (QoS) indicates the currently configured CPU and Memory request and limits. The Guaranteed and Burstable column under each QoS indicates the recommended CPU and Memory request limits for the respective QoS.

We can clearly notice  that we have over provisioned the resources and Goldilocks has made the recommendations to optimize the CPU and Memory request. The recommended level for CPU request and CPU limit is 15m and 15m compared to the current setting of 100m and 300m for Guaranteed QoS.  Memory request and limits are recommended to be 105M and 105M, compared to the current setting of 180Mi and 300 Mi.

Notice that the recommendations are available for two different Quality of Service (QoS) types: Guaranteed and Burstable. Kubernetes provides different levels of Quality of Service to pods depending on what they request and what limits are set for them. Pods that need to stay up and consistently good can request guaranteed resources, while pods with less exacting requirements can use resources with less or no guarantee.

Guaranteed (QoS) pods are considered top priority and are guaranteed to not be killed until they exceed their limits. If limits, and optionally requests, (not equal to 0) are set for all resources across all containers and limits and requests  are equal, then the pod is classified as Guaranteed.

Burstable (QoS) pods have some form of minimal resource guarantee, but can use more resources when available. Under system memory pressure, these containers are more likely to be killed once they exceed their requests and no Best-Effort pods exist. If requests, and optionally limits, are set (not equal to 0) for one or more resources across one or more containers, and they are not equal, then the pod is classified as Burstable.

To follow the recommended resource specification, customers can simply copy  the respective manifest file for the QoS class they are interested in and deploy the workloads which will then be right-sized and optimized.

For example, if we decide to apply the recommendations for the Guaranteed QoS, we could copy the YAML from the dashboard as shown here and apply them to the deployment object:

Let’s run the kubectl edit command to the deployment to apply the recommendations:

kubectl edit deployment tomcat-example -n javajmx-sample

The resources section in the containers spec  shows that we have successfully applied the recommendation of request and limits for CPU, and memory:

Once we apply the recommendations, we should be able to verify that the pod is trying to restart and come online with the updated resource configuration. Let’s verify the same by running the kubectl describe  command on the tomcat-example deployment:

kubectl describe deployment tomcat-example -n javajmx-sample

The output should look like the following:

Name:                   tomcat-example
Namespace:              javajmx-sample
CreationTimestamp:      Mon, 06 Feb 2023 17:41:38 +0000
Labels:                 <none>
Annotations:            deployment.kubernetes.io/revision: 2
Selector:               app=tomcat-example-pods
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:  app=tomcat-example-pods
  Containers:
   tomcat-example-pod:
    Image:       public.ecr.aws/u6p4l7a1/sample-java-jmx-app:latest
    Ports:       8080/TCP, 9404/TCP
    Host Ports:  0/TCP, 0/TCP
    Limits:
      cpu:     15m
      memory:  105Mi
    Requests:
      cpu:        15m
      memory:     105Mi
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>

Cleanup

To delete the deployments and sample workloads we created in the blog, execute the following commands:

helm delete metrics-server
helm delete goldilocks -n goldilocks
kubectl delete -f https://raw.githubusercontent.com/aws-observability/aws-o11y-recipes/main/sandbox/javajmx/example/sample-javajmx-app.yaml

Conclusion

This post demonstrated how Goldilocks can be used to efficiently rightsize the resource requests for Kubernetes applications. Customers in modernization efforts often have minimal time to decide the resource requirements for their applications, which usually involves a complex process of reviewing monitoring dashboards. By adopting the recommendations from Goldilocks, customers can shorten the time to market for their applications and optimize their Amazon EKS costs.

Further reading

EKS Best practices
Blog: Using Prometheus to Avoid Disasters with Kubernetes CPU Limits

Goldilocks project

Flatlogic Admin Templates banner

Develop a serverless application in Python using Amazon CodeWhisperer

While writing code to develop applications, developers must keep up with multiple programming languages, frameworks, software libraries, and popular cloud services from providers such as AWS. Even though developers can find code snippets on developer communities, to either learn from them or repurpose the code, manually searching for the snippets with an exact or even similar use case is a distracting and time-consuming process. They have to do all of this while making sure that they’re following the correct programming syntax and best coding practices.

Amazon CodeWhisperer, a machine learning (ML) powered coding aide for developers, lets you overcome those challenges. Developers can simply write a comment that outlines a specific task in plain English, such as “upload a file to S3.” Based on this, CodeWhisperer automatically determines which cloud services and public libraries are best-suited for the specified task, it creates the specific code on the fly, and then it recommends the generated code snippets directly in the IDE. And this isn’t about copy-pasting code from the web, but generating code based on the context of your file, such as which libraries and versions you have, as well as the existing code. Moreover, CodeWhisperer seamlessly integrates with your Visual Studio Code and JetBrains IDEs so that you can stay focused and never leave the development environment. At the time of this writing, CodeWhisperer supports Java, Python, JavaScript, C#, and TypeScript.

In this post, we’ll build a full-fledged, event-driven, serverless application for image recognition. With the aid of CodeWhisperer, you’ll write your own code that runs on top of AWS Lambda to interact with Amazon Rekognition, Amazon DynamoDB, Amazon Simple Notification Service (Amazon SNS), Amazon Simple Queue Service (Amazon SQS), Amazon Simple Storage Service (Amazon S3), and third-party HTTP APIs to perform image recognition. The users of the application can interact with it by either sending the URL of an image for processing, or by listing the images and the objects present on each image.

Solution overview

To make our application easier to digest, we’ll split it into three segments:

Image download – The user provides an image URL to the first API. A Lambda function downloads the image from the URL and stores it on an S3 bucket. Amazon S3 automatically sends a notification to an Amazon SNS topic informing that a new image is ready for processing. Amazon SNS then delivers the message to an Amazon SQS queue.

Image recognition – A second Lambda function handles the orchestration and processing of the image. It receives the message from the Amazon SQS queue, sends the image for Amazon Rekognition to process, stores the recognition results on a DynamoDB table, and sends a message with those results as JSON to a second Amazon SNS topic used in section three. A user can list the images and the objects present on each image by calling a second API which queries the DynamoDB table.

3rd-party integration – The last Lambda function reads the message from the second Amazon SQS queue. At this point, the Lambda function must deliver that message to a fictitious external e-mail server HTTP API that supports only XML payloads. Because of that, the Lambda function converts the JSON message to XML. Lastly, the function sends the XML object via HTTP POST to the e-mail server.

The following diagram depicts the architecture of our application:

Figure 1. Architecture diagram depicting the application architecture. It contains the service icons with the component explained on the text above.

Prerequisites

Before getting started, you must have the following prerequisites:

An AWS account and an Administrator user

Install and authenticate the AWS CLI. You can authenticate with an AWS Identity and Access Management (IAM) user or an AWS Security Token Service (AWS STS) token.
Install Python 3.7 or later.
Install Node Package Manager (npm).

Install the AWS CDK Toolkit.
Install the AWS Toolkit for VS Code or for JetBrains.

Install Git.

Configure environment

We already created the scaffolding for the application that we’ll build, which you can find on this Git repository. This application is represented by a CDK app that describes the infrastructure according to the architecture diagram above. However, the actual business logic of the application isn’t provided. You’ll implement it using CodeWhisperer. This means that we already declared using AWS CDK components, such as the API Gateway endpoints, DynamoDB table, and topics and queues. If you’re new to AWS CDK, then we encourage you to go through the CDK workshop later on.

Deploying AWS CDK apps into an AWS environment (a combination of an AWS account and region) requires that you provision resources that the AWS CDK needs to perform the deployment. These resources include an Amazon S3 bucket for storing files and IAM roles that grant permissions needed to perform deployments. The process of provisioning these initial resources is called bootstrapping. The required resources are defined in an AWS CloudFormation stack, called the bootstrap stack, which is usually named CDKToolkit. Like any CloudFormation stack, it appears in the CloudFormation console once it has been deployed.

After cloning the repository, let’s deploy the application (still without the business logic, which we’ll implement later on using CodeWhisperer). For this post, we’ll implement the application in Python. Therefore, make sure that you’re under the python directory. Then, use the cdk bootstrap command to bootstrap an AWS environment for AWS CDK. Replace {AWS_ACCOUNT_ID} and {AWS_REGION} with corresponding values first:

cdk bootstrap aws://{AWS_ACCOUNT_ID}/{AWS_REGION}

For more information about bootstrapping, refer to the documentation.

The last step to prepare your environment is to enable CodeWhisperer on your IDE. See Setting up CodeWhisperer for VS Code or Setting up Amazon CodeWhisperer for JetBrains to learn how to do that, depending on which IDE you’re using.

Image download

Let’s get started by implementing the first Lambda function, which is responsible for downloading an image from the provided URL and storing that image in an S3 bucket. Open the get_save_image.py file from the python/api/runtime/ directory. This file contains an empty Lambda function handler and the needed inputs parameters to integrate this Lambda function.

url is the URL of the input image provided by the user,

name is the name of the image provided by the user, and

S3_BUCKET is the S3 bucket name defined by our application infrastructure.

Write a comment in natural language that describes the required functionality, for example:

# Function to get a file from url

To trigger CodeWhisperer, hit the Enter key after entering the comment and wait for a code suggestion. If you want to manually trigger CodeWhisperer, then you can hit Option + C on MacOS or Alt + C on Windows. You can browse through multiple suggestions (if available) with the arrow keys. Accept a code suggestion by pressing Tab. Discard a suggestion by pressing Esc or typing a character.

For more information on how to work with CodeWhisperer, see Working with CodeWhisperer in VS Code or Working with Amazon CodeWhisperer from JetBrains.

You should get a suggested implementation of a function that downloads a file using a specified URL. The following image shows an example of the code snippet that CodeWhisperer suggests:

Figure 2. Screenshot of the code generated by CodeWhisperer on VS Code. It has a function called get_file_from_url with the implementation suggestion to download a file using the requests lib.

Be aware that CodeWhisperer uses artificial intelligence (AI) to provide code recommendations, and that this is non-deterministic. The result you get in your IDE may be different from the one on the image above. If needed, fine-tune the code, as CodeWhisperer generates the core logic, but you might want to customize the details depending on your requirements.

Let’s try another action, this time to upload the image to an S3 bucket:

# Function to upload image to S3

As a result, CodeWhisperer generates a code snippet similar to the following one:

Figure 3. Screenshot of the code generated by CodeWhisperer on VS Code. It has a function called upload_image with the implementation suggestion to download a file using the requests lib and upload it to S3 using the S3 client.

Now that you have the functions with the functionalities to download an image from the web and upload it to an S3 bucket, you can wire up both functions in the Lambda handler function by calling each function with the correct inputs.

Image recognition

Now let’s implement the Lambda function responsible for sending the image to Amazon Rekognition for processing, storing the results in a DynamoDB table, and sending a message with those results as JSON to a second Amazon SNS topic. Open the image_recognition.py file from the python/recognition/runtime/ directory. This file contains an empty Lambda and the needed inputs parameters to integrate this Lambda function.

queue_url is the URL of the Amazon SQS queue to which this Lambda function is subscribed,

table_name is the name of the DynamoDB table, and

topic_arn is the ARN of the Amazon SNS topic to which this Lambda function is published.

Using CodeWhisperer, implement the business logic of the next Lambda function as you did in the previous section. For example, to detect the labels from an image using Amazon Rekognition, write the following comment:

# Detect labels from image with Rekognition

And as a result, CodeWhisperer should give you a code snippet similar to the one in the following image:

Figure 4. Screenshot of the code generated by CodeWhisperer on VS Code. It has a function called detect_labels with the implementation suggestion to use the Rekognition SDK to detect labels on the given image.

You can continue generating the other functions that you need to fully implement the business logic of your Lambda function. Here are some examples that you can use:

# Save labels to DynamoDB

# Publish item to SNS

# Delete message from SQS

Following the same approach, open the list_images.py file from the python/recognition/runtime/ directory to implement the logic to list all of the labels from the DynamoDB table. As you did previously, type a comment in plain English:

# Function to list all items from a DynamoDB table

Other frequently used code

Interacting with AWS isn’t the only way that you can leverage CodeWhisperer. You can use it to implement repetitive tasks, such as creating unit tests and converting message formats, or to implement algorithms like sorting and string matching and parsing. The last Lambda function that we’ll implement as part of this post is to convert a JSON payload received from Amazon SQS to XML. Then, we’ll POST this XML to an HTTP endpoint.

Open the send_email.py file from the python/integration/runtime/ directory. This file contains an empty Lambda function handler. An event is a JSON-formatted document that contains data for a Lambda function to process. Type a comment with your intent to get the code snippet:

# Transform json to xml

As CodeWhisperer uses the context of your files to generate code, depending on the imports that you have on your file, you’ll get an implementation such as the one in the following image:

Figure 5. Screenshot of the code generated by CodeWhisperer on VS Code. It has a function called json_to_xml with the implementation suggestion to transform JSON payload into XML payload.

Repeat the same process with a comment such as # Send XML string with HTTP POST to get the last function implementation. Note that the email server isn’t part of this implementation. You can mock it, or simply ignore this HTTP POST step. Lastly, wire up both functions in the Lambda handler function by calling each function with the correct inputs.

Deploy and test the application

To deploy the application, run the command cdk deploy –all. You should get a confirmation message, and after a few minutes your application will be up and running on your AWS account. As outputs, the APIStack and RekognitionStack will print the API Gateway endpoint URLs. It will look similar to this example:

Outputs:

APIStack.RESTAPIEndpoint01234567 = https://examp1eid0.execute-
api.{your-region}.amazonaws.com/prod/

The first endpoint expects two string parameters: url (the image file URL to download) and name (the target file name that will be stored on the S3 bucket). Use any image URL you like, but remember that you must encode an image URL before passing it as a query string parameter to escape the special characters. Use an online URL encoder of your choice for that. Then, use the curl command to invoke the API Gateway endpoint:

curl -X GET ‘https://examp1eid0.execute-api.eu-east-
2.amazonaws.com/prod?url={encoded-image-URL}&amp;name={file-name}’

Replace {encoded-image-URL} and {file-name} with the corresponding values. Also, make sure that you use the correct API endpoint that you’ve noted from the AWS CDK deploy command output as mentioned above.

It will take a few seconds for the processing to happen in the background. Once it’s ready, see what has been stored in the DynamoDB table by invoking the List Images API (make sure that you use the correct URL from the output of your deployed AWS CDK stack):

curl -X GET ‘https://examp1eid7.execute-api.eu-east-2.amazonaws.com/prod’

After you’re done, to avoid unexpected charges to your account, make sure that you clean up your AWS CDK stacks. Use the cdk destroy command to delete the stacks.

Conclusion

In this post, we’ve seen how to get a significant productivity boost with the help of ML. With that, as a developer, you can stay focused on your IDE and reduce the time that you spend searching online for code snippets that are relevant for your use case. Writing comments in natural language, you get context-based snippets to implement full-fledged applications. In addition, CodeWhisperer comes with a mechanism called reference tracker, which detects whether a code recommendation might be similar to particular CodeWhisperer training data. The reference tracker lets you easily find and review that reference code and see how it’s used in the context of another project. Lastly, CodeWhisperer provides the ability to run scans on your code (generated by CodeWhisperer as well as written by you) to detect security vulnerabilities.

During the preview period, CodeWhisperer is available to all developers across the world for free. Get started with the free preview on JetBrains, VS Code or AWS Cloud9.

About the author:

Rafael Ramos

Rafael is a Solutions Architect at AWS, where he helps ISVs on their journey to the cloud. He spent over 13 years working as a software developer, and is passionate about DevOps and serverless. Outside of work, he enjoys playing tabletop RPG, cooking and running marathons.

Caroline Gluck

Caroline is an AWS Cloud application architect based in New York City, where she helps customers design and build cloud native data science applications. Caroline is a builder at heart, with a passion for serverless architecture and machine learning. In her spare time, she enjoys traveling, cooking, and spending time with family and friends.

Jason Varghese

Jason is a Senior Solutions Architect at AWS guiding enterprise customers on their cloud migration and modernization journeys. He has served in multiple engineering leadership roles and has over 20 years of experience architecting, designing and building scalable software solutions. Jason holds a bachelor’s degree in computer engineering from the University of Oklahoma and an MBA from the University of Central Oklahoma.

Dmitry Balabanov

Dmitry is a Solutions Architect with AWS where he focuses on building reusable assets for customers across multiple industries. With over 15 years of experience in designing, building, and maintaining applications, he still loves learning new things. When not at work, he enjoys paragliding and mountain trekking.

Caching NextJS Apps with Serverless Redis using Upstash

The modern application we build today is sophisticated. Every time a user loads a webpage, their browser needs to download the bulk of data in order to display that page. A website may consist of millions of data and serve hundreds of API calls. For the data to move smoothly with zero delays between server and client we can follow many strategies. We, developers want our app to deliver the best user experience possible, to achieve this we can employ a variety of techniques available.

There are a number of ways we can address this situation. It would be the best optimization if we could apply techniques that can reduce the amount of latency to perform read/write operations on the database. One of the most popular ways to optimize our API calls is by implementing Caching mechanism.

What is Caching?

Caching is the process of storing copies of files in a cache, or temporary storage location so that they can be accessed more quickly. Technically, a cache is any temporary storage location for copies of files or data, but the term is often used in reference to Internet technologies.

By Cloudflare.com

The most common example of caching we can see is the browser cache, which stores frequently accessed website resources locally so that it does not have to retrieve them over the network each time they are needed. Caching can boost the performance bottleneck of our web applications. When mostly dealing with heavy network traffic and large API calls optimization this technique can be one of the best options for our performance optimization.

Redis: Caching in Server-side

When we talk about caching in servers, one of the top pioneers of caching built-in databases is Redis. Redis (for REmote DIctionary Server) is an open-source NoSQL in-memory key-value data store. One of the best things about Redis is that we can persist data in a database that can continuously store them unless we delete or flush it manually. It is an in-memory database, its data access operations are faster than any other disk-based database, which eventually makes Redis the best choice for caching.

Redis can also be used as a primary database if needed. With the help of Redis, we can call to access and reaccessed as many times as needed without running the database query again. Depending on the Redis cache setup, this can stay in memory for a few hours, a few minutes, or longer. We even can set an expiration time for our caching which we will implement in our demo application.

Redis is able to handle huge amounts of data in real-time, making use of its in-memory data storage capabilities to help support highly responsive database constructs. Caching with Redis allows for fewer database accesses, which helps to reduce the amount of traffic and instances required even achieving a sub-millisecond of latency.

We will implement Redis in our Next application and see the performance gain we can achieve.

Let’s dive into it.

Initializing our Project

Before we begin I assume you have Node installed on your machine so that you can follow along with the steps involved. We will use Next for our project because it helps us write front-end and back-end logic with no configuration needed. We will create a starter project with the following command:

$ npx [email protected]typescript

After the command, give the project the desired name. After everything is done and the project is made for us we can add the dependencies we need to work on in this demo application.

$ npm i ioredis @chakra-ui/core @emotion/core @emotion/styled emotion-theming
$ npm i –save-dev @types/node @types/ioredis

The command above is all the dependencies we will deal with in this project. We will be making the use of ioredis to communicate with our Redis database and style things up with ChakraUI.

As we are using typescript for our project. We will also need to install the typescript version of the node and ioredis which we did in the second command as our local dev dependencies.

Setting up Redis with Upstash

We definitely need to connect our application with Redis. You can use Redis locally and connect to it from your application or use a Redis cloud instance. For this project demo, we will be using Upstash Redis.

Upstash is a serverless database for Redis, with servers/instances, you pay per hour or a fixed price. With Serverless, you pay per request. This means we are not charged when the database is not in use. Upstash configures and manages the database for you.

Head on to Upstash official website and start with an easy free plan. For our demo purpose, we don’t need to pay. Visit the Upstash console after creating your new account and create a new Redis serverless database with Upstash.

You can find the example of the connection string used ioredis in the Upstash dashboard. Copy the blue overlay URL. We will use this connection string to connect to the serverless Redis instance provided in with free tire by Upstash.

import Redis from “ioredis”;
export const redisConnect = new Redis(process.env.REDIS_URL);

In the snippet above we connected our app with the database. We can now use our Redis server instance provided by Upstash inside or our App.

Populating static data

The application we are building might not be an exact use case but, we actually want to see the implementation of caching performance Redis can make to our Application and know how it’s done.

Here we are making a Pokemon application where users can select a list of Pokemon and choose to see the details of Pokemon. We will implement caching to the visited Pokemon. In other words, if users visit the same Pokemon twice they will receive the cached result.

Let’s populate some data inside of our Pokemon options.

export const getStaticProps: GetStaticProps = async () => {
const res = await fetch(
‘https://pokeapi.co/api/v2/pokemon?limit=200&offset=200’
);
const { results }: GetPokemonResults = await res.json();

return {
props: {
pokemons: results,
},
};
};

We are making a call to our endpoint to fetch all the names of Pokemon. The GetStaticProps help us to fetch data at build time. The getStaticProps()function gives props needed for the component Home to render the pages that are generated at build time, not at runtime, and are static.

const Home: NextPage<{ pokemons: Pokemons[] }> = ({ pokemons }) => {
const [selectedPokemon, setSelectedPokemon] = useState<string>(”);
const toast = useToast();
const router = useRouter();

const handelSelect = (e: any) => {
setSelectedPokemon(e.target.value);
};

const searchPokemon = () => {
if (selectedPokemon === ”)
return toast({
title: ‘No pokemon selected’,
description: ‘You need to select a pokemon to search.’,
status: ‘error’,
duration: 3000,
isClosable: true,
});
router.push(`/details/${selectedPokemon}`);
};

return (
<div className={styles.container}>
<main className={styles.main}>
<Box my=”10″>
<FormControl>
<Select
id=”country”
placeholder={
selectedPokemon ? selectedPokemon : ‘Select a pokemon’
}
onChange={handelSelect}
>
{pokemons.map((pokemon, index) => {
return <option key={index}>{pokemon.name}</option>;
})}
</Select>
<Button
colorScheme=”teal”
size=”md”
ml=”3″
onClick={searchPokemon}
>
Search
</Button>
</FormControl>
</Box>
</main>
</div>
);
};

We have successfully populated some static data inside our dropdown to select some Pokemon. Let’s implement a page redirect to a dynamic route when we select a Pokemon name and click the search button.

Adding dynamic page

Creating a dynamic page inside of Next is simple as it has a folder structure provided, which we can leverage to add our dynamic Routes. Let’s create a detailed page for our Pokemon.

const PokemonDetail: NextPage<{ info: PokemonDetailResults }> = ({ info }) => {
return (
<div>
// map our data here
</div>
);
};

export const getServerSideProps: GetServerSideProps = async (context) => {
const { id } = context.query;
const name = id as string;
const data = await fetch(`https://pokeapi.co/api/v2/pokemon/${name}`);
const response: PokemonDetailResults = await data.json();

return {
props: {
info: response,
},
};
};

We made the use of getServerSideProps we are making the use of Server-Side-Rendering provided by Next which will help us to pre-render the page on each request using the data returned by getServerSideProps. This comes in handy when we want to fetch data that changes often and have the page updated to show the most current data. After receiving data we are mapping it over to display it on the screen.

Until now we really have not implemented caching mechanism into our project. Each time the user visits the page we are hitting the API endpoint and sending them back the data they requested for. Let’s move ahead and implement caching into our application.

Caching data

To implement caching in the first place we want to read our Redis database. As discussed Redis stores its data as key-value pairs. We will find whether the key is stored in Redis or not and feed the client with the respective data needed. For this to achieve we will create a function that reads Redis for the key client is requesting.

export const fetchCache = async <T>(key: string, fetchData: () => Promise<T>) => {
const cachedData = await getKey(key);
if (cachedData)return cachedData
return setValue(key, fetchData);
}

When we will know the client is requesting data they have not visited yet we will provide them a copy of data from the server and also behind the scene make a copy inside our Redis database. So, that we can serve data fast through Redis in the next request.

We will write a function where it takes in a parameter of key and if the key exists in the database it will return us parsed value to the client.

const getKey = async <T>(key: string): Promise<T | null> => {
const result = await redisConnect.get(key);
if (result) return JSON.parse(result);
return null;
}

We also need a function where it takes in a key and set the new values alongside with the keys inside our database only if we don’t have that key stored inside of Redis.

const setValue = async <T>(key: string, fetchData: () => Promise<T>): Promise<T> => {
const setValue = await fetchData();
await redisConnect.set(key, JSON.stringify(setValue));
return setValue;
}

Until now we have written everything we need to implement Caching. We will just need is to invoke the function in our dynamic pages. Inside of our [id].tsx we will make a minor tweak where we can invoke an API call only if we don’t have the requested key in Redis.

For this to happen we will need to pass a function as a prop to our fetchCache function.

export const getServerSideProps: GetServerSideProps = async (context) => {
const { id } = context.query;
const name = id as string;

const fetchData = async () => {
const data = await fetch(`https://pokeapi.co/api/v2/pokemon/${name}`);
const response: PokemonDetailResults = await data.json();
return response;
};

const cachedData = await fetchCache(name, fetchData);

return {
props: {
info: cachedData,
},
};
};

We added some tweaks to our code we wrote before. We imported and made the use of fetchCache functions inside of the dynamic page. This function will take in function as a prop and do the checking for key respectively.

Adding expiry

The expiration policy employed by a cache is another factor that helps determine how long a cached item is retained. The expiration policy is usually assigned to the object when it is added to the cache. This can also be customized according to the type of object that’s being cached. A common strategy involves assigning an absolute time of expiration to each object when it is added to the cache. Once that time passes, the item is removed from the cache accordingly.

Let’s also use the caching expiration feature of Redis in our Application. To implement this we just need to add a parameter to our fetchCache function.

const cachedData = await fetchCache(name, fetchData, 60 * 60 * 24);
return {
props: {
info: cachedData,
},
};

export const fetchCache = async (key: string, fetchData: () => Promise<unknown>, expiresIn: number) => {
const cachedData = await getKey(key);
if (cachedData) return cachedData
return setValue(key, fetchData, expiresIn);
}

const setValue = async <T>(key: string, fetchData: () => Promise<T>, expiresIn: number): Promise<T> => {
const setValue = await fetchData();
await redisConnect.set(key, JSON.stringify(setValue), “EX”, expiresIn);
return setValue;
}

For each key that is stored in our Redis database, we have added an expiry time of one day. When the set amount of time elapses, Redis will automatically get rid of the object from the cache so that it may be refreshed by calling the API again. This really helps when we want to feed the client with the updated fresh data every time they call an API.

Performance testing

After all of all these efforts we did which is all for our App performance and optimization. Let’s take a look at our application performance.

This might not be a suitable performance testing for small application. But app serving thousands of API calls with big data set can see a big advantage.

I will make use of the perf_hooks module to assist in measuring the time for our Next lambda to complete an invocation. This is not really provided by Next instead it’s imported from Node. With these APIs, you can measure the time it takes individual dependencies to load, how long your app takes to initially start, and even how long individual web service API calls take. This allows you to make more informed decisions on the efficiency of specific code blocks or even algorithms.

import { performance } from “perf_hooks”;

const startPerfTimer = (): number => {
return performance.now();
}

const endPerfTimer = (): number => {
return performance.now();
}

const calculatePerformance = (startTime: number, endTime: number): void => {
console.log(`Response took ${endTime – startTime} milliseconds`);
}

This may be overkill, to create a function for a line of code but we basically can reuse this function in our application when needed. We will add these function calls to our application and see the results millisecond(ms) of latency, it can impact our app performance overall.

In the above screenshot, we can see the millisecond of improvements in fetching the response. This can be a small improvement in the small application we have built. But, this may be a huge time and performance boost, especially working with large datasets.

Conclusion

Data-heavy applications do need caching operations to improve the response time and even reduce the cost of data volume and bandwidth. With the help of Redis, we can deduct the expensive operation database operations, third-party API calls, and server to server requests by duplicating a copy of the previous requests in our Redis instance.

There might be some cases, we might need to delegate caching to other applications or microservices or any form of key-value storage system that allows us to store and use when we need it. We chose Redis since it is open source and very popular in the industry. Redis’s other cool features include data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, HyperLogLogs, and many more.

I highly recommend you visit the Redis documentation here to gain a depth understanding of other features provided out of the box. Now we can go forth and use Redis to cache frequently queried data in our applications and gain a considerable performance boost.

Please find the code repository here.

Happy coding!

The post Caching NextJS Apps with Serverless Redis using Upstash appeared first on Flatlogic Blog.Flatlogic Admin Templates banner

Azure Functions consumption plan naming

This post is about how to follow naming conventions for Azure Function consumption plans. Unlike Azure App Services, when we create Azure Functions with consumption plan Azure will create a plan name which you can’t modify. If you’re following certain naming conventions it will be different from what you’re following. Here is an example.

If you choose App Service Plan or Premium Plan there is an option to create new plan with the name.

I was following the naming convention from Microsoft Docs. You can find more details here and here.

There is no direct way to solve this problem. But we can do something like – create an ARM template and deploy it instead of creating it directly. So in the Review + Create screen, click on the Download a template for automation link. We will get a screen like this.

In this screen choose the Deploy option. It will redirect to Custom Deployment screen – in this screen you will get an option to create / modify the Hosting Plan Name. And which will help you to configure your Function App consumption plan name.

This way you can create Azure Function consumption plans with our desired name.

Here are some resources which will help you to learn more about Azure Resource naming.

Develop your naming and tagging strategy for Azure resources
Define your naming convention

Happy Programming 🙂

Bring your own functions to Azure Static Web Apps

Azure Static Web Apps APIs are supported by two possible configurations: managed functions and bring your own functions. This post is about using your existing functions in Azure Static Web apps. Bring your own functions is only available in the Azure Static Web Apps Standard plan. By default Azure Static web apps support only HTTP Trigger functions. It is recommended in use cases like if we want to add some extra triggers like CosmosDB or Queue trigger or we need to manage the functions ourself. Once we enable this feature, we can’t use the default functions support in Azure Static Web App.

As the first step we need to modify the GitHub action workflow file and set the value of api_location to empty. Next you can come to the static web app configuration, and select the Functions menu, which will display a screen like this. (Bring your own function requires our Static Web App in Standard Plan, free Plan doesn’t support this feature.)

Click on the Link to a Function App link which will open a screen like this.

The function app dropdown will display all the function apps in the selected subscription. Once the function is selected, we can click on the Link button to link the function to static web app.

Now we can to access the azure function, with /api/ endpoint. There are some constraints associated to this approach. You can find more details about this approach, pros and cons and other details here – Bring your own functions to Azure Static Web Apps

Happy Programming 🙂