Using Open Source Cedar to Write and Enforce Custom Authorization Policies

Cedar is an open source language and software development kit (SDK) for writing and enforcing authorization policies for your applications. You can use Cedar to control access to resources such as photos in a photo-sharing app, compute nodes in a micro-services cluster, or components in a workflow automation system. You specify fine-grained permissions as Cedar policies, and your application authorizes access requests by calling the Cedar SDK’s authorization engine. Cedar has a simple and expressive syntax that supports common authorization paradigms, including both role-based access control (RBAC) and attribute-based access control (ABAC). Because Cedar policies are separate from application code, they can be independently authored, analyzed, and audited, and even shared among multiple applications.

In this blog post, we introduce Cedar and the SDK using an example application, TinyTodo, whose users and teams can organize, track, and share their todo lists. We present examples of TinyTodo permissions as Cedar policies and how TinyTodo uses the Cedar authorization engine to ensure that only intended users are granted access. A more detailed version of this post is included with the TinyTodo code.

TinyTodo

TinyTodo allows individuals, called Users, and groups, called Teams, to organize, track, and share their todo lists. Users create Lists which they can populate with tasks. As tasks are completed, they can be checked off the list.

TinyTodo Permissions

We don’t want to allow TinyTodo users to see or make changes to just any task list. TinyTodo uses Cedar to control who has access to what. A List‘s creator, called its owner, can share the list with other Users or Teams. Owners can share lists in two different modes: reader and editor. A reader can get details of a List and the tasks inside it. An editor can do those things as well, but may also add new tasks, as well as edit, (un)check, and remove existing tasks.

We specify and enforce these access permissions using Cedar. Here is one of TinyTodo’s Cedar policies.

// policy 1: A User can perform any action on a List they own
permit(principal, action, resource)
when {
    resource has owner && resource.owner == principal
};

This policy states that any principal (a TinyTodo User) can perform any action on any resource (a TinyTodoList) as long as the resource has an owner attribute that matches the requesting principal.

Here’s another TinyTodo Cedar policy.

// policy 2: A User can see a List if they are either a reader or editor
permit (
    principal,
    action == Action::”GetList”,
    resource
)
when {
    principal in resource.readers || principal in resource.editors
};

This policy states that any principal can read the contents of a task list (Action::”GetList”) so long as they are in either the list’s readers group, or its editors group.

Cedar’s authorizer enforces default deny: A request is authorized only if a specific permit policy grants it.

The full set of policies can be found in the file TinyTodo file policies.cedar (discussed below). To learn more about Cedar’s syntax and capabilities, check out the Cedar online tutorial at https://www.cedarpolicy.com/.

Building TinyTodo

To build TinyTodo you need to install Rust and Python3, and the Python3 requests module. Download and build the TinyTodo code by doing the following:

> git clone https://github.com/cedar-policy/tinytodo
…downloading messages here
> cd tinytodo
> cargo build
…build messages here

The cargo build command will automatically download and build the Cedar Rust packages cedar-policy-core, cedar-policy-validator, and others, from Rust’s standard package registry, crates.io, and build the TinyTodo server, tiny-todo-server. The TinyTodo CLI is a Python script, tinytodo.py, which interacts with the server. The basic architecture is shown in Figure 1.

Figure 1: TinyTodo application architecture

Running TinyTodo

Let’s run TinyTodo. To begin, we start the server, assume the identity of user andrew, create a new todo list called Cedar blog post, add two tasks to that list, and then complete one of the tasks.

> python -i tinytodo.py
>>> start_server()
TinyTodo server started on port 8080
>>> set_user(andrew)
User is now andrew
>>> get_lists()
No lists for andrew
>>> create_list(“Cedar blog post”)
Created list ID 0
>>> get_list(0)
=== Cedar blog post ===
List ID: 0
Owner: User::”andrew”
Tasks:
>>> create_task(0,”Draft the post”)
Created task on list ID 0
>>> create_task(0,”Revise and polish”)
Created task on list ID 0
>>> get_list(0)
=== Cedar blog post ===
List ID: 0
Owner: User::”andrew”
Tasks:
1. [ ] Draft the post
2. [ ] Revise and polish
>>> toggle_task(0,1)
Toggled task on list ID 0
>>> get_list(0)
=== Cedar blog post ===
List ID: 0
Owner: User::”andrew”
Tasks:
1. [X] Draft the post
2. [ ] Revise and polish

Figure 2: Users and Teams in TinyTodo

The get_list, create_task, and toggle_task commands are all authorized by the Cedar Policy 1 we saw above: since andrew is the owner of List ID 0, he is allowed to carry out any action on it.

Now, continuing as user andrew, we share the list with team interns as a reader. TinyTodo is configured so that the relationship between users and teams is as shown in Figure 2. We switch the user identity to aaron, list the tasks, and attempt to complete another task, but the attempt is denied because aaronis only allowed to view the list (since he’s a member of interns) not edit it. Finally, we switch to user kesha and attempt to view the list, but the attempt is not allowed (interns is a member of temp, but not the reverse).

>>> share_list(0,interns,read_only=True)
Shared list ID 0 with interns as reader
>>> set_user(aaron)
User is now aaron
>>> get_list(0)
=== Cedar blog post ===
List ID: 0
Owner: User::”andrew”
Tasks:
1. [X] Draft the post
2. [ ] Revise and polish
>>> toggle_task(0,2)
Access denied. User aaron is not authorized to Toggle Task on [0, 2]
>>> set_user(kesha)
User is now kesha
>>> get_list(0)
Access denied. User kesha is not authorized to Get List on [0]
>>> stop_server()
TinyTodo server stopped on port 8080

Here, aaron‘s get_list command is authorized by the Cedar Policy 2 we saw above, since aaron is a member of the Team interns, which andrew made a reader of List 0. aaron‘s toggle_task and kesha‘s get_list commands are both denied because no specific policy exists that authorizes them.

Extending TinyTodo’s Policies with Administrator Privileges

We can change the policies with no updates to the application code because they are defined and maintained independently. To see this, add the following policy to the end of the policies.cedar file:

permit(
principal in Team::”admin”,
action,
resource in Application::”TinyTodo”);

This policy states that any user who is a member of Team::”Admin” is able to carry out any action on any List (all of which are part of the Application::”TinyTodo” group). Since user emina is defined to be a member of Team::”Admin” (see Figure 2), if we restart TinyTodo to use this new policy, we can see emina is able to view and edit any list:

> python -i tinytodo.py
>>> start_server()
=== TinyTodo started on port 8080
>>> set_user(andrew)
User is now andrew
>>> create_list(“Cedar blog post”)
Created list ID 0
>>> set_user(emina)
User is now emina
>>> get_list(0)
=== Cedar blog post ===
List ID: 0
Owner: User::”andrew”
Tasks:
>>> delete_list(0)
List Deleted
>>> stop_server()
TinyTodo server stopped on port 8080

Enforcing access requests

When the TinyTodo server receives a command from the client, such as get_list or toggle_task, it checks to see if that command is allowed by invoking the Cedar authorization engine. To do so, it translates the command information into a Cedar request and passes it with relevant data to the Cedar authorization engine, which either allows or denies the request.

Here’s what that looks like in the server code, written in Rust. Each command has a corresponding handler, and that handler first calls the function self.is_authorized to authorize the request before continuing with the command logic. Here’s what that function looks like:

pub fn is_authorized(
&self,
principal: impl AsRef<EntityUid>,
action: impl AsRef<EntityUid>,
resource: impl AsRef<EntityUid>,
) -> Result<()> {
let es = self.entities.as_entities();
let q = Request::new(
Some(principal.as_ref().clone().into()),
Some(action.as_ref().clone().into()),
Some(resource.as_ref().clone().into()),
Context::empty(),
);
info!(“is_authorized request: …”);
let resp = self.authorizer.is_authorized(&q, &self.policies, &es);
info!(“Auth response: {:?}”, resp);
match resp.decision() {
Decision::Allow => Ok(()),
Decision::Deny => Err(Error::AuthDenied(resp.diagnostics().clone())),
}
}

The Cedar authorization engine is stored in the variable self.authorizer and is invoked via the call self.authorizer.is_authorized(&q, &self.policies, &es). The first argument is the access request &q — can the principal perform action on resource with an empty context? An example from our sample run above is whether User::”kesha” can perform action Action::”GetList” on resource List::”0″. (The notation Type::”id” used here is of a Cedar entity UID, which has Rust type cedar_policy::EntityUid in the code.) The second argument is the set of Cedar policies &self.policies the engine will consult when deciding the request; these were read in by the server when it started up. The last argument &es is the set of entities the engine will consider when consulting the policies. These are data objects that represent TinyTodo’s Users, Teams, and Lists, to which the policies may refer. The Cedar authorizer returns a decision: If Decision::Allow then the TinyTodo command can proceed; if Decision::Deny then the server returns that access is denied. The request and its outcome are logged by the calls to info!(…).

Learn More

We are just getting started with TinyTodo, and we have only seen some of what the Cedar SDK can do. You can find a full tutorial in TUTORIAL.md in the tinytodo source code directory which explains (1) the full set of TinyTodo Cedar policies; (2) information about TinyTodo’s Cedar data model, i.e., how TinyTodo stores information about users, teams, lists and tasks as Cedar entities; (3) how we specify the expected data model and structure of TinyTodo access requests as a Cedar schema, and use the Cedar SDK’s validator to ensure that policies conform to the schema; and (4) challenge problems for extending TinyTodo to be even more full featured.

Cedar and Open Source

Cedar is the authorization policy language used by customers of the Amazon Verified Permissions and AWS Verified Access managed services. With the release of the Cedar SDK on GitHub, we provide transparency into Cedar’s development, invite community contributions, and hope to build trust in Cedar’s security.

All of Cedar’s code is available at https://github.com/cedar-policy/. Check out the roadmap and issues list on the site to see where it is going and how you could contribute. We welcome submissions of issues and feature requests via GitHub issues. We built the core Cedar SDK components (for example, the authorizer) using a new process called verification-guided development in order to provide extra assurance that they are safe and secure. To contribute to these components, you can submit a “request for comments” and engage with the core team to get your change approved.

To learn more, feel free to submit questions, comments, and suggestions via the public Cedar Slack workspace, https://cedar-policy.slack.com. You can also complete the online Cedar tutorial and play with it via the language playground at https://www.cedarpolicy.com/.

Flatlogic Admin Templates banner

Introducing AWS Libcrypto for Rust, an Open Source Cryptographic Library for Rust

Today we are excited to announce the availability of AWS Libcrypto for Rust (aws-lc-rs), an open source cryptographic library for Rust software developers with FIPS cryptographic requirements. At our 2022 AWS re:Inforce talk we introduced our customers to AWS Libcrypto (AWS-LC), and our investment in and improvements to open source cryptography. Today we continue that mission by releasing aws-lc-rs, a performant cryptographic library for Linux (x86, x86-64, aarch64) and macOS (x86-64) platforms.

Rust developers increasingly need to deploy applications that meet US and Canadian government cryptographic requirements. We evaluated how to deliver FIPS validated cryptography in idiomatic and performant Rust, built around our AWS-LC offering. We found that the popular ring (v0.16) library fulfilled much of the cryptographic needs in the Rust community, but it did not meet the needs of developers with FIPS requirements. Our intention is to contribute a drop-in replacement for ring that provides FIPS support and is compatible with the ring API. Rust developers with prescribed cryptographic requirements can seamlessly integrate aws-lc-rs into their applications and deploy them into AWS Regions.

AWS-LC is the foundation of aws-lc-rs. AWS-LC is the general-purpose cryptographic library for the C programming language at AWS. It is a fork from Google’s BoringSSL, with features and performance enhancements developed by AWS, such as FIPS support, formal verification for validating implementation correctness, performance improvements on Arm processors for ChaCha20-Poly1305 and NIST P-256 algorithms, and improvements to ECDSA signature verification for NIST P-256 curves on x86 based platforms. aws-lc-rs leverages these AWS-LC optimizations to improve performance in Rust applications. AWS-LC has been submitted to an accredited lab for FIPS validation testing, and upon completion will be submitted to NIST for certification. Once NIST grants a validation certificate to AWS-LC, we will make an announcement to Rust developers on how to leverage the FIPS mode in aws-lc-rs.

We used rustls, a Rust library that provides TLS 1.2 and 1.3 protocol implementations to benchmark aws-lc-rs performance. We ran a set of benchmark scenarios on c7g.metal (Graviton3) and c6i.metal (x86-64) Amazon Elastic Compute Cloud (Amazon EC2) instance types. The graph below shows the improvement of a TLS client negotiating new connections using aws-lc-rs. aws-lc-rs with rustls significantly improves throughput in each of the algorithms tested, and for every hardware platform. We are excited to share aws-lc-rs and these cryptographic improvements with the Rust community today. We are continually evaluating our benchmarks for opportunities to improve aws-lc-rs.

Getting Started

Incorporating aws-lc-rs in your project is straightforward. Let’s look at how you can use aws-lc-rs in your Rust application by creating a SHA256 digest of the message “Hello Blog Readers!” An enhanced version of this digest example is available in the aws-lc-rs repository.

Install the Rust toolchain with rustup, if you do not already have it.
Initialize a new cargo package for your Rust project, if you don’t already have one:
$ cargo new –bin aws-lc-rs-example

Record aws-lc-rs as a dependency in the project Cargo file:
$ cargo add aws-lc-rsApplications already using the ring (v0.16.x) API: If your application already leverages the ring API, you can easily test and benchmark your application against aws-lc-rs without changing your application’s use declarations:$ cargo remove ring
$ cargo add –rename ring aws-lc-rs

Edit src/main.rs:

use aws_lc_rs::digest::{digest, SHA256};

fn main() {
const MESSAGE: &[u8] = b”Hello Blog Readers!”;

let output = digest(&SHA256, MESSAGE);

for v in output.as_ref() {
print!(“{:02x}”, *v);
}
println!();
}

Compile and run the program:
$ cargo run
Finished dev [unoptimized + debuginfo] target(s) in 0.04s
Running `target/debug/aws-lc-rs-example`
c10843d459bf8e2fa6000d59a95c0ae57966bd296d9e90531c4ec7261460c6fb

Conclusion

Rust developers increasingly need to deploy applications that meet US and Canadian government cryptographic requirements. In this post, you learned how we are building aws-lc-rs in order to bring FIPS compliant cryptography to Rust applications. Together AWS-LC and aws-lc-rs bring performance improvements for Arm and x86-64 processor families for commonly used cryptographic algorithms. If you are interested in using or contributing to aws-lc-rs source code or documentation, they are publicly available under the terms of either the Apache Software License 2.0 or ISC License from our GitHub repository. We use GitHub Issues for managing feature requests or bug reports. You can follow aws-lc-rs on crates.io for notifications about new releases. If you discover a potential security issue in aws-lc-rs or AWS-LC, we ask that you notify AWS Security using our vulnerability reporting page.

Flatlogic Admin Templates banner

AWS Now Supports Credentials-fetcher for gMSA on Amazon Linux 2023

In Q1 of 2023, AWS announced the release of the group Managed Service Account (gMSA) credentials-fetcher daemon, with initial support on Amazon Linux 2023, Fedora Linux 36, and Red Hat Enterprise Linux 9. The credentials-fetcher daemon, developed by AWS, is an open source project under the Apache 2.0 License. This release solves a 10-year, longstanding challenge affecting domain connected Linux machines. Until now, Linux users couldn’t use Microsoft Active Directory (Microsoft AD) gMSA and thus have missed out on the improved security and flexibility that gMSA offers over standard service accounts. With the release of the credentials-fetcher daemon, organizations now gain all of gMSA’s benefits without being tied to Windows based hosts.

In this blog post, we explain the use case for credentials-fetcher and give simple instructions for using an Active Directory domain joined Linux server with gMSA. We also demonstrate the interaction with other domain joined services such as Amazon Relational Database Service (Amazon RDS) for Microsoft SQL Server.  The new capabilities of credentials-fetcher pave the way for additional use cases, such as using a Linux host in Amazon Elastic Container Service (Amazon ECS) clusters with gMSA. AWS is committed to using the credentials-fetcher open source project in the AWS cloud, though users may choose to run the service elsewhere. The utility of the service is not limited to AWS. The credentials-fetcher daemon can be leveraged on any supported distribution of Linux and in any environment that meets the Microsoft Active Directory version requirement. This includes on-premise environments, hosted data centers, and other cloud providers.

Solution overview

Organizations running Windows workloads hosted in on-premises data centers use Microsoft AD to authenticate users and services to shared resources over the network. As these organizations migrate workloads into Windows-based environments on AWS and on other clouds, customers traditionally use the domain-join model to access Microsoft AD from Windows instances. In addition, organizations that use Windows containers to scale their applications and reduce their total cost of ownership (TCO) have used gMSAs for Active Directory access by providing Kerberos tickets for container-hosts.

As customers modernize their Windows and Microsoft SQL Server workloads to Linux-based platforms, they still need to authenticate the migrated applications through the organization’s existing Microsoft AD. Although customers can use the domain-join methodology to connect Linux instances to Microsoft AD, it requires a number of steps that traditionally include security limitations. The current method involves a sidecar architecture that fails to periodically rotate passwords, unlike gMSA on Windows containers, thus inducing a security risk of password exposure. Organizations with stringent security postures have not adopted this method on Linux containers and have been waiting for a “gMSA on Windows containers”-like experience on Linux containers.  Active Directory gMSAs have been technically infeasible for customers on Linux-based environments, until today.

A brief introduction to gMSA

Windows-based server infrastructure commonly uses Microsoft Active Directory to facilitate authentication and authorization between users, computers, and other computer network resources. Traditionally, enterprise applications running on Windows platforms use either manually managed accounts used as service accounts or Managed Service Accounts (MSA) for authentication and authorization. The use of manually managed service accounts brings with it the overhead of service account password management, including manually updating the password and updating the password on all servers. It also introduces increased security risks as these accounts typically have elevated privileges and are not tied to a specific user, which creates challenges for attributing activity when auditing the account. For this reason, password management of these accounts is critical.

In contrast, Managed Service Accounts don’t have any password management overhead; the passwords for these type of accounts are automatically rotated and updated on your servers. They are also limited to a single computer account, which means they can’t be used on more than one computer, and cannot be used for interactive logons. A Group Managed Service Account (gMSA) is a special type of service account which augments the functionality; its identity can be shared across multiple computers without needing to know the password. Computers should be part of a Microsoft Active Directory domain, which manages these service accounts to make use of them. Although Windows containers cannot join a domain like an instance, they can still use gMSA identity for authentication and authorization.

Credentials-fetcher’s potential scenarios

With the addition of the credentials-fetcher daemon, more organizations can use gMSA. This gives customers more options if they’re more familiar with Linux, they’re looking to save on licensing costs, and/or looking to improve their security posture. Customers can now associate Linux machines to a gMSA and take advantage of the authentication and authorization between members of that group managed security account. Environments hosted on domain joined, gMSA associated Linux machines running .NET applications or running in Linux containers can now use the gMSA to authenticate between their own domains and other services like Microsoft SQL Server.

Scenario 1: A Microsoft .NET application is running in Docker containers, with the hosts on a Microsoft Active Directory domain joined Amazon Elastic Computer Cloud (Amazon EC2) Linux server. The Linux application server is added as members of the gMSA group. The gMSA account is granted permissions to the domain joined Microsoft SQL Server or Amazon RDS for Microsoft SQL Server database.

Scenario 2: A Microsoft .NET application is running in Docker containers and Microsoft SQL server running in its own Docker container, with the hosts on a Microsoft Active Directory domain joined Amazon EC2 Linux server.  The Linux host servers of the application containers and Microsoft SQL Server container are added as members of the gMSA group. The gMSA account is granted permissions to the Microsoft SQL Server instance database running in a container.

Scenario 3: A Microsoft .NET application is running on an Amazon Elastic Container Service (Amazon ECS) cluster, hosted on a Microsoft Active Directory domain. The Linux servers within the Amazon ECS cluster are added as members of the gMSA group. The gMSA account is granted permissions to the domain joined Microsoft SQL Server or Amazon RDS for Microsoft SQL Server database.

Here is a visualization of the featured use-case scenarios.

Figure 1 Different use-case scenarios with Credentials-fetcher

Implementing the environment

This section will walk you through the prerequisites, environment setup and the installation steps for the credentials-fetcher daemon’s use cases.

Prerequisites

You have properly installed and configured the AWS Command Line Interface (AWS CLI) and PowerShell core on your workstation. We’ve chosen to use the AWS CLI for these steps so that the end-to-end workflow can be demonstrated.
This blog post as of April 4th, 2023 requires an install of Fedora Linux 36 or newer and the latest Amazon Linux 2023 AMI.
This blog post references AWS Managed Microsoft Active Directory, but it will also work with other self-managed Microsoft Active Directory scenarios as long as the Linux machines are able to be domain joined.
You have installed Amazon Relational Database Service (Amazon RDS) instance that is joined to the domain.
You have elevated administrative Active Directory credentials to configure instances to join a domain and create a Microsoft AD security group.
You have accessed to the credentials-fetcher GitHub package for the installation of the latest daemon and updated instructions.

Environment Setup for gMSA on Linux use cases

Figure 2 Credentials-fetcher running in Fedora Linux Server.

All instructions are assuming the use of the Fedora Linux 36 distro, which has been made available during the time of the blog creation. We plan to add gMSA support for additional Linux distributions in the future.

1.     Set up AWS Managed Microsoft Active Directory or Self-hosted Active Directory.

Active Directory setup:  You will set up domain-join from Linux instance to the AD domain. The Linux instance is part of the AD Security group that has access to gMSA account as configured by AD administrator.
AWS Managed Microsoft Active Directory can be deployed using this AWS CloudFormation template.

2.     Create a gMSA account as a Microsoft AD administrator.

Example: Replace ‘LinuxAppFarm’, ‘LinuxFarm01$’ and ‘CORP.EXAMPLE.COM’ with your own gMSA and domain names, respectively. Three Linux instances are displayed in this example: LinuxInstance01$, LinuxInstance02$ and LinuxInstance03$.

# Create the AD group
New-ADGroup -Name “LinuxAppFarm” -SamAccountName “LinuxAppFarm” -GroupScope DomainLocal

# Create the gMSA
New-ADServiceAccount -Name “gmsamachines” -DnsHostName “gmsamachines.CORP.EXAMPLE.COM” -ServicePrincipalNames “host/LinuxAppFarm”, “host/LinuxAppFarm.CORP.EXAMPLE.COM” -PrincipalsAllowedToRetrieveManagedPassword “LinuxAppFarm”

# Add your Linux Instance or containers to the AD group Add-ADGroupMember -Identity “LinuxAppFarm” -Members “LinuxInstances01$”, “LinuxInstances02$”, “LinuxInstances03$”, “MSSQLRDSIntance$”

3.     Verify and test the gMSA account.

PowerShell
# Get the current computer’s group membership
Test-ADServiceAccount gmsamachines

# Get the current computer’s group membership
Get-ADComputer $env:LinuxInstances01 | Get-ADPrincipalGroupMembership | Select-Object DistinguishedName

# Get the groups allowed to retrieve the gMSA password and Change “gmsamachines” for your own gMSA name
(Get-ADServiceAccount gmsamachines -Properties PrincipalsAllowedToRetrieveManagedPassword).PrincipalsAllowedToRetrieveManagedPassword.

Additional reference detailed instructions can be found in this guide to getting started with group managed service accounts.

4.     Create a credentialspec associated with a gMSA account:

Install powershell CredentialSpec module and create CredentialSpec

PowerShell
Install-Module CredentialSpec

New-CredentialSpec -AccountName LinuxAppFarm // Replace ‘LinuxAppFarm’ with your own gMSA Group

You will find the credentialspec in the directory ‘C:Program DataDockerCredentialspecsLinuxAppFarm_CredSpec.json’

5.     Obtain and deploy the supported Fedora Linux 36 version or newer supported AMI (AWS Public Cloud download.)

6.     Manually join your Linux system to the Microsoft Active Directory domain using the following command:

#Install realmd and configure DNS resolver for the Active Directory domain
sudo dnf install realmd sssd oddjob oddjob-mkhomedir adcli krb5-workstation samba-common-tools -y
sudo systemctl stop systemd-resolved
sudo systemctl disable systemd-resolved
sudo unlink /etc/resolv.conf

#Add your DNS nameserver IP and domain name to the resolv.conf and save
sudo nano /etc/resolv.conf

nameserver 10.0.0.20
search corp.example.com

#Join the Linux Server to the realm/domain case-sensitive

Replace (upper-case) realm account and domain name indicated by <bold text>  with the UPN of domain user and FQDN of domain name. Remove < and > in your final command.

Auto-join is not currently supported until the Amazon Linux 2022 distro is updated with the new rpm.

Microsoft SQL Server and Amazon RDS for Microsoft SQL Server can be added for Kerberos database authentication.

Microsoft SQL and Amazon RDS for Microsoft SQL Server must be joined to the AWS Managed Microsoft AD Domain.

See instructions on how to connect Amazon RDS for Microsoft SQL Server to the Microsoft Active Directory domain.

For the highest recommended security, constrained Kerberos delegation for gMSA should be applied to the accounts for any service access.

Set-ADAccountControl -Identity <TestgMSA$> -TrustedForDelegation $false -TrustedToAuthForDelegation $false
Set-ADServiceAccount -Identity TestgMSA$ -Clear ‘msDS-AllowedToDelegateTo’

Detailed instructions can be found here.

7.     Invoke the AddkerberosLease API with the credentialsspec input as shown in following command. This step is important to allow the credentials-fetcher to make a connection to Microsoft Active Directory. The gMSA account is then used for authentication.
Use this command with Fedora Linux only: (grpc_cli is not available on Amazon Linux)

#Replace gMSA group name, netbios name and DNS names in the command (Bold text)
grpc_cli call unix:/var/credentials-fetcher/socket/credentials_fetcher.sock AddKerberosLease “credspec_contents: ‘{“CmsPlugins”:[“ActiveDirectory”],”DomainJoinConfig”:{“Sid”:”S-1-5-21-1445507628-2856671781-3529916291″,”MachineAccountName”:”gmsamachines”,”Guid”:”af602f85-d754-4eea-9fa8-fd76810485f1″,”DnsTreeName”:”corp.example.com”,”DnsName”:”corp.example.com”,”NetBiosName”:”DEMOCORP”},”ActiveDirectoryConfig”:{“GroupManagedServiceAccounts”:[{“Name”:”gmsamachines”,”Scope”:”corp.example.com”},{“Name”:”gmsamachines”,”Scope”:”DEMOCORP”}]}}'”

Response example: (Note the response for use with your Docker application container)
path to kerberos ticket : /var/credentials-fetcher/krbdir/726837743cc966c7b4da/WebApp01

8.     Invoke the Delete kerberosLease API with lease id input as shown here. Set unique identifier lease_id, associated to the request. The deleted_kerberos_file_paths are paths associated to the Kerberos tickets deleted corresponding to the gMSA accounts.
Use this command from the Linux host:

#Delete Kerberos Lease sample command
grpc_cli call unix:/var/credentials-fetcher/socket/credentials_fetcher.sock DeleteKerberosLease “lease_id: ‘${response_lease_id_from_add_kerberos_lease}'”

Installation of credentials-fetcher on supported Linux distros

In this basic use case for testing the credentials-fetcher rpm, the following architecture is assumed for the purposes of this blog.

An AWS Managed Microsoft Active Directory joined Linux Application Server.
An AWS Managed Microsoft Active Directory joined RDS for Microsoft SQL.
gMSA account established for account credentials in an AWS Managed Microsoft Active Directory.

Fedora Linux 36 Server setup:

Deploy the “Fedora cloud based image for AWS public cloud” located here, to your AWS account.
Credentials-fetcher is packaged and included as part of the standard Fedora package repositories. Install the credentials-fetcher rpm by typing the command:

sudo dnf install credentials-fetcher -y

How to use credentials-fetcher per scenario

In these instructions, we will demonstrate the use of credentials-fetcher with an ASP.NET application and Amazon RDS for Microsoft SQL Server. A Microsoft SQL Server container scenario will also be demonstrated as an additional use case.

Scenario 1:  Using .NET Core container application on Linux with Amazon RDS for Microsoft SQL Server backend

Figure 3 Using .NET Core container application on Linux with Amazon RDS for Microsoft SQL Server backend

Once the environment prerequisites have been met, you can install Docker and a repository in preparation for deploying a .NET Core application to a container on the Linux server or servers.

1.     Set up the repository.

Install the dnf-plugins-core package (which provides the commands to manage your DNF repositories) and set up the repository.

sudo dnf -y install dnf-plugins-core

sudo dnf config-manager
–add-repo
https://download.docker.com/linux/fedora/docker-ce.repo

2.     Install Docker Engine and verify credentials-fetcher is installed and started.

Install the latest version of Docker Engine, containerd, and Docker Compose and start the Docker daemon:

sudo dnf install docker-ce docker-ce-cli containerd.io docker-compose-plugin
sudo systemctl start docker

sudo dnf install credentials-fetcher
sudo systemctl start credentials-fetcher

Additional reference detailed instructions on how to install Docker engine can be found here.

1. Create a kerberos ticket – associated to the gMSA account as described in step 8 of “Environment Setup for gMSA on Linux use cases”

Take a note of the response generated by the Kerberos ticket creation.

2.     Leverage the Docker File for environment variables

FROM mcr.microsoft.com/dotnet/sdk:5.0 AS build-image
WORKDIR /src
COPY *.csproj ./
RUN dotnet restore
COPY . ./
RUN dotnet publish -c Release -o out
FROM mcr.microsoft.com/dotnet/aspnet:5.0
WORKDIR /app
EXPOSE 80
COPY —from=build-image /src/out ./
RUN apt-get -qq update &&
apt-get -yqq install krb5-user &&
apt-get -yqq clean

ENV KRB5CCNAME=/var/credentials-fetcher/krbdir/krb5cc

ENTRYPOINT [“dotnet”, “WebApp.dll”]

Example: /var/credentials-fetcher/krbdir/726837743cc966c7b4da/WebApp01

3.     Build a Docker image for your application on the Linux server:

docker build -t <your_image_name>

4.     Run Docker with bind mount to the kerberos ticket, ensure the environment variable KRB5CCNAME (see example docker file above) is pointing to the destination location of the bind mount inside the application container.

sudo docker run -p 80:80 -d -it —name webapp1 —mount type=bind,source=/var/credentials-fetcher/krbdir/726837743cc966c7b4da/WebApp01,target=/var/credentials-fetcher/krbdir {docker_image}

5.     Add Amazon RDS for Microsoft SQL Server to the gMSA group with the following commands:

#Add the AD join SQL RDS server to the gMSA account
Set-ADServiceAccount -Identity “gmsamachines” -PrincipalsAllowedToRetrieveManagedPassword “mssqlrdsname$”

6.     Download and install the SQL Management Tool (SSMS) on a management Windows machine that is a member of the service account group to test the Amazon RDS for Microsoft SQL Server connections. The .NET application will have access to each other and the Amazon RDS for Microsoft SQL Server instance.

7.     Log in to the Amazon RDS for Microsoft SQL Server instance and apply the gMSA service account with the desired permissions required by the .NET application.

Scenario 2:  Using .NET Core container application on Linux with a Microsoft SQL Server Container

Figure 4 Using .NET Core container application on Linux with a Microsoft SQL Server Container.

As with Scenario 1, the same steps to install Docker on Linux and deploy your application will apply. The difference will be the deployment of Microsoft SQL Server in a container and the confirmation that the server operates as expected running on Linux and leveraging gMSA for authentication.

1.     As with the first scenario, install and run the credentials-fetcher on the Linux server or servers you are deploying your .NET application containers to.

2.     Deploy a Microsoft SQL Server 2022 container on one of the Linux servers in your gMSA group.

Leverage the Docker file example in “Use Case 1” environment KRB5CCNAME from the Microsoft SQL Server container.
Reference “Use Case 1” for details on verifying docker file KRB5CCNAME.

Run the following command on your Linux server to install the latest Microsoft SQL Server 2022 Docker container available. Replace yourStrong(!)Password with a strong password in the command.

sudo docker run -e “ACCEPT_EULA=Y” -e “SA_PASSWORD=yourStrong(!)Password” -p 1433:1433 -d mcr.microsoft.com/mssql/server:2022-latest —mount type=bind,source=/var/credentials-fetcher/krbdir/726837743cc966c7b4da/WebApp01,target=/var/credentials-fetcher/krbdir {SQL_docker_image}

Verify that the Microsoft SQL Server Docker container is running with the following command:

sudo docker ps

Additional reference details for deploying a Microsoft SQL Server container on Linux can be found here.

3.     Download and install the SQL Management Tool (SSMS) on a management Windows machine that is a member of the service account group to test the Microsoft SQL Server connection.

4.     Log in to the Microsoft SQL Server instance and apply the gMSA service account with the desired permissions required by the .NET application.

Conclusion

Linux containers have become a key modernization destination for customers running .NET workloads. gMSA for Linux containers will help organizations and the overall Microsoft administrator and developer community to access AD from applications and services hosted on Linux containers using the service account authentication model.

gMSA is a managed account that provides automatic password management, service principal name (SPN) management, and the ability to delegate management to administrators over multiple servers or instances. This unblocks a range of modernization use cases around identity using Microsoft AD, such as, connecting .NET Core applications hosted on Linux containers with SQL Server authenticating over Microsoft AD, and securely opening up access to network resources from applications running with service accounts. Based on the capabilities and customer usefulness, credentials-fetcher is positioned to be extended into services such as Amazon Elastic Container Service (ECS) and Amazon Elastic Kubernetes Service (EKS).

This service feature extends support for non-Windows container applications that require gMSA for Microsoft AD Authentication. AWS is dedicated to continuing development and support for the credentials-fetcher daemon open source project. We believe that open source is good for everyone and we are committed to bringing the value of open source to our customers, and the operational excellence of AWS to open source communities. Contributions and feedback are welcome.

Flatlogic Admin Templates banner

Disaster Recovery When Using Crossplane for Infrastructure Provisioning on AWS

We would like to acknowledge the help and support Vikram Sethi, Isaac Mosquera, and Carlos Santana provided to make this blog post better. Thank you!

In our previous blog posts [1,2,3], we have discussed a growing trend towards our customers adopting GitOps and Kubernetes native tooling to provision infrastructure resources. AWS customers are choosing open source tools such as Argo CD and Flux CD for rollout and Crossplane or Amazon Controllers for Kubernetes (ACK) for creating infrastructure resources.

In our conversation with some of our customers and partners, including Autodesk, Deutsche Kreditbank, Nike, and Upbound, we identified that, while there is a significant body of work at AWS on how to utilize multi-region and multiple availability zone (multi-AZ) architectures to enable disaster recovery (DR), DR in the context of GitOps and Kubernetes-native infrastructure rollout has not been explored as extensively. To address this issue and to bring attention to some of the key considerations, we’ve worked with engineers and architects from these companies to come up with failure scenarios and related solutions when managing AWS resources with Crossplane.

In this blog post, we discuss different backup considerations when employing GitOps and the use of Kubernetes-native infrastructure provisioning tools on Amazon Elastic Kubernetes Service (Amazon EKS). To provide Kubernetes clusters with capabilities to backup and recover Kubernetes objects, we use Velero. Velero is a popular open source solution which allows you to backup and restore Kubernetes objects to external storage backends such as Amazon Simple Storage Service (Amazon S3). For brevity, in this writing we focus on using Velero for DR when using Crossplane as the infrastructure provisioning tool, but the approach should be equally applicable to ACK as well. In particular, we will answer questions related to the following scenarios:

What should you do if the Kubernetes control plane fails and a new cluster needs to be brought up to manage AWS resources?
How can you bring AWS resources into another region when using Crossplane?
What should you do if one of the resources managed by Crossplane fails?

In this post, we will not go into the details of how to configure individual AWS services to guard against failures. If you are interested in learning more about disaster recovery strategies for specific AWS services, please refer to strategies and documentation available at AWS Elastic Disaster Recovery.

Managed Entities and Failure Scenarios

Crossplane makes use of the Kubernetes Resource Model (KRM) to represent the resources it manages as instances of Custom Resource Definitions (CRDs). These custom resource instances are held within the same Kubernetes Cluster as the Crossplane Controllers.

For example, to create an Amazon S3 bucket, you should deploy a Crossplane CRD of type Bucket to the cluster and also a corresponding Kubernetes object with the kind Bucket that instantiates the CRD and represents the underlying Amazon S3 Bucket resource. Therefore, for any infrastructure resource managed by Crossplane, there are two separate but interrelated entities that you need to look into when doing backups:

The underlying AWS resource, e.g. for the bucket discussed above, you must have a plan in place to back up its content.

The corresponding Kubernetes object and its managing Amazon EKS cluster that keep the actual state of the AWS resource in sync with its expected state as deployed to the Kubernetes cluster. This object is called the Managed Resource (MR).

In addition, it is possible to group multiple MRs into one Kubernetes object. Such objects are called Composite Resources (CRs) and they are responsible for gluing sub resources together and are defined by Composition and Composite Resource Definition objects. CRs can also include other CRs as one of their sub resources. Furthermore, it is possible to create CRs that contain MRs from different providers. For example, it is possible to create CRs which contain MRs from AWS and Helm providers.

Therefore, it is critical for these objects to maintain their relationships with other objects when restoring them from backups.

In this blog post we particularly focus on failure recovery scenarios for the managing Kubernetes cluster, the Crossplane controller that runs in this cluster, and the individual Kubernetes resources representing infrastructure resources. For the AWS infrastructure resource managed by Crossplane, we discuss the traditional disaster recovery scenario where a region becomes completely unavailable.

Scenario 1: Kubernetes Cluster Failure

For the first scenario we are going to look into cases where the managing cluster running the Crossplane controller unexpectedly fails and the management cluster is disconnected from the actual infrastructure resources it manages. While this does not impact the data plane (e.g., your Amazon S3 bucket from the earlier example will continue to operate as it used to), it prevents any change to the infrastructure resources via the management cluster.

A common mechanism to recover from a failed cluster is to copy the state of the failed cluster into a new and healthy cluster. This can be done by regularly backing up the Kubernetes etcd datastore of an existing cluster, so that it can be restored into a new cluster if things fail. While there are several solutions for backing up resources in Kubernetes, Velero is one of the more popular ones. Velero allows you to safely backup and restore Kubernetes objects. This can be done for the entire set of resources, or it can be done selectively and by using Kubernetes resource and label selectors. The following figure depicts how an Amazon RDS resource provisioned by Crossplane in a now failed cluster (left) can be restored and managed by Velero into a new cluster with an existing Crossplane controller (right).

As part of the recovery process, it is important to ensure that Kubernetes resources restored by Velero get into a stable state when reconciled by the Crossplane controller on the healthy cluster. For example, consider a scenario where Crossplane resources are backed up from a Kubernetes cluster which has an older version of Crossplane’s AWS provider. When Velero restores these resources into a new Kubernetes cluster which has a newer version of Crossplane’s AWS providers, Velero replicates the same old copy of the resource in the new cluster. If for arbitrary reasons (e.g. regressions, bugs, etc.) the new Crossplane AWS provider acts on the restored resources differently than the old provider, then Velero and the new Crossplane AWS provider may get into a race overriding the state of the resource on the cluster, in which case the external resource could end up flapping between a desired state and a non-desired state, hence not becoming fully stable.

Similarly, if the manifests representing your Crossplane resources are applied to your Kubernetes cluster using GitOps or similar automation, you should make sure to disable it before restoring the backup. Otherwise, once your backup is restored, any Kubernetes resources it contains could be overwritten by their representations from git. This could cause your restore to become inconsistent or worse, you could introduce skew with the underlying cloud provider resources. By disabling your GitOps workflows or any automation delivering your Crossplane manifests, you can ensure that the only manifests applied are those restored from backup.

When backups are made with Velero, they should contain objects managed by Crossplane and its providers, including composite resources, their sub resources, and managed resources. For a successful recovery, it is important for backed up objects to include all the dependencies. This includes, e.g., ProviderConfig objects and corresponding secrets. ProviderConfig objects are responsible for providing configuration including credentials for providers to connect to service providers such as AWS. Without importing the full chain of object dependencies, migrated crossplane resources have no way of getting reconciled correctly. In addition, to ensure resources are in-sync with Git, they need to be imported into your GitOps tooling. This is specific to the GitOps tooling used, which can become complex to manage. We intend to cover the intricacies of synchronizing restored resources and the corresponding Git counterparts in another blog post.

In general, a regular backup process before a cluster failure involves the following steps:

Scale down the AWS provider deployment to zero to prevent this instance of the provider from attempting to reconcile AWS resources.
Back up the required resources.
Repeat at a reasonable interval.

When restoring into a new cluster:

Install Velero and configure it to use the backup data source.
Ensure your AWS Auth ConfigMap setup does not get overwritten by Velero after you initiate the restore process, or else you will lose access to the new cluster.
Restore the backed up resources to the new cluster.
Scale up the AWS provider deployment to one in the new cluster.

During the backup and restoration process Velero preserves owner references, resource states, and external names of managed resources. Therefore from here on, the Kubernetes reconciliation process should kick in and proceed as if there were no changes.

Scenario 2: Failure in Crossplane Controller Upgrades

Upgrading the Crossplane controller for a given provider should not require treating failures as a full case of disaster recovery. When you install a new version of a provider, a ProviderRevision object is created that owns the new version of managed resources and transfers ownership from the old providers. In case the new revision of the provider fails, recovery should be as easy as rolling back to the old revision of the provider where ownership of the previous revision is reinstated.

However, it could be that due to other complications (see here for an example), a rollback from a failed provider upgrade would not help with recovering from a failure. In this case, a full cluster migration might be the fastest solution to bring things back on track.

In case of cluster migration, you can safely use Velero, similar to how it was discussed previously to back up resources and restore them to a new cluster. The new cluster does not need to be in the same region either, since the region in which managed AWS resources reside does not change. An example of such a migration can be found here.

Scenario 3: Region Failure

Region failures result in more complex failure scenarios to recover from, given how Kubernetes objects are connected to external AWS service entities.

For example, you may have an EKS cluster with Crossplane in the us-west-2 region. Let us assume that this cluster manages AWS resources in us-west-2 and us-east-1 regions. One day, for an unforeseeable reason, the entire us-west-2 region disappears from the earth. What should your recovery strategy look like if you need to restore infrastructure to another region?

You may have a Velero backup of objects from the managing Crossplane cluster in us-west-2 and you can restore them for Global AWS resources such as Route53 records and IAM roles, but you cannot simply restore regional resources to the new cluster in a different region. One reason for this is that all resources managed by the Crossplane AWS provider have “region” as a required configuration field (see the below Amazon Virtual Private Cloud (Amazon VPC) instance for example).

This field determines in which region Crossplane-managed AWS resources should be deployed. When you take a backup of the managed resources in the original cluster, these objects have the region specified in the resources. In our example, these resources may have the us-west-2 or us-east-1 region specified. Because of this, when you attempt to restore them into another cluster in another region, Crossplane will attempt to provision resources into the region that no longer exists. You will need to update your Kubernetes manifests to reflect region changes. Also, your stateful resources such as databases likely need to be restored from a snapshot. You may also have a hot replica in another region that your application can use. In either case, during the restoration process you still need a way to tell Crossplane which backup you want to restore from and update applications to point to the new database, preferably in an automated way.

You will also need to consider AWS regional differences and ensure compositions work in different regions. For example, the us-west-2 region has 4 availability zones, while the us-east-1 region has 6 availability zones. To accommodate this, you may need to create different compositions for each region to ensure availability requirements are met.

Because of these reasons, it’s likely you need to change manifests in your source of truth or use a mutation webhook to dynamically update the region field, instead of relying on a Kubernetes backup solution. You need to perform the restoration process using the continuous delivery (CD) tool of your choice. A rough process is depicted in the figure below and would look like this:

Create a new Kubernetes cluster with Crossplane in another region if one doesn’t exist.
Determine which region you want to restore to.
Determine which snapshots you want to restore from.
Make changes to your manifests to reflect information determined in the previous steps (preferably with automation such as a mutating webhook).
Let your CD tooling deploy resources.

Key Considerations

Based on the discussions here, we highly recommend that you start designing to guard against failure from the ground up to ensure uninterrupted operation of your applications.

In order to have a less hectic recovery process, here are some additional key considerations:

Automate the rollout of your infrastructure and application resources. This ensures that when disaster hits, you can more quickly respond by redirecting your rollout practices to an alternative solution.
Parameterize your deployments. Make regions, availability zones, instance types, replica counts, etc. configurable so you can change them to alternatives where necessary. This, combined with proper automation, allows you to quickly choose alternative targets for deploying infrastructure and application resources.
Make sure you can tag and specify which snapshot you want to restore for your backed up data stores. In Crossplane compositions, this can and needs to be defined when you are restoring your data into a new managing cluster.
Consider full migrations to alternative regions as a last resort. Failures happen a lot less frequently to an entire region. When recovering from failures, consider rollback options available natively by Crossplane. This involves rolling back to an older revision of the provider, or restoring a failed AWS service instance to remain in sync with the Crossplane counterpart.
Do not ignore full migrations to alternative regions. While rare, you would occasionally need to do a full migration to an entirely different region. Have it be part of your recovery planning. Do not ignore or postpone it!
Practice failure recovery scenarios a priori. We have seen examples like Chaos Monkey and other tools imitating a case of failure for deployments in order to prepare and practice a recovery process. Crossplane is no different. Ensure you practice failure recovery frequently to be prepared for an actual case of recovery when needed.

Conclusions

Managing AWS resources using a Kubernetes control plane has its own challenges when you need strategies to backup and restore resources. If you are simply migrating from one managing cluster to another, backup solutions such as Velero work nicely because no modifications to the source of truth are required.

If you are preparing for disaster recovery scenarios, things are more nuanced and may require modifications to your source of truth. Many AWS managed services offer easy ways to backup and restore your data to another region. Your Crossplane compositions need to be able to take advantage of these features according to your recovery objectives. Using templating and overlaying mechanisms, you can easily embed these objectives into Crossplane compositions. This means all AWS resources managed by Crossplane adhere to your recovery point objective. In the end, it is important to understand where you draw boundaries in your recovery process in what is possible via Kubernetes tooling and the control plane, and what needs to be taken care of out of band and using traditional failure recovery practices.

If you want to learn more about building shared services platforms (SSP) on EKS with Crossplane, you can schedule time with our experts.

References:

https://aws.amazon.com/blogs/opensource/comparing-aws-cloud-development-kit-and-aws-controllers-for-kubernetes/
https://aws.amazon.com/blogs/opensource/declarative-provisioning-of-aws-resources-with-spinnaker-and-crossplane/
https://aws.amazon.com/blogs/opensource/introducing-aws-blueprints-for-crossplane/

Flatlogic Admin Templates banner

AWS Teams with OSTIF on Open Source Security Audits

We are excited to announce that AWS is sponsoring open source software security audits by the Open Source Technology Improvement Fund (OSTIF), a non-profit dedicated to securing open source. This funding is part of a broader initiative at Amazon Web Services (AWS) to support open source software supply chain security.

Last year, AWS committed to investing $10 million over three years alongside the Open Source Security Foundation (OpenSSF) to fund supply chain security. AWS will be directly funding $500,000 to OSTIF as a portion of our ongoing initiative with OpenSSF. OSTIF has played a critical role in open source supply chain security by providing security audits and reviews to projects through their work as a pre-existing partner of the OpenSSF. Their broad experience with auditing open source projects has already provided significant benefits. This month the group completed a significant security audit of Git that uncovered 35 issues, including two critical and one high-severity finding. In July, the group helped find and fix a critical vulnerability in sigstore, a new open source technology for signing and verifying software.

Many of the tools and services provided by AWS are built on open source software. Through our OSTIF sponsorship, we can proactively mitigate software supply chain risk further up the supply chain by improving the health and security of the foundational open source libraries that AWS and our customers rely on. Our investment helps support upstream security and provides customers and the broader open source community with more secure open source software.

Supporting open source supply chain security is akin to supporting electrical grid maintenance. We all need the grid to continue working, and to be in good repair, because nothing gets powered without it. The same is true of open source software. Virtually everything of importance in the modern IT world is built atop open source. We need open source software to be well maintained and secure.

We look forward to working with OSTIF and continuing to make investments in open source supply chain security.

Flatlogic Admin Templates banner

iframe Security is So Strange

As I write, this is the attribute soup on an <iframe> on CodePen in order to dial in the right combination of security and usability:

<iframe

sandbox=”allow-downloads allow-forms allow-modals allow-pointer-lock allow-popups allow-presentation allow-same-origin allow-scripts allow-top-navigation-by-user-activation”

allow=”accelerometer; camera; encrypted-media; display-capture; geolocation; gyroscope; microphone; midi; clipboard-read; clipboard-write; web-share”

allowpaymentrequest=”true”
allowfullscreen=”true”>

</iframe>

Wow! I think that’s an awful lot of detailed HTML to get right. If any little bit of that was wrong, we’d hear about it here at CodePen right away. It would break what users expect to work.

To compound this issue, the above code is just the output for Chrome-n-friends. Both Safari and Firefox need variations of this HTML to perfect. That puts us in UA-sniffing territory, which is never a particularly happy place.

Add extra attributes or values to this code, you might make annoying extra console noise — quite annoying in an app for developers. Skip them, and you cripple the app itself. We have no choice but to render user code in an <iframe>, for all the obvious cross-origin security it provides.

Compounding things again, all this code changes. New features arrive in browsers that require new iframe permissions. But there is no good place to follow all the changes, so the way we tend to find out is when a user graciously sends in a support request for something that doesn’t work that they think should.

The code itself is just… strange! Why is sandbox space-separated but allow is semicolon-separated? Why are sandbox and allow different attributes at all? Especially when they are both whitelists? Why are some features their own special attributes?

Just feels like an awful lot of weirdness for one isolated purpose.

I was just looking over our setup here at CodePen and refactoring it a bit, and decided to chuck the attributes in JSON to maintain from there, so here’s a copy of that in case it’s useful to anyone else.

The post iframe Security is So Strange appeared first on CodePen Blog.Flatlogic Admin Templates banner

Problems with online user authentication when using self sovereign identity

Using self sovereign identity (SSI), there is no standardized solutions for solving online user authentication when using verifiable credentials and verifying the identity and user. All solutions result in further compromises and result in new problems. To understand the problems, we need to understand how this works. The following diagram shows the verifiable credential (VC) relationship with Issuer, Holders (behind a software wallet) and the verifier. Trust between the actors and are you required to authenticate the user behind the wallet are key and important requirements for some systems. Verifiable credentials are not issued to a user but to a wallet representing the user.

Image src: Verifiable Credentials Data Model 1.0 specification 

Use case definition: The user must be authenticated and verified by the “verifier” application using the verifiable credential. Is the user data in the verifiable credential the same user presenting it or a user, application allowed to use the VC on behalf of that person. So how can this be solved?

Solution 1: User Authentication is on the wallet.

The wallet application implements the authentication of the user and binds the user to all credentials issued to the wallet through the agents and then sent to verifier applications. With BBS+ verifiable credentials, it is possible to do this. The wallet is responsible for authentication of the user, but this is not standardized, and no wallet does this the same. If the wallet is responsible for user authentication, then applications only need to authorize the verifiable credentials and not authenticate the user behind the wallet and represented in the verifiable credential which is connected to the wallet. The VC is invalid if a different wallet sends this. So the verifier applications only validates that the sender of the VC has possession of the credential, nothing else and trusts that the wallet authenticates the user correctly and also trusts that the wallet prevents misuse. The verifier cannot validate if the application, person using the credential is allowed to use it. The verifier must trust that the wallet does this correctly.

Problems with this solution:

Wallet Software monopoly: If a state body pushes this solution, then it has effectively created a monopoly for the producer of the wallet software. This is because at present with existing wallets, the required authentication is specific to the application and the definition of how this is required and the hardware device used for the wallet. No standards exist for how a user is authenticated in the wallet and what level of initial user authentication is required. This could be improved by creating a new standard for wallets which can be used by the state body and the way the wallet must authenticate the users of the wallet. Then any wallet which fulfills the standard can be used for state created verifiable credentials.

Backup and recovery of wallets becomes really complicated because the user is connected to the software wallet. If I lose my wallet, or would like to switch the wallet, a safe secure standardized way would be required proving that the wallet has authenticated the same person as the initial wallet or a person of trust. All issued credentials would probably need to be re-issued. The user of the wallet and the wallet instance are tightly coupled.

Verifier authorization only, not authentication: The verifier does not authenticate the user behind the wallet, just accepts that it was done correctly. This creates a closed system between the verifier and the wallet even though it is distributed. The verifier is tightly coupled in the relationship if blindly trusting verifiable credentials from wallets which are not in its system scope. If the verifier needs to verify the identity, then FIDO2, OIDC, OAUTH2, PKI or existing online verifying flows could be used as a second factor.

Single point of failure: if the credential issuer VCs can no longer be trusted, then all verifiers using the credentials need to revoke everything created from the VC . This is not a problem if the verifier authenticated it’s users, identities directly.

Solution 2: Use the OIDC SIOP standard to authenticate when issuing and verifying the verifiable credentials

https://openid.net/specs/openid-connect-self-issued-v2-1_0-07.html

https://openid.net/specs/openid-connect-4-verifiable-presentations-1_0-08.html

A second way of solving user authentication in SSI is to use OpenID Connect and SIOP. The credential issuer uses it’s own OpenID Connect server with pre-registered users where the person has been correctly identified. The credential issuer is responsible for identifying and authenticating the identity, ie the user plus the application. Each credential type which is issued requires a specific OpenID Connect client. When the user, using the SIOP agent from his or her wallet tries to add a new verifiable credential using SIOP, the user is required to authenticate using the identity provider (IDP). This can also be used when verifying credentials. By using this, any wallet which supports the SIOP agent with the correct verifiable credential type used can work. The strong authentication is not required on the wallet because this is part of the flows and the user does not need to be connected to the wallet. If the verifier does not authenticate the user or application sending the verifiable credentials, then strong authentication would still be required on the wallet.

Problems with this solution:

Requires an OIDC server: All credential issuers require an OpenID Connect server and a separate client per credential type.

Verifier authorization only, not authentication: Only proof of possession on the verifier. Verifiers need to start an SIOP verification and the verifier needs to trust the OIDC server used for the client. The OIDC server authenticates and not the verifier.

Single point of failure: if the credential issuer VCs can no longer be trusted, then all verifiers using the credentials need to revoke everything created from the VC . This is not a problem if the verifier authenticated it’s users directly.

Solution 3: Verifiers authenticate the user correctly before trusting the verifiable credentials sent from an unspecific wallet.

Another way of solving this is that all credential issuers and all verifiers authenticate the user behind a verifiable credential using their own process. This avoids the single point of failure. Each sign-in would require an extra step to authenticate, if using SSI for example a FIDO2 key or a PKI authentication or some OIDC flow can be used. SSI could be used as the first factor in the application authentication. This solution works really good but a lot of the advantages of SSI is lost.

Problems with this solution:

All applications require authentication. This is more effort if implementing a closed system, but all applications need to do this anyway, so it’s not really a disadvantage. If you control both the issuer and the verifier, then the verifier application could just do authorization of the verifiable credential.

SSI adds little value. Due to the fact that a second authentication method is used everywhere, this would also work without SSI, so why add SSI then in the first place.

Summary

User authentication is not an easy problem to solve and SSI at present does not solve this in a standard way. All existing solutions do something vendor specific or solve this in a closed system. As soon as inter-op is required, then the problems begin. It is important to solve this in a way which does not require a vendor specific solution or creates a monopoly for a vendor solution. At present, SSI solutions still have very little convergence. We have different ledgers which only work with specific agents. We have different agents, SIOP, DIDComm V1, V2 which are only supported by certain wallets. We have different verifiable credentials standards which do not work together. We have no authentication standards on the wallets, no standard for backup and recovery. It is still not clear how the trust register for credentials issuers will work, I as an application verifier need an easy way to validate the quality of the credential issuer otherwise how can I know if the credential was issued in a good way without doing my own security check. Guardianship will also complicate the user authentication process.

Links:

https://github.com/swiss-ssi-group

https://github.com/e-id-admin

https://openid.net/specs/openid-connect-self-issued-v2-1_0-07.html

https://openid.net/specs/openid-connect-4-verifiable-presentations-1_0-08.html

https://www.w3.org/TR/vc-data-model/Flatlogic Admin Templates banner