Taking the EF Core Azure Cosmos DB Provider for a Test Drive

The release of EF Core 6.0 is right on the horizon (as I write this). The team has been hard at work adding features. One area of focus is the Azure Cosmos DB experience. We received feedback that many developers would prefer to use the provider for Cosmos DB but are waiting for certain key features.

Planetary docs

I built a reference app that uses Azure Cosmos DB with EF Core on Blazor Server. It includes search capability, cross-referenced entities, and an interface to create, read, and update. I recently upgraded to the latest EF Core 6.0 version and was able to simplify and remove quite a bit of code!

Feature overview

Here are some of the features requested that we added to the EF Core 6.0 Azure Cosmos DB provider.

Implicit ownership

EF Core was built as an object relational mapper. In relational databases, complex relationships are expressed by storing related entities in separate tables and referencing them with foreign keys. EF Core assumes non-primitive entity types encountered in a parent are expressed as foreign key relationships. The relationships are configured using HasMany or HasOne and the instances are assumed to exist independently with a configured relationship. In document databases, the default behavior for entity types is to assume they are embedded documents owned by the parent. In other words, the complex type’s data exists within the context of the parent. In previous versions of EF Core, this behavior had to be configured explicitly for it to work with the Azure Cosmos DB provider. In EF Core 6.0, ownership is implicit. This saves configuration and ensures the behavior is consistent with NoSQL approaches from other providers.

For example, in Planetary Docs there are authors and tags. The entities “own” a list of summaries that point to the URL and titles of related documents. This way, when a user asks “What documents have tag X” I only need one document loaded to answer the question (I load tag X, then iterate its owned collection of titles). Using EF Core 5, I had to explicitly claim ownership:

tagModel.OwnsMany(t => t.Documents);
authorModel.OwnsMany(t => t.Documents);

In EF Core 6, the ownership is implicit so there is no need to configure the entities except to specify partition keys.

Support for primitive collections

In relational databases, primitive collections are often modeled by either promoting them to complex types or converting them to a serialized artifact to store in a single column. Consider a blog post that can have a list of tags. One common approach would be to create an entity that represents a tag:

public class Tag
{
public int Id { get; set; }
public string Text { get; set; }
}

The tag is then referenced:

public ICollection<Tag> Tags { get; set; }

The primitive is promoted to a complex type and stored in a separate table. An alternative is to collapse the tags into a single field that contains a comma-delimited list. This approach requires a value converter to marshal the list into the field for updates and decompose the field into the list for read. It also makes it difficult and expensive to answer questions like, “How many posts are tagged X?” Using EF Core 5, I chose the single column approach. I serialized the list to JSON when writing and deserialized when reading. This is the serialization code:

private static string ToJson<T>(T item) => JsonSerializer.Serialize(item);
private static T FromJson<T>(string json) => JsonSerializer.Deserialize<T>(json);

I configured EF Core to make the conversions:

docModel.Property(d => d.Tags)
.HasConversion(
t => ToJson(t),
t => FromJson<List<string>>(t));

And the resulting document looked like this:

{
“tags” : “[“one”, “two”, “three”]”
}

With EF Core 6.0, I simply deleted the code to take advantage of the built-in handling of primitive types. This results in a document like this:

{
“tags” : [
“one”,
“two”,
“three”
]
}

This results in a schema change that Azure Cosmos DB has no problem handling. The C# code, on the other hand, will throw when a current model using tags as an array encounters a legacy record that used tags as a field. How do we handle this when EF Core doesn’t have the concept of NoSQL migrations?

Raw SQL

A popular request is to allow developers to write their own SQL for data access. This is exactly the feature I needed to handle my code migration. For the raw SQL to work, it must project to an existing model. It is an extension of the DbSet<T> for the entity. In my case, it enabled an in-place migration. After updating the code, attempting to load a document would fail. The document had a single string property for “tag” but the C# model is an array, so the JSON serializer would throw an exception. To remedy this, I used a built-in feature of Azure Cosmos DB that will parse a string into an array. Using a query, I project the entity to a document that matches the current schema and then save it back. This is the migration code:

var docs = await Documents.FromSqlRaw(
“select c.id, c.Uid, c.AuthorAlias, c.Description, c.Html, c.Markdown, c.PublishDate, c.Title, STRINGTOARRAY(c.Tags) as Tags from c”).ToListAsync();
foreach (var doc in docs)
{
Entry(doc).State = EntityState.Modified;
}

This feature empowers developers to craft complex queries that may not be supported by the LINQ provider.

Additional enhancements

In addition to what I already covered, these enhancements also made it in.

For many-to-many relationships, EF Core now implicitly uses the partition key on the join type.
You are able to configure time-to-live (TTL) for documents at the instance, type, and collection levels.
You can configure container facets such as throughput, size, etc. through EF Core APIs.
We now log diagnostic events specific to Cosmos DB including query cost.
We added support for the DISTINCT operator in queries.
The LINQ provider now translates certain methods such as string manipulation and mathematical operators to their native Cosmos DB counterparts.

Summary

I’m excited about the changes coming and hope that you are, too. Are you using the Cosmos DB provider? Are you considering it now that we’ve added these features? Is there something critical you need that we missed? Let me know in the comments below. Thank you!

The post Taking the EF Core Azure Cosmos DB Provider for a Test Drive appeared first on .NET Blog.

Leave a Reply

Your email address will not be published. Required fields are marked *