Protobuf In C# .NET – Part 4 – Performance Comparisons

This is a 4 part series on working with Protobuf in C# .NET. While you can start anywhere in the series, it’s always best to start at the beginning!

Part 1 – Getting Started
Part 2 – Serializing/Deserializing
Part 3 – Using Length Prefixes
Part 4 – Performance Comparisons

We’ve made mention in previous posts to the fact that Protobuf (supposedly) will out perform many other data formats, namely JSON. And while we’ve kind of alluded to the fact it’s “fast” and it’s “small”, we haven’t really jumped into the actual numbers.

This post will take a look across three different metrics :

File Size – So just how lightweight is Protobuf?
Serialization – How fast can we take a C# object and serialize it into Protobuf or JSON?
Deserialization – Given a Protobuf/JSON data format, how fast can we turn it into a C# object?

Let’s jump right in!

File Size Comparisons

Before looking at read/write performance, I actually wanted to compare how large the actual output is between Protobuf and JSON. I set up a really simple test that used the following model :

[ProtoContract]
class Person
{
[ProtoMember(1)]
public string FirstName { get; set; }

[ProtoMember(2)]
public string LastName { get; set; }

[ProtoMember(3)]
public List Emails { get; set; }
}

And I used the following code to create an object, and write it twice. Once with Protobuf and once with JSON :

var person = new Person
{
FirstName = “Wade”,
LastName = “G”,
Emails = new List<string>
{
[email protected]”,
[email protected]
}
};

using (var fileStream = File.Create(“person.buf”))
{
Serializer.Serialize(fileStream, person, PrefixStyle.Fixed32);
}

var personString = JsonConvert.SerializeObject(person);
File.WriteAllText(“person.json”, personString);

The results were :

Format
FileSize

Protobuf
46 bytes

JSON
85 bytes

So just by default, Protobuf is almost half the size. Obviously your mileage may vary depending on your data types and even your property names.

That last point is important because while Protobuf has other mechanisms keeping the size down, a big part of it is that all property names are serialized as integers rather than their string form. To illustrate this, I modified the model to look like so :

[ProtoContract]
class Person
{
[ProtoMember(1)]
[JsonProperty(“1”)]
public string FirstName { get; set; }

[ProtoMember(2)]
[JsonProperty(“2”)]
public string LastName { get; set; }

[ProtoMember(3)]
[JsonProperty(“3”)]
public List Emails { get; set; }
}

So now our JSON will be serialized with single digit names as well. When running this, our actual comparison table looks like so :

Format
FileSize

Protobuf
46 bytes

JSON
85 bytes

JSON With Digit Properties
65 bytes

So half of the benefits of using Protobuf when it comes to size instantly disappears! For now, I’m not going to use the single digit properties going forward because it’s not illustrative of what happens in the real world with JSON, but it’s an interesting little footnote that you can shrink your disk footprint with just this one simple hack that storage providers hate.

So overall, Protobuf has JSON beat when it comes to file size. That’s no surprise, but what about actual performance when working with objects?

Serialization Performance

Next, let’s take a look at serializing performance. There are a couple of notes on the methodology behind this

Because Protobuf serializes to bytes and JSON to strings, I wanted to leave them like that. e.g. I did not take the JSON string, and convert it into bytes as this would artificially create an overhead when there is no need.
I kept everything in memory (I did not write to a file etc)
I wanted to try and use *both* JSON.NET and Microsoft’s JSON Serializer. The latter is almost certainly going to be faster, but the former probably has more use cases out there in the wild.
For now, I’m just using the Protobuf.NET library for everything related to Protobuf
Use Protobuf as the “baseline” so everything will compared to how much slower (Or faster, you never know!) it is compared to Protobuf

With that in mind, here’s the benchmark using BenchmarkDotNet (Quick guide if you haven’t seen it before here : https://dotnetcoretutorials.com/2017/12/04/benchmarking-net-core-code-benchmarkdotnet/)

public class ProtobufVsJSONSerializeBenchmark
{
static Person person = new Person
{
FirstName = “Wade”,
LastName = “G”,
Emails = new List<string>
{
[email protected]”,
[email protected]
}
};

[Benchmark(Baseline = true)]
public byte[] SerializeProtobuf()
{
using(var memoryStream = new MemoryStream())
{
ProtoBuf.Serializer.Serialize(memoryStream, person);
return memoryStream.ToArray();
}
}

[Benchmark]
public string SerializeJsonMicrosoft()
{
return System.Text.Json.JsonSerializer.Serialize(person);
}

[Benchmark]
public string SerializeJsonDotNet()
{
return Newtonsoft.Json.JsonConvert.SerializeObject(person);
}
}

And the results?

Format
Average Time
Baseline Comparison

Protobuf
680ns

Microsoft JSON
743ns
9% Slower

JSON.NET
1599ns
135% Slower

So we can see that Protobuf is indeed faster, but not by a heck of a lot. And of course, I’m willing to bet a keen eyed reader will drop a comment below and tell me how the benchmark could be improved to make Microsoft’s JSON serializer even faster.

Of course JSON.NET is slower, and that is to be expected, but again I’m surprised that Protobuf, while fast, isn’t *that* much faster. How about deserialization?

Deserialization Performance

We’ve done serialization, so let’s take a look at the reverse – deserialization.

I do want to point out one thing before we even start, and that is that JSON.NET and Microsoft’s JSON library handle case sensitivity with JSON *very* differently. Infact, JSON.NET is case insensitive by default and is the *only* way it can run. Microsoft’s JSON library is case sensitive by default and must be switched to handle case insensitivity at a huge cost. I have an entire article dedicated to the subject here : https://dotnetcoretutorials.com/2020/01/25/what-those-benchmarks-of-system-text-json-dont-mention/

In some ways, that somewhat invalidates our entire test (Atleast when comparing JSON.NET to Microsoft’s JSON), because it actually entirely depends on whether your JSON is in the exact casing you require (In most cases that’s going to be PascalCase), or if it’s in CamelCase (In which case you take a performance hit). But for now, let’s push that aside and try our best to create a simple benchmark.

Other things to note :

Again, I want to work with the formats that work with each data format. So Protobuf will be deserializing from a byte array, and JSON will be deserializing from a string
I *had* to create a memory stream for Protobuf. Atleast without making the test more complicated than it needed to be.

public class ProtobufVsJSONDeserializeBenchmark
{
public static Person person = new Person
{
FirstName = “Wade”,
LastName = “G”,
Emails = new List<string>
{
[email protected]”,
[email protected]
}
};

static byte[] PersonBytes;
static string PersonString;

[GlobalSetup]
public void Setup()
{
using (var memoryStream = new MemoryStream())
{
ProtoBuf.Serializer.Serialize(memoryStream, person);
PersonBytes = memoryStream.ToArray();
}

PersonString = JsonConvert.SerializeObject(person);
}

[Benchmark(Baseline = true)]
public Person DeserializeProtobuf()
{
using (var memoryStream = new MemoryStream(PersonBytes))
{
return ProtoBuf.Serializer.Deserialize<Person>(memoryStream);
}
}

[Benchmark]
public Person DeserializeJsonMicrosoft()
{
return System.Text.Json.JsonSerializer.Deserialize<Person>(PersonString);
}

[Benchmark]
public Person DeserializeJsonDotNet()
{
return Newtonsoft.Json.JsonConvert.DeserializeObject<Person>(PersonString);
}
}

I know it’s a big bit of code to sift through but it’s all relatively simple. We are just deserializing back into a Person object. And the results?

Format
Average Time
Baseline Comparison

Protobuf
1.019us

Microsoft JSON
1.238us
21% Slower

JSON.NET
2.598us
155% Slower

So overall, Protobuf wins again and by a bigger margin this time than our Serialization effort (When it comes to percentage). But again, your mileage will vary heavily depending on what format your JSON is in.

Conclusion

The overall conclusion is that indeed, Protobuf is faster than JSON by a reasonable margin, or a huge margin if comparing it to JSON.NET. However, in some respects a big part of the difference is likely to lie in how JSON is always serialized as strings versus the direct byte serialization of Protobuf. But that’s just a hunch of mine.

When it comes to file size, Protobuf wins out again, *especially* when serializing full JSON property names. Obviously here we are talking about the difference between a few bytes, but when you are storing say 500GB of data in Protobuf, that same data would be 1000GB in JSON, so it definitely adds up.

That’s all I’m doing on Protobuf for a bit and I hope you’ve learnt something a bit new. Overall, just in my personal view, don’t get too dragged into the hype. Protobuf is great and it does what it says on the tin. But it’s just another data format, nothing to be afraid of!

The post Protobuf In C# .NET – Part 4 – Performance Comparisons appeared first on .NET Core Tutorials.

Leave a Reply

Your email address will not be published.