I’ve been recently looking into different serialization options. While there are plenty of writeups (even in C#) already available, I wanted to:
- Have one about C#
- Learn something new 😉
- Look into my particular data distribution / characteristics
- Understand not only performance, but also size impact
I’d say that there are very few summaries that you should just rely on. With some exception, serialization is generally pretty fast, and the choices you make have also following considerations:
- do you need it to be ‘human readable’?
- Is it just enough to have a tool that presents the serialization for you?
- Do you need tagged or untagged serialization? Untagged serialization can be noticably faster and smaller.
What if even the field names aren’t preserved in serialization? If you don’t have a schema agreed upon (ahead of time), some of customers of your data might not know how to interpret it (e.g. visualization tools for data).
- do you care about maximum compactness of data representation?
- how much do you really care about performance? if you are going to send it ‘over the wire’, chances are that any of the serializers will be ‘fast enough’.
- do you need cross language support? do you need code-gen for those languages?
I want to determine time and size impact of different serializers. I am more concerned about the size, assuming that time will be around the same for majority of serializers. The goal of this post is not to document differences between different serialization stacks (analysis of languages, apis, etc.)
Serializers in the set
Almost all of serializers in this list support some kind of RPC on their own, I’ll skip that part from this analysis.
- Newtonsoft.Json – it’s a good json serializer. It supports ‘DataContract’ attributes from System.Runtime.Serialization, it also supports BSON. (todo: BSON)
- Thrift – thrift is a fully fledged, cross-language/platform ‘RPC’ library (service development), used (among others) by Twitter and Salesforce (also, combined with finagle). It has language agnostic data (and service) definition layer, which then transpiles to a specific language of your choice. Various serialization options are available.
- Avro – data serialization system, but also provides RPC layer if needed. It relies on schemas (but it’s embedded with the message/data), but code does not have to be generated (unlike thrift and protobuf).
- Bond – Microsoft’s (cross platform) serialization mechanism.
- ProtoBuf(fers) (github) – very similar to thrift. Protobuf vs thrift on stackoverflow. (thrift is more RPC oriented)
- MessagePack – can work without schema
- DataContractSerializer (XML) – built in .net serializer
- Binary formatter – built in binary formatter / serializer for .net. You’d probably never use it in production, as it doesn’t really provide any backward compatibiltiy in case of schema changes.
About the project
I used Benchmark.Net for performance experiments. While it puts some constraints on code layout, it not only measures performance, but it correctly prepares the performance measurement, it also measures approximate allocations and garbage collections.
If serializer cannot use standard data type, I use AutoMapper to map from my original type to the type of that serializer. Since some serializers don’t handle nulls, at some point I decided to not have nulls in my properties.
You can find sources on github.
The test data
I decided to use test two scenarios. One is an object that contains 4 strings. The other one contains binary data (in my case ‘binary’ is HTML). It is meant to represent a content fetched by a web fetcher. So the data contains Url (or string), Response Header (as text, since it’s ANSI), and Content (byte) — since the fetcher itself might not know what encoding to apply.
I generated two objects which I am using across all tests. Both objects are generated before the tests are performed. All instances of serializers are also created before the test begins – I assume one-time creation time is neglibile (even if it’s one-time per type).
What I didn’t test is how these serializers handle nested objects, cycles, etc. While all of them work fine with nested objects, they differ in cycle handling. Some of them are configurable with that regard. Note that almost always cycle (and reference) handling has additional performance impact, hence it was out of scope.
Size & compressability
One of the goals is to save size. Turns out that you cannot get really much better than untagged serialization that supports binary arrays as first class citizen. Of course there is an open question on how is inheritance implemented (if supported), but it’s outside of the scope of this document.
Let’s take a look at the basic object (4 strings). First column represents uncompressed size, second uses DeflateStream to compress the data. Third column represents the % of the size of the object compared to the largest one, and fourth column represents the % of the size of compressed object to the largest uncompressed object.
As we can see, all of the ‘top of the line’ serializers produce very similar sized results (well, to be frank, with 4 strings there is not much rocket science in that).
|uncompressed bytes||compressed (optimal)||% of max||% of max compressed|
|uncompressed bytes||compressed (optimal)||% of max||% of max compressed||% of max without outliers||% of max without outliers compressed|
Since we established that size-wise the differences are minor, let’s look at the performance results. Not all serializers were tested in the same conditions, but more on that a bit later.
On the large objects, the serializers are working on the order of 100-400 us, with built in c# serializers being the slowest (outside of that range), newtonsoft.json being slow as well, and the rest being not that far from each other. The fastest serializer (Bond) reach 16us per serialization, which seems crazy. Note however, that it didn’t involve allocation of a single byte (nor garbage collection). That’s because I configured it to reuse the buffer. I didn’t do it with other serializers, but if performance is important to you, you should consider using buffer pools to avoid unnecessary garbage collections. (on the other hand you still need to create your object which involves memory operations, so the difference in the big picture might not be that noticable. Having said that, the slowest BondSimple with additional buffer copying is about as fast as Proto3 or Avro. Surprisingly, thrift is at the end of the peleton, and it looks like it allocates 2 times the required memory. (while Avro, Bond and Proto3 allocate around 600kB, Thrift & MessagePack allocate 1.2MB, and then are almost 2 times slower). It may be well because of how the MemoryStream works, and if it needs to expand, it will double its allocation.
|Method||Mean||StdDev||Gen 0||Gen 1||Gen 2||Allocated|
|NewtonsoftJsonReusedSerializer||1,456.6460 us||4.5955 us||324.4792||296.0938||292.7083||1.48 MB|
|NewtonsoftJsonGenericSerializer||1,256.3022 us||7.9291 us||321.6146||242.9688||163.8021||1.69 MB|
|NewtonsoftJsonDataContract||1,260.6177 us||7.7955 us||318.75||244.2708||161.9792||1.69 MB|
|Xml||716.6923 us||3.6916 us||276.1719||250.651||249.8698||1.32 MB|
|DataContractJsonSerializer||45,589.4428 us||141.1303 us||45.8333||45.8333||45.8333||4.57 MB|
|BondUnsafeCompact||265.6632 us||2.5722 us||152.7344||137.5||136.849||857.03 kB|
|BondUnsafeSimple||122.7984 us||0.6369 us||72.3307||56.7057||56.7057||382.2 kB|
|BondUnsafeSimpleCopied||225.2874 us||3.5498 us||126.6276||111.1328||111.0026||698.52 kB|
|BondUnsafeCompactReused||366.4184 us||5.9772 us||201.5625||185.8724||184.9609||1.17 MB|
|BondUnsafeCompactReusedCopied||372.5610 us||3.7343 us||209.375||193.6849||192.7734||1.17 MB|
|BondUnsafeSimpleReused||121.1955 us||1.3657 us||71.5169||55.9896||55.8919||382.13 kB|
|BondUnsafeSimpleReusedBuffer||16.2607 us||0.0823 us||–||–||–||0 B|
|Proto3||213.8052 us||2.3267 us||108.0729||106.1198||106.1198||640.15 kB|
|BinaryFormatter||415.9039 us||5.1037 us||197.526||195.5729||195.5729||1.27 MB|
|MessagePack||381.4887 us||4.9364 us||190.3646||190.3646||190.3646||1.26 MB|
|Avro||219.0960 us||2.3580 us||108.2031||107.2266||107.2266||637.4 kB|
|ThriftBinary||379.5297 us||3.1255 us||192.3177||190.3646||190.3646||1.27 MB|
|ThriftCompact||388.4297 us||8.7005 us||193.3594||191.5365||191.5365||1.27 MB|
On the smaller object, the performance was unmeasurable with default Benchmark.Net settings (it was too fast). I might come back to these tests later.
What really matters in terms of performance
Based on the test results, I’d risk to say that allocations & garbage collections have the largest impact on perf. Most performance problems come from allocation of memory. If you can avoid additional allocations, you will notice (in some cases), 50% performance improvements. If your application is serialization heavy, using buffer pools can significantly enhance your performance. Keep in mind that it might not matter in your application. Chances are that your logic is way more time consuming than the serialization itself.
I briefly touched on this point in previous paragraph, but let’s compare BondUnsafeSimple, BondUnsafeSimpleCopied, BondUnsafeSimpleCopiedReusedBuffer. First and second differ in that that “OutputBuffer” from Bond is copied from the buffer into array. (buffer has more capacity than the size of serialization). You probably won’t do that if you will be saving that object on disk or sending it over the wire. But you can see that the copy operation basically doubles memory allocation. Similarly, BondUnsafeSimpleReusedBuffer differs from BondUnsafeSimple in that, that it doesn’t even recreate “OutputBuffer” for subsequent serializations. Once the buffer grew to a certain size (and it gets reused), no more reallocations are required. This proves (or at least hints!) that majority of time in serialization is spent in memory allocation and not doing actual dumping of the data (especially when we are talking about copying byte into a stream).
If you are in a need of serializing data that contains binaries (as well as other properties), you have lots of choices. None of them involve ‘human readable’ representation. Any of the Avro, Thrift, MessagePack, Proto3 would do the trick. Seems that Avro, Proto3 and Bond might be standing out. Proto3 is well established and so is Bond (it is a public knowledge that Bond is used in scale infrastructure at Microsoft). I will be looking later on to see if there is something I am doing wrong with thrit that would cause it to have ~70% smaller performance (and higher memory usage) than the others.
What about deserialization?