.NET性能優化-是時候換個序列化協議了

計算機單機性能一直受到摩爾定律的約束，隨著移動互聯網的興趣，單機性能不足的瓶頸越來越明顯，制約著整個行業的發展。不過我們雖然不能無止境的縱向擴容系統，但是我們可以分布式、橫向的擴容系統，這聽起來非常的美好，不過也帶來了今天要說明的問題，分布式的節點越多，通信產生的成本就越大。

網絡傳輸帶寬變得越來越緊缺，我們服務器的標配上了 10Gbps 的網卡
HTTPx.x 時代 TCP/IP 協議通訊低效，我們即將用上 QUIC HTTP 3.0
同機器走 Socket 協議棧太慢，我們用起了 eBPF
....

現在我們的應用程序花在網絡通訊上的時間太多了，其中花在序列化上的時間也非常的多。我們和大家一樣，在內部微服務通訊序列化協議中，絕大的部分都是用 JSON。JSON 的好處很多，首先就是它對人非常友好，我們能直接讀懂它的含義，但是它也有著致命的缺點，那就是它序列化太慢、序列化以后的字符串太大了。

之前筆者做一個項目時，就遇到了一個選型的問題，我們有數億行數據需要緩存到 Redis 中，每行數據有數百個字段，如果用 Json 序列化存儲的話它的內存消耗是數 TB級別的（部署個集群再做個主從、多中心需要成倍的內存、太貴了，用不起）。于是我們就在找有沒有除了 JSON 其它更好的序列化方式？

看看都有哪些

目前市面上序列化協議有很多比如 XML、JSON、Thrift、Kryo 等等，我們選取了在.NET 平臺上比較常用的序列化協議來做比較：

JSON：JSON 是一種輕量級的數據交換格式。采用完全獨立于編程語言的文本格式來存儲和表示數據。簡潔和清晰的層次結構使得 JSON 成為理想的數據交換語言。
Protobuf：Protocol Buffers 是一種語言無關、平臺無關、可擴展的序列化結構數據的方法，它可用于（數據）通信協議、數據存儲等，它類似 XML，但比它更小、更快、更簡單。
MessagePack：是一種高效的二進制序列化格式。它可以讓你像 JSON 一樣在多種語言之間交換數據。但它更快、更小。小的整數被編碼成一個字節，典型的短字符串除了字符串本身之外，只需要一個額外的字節。
MemoryPack：是 Yoshifumi Kawai 大佬專為 C#設計的一個高效的二進制序列化格式，它有著.NET 平臺很多新的特性，并且它是 Code First 開箱即用，非常簡單；同時它還有著非常好的性能。

我們選擇的都是.NET 平臺上比較常用的，特別是后面的三種都宣稱自己是非常小，非常快的，那么我們就來看看到底是誰最快，誰序列化后的結果最小。

準備工作

我們準備了一個 DemoClass 類，里面簡單的設置了幾個不同類型的屬性，然后依賴了一個子類數組。暫時忽略上面的一些頭標記。

[MemoryPackable]
[MessagePackObject]
[ProtoContract]
public?partial?class?DemoClass
{[Key(0)]?[ProtoMember(1)]?public?int?P1?{?get;?set;?}[Key(1)]?[ProtoMember(2)]?public?bool?P2?{?get;?set;?}[Key(2)]?[ProtoMember(3)]?public?string?P3?{?get;?set;?}?=?null!;[Key(3)]?[ProtoMember(4)]?public?double?P4?{?get;?set;?}[Key(4)]?[ProtoMember(5)]?public?long?P5?{?get;?set;?}[Key(5)]?[ProtoMember(6)]?public?DemoSubClass[]?Subs?{?get;?set;?}?=?null!;
}[MemoryPackable]
[MessagePackObject]
[ProtoContract]
public?partial?class?DemoSubClass
{[Key(0)]?[ProtoMember(1)]?public?int?P1?{?get;?set;?}[Key(1)]?[ProtoMember(2)]?public?bool?P2?{?get;?set;?}[Key(2)]?[ProtoMember(3)]?public?string?P3?{?get;?set;?}?=?null!;[Key(3)]?[ProtoMember(4)]?public?double?P4?{?get;?set;?}[Key(4)]?[ProtoMember(5)]?public?long?P5?{?get;?set;?}
}

System.Text.Json

選用它的原因很簡單，這應該是.NET 目前最快的 JSON 序列化框架之一了，它的使用非常簡單，已經內置在.NET BCL 中，只需要引用System.Text.Json命名空間，訪問它的靜態方法即可完成序列化和反序列化。

using?System.Text.Json;var?obj?=?....;//?Serialize
var?json?=?JsonSerializer.Serialize(obj);//?Deserialize
var?newObj?=?JsonSerializer.Deserialize<T>(json)

Google Protobuf

.NET 上最常用的一個 Protobuf 序列化框架，它其實是一個工具包，通過工具包+*.proto文件可以生成 GRPC Service 或者對應實體的序列化代碼，不過它使用起來有點麻煩。

使用它我們需要兩個 Nuget 包，如下所示：

<!--Google.Protobuf?序列化和反序列化幫助類-->
<PackageReference?Include="Google.Protobuf"?Version="3.21.9"?/><!--Grpc.Tools?用于生成protobuf的序列化反序列化類?和?GRPC服務-->
<PackageReference?Include="Grpc.Tools"?Version="2.50.0"><PrivateAssets>all</PrivateAssets><IncludeAssets>runtime;?build;?native;?contentfiles;?analyzers;?buildtransitive</IncludeAssets>
</PackageReference>

由于它不能直接使用 C#對象，所以我們還需要創建一個*.proto文件，布局和上面的 C#類一致，加入了一個DemoClassArrayProto方便后面測試：

syntax="proto3";
option csharp_namespace="DemoClassProto";
package DemoClassProto;message DemoClassArrayProto
{repeated DemoClassProto DemoClass = 1;
}message DemoClassProto
{int32 P1=1;bool P2=2;string P3=3;double P4=4;int64 P5=5;repeated DemoSubClassProto Subs=6;
}message DemoSubClassProto
{int32 P1=1;bool P2=2;string P3=3;double P4=4;int64 P5=5;
}

做完這一些后，還需要在項目文件中加入如下的配置，讓Grpc.Tools在編譯時生成對應的 C#類：

<ItemGroup><Protobuf?Include="*.proto"?GrpcServices="Server"?/>
</ItemGroup>

然后 Build 當前項目的話就會在obj目錄生成 C#類：

最后我們可以用下面的方法來實現序列化和反序列化，泛型類型T是需要繼承IMessage<T>從*.proto生成的實體(用起來還是挺麻煩的)：

using?Google.Protobuf;//?Serialize
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public?static?byte[]?GoogleProtobufSerialize<T>(T?origin)?where?T?:?IMessage<T>
{return?origin.ToByteArray();
}//?Deserialize
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public?DemoClassArrayProto?GoogleProtobufDeserialize(byte[]?bytes)
{return?DemoClassArrayProto.Parser.ParseFrom(bytes);
}

Protobuf.Net

那么在.NET 平臺 protobuf 有沒有更簡單的使用方式呢？答案當然是有的，我們只需要依賴下面的 Nuget 包：

<PackageReference?Include="protobuf-net"?Version="3.1.22"?/>

然后給我們需要進行序列化的 C#類打上ProtoContract特性，另外將所需要序列化的屬性打上ProtoMember特性，如下所示：

[ProtoContract]
public?class?DemoClass
{[ProtoMember(1)]?public?int?P1?{?get;?set;?}[ProtoMember(2)]?public?bool?P2?{?get;?set;?}[ProtoMember(3)]?public?string?P3?{?get;?set;?}?=?null!;[ProtoMember(4)]?public?double?P4?{?get;?set;?}[ProtoMember(5)]?public?long?P5?{?get;?set;?}
}

然后就可以直接使用框架提供的靜態類進行序列化和反序列化，遺憾的是它沒有提供直接返回byte[]的方法，不得不使用一個MemoryStrem：

using?ProtoBuf;//?Serialize
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public?static?void?ProtoBufDotNet<T>(T?origin,?Stream?stream)
{Serializer.Serialize(stream,?origin);
}//?Deserialize
public?T?ProtobufDotNet(byte[]?bytes)
{using?var?stream?=?new?MemoryStream(bytes);return?Serializer.Deserialize<T>(stream);
}

MessagePack

這里我們使用的是 Yoshifumi Kawai 實現的MessagePack-CSharp，同樣也是引入一個 Nuget 包：

<PackageReference?Include="MessagePack"?Version="2.4.35"?/>

然后在類上只需要打一個MessagePackObject的特性，然后在需要序列化的屬性打上Key特性：

[MessagePackObject]
public?partial?class?DemoClass
{[Key(0)]?public?int?P1?{?get;?set;?}[Key(1)]?public?bool?P2?{?get;?set;?}[Key(2)]?public?string?P3?{?get;?set;?}?=?null!;[Key(3)]?public?double?P4?{?get;?set;?}[Key(4)]?public?long?P5?{?get;?set;?}
}

使用起來也非常簡單，直接調用MessagePack提供的靜態類即可：

using?MessagePack;//?Serialize
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public?static?byte[]?MessagePack<T>(T?origin)
{return?global::MessagePack.MessagePackSerializer.Serialize(origin);
}//?Deserialize
public?T?MessagePack<T>(byte[]?bytes)
{return?global::MessagePack.MessagePackSerializer.Deserialize<T>(bytes);
}

另外它提供了 Lz4 算法的壓縮程序，我們只需要配置 Option，即可使用 Lz4 壓縮，壓縮有兩種方式，Lz4Block和Lz4BlockArray，我們試試：

public?static?readonly?MessagePackSerializerOptions?MpLz4BOptions?=???MessagePackSerializerOptions.Standard.WithCompression(MessagePackCompression.Lz4Block);//?Serialize
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public?static?byte[]?MessagePackLz4Block<T>(T?origin)
{return?global::MessagePack.MessagePackSerializer.Serialize(origin,?MpLz4BOptions);
}//?Deserialize
public?T?MessagePackLz4Block<T>(byte[]?bytes)
{return?global::MessagePack.MessagePackSerializer.Deserialize<T>(bytes,?MpLz4BOptions);
}

MemoryPack

這里也是 Yoshifumi Kawai 大佬實現的MemoryPack，同樣也是引入一個 Nuget 包，不過需要注意的是，目前需要安裝 VS 2022 17.3 以上版本和.NET7 SDK，因為MemoryPack代碼生成依賴了它：

<PackageReference?Include="MemoryPack"?Version="1.4.4"?/>

使用起來應該是這幾個二進制序列化協議最簡單的了，只需要給對應的類加上partial關鍵字，另外打上MemoryPackable特性即可：

[MemoryPackable]
public?partial?class?DemoClass
{public?int?P1?{?get;?set;?}public?bool?P2?{?get;?set;?}public?string?P3?{?get;?set;?}?=?null!;public?double?P4?{?get;?set;?}public?long?P5?{?get;?set;?}
}

序列化和反序列化也是調用靜態方法：

//?Serialize
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public?static?byte[]?MemoryPack<T>(T?origin)
{return?global::MemoryPack.MemoryPackSerializer.Serialize(origin);
}//?Deserialize
public?T?MemoryPack<T>(byte[]?bytes)
{return?global::MemoryPack.MemoryPackSerializer.Deserialize<T>(bytes)!;
}

它原生支持 Brotli 壓縮算法，使用如下所示：

//?Serialize
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public?static?byte[]?MemoryPackBrotli<T>(T?origin)
{using?var?compressor?=?new?BrotliCompressor();global::MemoryPack.MemoryPackSerializer.Serialize(compressor,?origin);return?compressor.ToArray();
}//?Deserialize
public?T?MemoryPackBrotli<T>(byte[]?bytes)
{using?var?decompressor?=?new?BrotliDecompressor();var?decompressedBuffer?=?decompressor.Decompress(bytes);return?MemoryPackSerializer.Deserialize<T>(decompressedBuffer)!;
}

跑個分吧

我使用BenchmarkDotNet構建了一個 10 萬個對象序列化和反序列化的測試，源碼在末尾的 Github 鏈接可見，比較了序列化、反序列化的性能，還有序列化以后占用的空間大小。

public?static?class?TestData
{//public?static?readonly?DemoClass[]?Origin?=?Enumerable.Range(0,?10000).Select(i?=>{return?new?DemoClass{P1?=?i,P2?=?i?%?2?==?0,P3?=?$"Hello?World?{i}",P4?=?i,P5?=?i,Subs?=?new?DemoSubClass[]{new()?{P1?=?i,?P2?=?i?%?2?==?0,?P3?=?$"Hello?World?{i}",?P4?=?i,?P5?=?i,},new()?{P1?=?i,?P2?=?i?%?2?==?0,?P3?=?$"Hello?World?{i}",?P4?=?i,?P5?=?i,},new()?{P1?=?i,?P2?=?i?%?2?==?0,?P3?=?$"Hello?World?{i}",?P4?=?i,?P5?=?i,},new()?{P1?=?i,?P2?=?i?%?2?==?0,?P3?=?$"Hello?World?{i}",?P4?=?i,?P5?=?i,},}};}).ToArray();public?static?readonly?DemoClassProto.DemoClassArrayProto?OriginProto;static?TestData(){OriginProto?=?new?DemoClassArrayProto();for?(int?i?=?0;?i?<?Origin.Length;?i++){OriginProto.DemoClass.Add(DemoClassProto.DemoClassProto.Parser.ParseJson(JsonSerializer.Serialize(Origin[i])));}}
}