C# Protobuf-net：小数字典：零不能正确往返

Question

我在 protobuf-net 中发现了一个关于 serialization/deserialization 十进制零的奇怪错误，想知道是否有人为此找到了一个好的解决方法，或者这是否真的是一个功能。

给定一个像上面这样的字典，如果我在 linqpad 中运行:

void Main()
{
    {
        Dictionary<string, decimal> dict = new Dictionary<string, decimal>();
        dict.Add("one", 0.0000000m);
        DumpStreamed(dict);
    }

    {
        Dictionary<string, decimal> dict = new Dictionary<string, decimal>();
        dict.Add("one", 0m);
        DumpStreamed(dict);
    }
}

public static void DumpStreamed<T>(T val)
{
    using (var stream = new MemoryStream())
    {
        Console.Write("Stream1: ");
        ProtoBuf.Serializer.Serialize(stream, val);
        foreach (var by in stream.ToArray())
        {
            Console.Write(by);
        }

        Console.WriteLine();
        Console.Write("Stream2: ");
        stream.Position = 0;
        var item = ProtoBuf.Serializer.Deserialize<T>(stream);
        using(var stream2 = new MemoryStream())
        {
            ProtoBuf.Serializer.Serialize(stream2, item);
            foreach (var by in stream2.ToArray())
            {
                Console.Write(by);
            }

        }
    }

    Console.WriteLine();
    Console.WriteLine("----");
}

我将获得两个不同的流：

第一次连载：1091031111101011822414

二次连载：1071031111110101180

（0.0000000m 在反序列化时被转换为 0）。

我发现这是由于 ReadDecimal 中的这行代码造成的：

 if (low == 0 && high == 0) return decimal.Zero;

有谁知道为什么零仅在反序列化期间而不是在序列化期间被标准化？

或者在 serialization/deserialization 的字典中始终规范化或始终不规范化小数零的任何解决方法？

Answer 1

浮点数据类型实际上是具有多个元素的结构。其中包括基值和基值要提高到的指数。 decimal 的 c# 文档说明如下：

The binary representation of a Decimal number consists of a 1-bit sign, a 96-bit integer number, and a scaling factor used to divide the integer number and specify what portion of it is a decimal fraction. The scaling factor is implicitly the number 10, raised to an exponent ranging from 0 to 28

例如，您可以将 1234000 表示为

基值为 1234000 x 10^0
基值为 123000 x 10 ^1
基值为 12300 x 10^2

等等

所以这个问题不仅仅局限于零。所有十进制值都可以用不止一种方式表示。如果您依赖字节流来检查等价性，那么您会遇到很多问题。你真的不应该这样做，因为你肯定会得到误报，而不仅仅是零。

至于序列化时的规范化，我认为这是ProtoBuf特有的问题。您当然可以编写自己的序列化程序来采取步骤对数据进行规范化，尽管这可能很难弄清楚。另一种选择是在存储之前将小数转换为一些自定义 class，或者将它们存储为它们的字符串表示形式，这听起来可能很奇怪。

如果您有兴趣处理一些小数并检查原始数据，请参阅 GetBits() 方法。或者您可以使用此扩展方法来查看内存中的表示并亲自查看：

public static unsafe string ToBinaryHex(this decimal This)
{
    byte* pb = (byte*)&This;
    var bytes = Enumerable.Range(0, 16).Select(i => (*(pb + i)).ToString("X2"));
    return string.Join("-", bytes);
}

Answer 2

是的；问题是这个善意但可能有害的行：

    if (low == 0 && high == 0) return decimal.Zero;

忽略检查 signScale。真的应该是：

    if (low == 0 && high == 0 && signScale == 0) return decimal.Zero;

我会在下一个版本中对其进行调整。

(编辑：我最终完全删除了该检查 - 其余代码只是一些整数移位等，因此 "branch" 可能比 not 有了它)

C# Protobuf-net：小数字典：零不能正确往返

C# Protobuf-net: Dictionary of decimals: Zeroes don't get roundtrip properly

c#

decimal

protobuf-net