为什么Dictionary需要实现IDeserializationCallback?

Why does Dictionary needs need to implement IDeserializationCallback?

我对 ISerializable 的工作原理有一个大致的了解,例如格式化程序如何构造 SerializationInfo 并将其传递给对象(比方说字典),当 foramtter 进行反序列化时,它检索 SerializationInfo 并调用字典的特殊保护构造函数并传递此对象等。下面是一些代码(由我简化)关于字典:

[Serializable]
public class Dictionary<TKey,TValue>: ...ISerializable, IDeserializationCallback  {
   
   private SerializationInfo m_siInfo; // Only used for deserialization
   
   // Special constructor to control deserialization
   protected Dictionary(SerializationInfo info, StreamingContext context) {
      // During deserialization, save the SerializationInfo for OnDeserialization
      m_siInfo = info;
   }

   // implements ISerializable for serialization purpose
   public virtual void GetObjectData(SerializationInfo info, StreamingContext context) {
      info.AddValue("Version", m_version);
      info.AddValue("Comparer", m_comparer, typeof(IEqualityComparer<TKey>));
      info.AddValue("HashSize", (m_ buckets == null) ? 0 : m_buckets.Length);
      if (m_buckets != null) {
         KeyValuePair<TKey, TValue>[] array = new KeyValuePair<TKey, TValue>[Count];
         CopyTo(array, 0);
        info.AddValue("KeyValuePairs", array, typeof(KeyValuePair<TKey, TValue>[]));
      }
   }

   // implement IDeserializationCallback, so this method will be called after deserialization, which set the dictionary internal state bakc to its original state before serialization
   public virtual void OnDeserialization(Object sender) {    
      if (m_siInfo == null) return; // Never set, return
      Int32 num = m_siInfo.GetInt32("Version");
      Int32 num2 = m_siInfo.GetInt32("HashSize");
      m_comparer = (IEqualityComparer<TKey>)
      m_siInfo.GetValue("Comparer", typeof(IEqualityComparer<TKey>));
      ...// reconstruct the Dictionary's internal states such as buckets, entries etc

   }
   ...
}

我有一个问题:为什么 Dictionaryneeds 需要实现 IDeserializationCallback 而不是只在 specail 构造函数中做所有事情?

正如 canton7 所说:

The dictionary's contents are serialized when the dictionary is serialized -- every key and value stored in the dictionary is serialized. The dictionary cannot calculate the hash codes for any of its contents until they've finished being deserialized. Therefore the calculation of the hash codes is deferred until OnDeserialization, by which point each of the keys and values in the dictionary have finished being deserialized, and it is safe to call methods on them

但下面是 CLR via C# 书中的引述:

When a formatter serializes an object graph, it looks at each object. If its type implements the ISerializable interface, then the formatter ignores all custom attributes and instead constructs a new System.Runtime.Serialization.SerializationInfo object. This object contains the actual set of values that should be serialized for the object.

我们可以看到是SerializationInfo对象会被序列化,Dictionary对象本身不会被序列化。

当格式化程序从流中提取 SerializationInfo 对象时,它会创建一个新的 Dictionary 对象(通过调用 FormatterServices.GetUninitializedObject 方法)。最初,此 Dictionary 对象的所有字段都设置为 0 或 null。

并且我们已经返回 SerializationInfo 对象并将其传递给特殊构造函数,您在序列化之前拥有所有原始 keys/values,然后您可以在特殊构造函数中重建 Dictionary 的内部状态,就像

protected Dictionary(SerializationInfo info, StreamingContext context) {    
   if (info == null) return; // Never set, return
   Int32 num = info.GetInt32("Version");
   Int32 num2 = info.GetInt32("HashSize");
   m_comparer = (IEqualityComparer<TKey>)
   info.GetValue("Comparer", typeof(IEqualityComparer<TKey>));
   ...// reconstruct the Dictionary's internal states such as buckets, entries etc
}

我的理解正确吗?

我不确定您是从哪里获得代码的,但是如果您查看 actual source:

protected Dictionary(SerializationInfo info, StreamingContext context)
{
    // We can't do anything with the keys and values until the entire graph has been deserialized
    // and we have a resonable estimate that GetHashCode is not going to fail.  For the time being,
    // we'll just cache this.  The graph is not valid until OnDeserialization has been called.
    HashHelpers.SerializationInfoTable.Add(this, info);
}

从评论来看,问题很明确:序列化数据将包含字典的条目,是的,但也包含这些条目的键的序列化版本。字典需要重新计算每个条目的哈希码(在你的许多问题中的另一个问题中回答),并且在每个条目都被正确反序列化之前它不能这样做。由于无法保证反序列化的顺序(在您的另一个问题中得到回答),因此字典需要等到整个图被重建后再计算任何这些哈希码。


举个简单的例子:

public class Program
{
    public static void Main()
    {
        var container = new Container() { Name = "Test" };
        container.Dict.Add(container, "Container");
        
        var formatter = new BinaryFormatter();
        var stream = new MemoryStream();
        formatter.Serialize(stream, container);
        
        stream.Position = 0;
        formatter.Deserialize(stream);
    }
}

[Serializable]
public class Container : ISerializable
{
    public string Name { get; set; }
    public MyDictionary Dict { get; }
    
    public Container()
    {
        Dict = new MyDictionary();
    }
    
    protected Container(SerializationInfo info, StreamingContext context)
    {
        Console.WriteLine("Container deserialized");

        Name = info.GetString("Name");
        Dict = (MyDictionary)info.GetValue("Dict", typeof(MyDictionary));
    }
    
    public virtual void GetObjectData(SerializationInfo info, StreamingContext context)
    {
        info.AddValue("Name", Name);
        info.AddValue("Dict", Dict);
    }
    
    public override bool Equals(object other) => (other as Container)?.Name == Name;
    public override int GetHashCode() => Name.GetHashCode();
}

[Serializable]
public class MyDictionary : Dictionary<object, object>
{
    public MyDictionary() { }

    protected MyDictionary(SerializationInfo info, StreamingContext context)
        : base(info, context)
    {
        Console.WriteLine("MyDictionary deserialized");
        
        // Look at the data which Dictionary wrote...
        var kvps = (KeyValuePair<object, object>[])info.GetValue("KeyValuePairs", typeof(KeyValuePair<object, object>[]));
        Console.WriteLine("Name is: " + ((Container)kvps[0].Key).Name);
    }
}

运行 在这里:DotNetFiddle.

(我将 Dictionary 子类化,因此我们知道何时调用其反序列化构造函数)。

这会打印:

MyDictionary deserialized
Name is:
Container deserialized

注意在反序列化 Container 之前先反序列化 MyDictionary,尽管 MyDictionary 包含 Container 作为其键之一!您还可以看到当 MyDictionary 被反序列化时 Container.Namenull——它只会在稍后分配给 OnDeserialization 被调用之前的某个时间。

因为在构造 MyDictionary 时 Container.Namenull,而 Container.GetHashCode 依赖于 Container.Name,那么如果 Dictionary 试图调用 GetHashCode 在其任何项目的键上。