XML 序列化 - 大数据(20GB),OutOfMemoryException

XML Serialize - BIG Data ( 20GB ), OutOfMemoryException

我有一个问题:我想序列化一个 xml ( 20GB ) 但我得到一个 out of memory 异常。

对此你有什么建议吗?

我的代码如下:

public static string Serialize(object obj)
{
    string retval = string.Empty;

    if (null!= obj)
    {
        StringBuilder sb = new StringBuilder();
        using (XmlWriter writer = XmlWriter.Create(sb, new XmlWriterSettings() { OmitXmlDeclaration = true }))
        {                    
            XmlSerializer serializer = new XmlSerializer(obj.GetType());

            // We are ommitting the namespace to simplifying passing as parameter
            XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
            ns.Add("", "");

            serializer.Serialize(writer, obj);
        }

        retval = sb.ToString();
    }
    return retval;
}

20 GB 永远不会 用作 string(通过 StringBuilder);即使启用 <gcAllowVeryLargeObjects>string 的最大理论长度也只是其中的一小部分。

如果你想要大数据,你需要使用类似文件的东西(或者基本上:一个 Stream 而不是 MemoryStream)作为后端。

我还要说 xml 对于大数据来说是一个糟糕的选择。如果您不受 xml 的束缚,我强烈建议您寻找替代工具(如果可以的话,我很乐意提供建议)。

但现在:

string path = "my.xml";
XmlWriterSettings settings = ...
using (XmlWriter writer = XmlWriter.Create(path, settings))
{
    // ...
}

或者如果您实际上是在与套接字等通信:

Stream stream = ...
XmlWriterSettings settings = ...
using (XmlWriter writer = XmlWriter.Create(stream, settings))
{
    // ...
}

您可能有一个 List 可以分段处理

        public static void Serialize(List<MyClass> myClasses)
        {
            string retval = string.Empty;

            if (myClasses != null)
            {

                using (StreamWriter sWriter = new StreamWriter("filename", false))
                {
                    foreach (MyClass myClass in myClasses)
                    {

                        StringBuilder sb = new StringBuilder();
                        using (XmlWriter writer1 = XmlWriter.Create(sb, new XmlWriterSettings() { OmitXmlDeclaration = true }))
                        {
                            XmlSerializer serializer = new XmlSerializer(myClass.GetType());

                            // We are ommitting the namespace to simplifying passing as parameter
                            XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
                            ns.Add("", "");

                            serializer.Serialize(writer1, myClass);
                        }
                        sWriter.Write(sb.ToString());

                    }

                }
            }
        }