使用同一记录数组中具有不同列的记录读取 XML

Reading XML with records with different columns in same records array

我需要在 C# 中解析 XML 响应并在 SQL 中加载。 简而言之,我知道如何使用 XMLSerializer 来解析 xml,所以这不是我要找的。我关心的是我从网络请求中收到的 XML 结构。下面是 xml 的子集,我从 xml

收到
<apiXML>
<recordList>
<record id="31" >
    <administration_name>admin1</administration_name>
    <creator>Leekha, Mohit</creator>
    <object_category>painting</object_category>
    <object_number>1243</object_number>
    <id>31</id>
    <reproduction.reference>2458.jpg</reproduction.reference>
    <title lang="nl-NL" invariant="false">The Title1</title>
    <title lang="nl-NL" invariant="false">The Title</title>
    <title lang="nl-NL" invariant="false">Different Title</title>
</record>
<record id="32" >
    <administration_name>admin1</administration_name>
    <creator>Leekha, Mohit</creator>
    <object_category>painting</object_category>
    <object_number>AL1111</object_number>
    <id>32</id>
    <reproduction.reference>AL1111.jpg</reproduction.reference>
    <title lang="nl-NL" invariant="false">Health</title>
</record>
<record id="34" >
    <administration_name>admin2</administration_name>
    <creator>Leekha,Mohit</creator>
    <creator>System</creator>
    <object_category>earthenware</object_category>
    <object_category>ABC</object_category>
    <object_category>Remote</object_category>
    <object_number>Z.567 & X-124</object_number>
    <id>34</id>
    <reproduction.reference>Z.567 & X-124(1).jpg</reproduction.reference>
    <reproduction.reference>Z.567 & X-124(2).jpg</reproduction.reference>
    <reproduction.reference>Z.567 & X-124(3).jpg</reproduction.reference>
</record>
</recordList>
</apiXML>

我的问题:

  1. 一些记录有多个同名的数据成员。像记录 id 31 有 3 个标题
  2. 每条记录的列数不同。

所以我想问的是如何处理场景的建议。欢迎任何建议

您需要几个支持 classes 才能按原样反序列化 XML,因为您没有指定任何其他要求。

您的数据库将为您的记录元素和其中的所有 collections 提供表格。

序列化classes

那些 classes 将保存您的 XML 的内存表示。根将是 Api class.

[XmlRoot("apiXML")]
public class Api
{
     [XmlArray("recordList")]
     [XmlArrayItem("record", typeof(Record))]
     public List<Record> RecordList {get;set;}
}

[Serializable]
public class Record
{
    [XmlAttribute("id")]
    public int RecordId {get;set;}

    [XmlElement("id")]
    public int Id {get;set;}

    [XmlElement("administration_name")]
    public string AdministrationName {get;set;}

    [XmlElement("object_number")]
    public string ObjectNumber {get;set;}

    [XmlElement("creator")]
    public List<Creator> Creators {get;set;}

    [XmlElement("object_category")]
    public List<ObjectCategory> ObjectCategories {get;set;}

    [XmlElement("reproduction.reference")]
    public List<ReproductionReference> ReproductionReferences {get;set;}

    [XmlElement("title")]
    public List<Title> Titles {get;set;}
}

[Serializable]
public class Title:Child
{
    [XmlAttribute("invariant")]
    public bool Invariant {get;set;}

    [XmlAttribute("lang")]
    public string Culture {get;set;}

    [XmlText]
    public string Text {get;set;}
}

public class Child
{
    [XmlIgnore]
    public int ParentId {get;set;}
}

[Serializable]
public class Creator:Child
{
    [XmlText]
    public string Text {get;set;}
}

[Serializable]
public class ObjectCategory:Child
{
    [XmlText]
    public string Text {get;set;}
}

[Serializable]
public class ReproductionReference:Child
{
    [XmlText]
    public string Text {get;set;}
}

反序列化

使用 classes 正确注释反序列化 XML 只需要一对 行数:

var ser = new XmlSerializer(typeof(Api));
var sr = new StringReader(xml);
var api = (Api) ser.Deserialize(sr);

处理和建表

在变量 api 中,我们现在有 in-memory object 图表,您可以将其投射到关系数据库架构上。对于规范化模型,您需要下表:

  • 记录(id,[class] 中的字段)
  • 创作者(id, ..)
  • 标题(id, ...)
  • 对象类别(id, ...)
  • ObjectNumber (id, ...)
  • ReproductionReference(id, ...)

在这些表之间,您将需要 link 表,它们遵循与 Record 和 Creator 之间相同的约定:

  • RecordCreator(RecordId, CreatorId)

我假设您知道如何创建这些表并创建与数据库的连接。

// use an SqlAdapter.Fill to get the below  dataset call
// sqlAdapter.Fill(ds);
var ds = new DataSet();
// this is here so you can test without a database
// test mocking code
var recTable = ds.Tables.Add("Record");
recTable.Columns.Add("Id");
recTable.Columns.Add("AdministrationName");
recTable.Columns.Add("ObjectNumber");

var creTable = ds.Tables.Add("Creator");
creTable.Columns.Add("Id", typeof(int)).AutoIncrement = true;
creTable.Columns.Add("Text");

var reccreTable = ds.Tables.Add("RecordCreator");
reccreTable.Columns.Add("RecordId");
reccreTable.Columns.Add("CreatorId");
// end mocking code

// copy object graph and build link tables
foreach(var record in api.RecordList)
{
    // each main record is created
    var rtRow = recTable.NewRow();
    rtRow["Id"] = record.Id;
    rtRow["AdministrationName"] = record.AdministrationName;
    rtRow["ObjectNumber"] = record.ObjectNumber;
    recTable.Rows.Add(rtRow);
    // handle each collection
    foreach(var creator in record.Creators)
    {
        DataRow creRow; // will hold our Creator row
        // first try to find if the Text is already there
        var foundRows = creTable.Select(String.Format("Text='{0}'", creator.Text));
        if (foundRows.Length < 1) 
        {
            // if not, add it to the Creator table
            creRow =  creTable.NewRow(); // Id is autoincrement!
            creRow["Text"] = creator.Text;
            creTable.Rows.Add(creRow);
        }
        else 
        {
            // otherwise, we found an existing one
            creRow = foundRows[0];
        }
        // link record and creator
        var reccreRow = reccreTable.NewRow();
        reccreRow["RecordId"] = record.Id;
        reccreRow["CreatorId"] = creRow["Id"];
        reccreTable.Rows.Add(reccreRow);
   } 

   // the other collections follow a similar pattern but is left for the reader
} 
// now call Update to write the changes to the db.
// SqlDataAdapter.Update(ds); 

这就是您需要将 SQL 存储在 RDBMS 数据库中而不丢失信息的代码和结构。