将嵌套对象的集合展平为 DataTable 的通用方法?

Generic Method to Flatten a Collection of Nested Objects into a DataTable?

我有一个对象列表,它又包含更多对象的嵌套列表。我想将对象图展平为 DataTable.

我找到了获取对象集合并将它们映射到 DataTable(在下面引用)的代码,但它假定属性是可以可靠地转换为字符串值的简单类型。

我认为这只能通过递归实现,但也许有更好的方法。

数据模型

假设我们有 ListCustomer 个对象:

public class Item
{
    public string SKU { get; set; }
    public string Description { get; set; }
    public double Price { get; set; }
}

public class Order
{
    public string ID { get; set; }
    public List<Item> Items { get; set; }
}

public class Customer
{
    public string Name { get; set; }
    public string Email { get; set; }
    public List<Order> Orders { get; set; }
}

我想使用以下 DataColumns:

将集合完全展平为一个 DataTable

实施示例

以下是 ,但是这 只有 对象只包含简单属性(例如基元或字符串)而不包含其他嵌套对象时才有效。我在我认为可以应用递归的函数中添加了注释,但我不完全确定它会起作用。

public static DataTable CreateDataTableFromAnyCollection<T>(IEnumerable<T> list)
{
    Type type = typeof(T);
    var properties = type.GetProperties();

    DataTable dataTable = new DataTable();
    foreach (PropertyInfo info in properties)
    {
        dataTable.Columns.Add(new DataColumn(info.Name, Nullable.GetUnderlyingType(info.PropertyType) ?? info.PropertyType));
    }

    foreach (T entity in list)
    {
        object[] values = new object[properties.Length];
        for (int i = 0; i < properties.Length; i++)
        {
            values[i] = properties[i].GetValue(entity,null); // if not primitive pass a recursive call
        }

        dataTable.Rows.Add(values);
    }

    return dataTable;
}

在这种情况下,每个 Customer 对象的 Orders 列表中的每个 Order 中的每个 Item 都有一行。

如果您已经知道要在 DataTable 中包含哪些列,则不需要概括那么多。您发现的 CreateDataTableFromAnyCollection<T>() 方法非常通用(并且通用);从任何 one-dimensional 结构通用类型创建 DataTable。事实上,它非常通用,需要使用 Reflection!

递归调用是为了分解复杂的函数,这与泛型方法的观点相矛盾。因此,这样的方法要么必须由 non-generic sub-functions 组成,然后递归调用,要么使用 System.Reflection.MethodInfo.Invoke 进行递归。 Read more 在此。

既然你已经知道你需要什么,那就去做:

    public static DataTable CreateDataTableFromCustomerList(List<Customer> list) {
        DataTable dataTable = new DataTable();
        dataTable.Columns.AddRange(new DataColumn[] {
            new DataColumn("CustomerName"),
            new DataColumn("CustomerEmail"),
            new DataColumn("OrderID"),
            new DataColumn("OrderItemSKU"),
            new DataColumn("OrderItemDescription"),
            new DataColumn("OrderItemPrice"),
        });

        foreach (Customer customer in list) {
            object[] values = new object[6];
            // { customer.Name, customer.Email };
            values[0] = customer.Name;
            values[1] = customer.Email;
            foreach (Order order in customer.Orders) {
                values[2] = order.ID;
                foreach (Item item in order.Items) {
                    values[3] = item.SKU;
                    values[4] = item.Desc;
                    values[5] = item.Price;
                    dataTable.Rows.Add(values);
                }
            }
        }
        return dataTable;
    }

如果您只使用一种类型的模型对象(在这种情况下,Customer),那么我推荐,因为戏剧性地 简化了这个问题。

就是说,如果您确实需要将此作为通用方法,因为例如,您将处理各种不同的模型对象(即,不是 排他地 Customer),那么你当然可以扩展你提议的 CreateDataTableFromAnyCollection<T>() 方法来支持递归,尽管这不是一项简单的任务.

方法

递归过程并不像您预期​​的那么简单,因为您正在遍历对象集合,但只需要确定一次 DataTable 的定义。

因此,将递归功能分离为两种不同的方法更有意义:一种用于建立模式,另一种用于填充 DataTable。我提议:

  1. EstablishDataTableFromType(),它根据给定的 Type(以及任何嵌套类型)动态建立 DataTable 的架构,并且
  2. GetValuesFromObject(),对于源列表中的每个(嵌套)对象,将每个 属性 中的值添加到值列表中,随后可以将其添加到 DataTable.

挑战

上述方法掩盖了处理复杂对象和集合时引入的许多挑战。其中包括:

  1. 我们如何确定一个 属性 是否是一个集合——并因此受递归影响? 我们将能够使用Type.GetInterfaces()Type.GetGenericTypeDefinition() 方法来识别类型是否实现 ICollection<>。我在下面的私有 IsList() 方法中实现了它。

  2. 如果一个集合,我们如何判断这个集合包含什么类型 (例如,OrderItem)? 我们将能够使用 Type.GetGenericArguments() 来确定 ICollection<> 的泛型类型参数是什么。我在下面的私有 GetListType() 方法中实现了它。

  3. 我们如何确保所有数据都被表示,因为每个嵌套项目都需要一个额外的行?我们将需要为对象图中的每个排列建立一个新记录。

  4. 如果你在一个对象上有两个集合,按照 会发生什么? 我的代码假定你需要每个集合的排列.因此,如果您将 Addresses 添加到 Customer,这可能会产生如下内容:

    Customer.Name Customer.Orders.ID Customer.Orders.Items.SKU Customer.Addresses.PostalCode
    Bill Gates 0 001 98052
    Bill Gates 0 002 98052
    Bill Gates 0 001 98039
    Bill Gates 0 002 98039
  5. 如果一个对象有两个相同的集合会怎样Type你的建议推断DataColumn名字应该是由 Type 描述,但这会引入命名冲突。为了解决这个问题,我假设 属性 Name 应该用作描述符,而不是 属性 Type。例如,在您的示例模型中,DataColumn 将是 Customer.Orders.Items.SKU,而不是 Customer.Order.Item.SKU

  6. 如何区分复杂对象和“原始”对象? 或者,更准确地说,可以可靠地序列化为有意义的值的对象?您的问题假定集合属性将包含复杂对象,而其他属性则不会,但这不一定是真的。例如,指向复杂对象的 属性,或者相反,指向包含简单对象的集合:

    public class Order
    {
        public List<string> CouponCodes { get; set; } = new();
        public Address ShipTo { get; set; }
    }
    

    为了解决这个问题,我依靠@julealgon's answer to How do I tell if a type is a "simple" type? i.e. holds a single value。我在下面的私有 IsSimple() 方法中实现了它。

解决方案

此问题的解决方案比您引用的示例代码复杂得多。我将在下面简要介绍每种方法。此外,我在代码中包含了 XML 文档和一些注释。但是,如果您对任何特定功能有疑问,请提问,我会提供进一步的说明。

EstablishDataTableFromType(): 此方法将根据给定的 Type 建立一个 DataTable 定义。但是,此方法不是简单地循环遍历值,而是递归发现的任何复杂类型,包括集合中包含的复杂类型。

/// <summary>
///   Populates a <paramref name="dataTable"/> with <see cref="DataColumn"/>
///   definitions based on a given <paramref name="type"/>. Optionally prefixes
///   the <see cref="DataColumn"/> name with a <paramref name="prefix"/> to
///   handle nested types.
/// </summary>
/// <param name="type">
///   The <see cref="Type"/> to derive the <see cref="DataColumn"/> definitions
///   from, based on properties.
/// </param>
/// <param name="dataTable">
///   The <see cref="DataTable"/> to add the <see cref="DataColumn"/>s to.
/// </param>
/// <param name="prefix">
///   The prefix to prepend to the <see cref="DataColumn"/> name.
/// </param>

private static void EstablishDataTableFromType(Type type, DataTable dataTable, string prefix = "") {
    var properties = type.GetProperties();
    foreach (System.Reflection.PropertyInfo property in properties)
    {

        // Handle properties that can be meaningfully converted to a string
        if (IsSimple(property.PropertyType))
        {
            dataTable.Columns.Add(
                new DataColumn(
                    prefix + property.Name,
                    Nullable.GetUnderlyingType(property.PropertyType)?? property.PropertyType
                )
            );
        }

        // Handle collections
        else if (IsList(property.PropertyType))
        {
            // If the property is a generic list, detect the generic type used
            // for that list
            var listType = GetListType(property.PropertyType);
            // Recursively call this method in order to define columns for
            // nested types
            EstablishDataTableFromType(listType, dataTable, prefix + property.Name + ".");
        }

        // Handle complex properties
        else {
            EstablishDataTableFromType(property.PropertyType, dataTable, prefix + property.Name + ".");
        }
    }
}

GetValuesFromObject(): 此方法将采用源 Object 并为每个 属性 添加 属性 到 object[]。如果 Object 包含 ICollection<> 属性,它将递归 属性,为每个排列建立 object[]

/// <summary>
///   Populates a <paramref name="target"/> list with an array of <see cref="
///   object"/> instances representing the values of each property on a <paramref
///   name="source"/>.
/// </summary>
/// <remarks>
///   If the <paramref name="source"/> contains a nested <see cref="ICollection{T}"/>,
///   then this method will be called recursively, resulting in a new record for
///   every nested <paramref name="source"/> in that <see cref="ICollection{T}"/>.
/// </remarks>
/// <param name="type">
///   The expected <see cref="Type"/> of the <paramref name="source"/> object.
/// </param>
/// <param name="source">
///   The source <see cref="Object"/> from which to pull the property values.
/// </param>
/// <param name="target">
///   A <see cref="List{T}"/> to store the <paramref name="source"/> values in.
/// </param>
/// <param name="columnIndex">
///   The index associated with the property of the <paramref name="source"/>
///   object.
/// </param>

private static void GetValuesFromObject(Type type, Object? source, List<object?[]> target, ref int columnIndex)
{

    var properties          = type.GetProperties();

    // For each property, either write the value or recurse over the object values
    for (int i = 0; i < properties.Length; i++)
    {

        var property        = properties[i];
        var value           = source is null? null : property.GetValue(source, null);
        var baseIndex       = columnIndex;

        // If the property is a simple type, write its value to every instance of
        // the target object. If there are multiple objects, the value should be
        // written to every permutation
        if (IsSimple(property.PropertyType))
        {
            foreach (var row in target)
            {
                row[columnIndex] = value;
            }
            columnIndex++;
        }

        // If the property is a generic list, recurse over each instance of that
        // object. As part of this, establish copies of the objects in the target
        // storage to ensure that every a new permutation is created for every
        // nested object.
        else if (IsList(property.PropertyType))
        {
            var list        = value as ICollection;
            var collated    = new List<Object?[]>();

            // If the list is null or empty, rely on the type definition to insert 
            // null values into each DataColumn.
            if (list is null || list.Count == 0) {
                GetValuesFromObject(GetListType(property.PropertyType), null, collated, ref columnIndex);
                continue;
            }

            // Otherwise, for each item in the list, create a new row in the target 
            // list for its values.
            foreach (var item in list)
            {
                columnIndex = baseIndex;
                var values  = new List<Object?[]>();
                foreach (var baseItem in target)
                {
                    values.Add((object?[])baseItem.Clone());
                }
                GetValuesFromObject(item.GetType(), item, values, ref columnIndex);
                collated.AddRange(values);
            }

            // Finally, write each permutation of values to the target collection
            target.Clear();
            target.AddRange(collated);

        }

        // If the property is a complex type, recurse over it so that each of its
        // properties are written to the datatable.
        else
        {
            GetValuesFromObject(property.PropertyType, value, target, ref columnIndex);
        }

    }
}

CreateDataTableFromAnyCollection: 你提供的原始方法显然需要更新为调用EstablishDataTableFromType()GetValuesFromObject()方法,从而支持递归,而不是简单地遍历一个简单的属性列表。这很容易做到,尽管考虑到我如何编写 GetValuesFromObject() 签名,它确实需要一些脚手架。

/// <summary>
///   Given a <paramref name="list"/> of <typeparamref name="T"/> objects, will
///   return a <see cref="DataTable"/> with a <see cref="DataRow"/> representing
///   each instance of <typeparamref name="T"/>.
/// </summary>
/// <remarks>
///   If <typeparamref name="T"/> contains any nested <see cref="ICollection{T}"/>, the
///   schema will be flattened. As such, each instances of <typeparamref name=
///   "T"/> will have one record for every nested item in each <see cref=
///   "ICollection{T}"/>.
/// </remarks>
/// <typeparam name="T">
///   The <see cref="Type"/> that the source <paramref name="list"/> contains a
///   list of.
/// </typeparam>
/// <param name="list">
///   A list of <typeparamref name="T"/> instances to be added to the <see cref=
///   "DataTable"/>.
/// </param>
/// <returns>
///   A <see cref="DataTable"/> containing (at least) one <see cref="DataRow"/>
///   for each item in <paramref name="list"/>.
/// </returns>

public static DataTable CreateDataTableFromAnyCollection<T>(IEnumerable<T> list)
{

    var dataTable           = new DataTable();

    EstablishDataTableFromType(typeof(T), dataTable, typeof(T).Name + ".");

    foreach (T source in list)
    {
        var values          = new List<Object?[]>();
        var currentIndex    = 0;

        // Establish an initial array to store the values of the source object
        values.Add(new object[dataTable.Columns.Count]);

        // Assuming the source isn't null, retrieve its values and add them to the 
        // DataTable.
        if (source is not null)
        {
            GetValuesFromObject(source.GetType(), source, values, ref currentIndex);
        }

        // If the source object contains nested lists, then multiple permutations
        // of the source object will be returned.
        foreach (var value in values)
        {
            dataTable.Rows.Add(value);
        }

    }

    return dataTable;

}

IsSimple(): 一种辅助方法,用于确定 属性 类型是否可以可靠地序列化为有意义的字符串值。如果不能,那么上面的函数将递归它,设置它的每个属性 个值到 DataColumn。这是基于 @julealgon 对 How do I tell if a type is a "simple" type? i.e. holds a single value.

的回答
/// <summary>
///   Determine if a given <see cref="Type"/> can be reliably converted to a single
///   <see cref="String"/> value in the <see cref="DataTable"/>.
/// </summary>
/// <param name="type">
///   The <see cref="Type"/> to determine if it is a simple type.
/// </param>
/// <returns>
///   Returns <c>true</c> if the <paramref name="type"/> can be reliably converted
///   to a meaningful <see cref="String"/> value.
/// </returns>

private static bool IsSimple(Type type) =>
  TypeDescriptor.GetConverter(type).CanConvertFrom(typeof(string));

IsList(): 在这里,我添加了一个简单的辅助方法来确定给定 属性 的 Type 是否是是否通用ICollection<>EstablishDataTableFromType()GetValuesFromObject() 都使用它。这依赖于 Type 类型的 IsGenericTypeGetGenericTypeDefinition()。我使用 ICollection<> 而不是 IEnumerable<> 因为例如String 实现 IEnumerable<>(您不希望字符串中的每个字符都有一个新列!)

/// <summary>
///   Simple helper function to determine if a given <paramref name="type"/> is a
///   generic <see cref="ICollection{T}"/>.
/// </summary>
/// <param name="type">
///   The <see cref="Type"/> to determine if it is an <see cref="ICollection{T}"/>.
/// </param>
/// <returns>
///   Returns <c>true</c> if the <paramref name="type"/> is a generic <see cref=
///   "ICollection{T}"/>.
/// </returns>

private static bool IsList(Type type) => type
    .GetInterfaces()
    .Any(i => i.IsGenericType && i.GetGenericTypeDefinition() == typeof(ICollection<>));

GetListType(): 最后,我添加了另一个简单的辅助方法来确定给定泛型 ICollection<> 的泛型 TypeEstablishDataTableFromType()GetValuesFromObject() 都使用它。这与上面的 IsList() 方法非常相似,只是它 returns 特定的 Type,而不是仅仅确认 属性 类型实现了 ICollection<> 接口.

/// <summary>
///   Simple helper function to determine the generic <paramref name="type"/> of
///   an <see cref="ICollection{T}"/>.
/// </summary>
/// <param name="type">
///   The <see cref="Type"/> implementing <see cref="ICollection{T}"/>.
/// </param>
/// <returns>
///   Returns the generic <see cref="Type"/> associated with the <see cref=
///   "ICollection{T}"/> implemented for the <paramref name="type"/>.
/// </returns>

private static Type GetListType(Type type) => type
    .GetInterfaces()
    .Where(i => i.IsGenericType && typeof(ICollection<>) == i.GetGenericTypeDefinition())
    .FirstOrDefault()
    .GetGenericArguments()
    .Last();

验证

这是一个非常简单的测试(为 XUnit 编写)来验证基本功能。这仅确认 DataTable 中的 DataRow 实例数与预期的排列数相匹配;它不会验证每条记录中的实际数据——尽管我已经单独验证了数据是否正确:

[Fact]
public void CreateDataTableFromAnyCollection() 
{
    
    // ARRANGE

    var customers           = new List<Customer>();

    // Create an object graph of Customer, Order, and Item instances, three per
    // collection 
    for (var i = 0; i < 3; i++) 
    {
        var customer        = new Customer() {
            Email           = "Customer" + i + "@domain.tld",
            Name            = "Customer " + i
        };
        for (var j = 0; j < 3; j++) 
        {
            var order = new Order() 
            {
                ID = i + "." + j
            };
            for (var k = 0; k < 3; k++) 
            {
                order.Items.Add(
                    new Item() 
                    {
                        Description = "Item " + k,
                        SKU = "0123-" + k,
                        Price = i + (k * .1)
                    }
                );
            }
            customer.Orders.Add(order);
        }
        customers.Add(customer);
    }

    // ACT
    var dataTable = ParentClass.CreateDataTableFromAnyCollection<Customer>(customers);

    // ASSERT
    Assert.Equal(27, dataTable.Rows.Count);

    // CLEANUP VALUES
    dataTable.Dispose();

}

Note: This assumes that your CreateDataTableFromAnyCollection() method is placed in a class called ParentClass; obviously, you'll need to adjust that based on your application's structure.

结论

这应该会让您很好地了解如何将对象图动态映射到扁平化的 DataTable,同时还可以解决您可能会遇到的常见场景,例如引用复杂对象的属性(例如,上面的 ShipTo 示例)和 null 或空集合。显然,您的特定数据模型可能会在我的实施中引入无法预料的额外挑战;在这种情况下,这应该为您的构建提供坚实的基础。