将集合拆分为 n 个部分未提供所需的结果序列

Split collection into n of parts is not giving the desired resulting secuences

我正在尝试将集合拆分为特定数量的部分,我在 Whosebug 上寻求了一些解决方案的帮助:Split a collection into `n` parts with LINQ?

这是我的VB.Net翻译自@Hasan Khan 解决方案:

''' <summary>
''' Splits an <see cref="IEnumerable(Of T)"/> into the specified amount of secuences.
''' </summary>
Public Shared Function SplitIntoParts(Of T)(ByVal col As IEnumerable(Of T),
                                            ByVal amount As Integer) As IEnumerable(Of IEnumerable(Of T))

    Dim i As Integer = 0

    Dim splits As IEnumerable(Of IEnumerable(Of T)) =
                 From item As T In col
                 Group item By item = Threading.Interlocked.Increment(i) Mod amount
                 Into Group
                 Select Group.AsEnumerable()

    Return splits


End Function

这是我对@manu08 解决方案的 VB.Net 翻译:

''' <summary>
''' Splits an <see cref="IEnumerable(Of T)"/> into the specified amount of secuences.
''' </summary>
Public Shared Function SplitIntoParts(Of T)(ByVal col As IEnumerable(Of T),
                                            ByVal amount As Integer) As IEnumerable(Of IEnumerable(Of T))

    Return col.Select(Function(item, index) New With {index, item}).
               GroupBy(Function(x) x.index Mod amount).
               Select(Function(x) x.Select(Function(y) y.item))

End Function

问题是这两个函数 returns 一个错误的结果。

因为如果我像这样拆分一个集合:

Dim mainCol As IEnumerable(Of Integer) = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

Dim splittedCols As IEnumerable(Of IEnumerable(Of Integer)) =
    SplitIntoParts(col:=mainCol, amount:=2)

两个函数都给出了这个结果:

1: { 1, 3, 5, 7, 9 }
2: { 2, 4, 6, 8, 10 }

而不是这些序列:

1: { 1, 2, 3, 4, 5 } 
2: { 6, 7, 8, 9, 10 }

我做错了什么?

你没有做错什么;只是您使用的方法没有按照您想要的方式保持排序。想想 modGroupBy 是如何工作的,你就会明白为什么。

我建议你使用 Jon Skeet's answer,因为它保留了你的 collection 的顺序(我冒昧地为你翻译成 VB.Net)。

你只需要事先计算每个分区的大小,因为它不会将 collection 分成 n 块,而是分成长度为 n 的块:

<Extension> _
Public Shared Iterator Function Partition(Of T)(source As IEnumerable(Of T), size As Integer) As IEnumerable(Of IEnumerable(Of T)) 
    Dim array__1 As T() = Nothing
    Dim count As Integer = 0
    For Each item As T In source
        If array__1 Is Nothing Then
            array__1 = New T(size - 1) {}
        End If
        array__1(count) = item
        count += 1
        If count = size Then
            yield New ReadOnlyCollection(Of T)(array__1)
            array__1 = Nothing
            count = 0
        End If
    Next
    If array__1 IsNot Nothing Then
        Array.Resize(array__1, count)
        yield New ReadOnlyCollection(Of T)(array__1)
    End If
End Function

并使用它:

mainCol.Partition(CInt(Math.Ceiling(mainCol.Count() / 2)))

随意隐藏新方法中的 Partition(CInt(Math.Ceiling(...)) 部分。

低效的解决方案(对数据的迭代太多):

class Program
{
    static void Main(string[] args)
    {
        var data = Enumerable.Range(1, 10);
        var result = data.Split(2);            
    }
}

static class Extensions
{
    public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> col, int amount)
    {
        var chunkSize = (int)Math.Ceiling((double)col.Count() / (double)amount);

        for (var i = 0; i < amount; ++i)
            yield return col.Skip(chunkSize * i).Take(chunkSize);
    }
}

编辑:

在VB.Net

Public Shared Iterator Function SplitIntoParts(Of T)(ByVal col As IEnumerable(Of T),
                                                     ByVal amount As Integer) As IEnumerable(Of IEnumerable(Of T))

    Dim chunkSize As Integer = CInt(Math.Ceiling(CDbl(col.Count()) / CDbl(amount)))

    For i As Integer = 0 To amount - 1
        Yield col.Skip(chunkSize * i).Take(chunkSize)
    Next

End Function

MyExtensions class 有两个 public Split 方法:

  1. 对于 ICollection - 遍历集合 仅一次 - 用于拆分。
  2. For IEnumerable - 遍历可枚举 两次:用于计算项目并拆分它们。尽可能不要使用它(第一个是安全的,而且速度快两倍)。

更多:此算法试图恢复确切指定数量的集合

public static class MyExtensions
{
    // Works with ICollection - iterates through collection only once.
    public static IEnumerable<IEnumerable<T>> Split<T>(this ICollection<T> items, int count)
    {
        return Split(items, items.Count, count);
    }

    // Works with IEnumerable and iterates items TWICE: first for count items, second to split them.
    public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> items, int count)
    {            
        // ReSharper disable PossibleMultipleEnumeration
        var itemsCount = items.Count();
        return Split(items, itemsCount, count);
        // ReSharper restore PossibleMultipleEnumeration
    }

    private static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> items, int itemsCount, int partsCount)
    {
        if (items == null)
            throw new ArgumentNullException("items");
        if (partsCount <= 0)
            throw new ArgumentOutOfRangeException("partsCount");

        var rem = itemsCount % partsCount;
        var min = itemsCount / partsCount;
        var max = rem != 0 ? min + 1 : min;

        var index = 0;
        var enumerator = items.GetEnumerator();

        while (index < itemsCount)
        {
            var size = 0 < rem-- ? max : min;
            yield return SplitPart(enumerator, size);
            index += size;
        }
    }

    private static IEnumerable<T> SplitPart<T>(IEnumerator<T> enumerator, int count)
    {
        for (var i = 0; i < count; i++)
        {
            if (!enumerator.MoveNext())
                break;
            yield return enumerator.Current;
        }            
    }
}

示例程序:

var items = new [] {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'};

for(var i = 1; i <= items.Length + 3; i++)
{
    Console.WriteLine("{0} part(s)", i);
    foreach (var part in items.Split(i))
        Console.WriteLine(string.Join(", ", part));
    Console.WriteLine();
}

...此程序的输出:

1 part(s)
a, b, c, d, e, f, g, h, i, j

2 part(s)
a, b, c, d, e
f, g, h, i, j

3 part(s)
a, b, c, d
e, f, g
h, i, j

4 part(s)
a, b, c
d, e, f
g, h
i, j

5 part(s)
a, b
c, d
e, f
g, h
i, j

6 part(s)
a, b
c, d
e, f
g, h
i
j

7 part(s)
a, b
c, d
e, f
g
h
i
j

8 part(s)
a, b
c, d
e
f
g
h
i
j

9 part(s)
a, b
c
d
e
f
g
h
i
j

10 part(s)
a
b
c
d
e
f
g
h
i
j

11 part(s) // Only 10 items in collection.
a
b
c
d
e
f
g
h
i
j

12 part(s) // Only 10 items in collection.
a
b
c
d
e
f
g
h
i
j

13 part(s)  // Only 10 items in collection.
a
b
c
d
e
f
g
h
i
j