有没有一种简单的方法可以通过非交换操作进行并行聚合?

Is there an easy way to do parallel aggregation with a non-commutative operation?

.NET Framework 使并行聚合变得容易,但根据 the documentation,它仅适用于交换操作,即 f(x, y) = f(y, x):

The .NET implementation of the Parallel Aggregation pattern also expects the operations to be commutative.

我想使用连接来聚合字符串值,这是一种非交换操作。顺序方法如下所示:

var result = string.Concat(sequence.Select(this.LongOperation));

所以如果this.LongOperationreturns先后Hello,World,!,最后的结果就是HelloWorld!.

如果我使用并行聚合,结果可能是 HelloWorld,但也可能是 World!Hello!HelloWorld

解决方法是执行类似于以下内容的操作:

var result = sequence
    .AsParallel()
    .Select((v, i) => new { Index = i, Value = v })
    .Select(c => new { Index = c.Index, Value = this.LongOperation(c.Value))
    .OrderBy(c => c.Index)
    .Aggregate(seed: string.Empty, func: (prev, current) => prev + current);

(不重要,在我的特定情况下)缺点是整个序列将在 OrderBy 步骤进行评估,而无需等到聚合。另一种写法是:

var parts = sequence
    .AsParallel()
    .Select((v, i) => new { Index = i, Value = v })
    .Select(c => new { Index = c.Index, Value = this.LongOperation(c.Value))
    .OrderBy(c => c.Index)
    .Select(c => c.Value);

var result = string.Concat(parts);

我应该这样做,还是有更简单的方法来做这件事?

您正在寻找 ParallelEnumerable.AsOrdered:

var result = sequence
    .AsParallel()
    .AsOrdered()
    .Aggregate(seed: string.Empty, func: (prev, current) => prev + current);

您需要保留排序这一事实将对您的查询产生性能影响。由于结果需要按顺序聚合,您将无法享受并行的最大好处,并且有时可能会导致顺序迭代的性能下降。话虽如此,这将满足您的需求。

例如,以下代码将始终生成 "[7][35][22][6][14]"

var result = new [] { 35, 14, 22, 6, 7 }
    .AsParallel()
    .AsOrdered()
    .Select(c => "[" + c + "]")
    .Aggregate(seed: string.Empty, func: (prev, current) => prev + current);

Console.WriteLine(result);

Parallel Programming Team 关​​于 PLINQ Ordering 有很好的 post。