将数组分成具有最小和最大长度的组的最佳 Ruby 算法是什么?

What is the best Ruby algorithm to divide an array into groups with a minimum and maximum length?

我有一个后台作业需要将很多项目合并在一起。我想将其拆分为多个 "sub jobs",每个 "sub jobs" 合并数据的一个子集,然后最后一次将所有 "sub jobs" 的输出合并在一起。

一种简单的方法是将数据分成 x 个元素组。问题是最后一组可能有 1 个元素的剩余部分,所以它将是 "noop"。我想找到最优的 "x" 使得组大致均匀,并且每个组中有最小和最大元素数(例如不少于 10 个元素,并且不超过 20 个。)

在 Ruby 中对此有什么好的算法?

下面是一些示例输出,最小值为 10,最大值为 20。数字表示每个数组中元素的数量。

<number of elements in input> => <subgroup 1>, <subgroup 2>, etc. 

5 => 5
10 => 10
15 => 15
20 => 20
21 => 10, 11
30 => 15, 15
40 => 20, 20
41 => 13, 14, 14
42 => 14, 14, 14
43 => 14, 14, 15
45 => 15, 15, 15
50 => 16, 17, 17
55 => 18, 18, 19
60 => 20, 20, 20
61 => 15, 15, 15, 16

基本上我想将数组分成大致均匀的组,但每个组中的元素数量最少和最多。

这是我尝试的解决方案:

class ArrayPartition
  def self.partition_lengths(length, minimum, maximum)
    if length <= maximum
      return [length]
    end

    group_size = maximum
    groups = []

    while groups.empty? || groups.last < minimum
      groups = []
      remaining = length
      while remaining > group_size
        groups << group_size
        remaining -= group_size
      end
      groups << remaining
      group_size -= 1
    end

    # Redistribute evenly
    avg_group_size = (length / groups.size.to_f).round
    groups = []
    remaining = length
    while remaining > maximum
      groups << avg_group_size
      remaining -= avg_group_size
    end
    groups << remaining

    groups.sort
  end
end

RSpec 测试:

RSpec.describe ArrayPartition do
  it 'partitions an array into optimal groups with min and max elements' do
    expect(ArrayPartition.partition_lengths(5, 5, 10)).to eq [5]
    expect(ArrayPartition.partition_lengths(6, 5, 10)).to eq [6]
    expect(ArrayPartition.partition_lengths(7, 5, 10)).to eq [7]
    expect(ArrayPartition.partition_lengths(10, 5, 10)).to eq [10]
    expect(ArrayPartition.partition_lengths(11, 5, 10)).to eq [5, 6]
    expect(ArrayPartition.partition_lengths(12, 5, 10)).to eq [6, 6]
    expect(ArrayPartition.partition_lengths(13, 5, 10)).to eq [6, 7]
    expect(ArrayPartition.partition_lengths(16, 5, 10)).to eq [8, 8]
    expect(ArrayPartition.partition_lengths(20, 5, 10)).to eq [10, 10]
    expect(ArrayPartition.partition_lengths(21, 5, 10)).to eq [7, 7, 7]
    expect(ArrayPartition.partition_lengths(22, 5, 10)).to eq [7, 7, 8]

    expect(ArrayPartition.partition_lengths(5, 10, 20)).to eq [5]
    expect(ArrayPartition.partition_lengths(10, 10, 20)).to eq [10]
    expect(ArrayPartition.partition_lengths(15, 10, 20)).to eq [15]
    expect(ArrayPartition.partition_lengths(20, 10, 20)).to eq [20]
    expect(ArrayPartition.partition_lengths(21, 10, 20)).to eq [10, 11]
    expect(ArrayPartition.partition_lengths(30, 10, 20)).to eq [15, 15]
    expect(ArrayPartition.partition_lengths(40, 10, 20)).to eq [20, 20]
    expect(ArrayPartition.partition_lengths(41, 10, 20)).to eq [13, 14, 14]
    expect(ArrayPartition.partition_lengths(42, 10, 20)).to eq [14, 14, 14]
    expect(ArrayPartition.partition_lengths(43, 10, 20)).to eq [14, 14, 15]
    expect(ArrayPartition.partition_lengths(45, 10, 20)).to eq [15, 15, 15]
    expect(ArrayPartition.partition_lengths(50, 10, 20)).to eq [16, 17, 17]
    expect(ArrayPartition.partition_lengths(55, 10, 20)).to eq [18, 18, 19]
    expect(ArrayPartition.partition_lengths(60, 10, 20)).to eq [20, 20, 20]
    expect(ArrayPartition.partition_lengths(61, 10, 20)).to eq [15, 15, 15, 16]
  end
end

我会这样处理:

# count of original items
count = 61

# max bucket size
max = 20

# decide buckets
groups = (count / max) + (count % max > 0 ? 1 : 0)

# this will be the final result
result = []

# create buckets
groups.times { result.push(0) }

# iterate over original items and distribute them in the buckets
count.times do |n|
  result[n % groups] += 1
end

p result

鉴于 count 为 61,它会打印 16、15、15、15。我已经在代码段本身中解释了每个语句的目的。

略有不同的版本:

def divide(c, max = 20)
  groups = (c.to_f / max).ceil
  min_count = (c.to_f / groups).floor

  [min_count + 1] * (c % min_count) + [min_count] * (groups - c % min_count)
end