如何调用加起来达到一定数量的组合的输出？

Question

我正在尝试查看以下返回的组合中有多少个加起来至少为 46。我应该输入什么代码？

x=c(20,10,16,16,6,15)
L=combn(x,3)

L是6选3的组合列表

谢谢！

Answer 1

我认为解决这个问题的一个好方法是简单地转置您生成的组合矩阵，求和，然后过滤那些加起来为 46 的矩阵。

transposeL<- as.data.frame(t(L)) %>% 
  mutate(sum=V1+V2+V3) %>%
  filter(sum>=46)

Answer 2

试试这个

x <- c(20,10,16,16,6,15)
sum(combn(x, 3, sum) >= 46)

输出

[1] 6

Answer 3

基于 Joe Erinjeri 的回答，我们可以采用任意组合大小，避免必须写出变量名称（V1、V2、V3、...），例如所以：

library(tidyverse)
c(20,10,16,16,6,15) %>% 
  combn(3) %>% 
  t() %>% 
  as.data.frame() %>% 
  mutate(V_sum = rowSums(.)) %>%  
  filter(V_sum >= 46) %>% 
  nrow()

  [1] 6

Answer 4

包 RcppAlgos^* 解决了这些类型的问题。 OP 提到的具体问题涉及多重集的组合，使用大多数标准工具，会产生许多重复的条目（我假设不需要）。

例如，天真的方法必须首先生成所有组合，对值求和，然后检查所有组合。请注意，我们没有利用 FUN = 参数进行演示：

x <- c(20,10,16,16,6,15)

funNaive <- function(x, m, tar) {
    all_combs <- t(combn(x, m))
    ind <- which(rowSums(all_combs) >= tar)
    all_combs[ind, ]
}

funNaive(x, 3, 46)
     [,1] [,2] [,3]
[1,]   20   10   16
[2,]   20   10   16   <- duplicate of the 1st row
[3,]   20   16   16
[4,]   20   16   15
[5,]   20   16   15   <- duplicate of the 4th row
[6,]   16   16   15

您会注意到第 1^st 行和第 2^nd 行以及第 3^{rd 行相同} 和第 4^行。这个问题总共应该只有 4 个结果。

这是使用 RcppAlgos 中的 comboGeneral 的更好方法。请注意 freqs 参数的使用，该参数用于表示源向量的每个元素重复了多少次：

funAlgos <- function(x, m, tar) {
    x <- sort(x)
    myFreq <- rle(x)$lengths
    myVals <- rle(x)$values
    
    RcppAlgos::comboGeneral(myVals, m,
                            freqs = myFreq,
                            constraintFun = "sum",
                            comparisonFun = ">=",
                            limitConstraints = tar)
}

funAlgos(x, 3, 46)
     [,1] [,2] [,3]
[1,]   20   16   16
[2,]   20   16   15
[3,]   20   16   10
[4,]   16   16   15

您可以更改上面的基本方法以给出正确的结果。在这种情况下，我们仍然无法利用 FUN = 参数，因为我们需要能够删除重复的组合：

funNaiveCorrected <- function(x, m, tar) {
    x <- sort(x)
    all_combs <- t(combn(x, m))
    no_dupes <- all_combs[!duplicated(all_combs), ]
    ind <- which(rowSums(no_dupes) >= tar)
    no_dupes[ind, ]
}

funNaiveCorrected(x, 3, 46)
      [,1] [,2] [,3]
[1,]   20   10   16
[2,]   20   16   16
[3,]   20   16   15
[4,]   16   16   15

必须强调的是，我们不能简单地将 unique 应用于源向量，因为我们会错过具有重复值的组合。

对于小问题，这不是问题，但是对于更大的问题，这将迅速成为真正的瓶颈。观察：

set.seed(42)
big_x <- sort(sample(25, replace = TRUE))
system.time(algos <- funAlgos(big_x, 10, 175))
user  system elapsed 
   0       0       0

dim(algos)
[1] 1668   10

system.time(naive <- funNaiveCorrected(big_x, 10, 175))
  user  system elapsed 
17.161   0.276  17.434

dim(naive)
[1] 1668   10

对于更大的问题，基本方法将消耗所有可用内存。请注意，不建议使用基本方法尝试下面的示例 (N.B。choose(50, 20) ~= 4.712921e+13):

set.seed(1729)
huge_x <- sort(sample(50, replace = TRUE))
system.time(algos <- funAlgos(huge_x, 20, 800))
 user  system elapsed 
0.009   0.001   0.010

dim(algos)
[1] 13473    20

^* 我是 RcppAlgos

的作者

如何调用加起来达到一定数量的组合的输出？

How to call the output(s) of a combination that add up to a certain number?

syntax

combinations

r