计算我可以在 R dplyr 中找到一个元素的不同组数

Question

我的数据是这样的

library(tidyverse)

df3 <- tibble(fruits=c("apple","banana","ananas","apple","ananas","apple","ananas"),
              position=c("135","135","135","136","137","138","138"), 
              counts = c(100,200,100,30,40,50,100))

df3
#> # A tibble: 7 × 3
#>   fruits position counts
#>   <chr>  <chr>     <dbl>
#> 1 apple  135         100
#> 2 banana 135         200
#> 3 ananas 135         100
#> 4 apple  136          30
#> 5 ananas 137          40
#> 6 apple  138          50
#> 7 ananas 138         100

^{由 reprex package (v2.0.1)}

创建于 2022-02-21

我想 group_by fruits 并计算每个 在哪个和多少个不同的位置fruit 属于。我希望我的数据看起来像

fruits    groups    n_groups     sum_count
apple  135,136,138      3            180
banana      135         1            200
ananas 135,137,138      3            240

组列可以是字符列表。我不太关心结构。

感谢您的宝贵时间。任何指导表示赞赏。

Answer 1

我不太明白你想要从你的描述中得到什么，但是你可以通过 fruits:

分组来实现你想要的 data.frame

df3 %>% 
  group_by(fruits) %>% 
  summarise(groups = list(position), n_groups = n(), counts = sum(counts))

  fruits groups    n_groups counts
  <chr>  <list>       <int>  <dbl>
1 ananas <chr [3]>        3    240
2 apple  <chr [3]>        3    180
3 banana <chr [1]>        1    200

Answer 2

请在下面找到另一种使用 data.table

的可能性

Reprex

代码

library(data.table)
library(tibble)


setDT(df3)[, .(groups = paste(position, collapse = ","), n_groups = .N, sum_count = sum(counts)) , by = .(fruits)]

输出

#>    fruits      groups n_groups sum_count
#> 1:  apple 135,136,138        3       180
#> 2: banana         135        1       200
#> 3: ananas 135,137,138        3       240

^{由 reprex package (v2.0.1)}

于 2022-02-21 创建

计算我可以在 R dplyr 中找到一个元素的不同组数

Count in how many different groups I can find an element in R dplyr

r

dplyr

data.table

tidyr

tidyverse