按多列聚合,对一列求和并保留其他列?根据聚合值创建新列?

Aggregate by multiple columns, sum one column and keep other columns? Create new column based on aggregated values?

我有一个包含销售额的数据框。我需要按 2 列 ProductIDDay 聚合 df,并对来自不同列 Amount 的每个聚合组的值求和,以便它现在显示总数。我希望保留其他可以分组的列(跨行的相同值),在本例中只是 Product。最后一列 Store 不会保留,因为分组行中的值可能不同。但是,我需要添加一列 UniqueStores,它计算每组具有相同 ProductID 和 Day 的唯一商店的数量。例如,ID=1 且 Day= Monday 的第一个组将有 1 个唯一商店“N”,因此值为 1。

我尝试在此处起草 table 文本,但我无法正确设置格式,所以这里作为汇总前的外观图片:

我已经尝试使用 group_by + summarize 和 df[sum,by] 进行聚合,但它们不会保留未作为索引给出的变量。是否有无需手动插入应保留的每一列的解决方法?

提前致谢,希望我说清楚了。

输入值:

df <- data.frame("ProductID" = c(1,1,1,1,2,2,2,2), "Day"=c("Monday","Monday", "Tuesday", "Tuesday","Wednesday", "Wednesday", "Friday", "Friday"), "Amount"=c(5,5,3,7,6,9,5,2), "Product"=c("Food","Food","Food","Food","Toys","Toys","Toys","Toys"), "Store"=c("N","N","W","N", "S","W", "S","S"))

我们可以在 dplyrsummarise 中对 'Amount' 和 n_distinctsum 进行分组操作([ 的不同元素数=24=])

library(dplyr)
df %>% 
  group_by(ProductID, Day, Product) %>%
  summarise(Amount = sum(Amount), 
       UniqueStores = n_distinct(Store), .groups = 'drop')
# A tibble: 4 x 5
#  ProductID Day       Product Amount UniqueStores
#      <dbl> <chr>     <chr>    <dbl>        <int>
#1         1 Monday    Food        10            1
#2         1 Tuesday   Food        10            2
#3         2 Friday    Toys         7            1
#4         2 Wednesday Toys        15            2

如果有多个列,并且只想对一部分列进行子集化,同时保留其余列,一个选项是在数据集中 mutate 然后使用 distinct 来获取第一行

df %>% 
  group_by(ProductID, Day, Product) %>%
  mutate(Amount = sum(Amount), 
       UniqueStores = n_distinct(Store), .keep = 'all') %>%
  ungroup %>%
  distinct(ProductID, Day, Product, .keep_all = TRUE)

data.table中:

library(data.table)

setDT(df)[, .(Amount = sum(Amount, na.rm = TRUE),
              UniqueStores = uniqueN(Store, na.rm = TRUE)), 
          by = .(ProductID, Day, Product)
          ]

输出:

   ProductID       Day Product Amount UniqueStores
1:         1    Monday    Food     10            1
2:         1   Tuesday    Food     10            2
3:         2 Wednesday    Toys     15            2
4:         2    Friday    Toys      7            1