使用 R 为每个 id 的列中的每个单元格创建逗号分隔的字符值

Question

我有 2 列 ID 和产品：

ID  Product
A   Clothing
B   Food
A   Food
A   Furniture
C   Food
B   Clothing

我如何使用 R 创建一个数据框，其中每个 ID 的产品都以逗号分隔，如下所示：

ID  Product
A   Clothing, Food, Furniture
B   Food, Clothing
C   Food, Clothing

Answer 1

我们可以使用分组功能之一。使用 data.table，我们将 'data.frame' 转换为 'data.table' (setDT(df1))，按 'ID' 分组，我们 paste [=32= 的元素] 一起。 toString 是 paste(., collapse=', ').

的包装器

library(data.table)
setDT(df1)[,list(Product=toString(Product)), by = ID]

与 dplyr 类似的选项是

library(dplyr
df1 %>%
   group_by(ID) %>%
   summarise(Product= toString(Product))

或者我们可以使用 base R

中的 aggregate

aggregate(Product~ID, df1, FUN=toString)
#    ID                   Product
#  1  A Clothing, Food, Furniture
#  2  B            Food, Clothing
#  3  C                      Food

使用 R 为每个 id 的列中的每个单元格创建逗号分隔的字符值

Creating comma separated character values for each cell in a column for each id using R

r

data-manipulation