使用 R 中的向量列将多个条目转换为单行
Convert multiple entries to a single row with a vector column in R
我有一个多行数据集,描述了一个用户。我正在尝试将我的数据集更改为一行代表一个用户。
可重现的例子:
old_way <- data.frame("Day" = 1:10, "Purchase" = 20:29, "Name" = c("John", "John", "John", "Dora", "Dora", "Dora", "Dora", "Gerald", "Gerald", "Gerald"), stringsAsFactors = FALSE)
Day Purchase Name
1 1 20 John
2 2 21 John
3 3 22 John
4 4 23 Dora
5 5 24 Dora
6 6 25 Dora
7 7 26 Dora
8 8 27 Gerald
9 9 28 Gerald
10 10 29 Gerald
最大的区别是现在每一排都是一个人。所以我在做记录的时候,可以很容易的看到他们在哪几天,做了什么采购。
desired_way <- data.frame("Name" = c("John","Dora","Gerald"), "Day" = c("1, 2, 3", "4, 5, 6, 7", "8, 9 ,10"), "Purchase" = c("20, 21, 22", "23, 24, 25, 26", "27, 28, 29"), "Last_Day" = c("3", "7", "10"), "Avg_Purchase" = c("21","25","28"))
Name Day Purchase Last_Day Avg_Purchase
1 John 1, 2, 3 20, 21, 22 3 21
2 Dora 4, 5, 6, 7 23, 24, 25, 26 7 25
3 Gerald 8, 9 ,10 27, 28, 29 10 28
如何创建封装其他行信息的单元格? R 是否支持对该单元格执行的操作,或者我是否需要在创建该单元格时计算最近的和平均值?
提前谢谢大家!
在按 'Name'
分组后,最好将其保存在 list
而不是单个字符串中
library(dplyr)
old_way %>%
group_by(Name) %>%
summarise(Last_Day = last(Day),
Avg_Purchase = mean(Purchase),
Day = list(Day), Purchase = list(Purchase), .groups = 'drop')
-输出
# A tibble: 3 x 5
# Name Last_Day Avg_Purchase Day Purchase
# <chr> <int> <dbl> <list> <list>
#1 Dora 7 24.5 <int [4]> <int [4]>
#2 Gerald 10 28 <int [3]> <int [3]>
#3 John 3 21 <int [3]> <int [3]>
我有一个多行数据集,描述了一个用户。我正在尝试将我的数据集更改为一行代表一个用户。
可重现的例子:
old_way <- data.frame("Day" = 1:10, "Purchase" = 20:29, "Name" = c("John", "John", "John", "Dora", "Dora", "Dora", "Dora", "Gerald", "Gerald", "Gerald"), stringsAsFactors = FALSE)
Day Purchase Name
1 1 20 John
2 2 21 John
3 3 22 John
4 4 23 Dora
5 5 24 Dora
6 6 25 Dora
7 7 26 Dora
8 8 27 Gerald
9 9 28 Gerald
10 10 29 Gerald
最大的区别是现在每一排都是一个人。所以我在做记录的时候,可以很容易的看到他们在哪几天,做了什么采购。
desired_way <- data.frame("Name" = c("John","Dora","Gerald"), "Day" = c("1, 2, 3", "4, 5, 6, 7", "8, 9 ,10"), "Purchase" = c("20, 21, 22", "23, 24, 25, 26", "27, 28, 29"), "Last_Day" = c("3", "7", "10"), "Avg_Purchase" = c("21","25","28"))
Name Day Purchase Last_Day Avg_Purchase
1 John 1, 2, 3 20, 21, 22 3 21
2 Dora 4, 5, 6, 7 23, 24, 25, 26 7 25
3 Gerald 8, 9 ,10 27, 28, 29 10 28
如何创建封装其他行信息的单元格? R 是否支持对该单元格执行的操作,或者我是否需要在创建该单元格时计算最近的和平均值?
提前谢谢大家!
在按 'Name'
分组后,最好将其保存在list
而不是单个字符串中
library(dplyr)
old_way %>%
group_by(Name) %>%
summarise(Last_Day = last(Day),
Avg_Purchase = mean(Purchase),
Day = list(Day), Purchase = list(Purchase), .groups = 'drop')
-输出
# A tibble: 3 x 5
# Name Last_Day Avg_Purchase Day Purchase
# <chr> <int> <dbl> <list> <list>
#1 Dora 7 24.5 <int [4]> <int [4]>
#2 Gerald 10 28 <int [3]> <int [3]>
#3 John 3 21 <int [3]> <int [3]>