如何在多个维度上聚合 data.table
How to aggregate data.table on multiple dimensoins
我有一个数据 table,我想根据多个字段聚合数据。这是我的数据的一个简化示例:
# each record is the number of pages read
# by a student in a given day
pages_per_day <- data.table(
student_id = c(1,1,1,2,2,2),
week_of_semester = c(1,1,2,1,2,2),
pages_read = c(8,6,4,7,8,7)
)
我想根据 student_id 和周汇总此数据,以显示每个学生在学期的给定周内阅读的平均页数。我尝试了以下方法:
avg_weekly_pages_read <- grades[,list(
avg_pages = sum(pages_read) / .N,
by = c('student_id','week')
)]
这给了我一个两列数据 table,列为:avg_pages,by.
我希望 table 更像:
student_id, week, avg_pages
1,1,7
1,2,4
2,1,7
2,2,7.5
非常感谢任何指导。
您正在寻找
pages_per_day[, .(avg_pages = mean(pages_read)), by = .(student_id, week_of_semester)]
# student_id week_of_semester avg_pages
# 1: 1 1 7.0
# 2: 1 2 4.0
# 3: 2 1 7.0
# 4: 2 2 7.5
顺便说一句,不需要重新发明轮子。 R
中有一个mean
函数
aggregate(pages_read~student_id+week_of_semester,pages_per_day,mean)
student_id week_of_semester pages_read
# 1 1 1 7.0
# 2 2 1 7.0
# 3 1 2 4.0
# 4 2 2 7.5
我有一个数据 table,我想根据多个字段聚合数据。这是我的数据的一个简化示例:
# each record is the number of pages read
# by a student in a given day
pages_per_day <- data.table(
student_id = c(1,1,1,2,2,2),
week_of_semester = c(1,1,2,1,2,2),
pages_read = c(8,6,4,7,8,7)
)
我想根据 student_id 和周汇总此数据,以显示每个学生在学期的给定周内阅读的平均页数。我尝试了以下方法:
avg_weekly_pages_read <- grades[,list(
avg_pages = sum(pages_read) / .N,
by = c('student_id','week')
)]
这给了我一个两列数据 table,列为:avg_pages,by.
我希望 table 更像:
student_id, week, avg_pages
1,1,7
1,2,4
2,1,7
2,2,7.5
非常感谢任何指导。
您正在寻找
pages_per_day[, .(avg_pages = mean(pages_read)), by = .(student_id, week_of_semester)]
# student_id week_of_semester avg_pages
# 1: 1 1 7.0
# 2: 1 2 4.0
# 3: 2 1 7.0
# 4: 2 2 7.5
顺便说一句,不需要重新发明轮子。 R
中有一个mean
函数
aggregate(pages_read~student_id+week_of_semester,pages_per_day,mean)
student_id week_of_semester pages_read
# 1 1 1 7.0
# 2 2 1 7.0
# 3 1 2 4.0
# 4 2 2 7.5