计算 R / dplyr 中不同子集的值
Calculating values across different subsets in R / dplyr
我有一个数据集,其中有许多由个人 ID 提交的值,这些值被组织成子集。我想为每个 ID 计算一个值 = ID 分数的平均值/子集分数的平均值。我尝试了很多使用 group_by()
、summarize()
和 spread()
的选项,但无法安排。
library(dplyr)
df <- data.frame(stringsAsFactors=FALSE,
Subset = c("A","B","C","D","A","B","C","D","A","B","C","D"),
ID = c(1,2,3,4,5,3,1,5,2,3,4,1),
score = c(123,42,564,234,123,345,6678,87,543,121,123,55))
averages <-
df %>%
group_by(Subset) %>%
summarise(mean.subs = mean(score)) %>%
ungroup() %>%
group_by(ID) %>%
summarise(mean.id = mean(score) / mean.subs)
如有任何帮助,我将不胜感激。
我想你想用 mutate
而不是 summarize
:
library(dplyr)
df %>%
dplyr::group_by(Subset) %>%
dplyr::mutate(mean.subs = mean(score)) %>%
dplyr::ungroup() %>%
dplyr::group_by(ID) %>%
dplyr::mutate(mean.id = mean(score)) %>%
dplyr::rowwise() %>%
dplyr::mutate(value = mean.id / mean.subs)
Subset ID score mean.subs mean.id value
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A 1 123 263 2285. 8.69
2 B 2 42 169. 292. 1.73
3 C 3 564 2455 343. 0.140
4 D 4 234 125. 178. 1.42
5 A 5 123 263 105 0.399
6 B 3 345 169. 343. 2.03
7 C 1 6678 2455 2285. 0.931
8 D 5 87 125. 105 0.838
9 A 2 543 263 292. 1.11
10 B 3 121 169. 343. 2.03
11 C 4 123 2455 178. 0.0727
12 D 1 55 125. 2285. 18.2
此外,要跨行计算,您还需要使用 rowwise
。您可能还想在末尾添加另一个管道到 ungroup
以取消组合您的输出。
我有一个数据集,其中有许多由个人 ID 提交的值,这些值被组织成子集。我想为每个 ID 计算一个值 = ID 分数的平均值/子集分数的平均值。我尝试了很多使用 group_by()
、summarize()
和 spread()
的选项,但无法安排。
library(dplyr)
df <- data.frame(stringsAsFactors=FALSE,
Subset = c("A","B","C","D","A","B","C","D","A","B","C","D"),
ID = c(1,2,3,4,5,3,1,5,2,3,4,1),
score = c(123,42,564,234,123,345,6678,87,543,121,123,55))
averages <-
df %>%
group_by(Subset) %>%
summarise(mean.subs = mean(score)) %>%
ungroup() %>%
group_by(ID) %>%
summarise(mean.id = mean(score) / mean.subs)
如有任何帮助,我将不胜感激。
我想你想用 mutate
而不是 summarize
:
library(dplyr)
df %>%
dplyr::group_by(Subset) %>%
dplyr::mutate(mean.subs = mean(score)) %>%
dplyr::ungroup() %>%
dplyr::group_by(ID) %>%
dplyr::mutate(mean.id = mean(score)) %>%
dplyr::rowwise() %>%
dplyr::mutate(value = mean.id / mean.subs)
Subset ID score mean.subs mean.id value
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A 1 123 263 2285. 8.69
2 B 2 42 169. 292. 1.73
3 C 3 564 2455 343. 0.140
4 D 4 234 125. 178. 1.42
5 A 5 123 263 105 0.399
6 B 3 345 169. 343. 2.03
7 C 1 6678 2455 2285. 0.931
8 D 5 87 125. 105 0.838
9 A 2 543 263 292. 1.11
10 B 3 121 169. 343. 2.03
11 C 4 123 2455 178. 0.0727
12 D 1 55 125. 2285. 18.2
此外,要跨行计算,您还需要使用 rowwise
。您可能还想在末尾添加另一个管道到 ungroup
以取消组合您的输出。