如何根据R中另一个变量的频率计算一个变量的百分比
How to calculate a percentage of one variable based on the frequency of another in R
我不确定如何写出来,但我会展示我想要的代码。我试图确定某些人在多次发生的任务中的准确性,但是,有很多案例。
names <- c("James", "James", "James", "James", "James", "John", "John", "Fred")
outcome <- c("successful", "unsuccessful", "unsuccessful", "successful", "successful", "successful",
"unsuccessful", "unsuccessful")
accuracy <- c("60%", "60%", "60%", "60%", "60%", "50%", "50%", "0%")
df <- data.frame(names, outcome, accuracy)
在上面的例子中,我显然手动输入了数据,但我想知道如何编写代码来查看 successful/unsuccessful 结果与人名相关的频率,然后打印在准确度列中成功的实例总数的百分比。
我不太确定从哪里开始,希望这是一个我没想到的简单解决方案!
提前致谢
使用 ave
.
df$accuracy <- NULL
df <- transform(df, accuracy=ave(outcome %in% "successful", names,
FUN=function(x) paste0(sum(x)/length(x)*100, "%")))
df
# names outcome accuracy
# 1 James successful 60%
# 2 James unsuccessful 60%
# 3 James unsuccessful 60%
# 4 James successful 60%
# 5 James successful 60%
# 6 John successful 50%
# 7 John unsuccessful 50%
# 8 Fred unsuccessful 0%
这个有用吗?
> df %>% group_by(names) %>% mutate(accuracy = paste0(100 * sum(outcome == 'successful')/n(),'%'))
# A tibble: 8 x 3
# Groups: names [3]
names outcome accuracy
<chr> <chr> <chr>
1 James successful 60%
2 James unsuccessful 60%
3 James unsuccessful 60%
4 James successful 60%
5 James successful 60%
6 John successful 50%
7 John unsuccessful 50%
8 Fred unsuccessful 0%
我不确定如何写出来,但我会展示我想要的代码。我试图确定某些人在多次发生的任务中的准确性,但是,有很多案例。
names <- c("James", "James", "James", "James", "James", "John", "John", "Fred")
outcome <- c("successful", "unsuccessful", "unsuccessful", "successful", "successful", "successful",
"unsuccessful", "unsuccessful")
accuracy <- c("60%", "60%", "60%", "60%", "60%", "50%", "50%", "0%")
df <- data.frame(names, outcome, accuracy)
在上面的例子中,我显然手动输入了数据,但我想知道如何编写代码来查看 successful/unsuccessful 结果与人名相关的频率,然后打印在准确度列中成功的实例总数的百分比。
我不太确定从哪里开始,希望这是一个我没想到的简单解决方案!
提前致谢
使用 ave
.
df$accuracy <- NULL
df <- transform(df, accuracy=ave(outcome %in% "successful", names,
FUN=function(x) paste0(sum(x)/length(x)*100, "%")))
df
# names outcome accuracy
# 1 James successful 60%
# 2 James unsuccessful 60%
# 3 James unsuccessful 60%
# 4 James successful 60%
# 5 James successful 60%
# 6 John successful 50%
# 7 John unsuccessful 50%
# 8 Fred unsuccessful 0%
这个有用吗?
> df %>% group_by(names) %>% mutate(accuracy = paste0(100 * sum(outcome == 'successful')/n(),'%'))
# A tibble: 8 x 3
# Groups: names [3]
names outcome accuracy
<chr> <chr> <chr>
1 James successful 60%
2 James unsuccessful 60%
3 James unsuccessful 60%
4 James successful 60%
5 James successful 60%
6 John successful 50%
7 John unsuccessful 50%
8 Fred unsuccessful 0%