R select 行基于 if else 语句
R select rows based on if else statement
我无法根据 if else 条件限制数据集。
这是我的数据框示例:
mydf<-data.frame(chemical=c("Cd","Cd","Cd","Cd","Pb","Pb"),species=c("a","a","a","a","b","d"),scores=c(0,1,2,3,0,0))
我需要select:对于每个化学物质和物种,如果scores>0
选择分数最小的行,否则选择0
的行
我可以做最小的分数,但我似乎无法成功添加 if else 语句。
ddply(mydf,.(chemical,species),function(x) x[which.min(x$score),])
结束table应该是这样的:
chemical species scores
1 Cd a 1
2 Pb b 0
3 Pb d 0
此处采用 OP 的原始逻辑的可行解决方案,可能不是最优雅的代码
plyr
ddply(mydf,.(chemical,species),
function(x) x[if(any(x$scores != 0)) {which.min(replace(x$scores, x$scores == 0, NA))} else which(x$scores == 0),])
dplyr
mydf %>%
group_by(chemical, species) %>%
do(.[if(any(.$scores != 0)) {which.min(replace(.$scores, .$scores == 0, NA))} else which(.$scores == 0),])
Ifelse 逻辑解包
# If none of the values are equal to 0
if(any(.$scores != 0))
# Find the index of the smallest values from a vector where 0 has been replaced by NA
{which.min(replace(.$scores, .$scores == 0, NA))}
# Else find index of value equal to 0
else which(.$scores == 0)
这应该能达到你想要的效果:
library(tidyverse)
mydf %>%
group_by(chemical, species) %>%
mutate(zero = if_else(condition = max(scores)==0, true = TRUE, false = FALSE)) %>%
filter(scores==0&zero==TRUE|scores>0&zero==FALSE) %>%
arrange(chemical, species, scores) %>%
distinct(chemical, species, .keep_all = TRUE) %>%
select(-zero)
mydf %>%
group_by(chemical, species) %>%
summarize(scores = ifelse(any(scores > 0), min(scores[scores>0]), 0))
我不知道这是否更快,但为了好玩,你也可以这样做
mydf %>%
group_by(chemical, species) %>%
summarize(scores = min(max(scores, 0)))
我无法根据 if else 条件限制数据集。
这是我的数据框示例:
mydf<-data.frame(chemical=c("Cd","Cd","Cd","Cd","Pb","Pb"),species=c("a","a","a","a","b","d"),scores=c(0,1,2,3,0,0))
我需要select:对于每个化学物质和物种,如果scores>0
选择分数最小的行,否则选择0
我可以做最小的分数,但我似乎无法成功添加 if else 语句。
ddply(mydf,.(chemical,species),function(x) x[which.min(x$score),])
结束table应该是这样的:
chemical species scores
1 Cd a 1
2 Pb b 0
3 Pb d 0
此处采用 OP 的原始逻辑的可行解决方案,可能不是最优雅的代码
plyr
ddply(mydf,.(chemical,species),
function(x) x[if(any(x$scores != 0)) {which.min(replace(x$scores, x$scores == 0, NA))} else which(x$scores == 0),])
dplyr
mydf %>%
group_by(chemical, species) %>%
do(.[if(any(.$scores != 0)) {which.min(replace(.$scores, .$scores == 0, NA))} else which(.$scores == 0),])
Ifelse 逻辑解包
# If none of the values are equal to 0
if(any(.$scores != 0))
# Find the index of the smallest values from a vector where 0 has been replaced by NA
{which.min(replace(.$scores, .$scores == 0, NA))}
# Else find index of value equal to 0
else which(.$scores == 0)
这应该能达到你想要的效果:
library(tidyverse)
mydf %>%
group_by(chemical, species) %>%
mutate(zero = if_else(condition = max(scores)==0, true = TRUE, false = FALSE)) %>%
filter(scores==0&zero==TRUE|scores>0&zero==FALSE) %>%
arrange(chemical, species, scores) %>%
distinct(chemical, species, .keep_all = TRUE) %>%
select(-zero)
mydf %>%
group_by(chemical, species) %>%
summarize(scores = ifelse(any(scores > 0), min(scores[scores>0]), 0))
我不知道这是否更快,但为了好玩,你也可以这样做
mydf %>%
group_by(chemical, species) %>%
summarize(scores = min(max(scores, 0)))