根据 R 中所有变量的性别分布类型一次替换 NA
Replacing NA depending on distribution type of gender for all variable at once in R
这里,
我问如何根据分布类型替换 NA。
Lstat 的解决方案很棒
library(dplyr)
data %>%
group_by(sex) %>%
mutate(
emotion = ifelse(!is.na(emotion), emotion,
ifelse(shapiro.test(emotion)$p.value > 0.05,
mean(emotion, na.rm=TRUE), quantile(emotion, na.rm=TRUE, probs=0.5) ) ),
IQ = ifelse(!is.na(IQ), IQ,
ifelse(shapiro.test(IQ)$p.value > 0.05,
mean(IQ, na.rm=TRUE), quantile(IQ, na.rm=TRUE, probs=0.5) )
)
)
但是如果我有 20 个或更多变量怎么办。如何做到这一点这段代码一次适用于所有变量。即我不想写每个字符串
var1=ifelse
var2=ifelse
...
var20 ifelse
这是数据
data=structure(list(sex = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), emotion = c(20L,
15L, 49L, NA, 34L, 35L, 54L, 45L), IQ = c(101L, 98L, 105L, NA,
123L, 120L, 115L, NA)), .Names = c("sex", "emotion", "IQ"), class = "data.frame", row.names = c(NA,
-8L))
您可以考虑使用 dplyr::mutate_at
在多个列上应用相同的函数。
假设,您想在 emotion
和 IQ
列上应用相同的函数,那么解决方案可以写成:
library(dplyr)
data %>%
group_by(sex) %>%
mutate_at(vars(c("emotion", "IQ")),
funs(ifelse(!is.na(.), ., ifelse(shapiro.test(.)$p.value > 0.05,
mean(., na.rm=TRUE), quantile(., na.rm=TRUE, probs=0.5)))))
# # A tibble: 8 x 3
# # Groups: sex [2]
# sex emotion IQ
# <int> <dbl> <dbl>
# 1 1 20.0 101
# 2 1 15.0 98.0
# 3 1 49.0 105
# 4 1 28.0 101
# 5 2 34.0 123
# 6 2 35.0 120
# 7 2 54.0 115
# 8 2 45.0 119
这里,
library(dplyr)
data %>%
group_by(sex) %>%
mutate(
emotion = ifelse(!is.na(emotion), emotion,
ifelse(shapiro.test(emotion)$p.value > 0.05,
mean(emotion, na.rm=TRUE), quantile(emotion, na.rm=TRUE, probs=0.5) ) ),
IQ = ifelse(!is.na(IQ), IQ,
ifelse(shapiro.test(IQ)$p.value > 0.05,
mean(IQ, na.rm=TRUE), quantile(IQ, na.rm=TRUE, probs=0.5) )
)
)
但是如果我有 20 个或更多变量怎么办。如何做到这一点这段代码一次适用于所有变量。即我不想写每个字符串
var1=ifelse
var2=ifelse
...
var20 ifelse
这是数据
data=structure(list(sex = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), emotion = c(20L,
15L, 49L, NA, 34L, 35L, 54L, 45L), IQ = c(101L, 98L, 105L, NA,
123L, 120L, 115L, NA)), .Names = c("sex", "emotion", "IQ"), class = "data.frame", row.names = c(NA,
-8L))
您可以考虑使用 dplyr::mutate_at
在多个列上应用相同的函数。
假设,您想在 emotion
和 IQ
列上应用相同的函数,那么解决方案可以写成:
library(dplyr)
data %>%
group_by(sex) %>%
mutate_at(vars(c("emotion", "IQ")),
funs(ifelse(!is.na(.), ., ifelse(shapiro.test(.)$p.value > 0.05,
mean(., na.rm=TRUE), quantile(., na.rm=TRUE, probs=0.5)))))
# # A tibble: 8 x 3
# # Groups: sex [2]
# sex emotion IQ
# <int> <dbl> <dbl>
# 1 1 20.0 101
# 2 1 15.0 98.0
# 3 1 49.0 105
# 4 1 28.0 101
# 5 2 34.0 123
# 6 2 35.0 120
# 7 2 54.0 115
# 8 2 45.0 119