选择具有行的行意味着大于数据框的总体平均值

Question

没有dplyr我们可以做这件事吗？我想 select 行均值大于数据帧总体均值的那些行。

我试过使用该功能，但它不起作用。

tf12 <- apply(tf11, 2, function(x) filter(rowMeans(x) > mean(x)))

它给出了以下错误。

Error in rowMeans(x) : 'x' must be an array of at least two dimensions

Answer 1

我们可以 unlist 计算整个数据帧的 mean，然后将其与 rowMeans

进行比较

tf11[rowMeans(tf11) > mean(unlist(tf11)), ]

如果数据框中有 NA 值，请在 mean 和 rowMeans 中使用 na.rm = TRUE。

考虑一个例子，

df <- data.frame(a = 1:10, b = 11:20)
df[rowMeans(df) > mean(unlist(df)), ]

#    a  b
#6   6 16
#7   7 17
#8   8 18
#9   9 19
#10 10 20

Selecting rows with row means greater than the overall mean of a data frame