使用 dplyr 获取单个变量的平均值

Question

我正在使用钻石数据集，我试图找到每个切割的平均价格。我认为这会起作用

diamonds_data %>%
  filter(Cut == 'Ideal') %>%
  mean(Price)

但我收到以下警告消息：

[1] NA
Warning message:
In mean.default(., diamonds_data, Price) :
  argument is not numeric or logical: returning NA

Answer 1

您不能将 mean 用作数据框上的函数。如果要从列开始获取数值，请使用 pull 从数据框中提取该列。

diamonds_data %>% 
  filter(Cut == "Ideal") %>% 
  pull(Price) %>% 
  mean()
# [1] 3457.542

Answer 2

要使您的代码正常工作，请尝试 mean(.$price)。

diamonds %>%
  filter(cut == 'Ideal') %>%
  {mean(.$price)}

# [1] 3457.542

更好的选择是一次计算每个 cut 的平均价格，并将摘要 table 分配给一个对象。

price <- diamonds %>%
  group_by(cut) %>%
  summarise(mean_price = mean(price))

# # A tibble: 5 x 2
#   cut       mean_price
#   <ord>          <dbl>
# 1 Fair           4359.
# 2 Good           3929.
# 3 Very Good      3982.
# 4 Premium        4584.
# 5 Ideal          3458.

当您需要一些值时，从 table 中提取它。

price$mean_price[price$cut == "Ideal"]

# [1] 3457.542

使用 dplyr 获取单个变量的平均值

Get mean of a single variable with dplyr

r

mean

dataframe

dplyr