分组并总结
Group by and summarise
我想根据 3 个变量进行分组,并使用汇总函数创建新变量。
我的代码:
选项 1
library(tidyverse)
library(dplyr)
example2<-example%>%
group_by(age_cohort,sex,city)%>%
summarise(rich=sum(rich),
middleclass=sum(middleclass),
poor=sum(poor),
population=count(id))
我不明白错误:
Error in `summarise()`:
! Problem while computing `population = count(id)`.
i The error occurred in group 1: age_cohort = 1, sex = 0, city = 1.
Caused by error in `UseMethod()`:
! no applicable method for 'count' applied to an object of class "c('double', 'numeric')"
Run `rlang::last_error()` to see where the error occurred.
选项 2
example3<-example%>%
group_by(age_cohort,sex,city)%>%
summarise(rich=sum(rich),
middleclass=sum(middleclass),
poor=sum(poor),
population=n(id))
错误:
Error in `summarise()`:
! Problem while computing `population = n(id)`.
i The error occurred in group 1: age_cohort = 1, sex = 0, city = 1.
Caused by error in `n()`:
! unused argument (id)
Run `rlang::last_error()` to see where the error occurred.
此外,如果我删除 'population' 变量,我的代码仍然有问题。
新代码
example<-example%>%
group_by(age_cohort,sex,city)%>%
summarise(rich=sum(rich),
middleclass=sum(middleclass),
poor=sum(poor))
错误:
Error in UseMethod("group_by") :
no applicable method for 'group_by' applied to an object of class "function"
原始数据(例子):
id sex city rich middleclass poor age_cohort
1 0 1 1 0 0 1
2 1 1 0 1 0 5
3 1 2 0 0 1 2
4 0 2 0 0 1 3
5 1 3 0 0 1 4
6 0 4 0 1 0 1
7 0 6 0 1 0 1
8 1 7 1 0 0 5
9 0 3 1 0 0 5
10 1 7 0 1 5
11 1 3 0 0 1 2
12 1 1 0 0 1 3
正如阿克伦所说,你需要 population=n()
.
id <- 1:12
sex <- c(0,1,1,0,1,0,0,1,0,1,1,1)
city <- c(1,1,2,2,3,4,6,7,3,7,3,1)
rich <- c(1,0,0,0,0,0,0,1,1,0,0,0)
middleclass <- c(0,1,0,0,0,1,1,0,0,1,0,0)
poor <- c(0,0,1,1,1,0,0,0,0,NA,1,1)
age_cohort <- c(1,5,2,3,4,1,1,5,5,5,2,3)
example <- data.frame(id,sex,city,rich,middleclass,poor,age_cohort)
example3 <- example%>%
group_by(age_cohort,sex,city)%>%
summarise(rich=sum(rich),
middleclass=sum(middleclass),
poor=sum(poor),
population=n())
输出
> example
id sex city rich middleclass poor age_cohort
1 1 0 1 1 0 0 1
2 2 1 1 0 1 0 5
3 3 1 2 0 0 1 2
4 4 0 2 0 0 1 3
5 5 1 3 0 0 1 4
6 6 0 4 0 1 0 1
7 7 0 6 0 1 0 1
8 8 1 7 1 0 0 5
9 9 0 3 1 0 0 5
10 10 1 7 0 1 NA 5
11 11 1 3 0 0 1 2
12 12 1 1 0 0 1 3
> example3
# A tibble: 11 x 7
# Groups: age_cohort, sex [7]
age_cohort sex city rich middleclass poor population
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 1 0 1 1 0 0 1
2 1 0 4 0 1 0 1
3 1 0 6 0 1 0 1
4 2 1 2 0 0 1 1
5 2 1 3 0 0 1 1
6 3 0 2 0 0 1 1
7 3 1 1 0 0 1 1
8 4 1 3 0 0 1 1
9 5 0 3 1 0 0 1
10 5 1 1 0 1 0 1
11 5 1 7 1 1 NA 2
为什么会出错
正如其他人在评论中指出的那样。
第一个错误是由于 count
处理数据帧和变量名;它不能用作汇总函数。例如,count(example, sex)
。您给 count
一个数值向量 (an object of class "c('double', 'numeric')
),它不能作为参数 (no applicable method for 'count' applied to...
)。
第二个错误是由于 n()
仅返回有关最后一个分组变量的信息(参见 ?context
)。这一次,你给了它一个参数,但它没有接受任何参数,因为最后一个分组变量是由 group_by
指定的,所以它返回 unused argument
.
最后一个错误是由于您在执行 group_by
之前没有在环境中创建对象 example
。实际上,example
是 utils
中的函数名称(参见 ?example
)。因此,如果您不使用该名称创建对象,R 认为您指的是名为 example
的函数。然后你尝试 group
它,R 不能,因为它只适用于数据帧。当它需要一个数据帧时,你给了它一个 class 函数 (an object of class "function"
) 的参数。
我想根据 3 个变量进行分组,并使用汇总函数创建新变量。
我的代码:
选项 1
library(tidyverse)
library(dplyr)
example2<-example%>%
group_by(age_cohort,sex,city)%>%
summarise(rich=sum(rich),
middleclass=sum(middleclass),
poor=sum(poor),
population=count(id))
我不明白错误:
Error in `summarise()`:
! Problem while computing `population = count(id)`.
i The error occurred in group 1: age_cohort = 1, sex = 0, city = 1.
Caused by error in `UseMethod()`:
! no applicable method for 'count' applied to an object of class "c('double', 'numeric')"
Run `rlang::last_error()` to see where the error occurred.
选项 2
example3<-example%>%
group_by(age_cohort,sex,city)%>%
summarise(rich=sum(rich),
middleclass=sum(middleclass),
poor=sum(poor),
population=n(id))
错误:
Error in `summarise()`:
! Problem while computing `population = n(id)`.
i The error occurred in group 1: age_cohort = 1, sex = 0, city = 1.
Caused by error in `n()`:
! unused argument (id)
Run `rlang::last_error()` to see where the error occurred.
此外,如果我删除 'population' 变量,我的代码仍然有问题。
新代码
example<-example%>%
group_by(age_cohort,sex,city)%>%
summarise(rich=sum(rich),
middleclass=sum(middleclass),
poor=sum(poor))
错误:
Error in UseMethod("group_by") :
no applicable method for 'group_by' applied to an object of class "function"
原始数据(例子):
id sex city rich middleclass poor age_cohort
1 0 1 1 0 0 1
2 1 1 0 1 0 5
3 1 2 0 0 1 2
4 0 2 0 0 1 3
5 1 3 0 0 1 4
6 0 4 0 1 0 1
7 0 6 0 1 0 1
8 1 7 1 0 0 5
9 0 3 1 0 0 5
10 1 7 0 1 5
11 1 3 0 0 1 2
12 1 1 0 0 1 3
正如阿克伦所说,你需要 population=n()
.
id <- 1:12
sex <- c(0,1,1,0,1,0,0,1,0,1,1,1)
city <- c(1,1,2,2,3,4,6,7,3,7,3,1)
rich <- c(1,0,0,0,0,0,0,1,1,0,0,0)
middleclass <- c(0,1,0,0,0,1,1,0,0,1,0,0)
poor <- c(0,0,1,1,1,0,0,0,0,NA,1,1)
age_cohort <- c(1,5,2,3,4,1,1,5,5,5,2,3)
example <- data.frame(id,sex,city,rich,middleclass,poor,age_cohort)
example3 <- example%>%
group_by(age_cohort,sex,city)%>%
summarise(rich=sum(rich),
middleclass=sum(middleclass),
poor=sum(poor),
population=n())
输出
> example
id sex city rich middleclass poor age_cohort
1 1 0 1 1 0 0 1
2 2 1 1 0 1 0 5
3 3 1 2 0 0 1 2
4 4 0 2 0 0 1 3
5 5 1 3 0 0 1 4
6 6 0 4 0 1 0 1
7 7 0 6 0 1 0 1
8 8 1 7 1 0 0 5
9 9 0 3 1 0 0 5
10 10 1 7 0 1 NA 5
11 11 1 3 0 0 1 2
12 12 1 1 0 0 1 3
> example3
# A tibble: 11 x 7
# Groups: age_cohort, sex [7]
age_cohort sex city rich middleclass poor population
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 1 0 1 1 0 0 1
2 1 0 4 0 1 0 1
3 1 0 6 0 1 0 1
4 2 1 2 0 0 1 1
5 2 1 3 0 0 1 1
6 3 0 2 0 0 1 1
7 3 1 1 0 0 1 1
8 4 1 3 0 0 1 1
9 5 0 3 1 0 0 1
10 5 1 1 0 1 0 1
11 5 1 7 1 1 NA 2
为什么会出错
正如其他人在评论中指出的那样。
第一个错误是由于 count
处理数据帧和变量名;它不能用作汇总函数。例如,count(example, sex)
。您给 count
一个数值向量 (an object of class "c('double', 'numeric')
),它不能作为参数 (no applicable method for 'count' applied to...
)。
第二个错误是由于 n()
仅返回有关最后一个分组变量的信息(参见 ?context
)。这一次,你给了它一个参数,但它没有接受任何参数,因为最后一个分组变量是由 group_by
指定的,所以它返回 unused argument
.
最后一个错误是由于您在执行 group_by
之前没有在环境中创建对象 example
。实际上,example
是 utils
中的函数名称(参见 ?example
)。因此,如果您不使用该名称创建对象,R 认为您指的是名为 example
的函数。然后你尝试 group
它,R 不能,因为它只适用于数据帧。当它需要一个数据帧时,你给了它一个 class 函数 (an object of class "function"
) 的参数。