R:根据所选类别标准绘制聚合频率
R: Plot aggregated frequency depending on chosen category-criterion
我有一些数据集,其中一列是 "Exposure",一列是 "number of events",还有几列标记了各种类别类型。
Exposure<-c(10,2.1,2.8,4.5,21)
NoEvents <- c(1,0,2,0,0)
Cat1<-as.factor(c("A","A","B","A","B"))
Cat2<-as.factor(c("X","Y","Y","Y","X"))
Cat3<-as.factor(c("u","v","u","w","w"))
dataTest<-data.frame(Exposure,NoEvents,Cat1,Cat2,Cat3)
dataTest
Exposure NoEvents Cat1 Cat2 Cat3
10.0 1 A X u
2.1 0 A Y v
2.8 2 B Y u
4.5 0 A Y w
21.0 0 B X w
我现在想(灵活地)计算并绘制为所选类别类型聚合的频率 (NoEvents/Exposure):Cat1、Cat2 或 Cat3。对于固定的类别列,例如Cat1,我可以定义如下函数
freq_Cat <- function(data,Cat1){
data_aggr<-aggregate(. ~ Cat1, data[,c("Exposure","NoEvents","Cat1")], sum)
data_aggr[,"frequency"] <- data_aggr$NoEvents/data_aggr$Exposure
return(data_aggr)
}
然后用
绘制
ggplot(freq_Cat(dataTest,Cat1), aes(x=Cat1,y=frequency)) +
geom_bar(stat="identity",fill="dodgerblue",col="black")
我想让函数 freq_Cat 和绘图更灵活,这样我就可以灵活地选择 category-type/column (Cat1、Cat2 或 Cat3)来聚合,而不仅仅是复制-粘贴并用另一个替换 Cat1。
如果将类别作为数字传递,则可以计算要使用的变量。
freq_Cat <- function(data,Cat){
Var = paste("Cat", Cat, sep="")
Form = formula(paste(". ~", Var))
data_aggr<-aggregate(Form, data[,c("Exposure","NoEvents",Var)], sum)
data_aggr[,"frequency"] <- data_aggr$NoEvents/data_aggr$Exposure
return(data_aggr)
}
但现在不再像以前那样调用它,而是使用
freq_Cat(dataTest, Cat=1)
或者,如果您想按名称引用类别变量,您可以使用:
freq_Cat <- function(data,Cat){
Form = formula(paste(". ~", Cat))
data_aggr<-aggregate(Form, data[,c("Exposure","NoEvents",Cat)], sum)
data_aggr[,"frequency"] <- data_aggr$NoEvents/data_aggr$Exposure
return(data_aggr)
}
然后像这样访问函数:
freq_Cat(dataTest, Cat="Country")
我有一些数据集,其中一列是 "Exposure",一列是 "number of events",还有几列标记了各种类别类型。
Exposure<-c(10,2.1,2.8,4.5,21)
NoEvents <- c(1,0,2,0,0)
Cat1<-as.factor(c("A","A","B","A","B"))
Cat2<-as.factor(c("X","Y","Y","Y","X"))
Cat3<-as.factor(c("u","v","u","w","w"))
dataTest<-data.frame(Exposure,NoEvents,Cat1,Cat2,Cat3)
dataTest
Exposure NoEvents Cat1 Cat2 Cat3
10.0 1 A X u
2.1 0 A Y v
2.8 2 B Y u
4.5 0 A Y w
21.0 0 B X w
我现在想(灵活地)计算并绘制为所选类别类型聚合的频率 (NoEvents/Exposure):Cat1、Cat2 或 Cat3。对于固定的类别列,例如Cat1,我可以定义如下函数
freq_Cat <- function(data,Cat1){
data_aggr<-aggregate(. ~ Cat1, data[,c("Exposure","NoEvents","Cat1")], sum)
data_aggr[,"frequency"] <- data_aggr$NoEvents/data_aggr$Exposure
return(data_aggr)
}
然后用
绘制 ggplot(freq_Cat(dataTest,Cat1), aes(x=Cat1,y=frequency)) +
geom_bar(stat="identity",fill="dodgerblue",col="black")
我想让函数 freq_Cat 和绘图更灵活,这样我就可以灵活地选择 category-type/column (Cat1、Cat2 或 Cat3)来聚合,而不仅仅是复制-粘贴并用另一个替换 Cat1。
如果将类别作为数字传递,则可以计算要使用的变量。
freq_Cat <- function(data,Cat){
Var = paste("Cat", Cat, sep="")
Form = formula(paste(". ~", Var))
data_aggr<-aggregate(Form, data[,c("Exposure","NoEvents",Var)], sum)
data_aggr[,"frequency"] <- data_aggr$NoEvents/data_aggr$Exposure
return(data_aggr)
}
但现在不再像以前那样调用它,而是使用
freq_Cat(dataTest, Cat=1)
或者,如果您想按名称引用类别变量,您可以使用:
freq_Cat <- function(data,Cat){
Form = formula(paste(". ~", Cat))
data_aggr<-aggregate(Form, data[,c("Exposure","NoEvents",Cat)], sum)
data_aggr[,"frequency"] <- data_aggr$NoEvents/data_aggr$Exposure
return(data_aggr)
}
然后像这样访问函数:
freq_Cat(dataTest, Cat="Country")