如何仅为数据子集制作 beanplot
How to make beanplot for only subset of data
我有一个看起来像这样的数据框
stream n rates means column value truevalue
1 Brooks 3 3.0 0.9629152 1 0.42707006 0.9440620
2 Siouxon 3 3.0 0.5831929 1 0.90503736 0.5858527
3 Speelyai 3 3.0 0.6199235 1 0.08554021 0.5839844
4 Brooks 4 7.5 0.9722707 1 1.43338843 0.9440620
5 Siouxon 4 7.5 0.5865031 1 0.50574543 0.5858527
6 Speelyai 4 7.5 0.6118634 1 0.32252396 0.5839844
7 Brooks 5 10.0 0.9637475 1 0.88984211 0.9440620
8 Siouxon 5 10.0 0.5804420 1 0.47501800 0.5858527
9 Speelyai 5 10.0 0.5959238 1 0.15079491 0.5839844
并继续 56,000 行。我正在尝试制作一个 beanplot,我想制作 3 个不同的 beanplot,每个流一个。我宁愿不子集这个数据框来创建 3 个新的/单独的数据框。有没有办法指定您想要 stream=="Brooks"
的 beanplot?
这是我的
beanplot(error~rates, data= result, col=c("orange", "black", "white", "red"), border ="pink", what=c(0,1,1,1), maxstripline=.05)
这有效,但为所有数据制作了一个 beanplot。我试过这个没有用
beanplot(error~rates, data= result[stream=="Speelyai"], col=c("orange", "black", "white", "red"), border ="pink", what=c(0,1,1,1), maxstripline=.05)
试试这个:
beanplot(error~rates, data= result[result$stream=="Speelyai", ], col=c("orange", "black", "white", "red"), border ="pink", what=c(0,1,1,1), maxstripline=.05)
我是这样认为的:
beanplot(error~rates, data= result[result[,"stream"]=="Speelyai",], col=c("orange", "black", "white", "red"), border ="pink", what=c(0,1,1,1), maxstripline=.05)
或者,如果您想要更紧凑的内容,请尝试使用 data.table
。就子集而言更紧凑,一旦你设置好了(你可以先w/o设置键,它仍然会更紧凑,但有点慢):
# load package
library(data.table)
# convert to data.table, and set key for subsetting
result <- as.data.table(result)
setkey(result, stream)
# save your original plotting code (minus the data part) as an expression
original.plot <- expression(beanplot(error~rates, col=c("orange", "black", "white", "red"), border ="pink", what=c(0,1,1,1), maxstripline=.05))
# make the plot for this stream only
result["Speelyai", eval(original.plot)]
然后,如果你想为这 3 个流制作情节,你可以这样做
par(mfrow=c(2,2)) # I'm doing 4 panels just so it's a square; 1 will be empty
result[c("Brooks","Siouxon","Speelyai"), eval(original.plot), by=c("stream")]
可能需要一段时间才能习惯 data.table,但它往往是非常方便且非常快速的符号。对于子集化或为多个子集执行任务非常方便。
我有一个看起来像这样的数据框
stream n rates means column value truevalue
1 Brooks 3 3.0 0.9629152 1 0.42707006 0.9440620
2 Siouxon 3 3.0 0.5831929 1 0.90503736 0.5858527
3 Speelyai 3 3.0 0.6199235 1 0.08554021 0.5839844
4 Brooks 4 7.5 0.9722707 1 1.43338843 0.9440620
5 Siouxon 4 7.5 0.5865031 1 0.50574543 0.5858527
6 Speelyai 4 7.5 0.6118634 1 0.32252396 0.5839844
7 Brooks 5 10.0 0.9637475 1 0.88984211 0.9440620
8 Siouxon 5 10.0 0.5804420 1 0.47501800 0.5858527
9 Speelyai 5 10.0 0.5959238 1 0.15079491 0.5839844
并继续 56,000 行。我正在尝试制作一个 beanplot,我想制作 3 个不同的 beanplot,每个流一个。我宁愿不子集这个数据框来创建 3 个新的/单独的数据框。有没有办法指定您想要 stream=="Brooks"
的 beanplot?
这是我的
beanplot(error~rates, data= result, col=c("orange", "black", "white", "red"), border ="pink", what=c(0,1,1,1), maxstripline=.05)
这有效,但为所有数据制作了一个 beanplot。我试过这个没有用
beanplot(error~rates, data= result[stream=="Speelyai"], col=c("orange", "black", "white", "red"), border ="pink", what=c(0,1,1,1), maxstripline=.05)
试试这个:
beanplot(error~rates, data= result[result$stream=="Speelyai", ], col=c("orange", "black", "white", "red"), border ="pink", what=c(0,1,1,1), maxstripline=.05)
我是这样认为的:
beanplot(error~rates, data= result[result[,"stream"]=="Speelyai",], col=c("orange", "black", "white", "red"), border ="pink", what=c(0,1,1,1), maxstripline=.05)
或者,如果您想要更紧凑的内容,请尝试使用 data.table
。就子集而言更紧凑,一旦你设置好了(你可以先w/o设置键,它仍然会更紧凑,但有点慢):
# load package
library(data.table)
# convert to data.table, and set key for subsetting
result <- as.data.table(result)
setkey(result, stream)
# save your original plotting code (minus the data part) as an expression
original.plot <- expression(beanplot(error~rates, col=c("orange", "black", "white", "red"), border ="pink", what=c(0,1,1,1), maxstripline=.05))
# make the plot for this stream only
result["Speelyai", eval(original.plot)]
然后,如果你想为这 3 个流制作情节,你可以这样做
par(mfrow=c(2,2)) # I'm doing 4 panels just so it's a square; 1 will be empty
result[c("Brooks","Siouxon","Speelyai"), eval(original.plot), by=c("stream")]
可能需要一段时间才能习惯 data.table,但它往往是非常方便且非常快速的符号。对于子集化或为多个子集执行任务非常方便。