超过 6 组数据的离群值
Outlier for more than 6 set of data
我正在尝试为我的数据绘制箱线图。我写了一个自定义函数来更改晶须和异常值选择。
我的数据中没有得到任何异常点,所以经过一些调查和一些好心人的外部帮助,我发现了问题,
我的代码:
f <- function(x) {r <- c( quantile(x,probs=c(0.25))-(1.5*(quantile(x,probs=c(0.75))-quantile(x,probs=c(0.25)))) ,quantile(x, probs = c(0.25)), quantile(x, probs = c(0.5)), quantile(x, probs = c(0.75)), quantile(x,probs=c(0.75))+(1.5*(quantile(x,probs=c(0.75))-quantile(x,probs=c(0.25)))) );names(r) <- c("ymin", "lower", "middle", "upper", "ymax"); r}
o <-function(x) { subset(x,x < (quantile(x, probs = c(0.25)) - (1.5 * (quantile(x, probs = c(0.75)) - quantile(x, probs = c(0.25))))) | x > (quantile(x, probs = c(0.75)) + (1.5 * (quantile(x, probs = c(0.75)) - quantile(x, probs = c(0.25))))))}
dt=read.table("C:/Users/msi161/Desktop/R/test.txt",header=TRUE,sep=",")
data2<-data.frame(x=dt$x,day=dt$day)
data3=data2[order(data2$day),]
data<-data.frame(x=data3$x,day=data3$day)
dev.new();ggplot(data, aes(day,x)) + stat_summary(fun.data=f, geom='boxplot')+stat_summary(fun.y =o, geom='point',col='red')
只要我更改数据量,它就会正常工作,
工作正常:
datadd=head(data,43*6)
dev.new();ggplot(datadd, aes(factor(day),x,fill=factor(day))) + stat_summary(fun.data=f, geom='boxplot')+stat_summary(fun.y =o, geom='point',size=1)
但是第 7 个箱线图和之后的箱线图(或我的所有数据)出现以下错误:
问题:
datadd=head(data,43*7)
dev.new();ggplot(datadd, aes(factor(day),x,fill=factor(day))) + stat_summary(fun.data=f, geom='boxplot')+stat_summary(fun.y =o, geom='point',size=1)
Warning message:
Computation failed in `stat_summary()`:
arguments imply differing number of rows: 1, 0
问题与我的o函数有关,我更正如下
o <-function(x) { pp= subset(x, x <(quantile(x, probs = c(0.25)) - (1.5 * (quantile(x, probs = c(0.75)) - quantile(x, probs = c(0.25))))) | x > (quantile(x, probs = c(0.75)) + (1.5 * (quantile(x, probs = c(0.75)) - quantile(x, probs = c(0.25))))));if(length(pp)<1){pp=c(1);return(pp)}else { return (NA)}}
我正在尝试为我的数据绘制箱线图。我写了一个自定义函数来更改晶须和异常值选择。
我的数据中没有得到任何异常点,所以经过一些调查和一些好心人的外部帮助,我发现了问题,
我的代码:
f <- function(x) {r <- c( quantile(x,probs=c(0.25))-(1.5*(quantile(x,probs=c(0.75))-quantile(x,probs=c(0.25)))) ,quantile(x, probs = c(0.25)), quantile(x, probs = c(0.5)), quantile(x, probs = c(0.75)), quantile(x,probs=c(0.75))+(1.5*(quantile(x,probs=c(0.75))-quantile(x,probs=c(0.25)))) );names(r) <- c("ymin", "lower", "middle", "upper", "ymax"); r}
o <-function(x) { subset(x,x < (quantile(x, probs = c(0.25)) - (1.5 * (quantile(x, probs = c(0.75)) - quantile(x, probs = c(0.25))))) | x > (quantile(x, probs = c(0.75)) + (1.5 * (quantile(x, probs = c(0.75)) - quantile(x, probs = c(0.25))))))}
dt=read.table("C:/Users/msi161/Desktop/R/test.txt",header=TRUE,sep=",")
data2<-data.frame(x=dt$x,day=dt$day)
data3=data2[order(data2$day),]
data<-data.frame(x=data3$x,day=data3$day)
dev.new();ggplot(data, aes(day,x)) + stat_summary(fun.data=f, geom='boxplot')+stat_summary(fun.y =o, geom='point',col='red')
只要我更改数据量,它就会正常工作,
工作正常:
datadd=head(data,43*6)
dev.new();ggplot(datadd, aes(factor(day),x,fill=factor(day))) + stat_summary(fun.data=f, geom='boxplot')+stat_summary(fun.y =o, geom='point',size=1)
但是第 7 个箱线图和之后的箱线图(或我的所有数据)出现以下错误:
问题:
datadd=head(data,43*7)
dev.new();ggplot(datadd, aes(factor(day),x,fill=factor(day))) + stat_summary(fun.data=f, geom='boxplot')+stat_summary(fun.y =o, geom='point',size=1)
Warning message:
Computation failed in `stat_summary()`:
arguments imply differing number of rows: 1, 0
问题与我的o函数有关,我更正如下
o <-function(x) { pp= subset(x, x <(quantile(x, probs = c(0.25)) - (1.5 * (quantile(x, probs = c(0.75)) - quantile(x, probs = c(0.25))))) | x > (quantile(x, probs = c(0.75)) + (1.5 * (quantile(x, probs = c(0.75)) - quantile(x, probs = c(0.25))))));if(length(pp)<1){pp=c(1);return(pp)}else { return (NA)}}