R:Loess 回归在值 10 之后产生类似阶梯的图形,而不是被平滑

R: Loess regression produces a staircase-like graph, rather than being smoothed, after the value 10

发生这种情况的可能原因是什么?它总是发生在值 10 之后。

应用回归前后感兴趣区域周围的数据集子集:

这是我用来生成图表的 ggplot2 调用。使用的平滑跨度为0.05.

dat <- read.csv("before_loess.csv", stringsAsFactors = FALSE)

    smoothed.data <- applyLoessSmooth(dat, 0.05) # dat is the dataset before being smoothed

    scan.plot.data <- melt(smoothed.data, id.vars = "sample.diameters", variable.name = 'series')

    scan.plot <- ggplot(data = scan.plot.data, aes(sample.diameters, value)) +
      geom_line(aes(colour = series)) +
      xlab("Diameters (nm)") +                                                                                                                
      ylab("Concentration (dN#/cm^2)") +
      theme(plot.title = element_text(hjust = 0.5))

用于应用黄土过滤器的函数:

applyLoessSmooth <- function(raw.data, smoothing.span) {
  raw.data <- raw.data[complete.cases(raw.data),]

  ## response
  vars <- colnames(raw.data)
  ## covariate
  id <- 1:nrow(raw.data)
  ## define a loess filter function (fitting loess regression line)
  loess.filter <- function (x, given.data, span) loess(formula = as.formula(paste(x, "id", sep = "~")),
                                           data = given.data,
                                           degree = 1,
                                           span = span)$fitted 
  ## apply filter column-by-column
  loess.graph.data <- as.data.frame(lapply(vars, loess.filter, given.data = raw.data, span = smoothing.span),
                           col.names = colnames(raw.data))
  sample.rows <- length(loess.graph.data[1])
  loess.graph.data <- loess.graph.data %>% mutate("sample.diameters" = raw.data$sample.diameters[1:nrow(raw.data)])

    }

第一个问题只是您的数据四舍五入为三位有效数字。低于 10,x 轴上的值 scan.plot.data$sample.diameters 以 0.01 的增量增加,这会在图表上产生平滑的曲线,但在 10 之后它们以 0.1 的增量增加,这在图表上显示为可见的阶梯。

第二个问题是您应该针对 sample.diameters 的值进行回归,而不是针对行号 id。我认为这导致 x 的每个不同值都有多个平滑值 - 因此是步骤。以下是对您的函数的一些建议小修改...

applyLoessSmooth <- function(raw.data, smoothing.span) {
  raw.data <- raw.data[complete.cases(raw.data),]    
  vars <- colnames(raw.data)
  vars <- vars[vars != "sample.diameters"] #you are regressing against this, so exclude it from vars
  loess.filter <- function (x, given.data, span) loess(
                    formula = as.formula(paste(x, "sample.diameters", sep = "~")), #not 'id'
                    data = given.data,
                    degree = 1,
                    span = span)$fitted 
  loess.graph.data <- as.data.frame(lapply(vars, loess.filter, given.data = raw.data, 
                                           span = smoothing.span),
                                    col.names = vars) #final argument edited
  loess.graph.data$sample.diameters <- raw.data$sample.diameters #simplified
  return(loess.graph.data)      
}

所有这些似乎都能解决问题...

当然,您也可以这样做...

dat.melt <- melt(dat, id.vars = "sample.diameters", variable.name = 'series')
ggplot(data = dat.melt, aes(sample.diameters, value, colour=series)) +  
       geom_smooth(method="loess", span=0.05, se=FALSE)