raster::extract：创建数据框并使用缓冲区加入属性信息，但包含 NA 的问题

Question

我想在两个栅格中使用 extract() 函数后，在我的最终数据帧 RES 中恢复我的 SpatialPointsDataFrame（在我的例子中是 df.pts.SPDF$status 变量）的属性信息。我找不到在任何函数中解释邻域坐标（缓冲区=6 左右）与原始坐标（df.pts.SPDF）具有相同状态属性的方法，而且我对包含的 NA 也有太多问题。对于 NA，我使用 x<-lapply(list, function(x) x[!is.na(x)]) 但没有成功。在我的例子中：

library(raster)  
r <- raster(ncol=10, nrow=10, crs="+proj=utm +zone=1 +datum=WGS84", xmn=0, xmx=50, ymn=0, ymx=50)
s1 <- stack(lapply(1:4, function(i) setValues(r, runif(ncell(r)))))
r2 <- raster(ncol=10, nrow=10, crs="+proj=utm +zone=1 +datum=WGS84", xmn=0, xmx=100, ymn=0, ymx=100) # Large raster for produce NAs
s2 <- stack(lapply(1:4, function(i) setValues(r2, runif(ncell(2)))))
ras <- list(s1, s2)
pts <- data.frame(pts=sampleRandom(s2, 100, xy=TRUE)[,1:2], status=rep(c("control","treat"),5))
pts.sampling = SpatialPoints(cbind(pts$pts.x,pts$pts.y), proj4string=CRS("+proj=utm +zone=1 +datum=WGS84"))
df.pts.SPDF<- SpatialPointsDataFrame(pts.sampling, data = pts)

## Extract raster values in 6 distance around (buffer) and organize the results with df.pts.SPDF$status information 
#( neighborhood coordinates (buffer=6 around) has the same status attribute of the original coordinates in df.pts.SPDF) 
RES <- NULL
for (i in 1:length(ras)) {
x <- extract(ras[[i]], df.pts.SPDF,buffer=6)
res<- data.frame(coordinates(pts.sampling),
                 df.pts.SPDF,
                 do.call("rbind", x))
RES<-rbind(RES,c(res))                 
}
#
Error in data.frame(coordinates(pts.sampling), df.pts.SPDF, do.call("rbind",  : 
  arguments imply differing number of rows: 100, 165

我想要的输出是：

#  coords.x1 coords.x2         x         y  ras    status layer.1   layer.2   layer.3   layer.4
#1 0.8824756 0.1675364 0.8824756 0.1675364   s1    control 0.2979335 0.8745829 0.4586767 0.4631793
#2 0.3197404 0.6779792 0.3197404 0.6779792   s1    treat   0.2979335 0.8745829 0.4586767 0.4631793
#3 0.1542464 0.5778322 0.1542464 0.5778322   s1    control 0.2979335 0.8745829 0.4586767 0.4631793
#4 0.6299502 0.3118177 0.6299502 0.3118177   s1    control 0.2979335 0.8745829 0.4586767 0.4631793
#5 0.4714429 0.1400559 0.4714429 0.1400559   s1    control 0.2979335 0.8745829 0.4586767 0.4631793
#6 0.4568768 0.6155193 0.4568768 0.6155193   s1    treat   0.2979335 0.8745829 0.4586767 0.4631793

有什么想法吗？

Answer 1

我认为您想要的输出可能有所不同。上面的 x 和 y 坐标不属于您的数据。但我提供了两个解决方案：

解决方案 1：类似于您想要的输出：

 #this is a function to convert vectors to matrix
c2m <- function(x){
  mtx <- matrix(x, nrow=length(x)/4, ncol=4, byrow = T)#4 is number of layers in raster stack
  return(mtx)
}

RES <- list() #you might need a list here
for (i in 1:length(ras)) {
  x <- raster::extract(ras[[i]], df.pts.SPDF, buffer=6)

  max.len <- max(sapply(x, length))
  x <- lapply(x, function(x) {c(x, rep(NA, max.len - length(x)))})
  xx <- lapply(x, function(x) c2m(x))
  res<- data.frame(coordinates(pts.sampling),
                   df.pts.SPDF,
                   do.call("rbind", xx))
  RES[[i]]<-res  #this is another change you need    
}

df.out <- ldply(RES, rbind)
colnames(df.out) <- stringr::str_replace_all(colnames(df.out), pattern = "X", replacement = "layer.")

因为你有一个缓冲区，所以每个 x 和 y 有 4 个点，一些行的坐标是重复的。这意味着如果您将其转换为 shapefile（x 和 y 每 100 次观察重复一次），稍后您将拥有叠加点。

解决方案 2：将属于唯一 x 和 y 的所有值放在一行中：

RES <- list() #you might need a list here
for (i in 1:length(ras)) {
  x <- raster::extract(ras[[i]], df.pts.SPDF, buffer=6)

  max.len <- max(sapply(x, length))
  x <- lapply(x, function(x) {c(x, rep(NA, max.len - length(x)))})
  
  res<- data.frame(coordinates(pts.sampling),
                   df.pts.SPDF,
                   do.call("rbind", x))
  RES[[i]]<-res  #this is anotherchange    
}

df.out <- ldply(RES, rbind)
colnames(df.out) <- stringr::str_replace_all(colnames(df.out), pattern = "V", replacement = "layer.")

raster::extract：创建数据框并使用缓冲区加入属性信息，但包含 NA 的问题

raster::extract: Create a data frame and join atributes information using buffer but problems with NAs included

r

raster

r-raster

解决方案 1：类似于您想要的输出：

解决方案 2：将属于唯一 x 和 y 的所有值放在一行中：