如何根据概率将栅格样本分为三类并在R中设置这三类的值？

Question

很抱歉我的问题很麻烦。我想要做的是生成一个栅格，其中随机 1% 的图层像素具有值，并且在这些有价值的像素中，随机的 35% 具有值 1，55% 具有值 2，而 10% 具有值 3。其余像素应采用 R 中的 'no data' 标记 ("NA")。

使用以下代码创建具有统一值的 1% 像素的栅格非常容易：

pixels <- raster(ext = extent(-120, -119, 49, 50), resolution = c(0.001, 0.001), crs = CRS("+proj=longlat +datum=WGS84"), vals = 1)

testing <- sampleRandom(pixels, size = as.integer(0.01*ncell(pixels)), asRaster = TRUE)

但是，我不确定如何将testing的有值像素分为三类，并根据上述设置这些类别的值。

这是可能的，还是另一种实现我所追求的方式？

Answer 1

这样也不算太麻烦，

函数f取单元格总数

f <- function(N) {
    n <- N/100  # 1% sample
    # create a vector with the values you want
    v <- c(rep(1, 0.35*n), rep(2, 0.55*n), rep(3, 0.1*n))
    # sample these values (that is, put them in random order)
    v <- sample(v)
    # create output vector 
    out <- rep(NA, N)
    # put the values in random places
    out[sample(N, length(v))] <- v
    out
}

library(raster)
# create a RasterLayer
r <- raster(ncol=100, nrow=100)
set.seed(10) # for reproducibility
values(r) <- f(ncell(r))

证明它有效

table(values(r))
# 1  2  3 
#35 55 10

另一种方法是对样本使用概率

set.seed(10) 
N <- ncell(r)
v <- sample(3, N/100, prob=c(0.35, 0.55, 0.1), replace=TRUE)
table(v)
# 1  2  3 
#30 67  3

但是因为这使用了概率，所以比例并不准确。在这种情况下它似乎很遥远，但那是因为样本量小。

按照你的例子，你也可以走这条路

library(raster)
r <- raster(ext=extent(-120, -119, 49, 50), resolution=c(0.001, 0.001), crs="+proj=longlat +datum=WGS84", vals = 1)
r <- sampleRandom(r, size = (0.01*ncell(r)), asRaster = TRUE)

sfun <- function(x) {
   i <- !is.na(x)
   x[i] <- sample(1:3, sum(i), prob=c(0.35, 0.55, 0.1), replace=TRUE)
   x
}

set.seed(101)
x <- calc(r, sfun)

再次大致正确

tab <- table(values(x))
100 * tab / sum(tab)
#   1     2     3 
#35.45 54.62  9.93

如何根据概率将栅格样本分为三类并在R中设置这三类的值？

How to divide raster sample based on probabilities into three categories and set the values of these categories in R?

random

r

raster