将 `region_ID=0` 的每一行与 `region_ID=1` 的行匹配并计算一定的距离

Question

我有一个如下所示的数据集：

structure(list(X = c(36, 37, 38, 39, 40, 41, 1, 2, 3, 4, 5, 6
), Y = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), region_ID = c(0, 
0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1)), row.names = c(NA, -12L), class = c("data.table", 
"data.frame"), .internal.selfref = <pointer: 0x7fb8fc819ae0>)

我想匹配 region_ID=0 的每一行与 region_ID=1 的行并计算

dist_to_r1=sqrt((X - i.X)^2 + (Y - i.Y)^2))

其中 i. 前缀指的是后面的行。我想使用数据 table 语法来做到这一点。

我一直在尝试使用左连接来执行此操作，但无法成功。

Answer 1

您想要完全联接，以便将区域 0 中的六行中的每一行都联接到区域 1 中的六行吗？

在这种情况下，您只需设置 allow.cartesian = T:

data[, id:=1][region_ID==0][data[region_ID==1], on ="id", allow.cartesian=T][, dist_to_r1:=sqrt((X-i.X)^2 + (Y-i.Y)^2)][]

编辑：OP 阐明仅需要到区域 0 中的点的最小距离。在这种情况下，我们可以这样做：

data[,id:=1]
region0 = data[region_ID==0]

# function that gets the minimum distance between two regions
get_min_dist <- function(region_a, region_b) {
  region_a[region_b, on="id", allow.cartesian=T][,min(sqrt((X-i.X)^2 + (Y-i.Y)^2))]
}

# apply the function above to every region

data[,
     (min_dist_to_zero = get_min_dist(
       region_a = region0,
       region_b = data[region_ID==.BY]
       )),
  by=region_ID]

输出：

   region_ID min_dist_to_zero
1:         0                0
2:         1               30

将 `region_ID=0` 的每一行与 `region_ID=1` 的行匹配并计算一定的距离

match every row whose `region_ID=0` with the rows whose `region_ID=1` and calculate a certain distance

r

data.table

tidyverse