迭代查找另一个数据集中点的 x 距离的所有点
iterate finding all points with x distance of points in another dataset
我有两组数据点(df1
和 df2
)。这个想法是找到(并计算)df2
中的所有点,这些点在 df1
中每个点的 x 半径距离内。让 x 从 500 m 开始,以 500 m 的增量增加到 20000 m。这个想法是使用循环或 purr 更有效地迭代和完成任务。
我试过这样的事情:
library(sf)
library(spData)
library(tidyverse)
# data set we can use for the example
df1 <- cycle_hire %>%
# transform for the buffer function
st_transform(., crs = st_crs(27700)) %>%
select(id_1 = "id")
df2 <- cycle_hire_osm %>%
# transform for the buffer function
st_transform(., crs = st_crs(27700)) %>%
select(id_2 = "osm_id")
# create first 500 m buffer for df1
df1_500 <- st_buffer(df1, dist = 500)
# get points that fall within the 500 m buffer
within_500 <- st_join(df1_500, df2, join = st_intersects) %>%
# for each point in df 1, count how many points from df2 are within 500 m
group_by(id_1) %>%
count() %>%
ungroup() %>%
# in my real data some rows return NA next to the id_1
# but this is still being counted as 1.
# I can deal with this later but I realize that in this toy example that this is a mistake.
# remove geometry because I want to join everything later
as_tibble() %>%
select(-geometry) %>%
# identify distance (will join later)
mutate(within = "500")
# now x increases by 500 m and we create a 1000 m buffer
df1_1000 <- st_buffer(df1, dist = 1000)
# get points that fall within the 1000 m buffer
within_1000 <- st_join(df1_1000, df2, join = st_intersects) %>%
# for each point in df 1, count how many points from df2 are within 1000 m
group_by(id_1) %>%
count() %>%
ungroup() %>%
# remove geometry
as_tibble() %>%
select(-geometry) %>%
# identify distance
mutate(within = "1000")
# and repeat...
# eventually I bind into one tibble
bind_all <- bind_rows(within_500, within_1000)
我们重复这个直到我们达到 20000。
我将如何迭代它以便它可以扩展到 20000 米或更多?我还希望在起始距离和增量方面具有一定的灵活性 - 即,可能从 200 米开始并增加我的 200 米增量。或者从 200 米开始,增加 200 直到我们达到 20000,然后我们开始增加 1000 直到 40000。显然我的代码效率很低,我不应该手动执行此操作。
任何帮助将不胜感激
500 米以内的查找:
df_example <- df1 %>%
mutate(within_500 = lengths(st_is_within_distance(x = .,
y = df2,
dist = 500)))
df_example
Simple feature collection with 742 features and 2 fields
geometry type: POINT
dimension: XY
bbox: xmin: 522502 ymin: 174408 xmax: 538733.2 ymax: 184421
projected CRS: OSGB 1936 / British National Grid
First 10 features:
id_1 geometry within_500
1 1 POINT (531203.5 182832.1) 12
2 2 POINT (525208.1 179391.9) 9
3 3 POINT (532985.8 182001.6) 8
4 4 POINT (530437.8 182912) 8
5 5 POINT (528051 178742) 2
6 6 POINT (528858.4 181542.9) 11
7 7 POINT (527159 183300.8) 3
8 8 POINT (527032.7 182634.6) 5
9 9 POINT (532205 180434.6) 7
10 10 POINT (532464.9 180284.3) 5
为了找到所有这些(为简洁起见,我做了 5000 米):
intervals <- seq(500, 5000, 500)
df_final <- map_dfr(intervals,
~ df1 %>%
mutate(within = .x,
n = lengths(st_is_within_distance(x = .,
y = df2,
dist = .x))) %>%
st_drop_geometry()
)
head(df_final)
id_1 within n
1 1 500 12
2 2 500 9
3 3 500 8
4 4 500 8
5 5 500 2
6 6 500 11
tail(df_final)
id_1 within n
7415 772 5000 205
7416 773 5000 400
7417 774 5000 113
7418 775 5000 130
7419 776 5000 114
7420 777 5000 120
我有两组数据点(df1
和 df2
)。这个想法是找到(并计算)df2
中的所有点,这些点在 df1
中每个点的 x 半径距离内。让 x 从 500 m 开始,以 500 m 的增量增加到 20000 m。这个想法是使用循环或 purr 更有效地迭代和完成任务。
我试过这样的事情:
library(sf)
library(spData)
library(tidyverse)
# data set we can use for the example
df1 <- cycle_hire %>%
# transform for the buffer function
st_transform(., crs = st_crs(27700)) %>%
select(id_1 = "id")
df2 <- cycle_hire_osm %>%
# transform for the buffer function
st_transform(., crs = st_crs(27700)) %>%
select(id_2 = "osm_id")
# create first 500 m buffer for df1
df1_500 <- st_buffer(df1, dist = 500)
# get points that fall within the 500 m buffer
within_500 <- st_join(df1_500, df2, join = st_intersects) %>%
# for each point in df 1, count how many points from df2 are within 500 m
group_by(id_1) %>%
count() %>%
ungroup() %>%
# in my real data some rows return NA next to the id_1
# but this is still being counted as 1.
# I can deal with this later but I realize that in this toy example that this is a mistake.
# remove geometry because I want to join everything later
as_tibble() %>%
select(-geometry) %>%
# identify distance (will join later)
mutate(within = "500")
# now x increases by 500 m and we create a 1000 m buffer
df1_1000 <- st_buffer(df1, dist = 1000)
# get points that fall within the 1000 m buffer
within_1000 <- st_join(df1_1000, df2, join = st_intersects) %>%
# for each point in df 1, count how many points from df2 are within 1000 m
group_by(id_1) %>%
count() %>%
ungroup() %>%
# remove geometry
as_tibble() %>%
select(-geometry) %>%
# identify distance
mutate(within = "1000")
# and repeat...
# eventually I bind into one tibble
bind_all <- bind_rows(within_500, within_1000)
我们重复这个直到我们达到 20000。
我将如何迭代它以便它可以扩展到 20000 米或更多?我还希望在起始距离和增量方面具有一定的灵活性 - 即,可能从 200 米开始并增加我的 200 米增量。或者从 200 米开始,增加 200 直到我们达到 20000,然后我们开始增加 1000 直到 40000。显然我的代码效率很低,我不应该手动执行此操作。
任何帮助将不胜感激
500 米以内的查找:
df_example <- df1 %>%
mutate(within_500 = lengths(st_is_within_distance(x = .,
y = df2,
dist = 500)))
df_example
Simple feature collection with 742 features and 2 fields
geometry type: POINT
dimension: XY
bbox: xmin: 522502 ymin: 174408 xmax: 538733.2 ymax: 184421
projected CRS: OSGB 1936 / British National Grid
First 10 features:
id_1 geometry within_500
1 1 POINT (531203.5 182832.1) 12
2 2 POINT (525208.1 179391.9) 9
3 3 POINT (532985.8 182001.6) 8
4 4 POINT (530437.8 182912) 8
5 5 POINT (528051 178742) 2
6 6 POINT (528858.4 181542.9) 11
7 7 POINT (527159 183300.8) 3
8 8 POINT (527032.7 182634.6) 5
9 9 POINT (532205 180434.6) 7
10 10 POINT (532464.9 180284.3) 5
为了找到所有这些(为简洁起见,我做了 5000 米):
intervals <- seq(500, 5000, 500)
df_final <- map_dfr(intervals,
~ df1 %>%
mutate(within = .x,
n = lengths(st_is_within_distance(x = .,
y = df2,
dist = .x))) %>%
st_drop_geometry()
)
head(df_final)
id_1 within n
1 1 500 12
2 2 500 9
3 3 500 8
4 4 500 8
5 5 500 2
6 6 500 11
tail(df_final)
id_1 within n
7415 772 5000 205
7416 773 5000 400
7417 774 5000 113
7418 775 5000 130
7419 776 5000 114
7420 777 5000 120