在 Python 中的大阵列中使用 8 像素邻域模式检查并填充中心像素

Question

我有一个大的二进制数组(500 x 700)，我想在其中检查'NaNs'并用八个周围像素的模式填充中心像素（如果超过 4 个周围像素有 0或 1).它更像是 3x3 滑动 window 搜索。在 xarray 或 scipy.ndimage 甚至 numpy 中是否有任何 tools/functions 可以做到这一点？

例如

arr = np.asarray([0,  1,  1,  1,  0,  1, 1, np.nan, 0,  1,  0,  1, 1,  1,  0,  1,  1, np.nan]).reshape(3,6)

arr[1,1] = 1 
arr[-1,-1] = 1 (only 3 neighbours)

任何帮助将不胜感激..

提前致谢。

Answer 1

您可以直接使用numpy和scipy.stats.mode实现您的想法。

首先，通过将数组与自身进行比较来查找 nan 值的位置，因为根据定义，NaN 浮点数不等于自身。 np.where 函数将 return 此条件包含的所有位置，在两个索引元组中，一个用于行，另一个用于列。

然后，对于找到 NaN 的每个位置，将其添加 8 个增量以获得其周围的像素。这可以使用增量数组有效地完成，它列出了每个邻居的行和列索引的所有可能偏移量。

最后，对选定的有效邻居执行 within-boundary 检查和运行 mode 函数，并将此值填充到 NaN 单元格中。

下面是我上面描述的代码：

import numpy as np
import scipy.stats

arr = np.asarray([
    0,  1,  1,  1,  0,  1,
    1, np.nan, 0,  1,  0,  1,
    1,  1,  0,  1,  1, np.nan
]).reshape(3, 6)

delta_rows = np.array([-1, -1, -1, 0, 0, 1, 1, 1])
delta_cols = np.array([-1, 0, 1, -1, 1, -1, 0, 1])
nan_rows, nan_cols = np.where(arr != arr)
for nan_row, nan_col in zip(nan_rows, nan_cols):
    neighbour_rows = nan_row + delta_rows
    neighbour_cols = nan_col + delta_cols
    within_boundary = (
        (0 <= neighbour_rows) & (neighbour_rows < arr.shape[0]) & 
        (0 <= neighbour_cols) & (neighbour_cols < arr.shape[1])
    )
    neighbour_rows = neighbour_rows[within_boundary]
    neighbour_cols = neighbour_cols[within_boundary]
    arr[nan_row, nan_col] = scipy.stats.mode(arr[neighbour_rows, neighbour_cols]).mode

之后，我们可以看到 arr 中的每个 NaN 值都正确填充了其周围单元格的模式：

>>> print(arr)
[[0. 1. 1. 1. 0. 1.]
 [1. 1. 0. 1. 0. 1.]
 [1. 1. 0. 1. 1. 1.]]

在 Python 中的大阵列中使用 8 像素邻域模式检查并填充中心像素

Checking and infilling the central pixel with 8 pixel neighbourhood mode in a large array in Python

python

interpolation

python-xarray