计算数据中包含 nans 的 3 个数组的加权平均值 python

Question

我有 3 个表示地理空间数据的二维数组。每个数组形状为(721,1440)，即721个纬度值和1440个经度值。我想计算这 3 个数组的加权平均值。通常这很简单，通常是 sum(array*weight)/sum(weights)。这很好用，除非数据中有 nans。

在我的特定情况下，arr1 的权重应该为 0.7，arr2 0.2，arr3 0.1。但是，只要有 nan，均值显然会变成 nan。在我的例子中，唯一带有 nans 的数据是 arr3.

虽然我想要的是当加权平均值有一个 nan 时只包含前两个数组，这将是 (arr1*0.7 + arr2*0.2)/0.9。我尝试使用 xr.where() 来完成此操作，但由于某种原因，它在我的 RAM 上变得疯狂并且每次都使我的内核崩溃。还有其他方法可以完成这个任务吗？

Answer 1

您可以使用 np.nansum() 和 np.isnan():

import numpy as np

# Dummy example
x = np.ones((5,5))
y = np.ones((5,5))*2
x[0,0] = np.nan

# Stack your array 
stack  = np.stack((x,y))
# Compute the weight for each value:                 
weight = np.apply_along_axis(np.multiply,0,~np.isnan([x,y]),[0.2,0.8])
# Get the result
res    = np.nansum(stack*weight,axis=0)/weight.sum(axis=0)

计算数据中包含 nans 的 3 个数组的加权平均值 python

calculate weighted mean of 3 arrays with nans in data python

python

numpy

python-xarray