高斯过程回归（Kriging）与径向基函数插值

Question

我在平面图上对传感器之间的温度数据实施了 2 种类型的插值。由于我不是很精通我使用的包的底层过程和数学，我发现很难理解为什么它们通过 pcolormesh 的输出如此不同。

我用了scipy.interpolate.Rbf和sklearn.gaussian_process。

这些是图片。

RBF 示例看起来与在网络上找到的实现完全一样，但 GPR 示例显示这些长线而不是圆形。在 Scikit Learn 的 GPR 实现中，什么参数会调节这些形状？为什么即使 GPR 的温度结果发生轻微变化，它们的形状甚至有时颜色强度也会如此不同。

平面图上的9个传感器（点）均匀分布。

RBF 代码。

# Set X and Y Coordinates for each sensor (pixels)
    days_data['xCoordinate'] = days_data.nodeid.apply(lambda id: createXCoord(id))
    days_data['yCoordinate'] = days_data.nodeid.apply(lambda id: createYCoord(id))

    # Define location of "sensors" on the axes
    x = days_data.xCoordinate.to_list()
    y = days_data.yCoordinate.to_list()
    z = days_data.avgtemperature.to_list() #temperature

    # Use Gaussian function
    rbf_adj = Rbf(x, y, z, function = 'gaussian')

    # Set extent to which colour mesh stretches over
    # the underlying image
    x_fine = np.linspace(0, 1000, 81) #81 - num of samples
    y_fine = np.linspace(0, 700, 81)

    x_grid, y_grid = np.meshgrid(x_fine, y_fine)

    z_grid = rbf_adj(x_grid.ravel(), y_grid.ravel()).reshape(x_grid.shape)

    # Remove the colorbar created by the previous plot, if any
    # To avoid a new colorbar being plotted alongside the previous one each time a different date is selected
    try:
        cb = p.colorbar
        cb.remove()
    except:
        pass

    # plot the pcolor on the Axes. Use alpha to set the transparency
    p=ax.pcolor(x_fine, y_fine, z_grid, alpha=0.3)
    ax.invert_yaxis() #invert Y axis for X and Y to have same starting point

    # Add a colorbar for the pcolor field
    fig.colorbar(p,ax=ax)

GPR 代码

 # Define location of "sensors" on the axes
    x = days_data.xCoordinate.to_list()
    y = days_data.yCoordinate.to_list()
    z = days_data.avgtemperature.to_list() #temperature

    X = np.array([[a, b] for a, b in zip(x, y)])

    # Set extent to which colour mesh stretches over
    # the underlying image
    x_fine = np.linspace(0, 1000, 81) #81 - num of samples
    y_fine = np.linspace(0, 700, 82)
    X_fine = np.array([[a_fine, b_fine] for a_fine, b_fine in zip(x_fine, y_fine)])

    x_grid, y_grid = np.meshgrid(x_fine, y_fine)

    # Instantiate a Gaussian Process model
    kernel = C(1.0, (1e-3, 1e3)) * RBF(10, (1e-2, 1e2))
    gp = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=9)

    # Fit to data using Maximum Likelihood Estimation of the parameters
    gp.fit(X, z)
    z_grid, sigma = gp.predict(X_fine, return_std=True)

    # Remove the colorbar created by the previous plot, if any
    # To avoid a new colorbar being plotted alongside the previous one each time a different date is selected
    try:
        cb = p.colorbar
        cb.remove()
    except:
        pass

    # plot the pcolor on the Axes. Use alpha to set the transparency
    p = ax.pcolor(x_grid, y_grid, np.meshgrid(z_grid, y_fine)[0], alpha=0.3)
    ax.invert_yaxis() #invert Y axis for X and Y to have same starting point

    # Add a colorbar for the pcolor field
    fig.colorbar(p,ax=ax)

Answer 1

我猜想高斯过程的尺度参数在x和y方向上差别很大，x尺度参数小，y尺度参数相对较大。这样，x 距离小的两个点相关性低，y 距离小的两个点相关性高：这会在温度曲线中创建垂直 "bands"。

我能够用以下数据模拟这个，OpenTURNS。你没有提供温度，所以我不得不从图形中猜测它们。

import numpy as np
import openturns as ot
coordinates = ot.Sample([[100.0,100.0],[500.0,100.0],[900.0,100.0], \
                         [100.0,350.0],[500.0,350.0],[900.0,350.0], \
                         [100.0,600.0],[500.0,600.0],[900.0,600.0]])
observations = ot.Sample([25.0,25.0,10.0,20.0,25.0,20.0,15.0,25.0,25.0],1)

# Extract coordinates
x = np.array(coordinates[:,0])
y = np.array(coordinates[:,1])

# Plot the data with a scatter plot and a color map
import matplotlib.pyplot as plt
fig = plt.figure()
plt.scatter(x, y, c=observations, cmap='viridis')
plt.colorbar()
plt.show()

这会产生：

可以使用以下脚本从中拟合克里格元模型。我使用了平方指数协方差模型。

def fitKriging(coordinates, observations, covarianceModel, basis):
    '''
    Fit the parameters of a kriging metamodel. 
    '''
    algo = ot.KrigingAlgorithm(coordinates, observations, covarianceModel, basis)
    algo.run()
    krigingResult = algo.getResult()
    krigingMetamodel = krigingResult.getMetaModel()
    return krigingResult, krigingMetamodel

inputDimension = 2
basis = ot.ConstantBasisFactory(inputDimension).build()
covarianceModel = ot.SquaredExponential([1.]*inputDimension, [1.0])
krigingResult, krigingMetamodel = fitKriging(coordinates, observations, covarianceModel, basis)

为了绘制这个克里格元模型的预测，我使用了以下基于 pcolor 函数的脚本。

def plotKrigingPredictions(krigingMetamodel):
    '''
    Plot the predictions of a kriging metamodel. 
    '''
    # Create the mesh of the box [0., 1000.] * [0., 700.]
    myInterval = ot.Interval([0., 0.], [1000., 700.])
    # Define the number of interval in each direction of the box
    nx = 20
    ny = 20
    myIndices = [nx-1, ny-1]
    myMesher = ot.IntervalMesher(myIndices)
    myMeshBox = myMesher.build(myInterval)

    # Predict
    vertices = myMeshBox.getVertices()
    predictions = krigingMetamodel(vertices)

    # Format for plot
    X = np.array(vertices[:,0]).reshape((ny,nx))
    Y = np.array(vertices[:,1]).reshape((ny,nx))
    predictions_array = np.array(predictions).reshape((ny,nx))

    # Plot
    plt.figure()
    plt.pcolor(X, Y, predictions_array)
    plt.colorbar()
    plt.show()
    return

plotKrigingPredictions(krigingMetamodel)

这会产生：

您可以看到与预测中相同的波段。

查看协方差模型解释了原因：

>>> covarianceModel = krigingResult.getCovarianceModel()
>>> print(covarianceModel)
SquaredExponential(scale=[0.167256,1.5929], amplitude=[6.75753])

x 尺度为 0.1672，比 y 尺度 1.593 小得多。这是因为我们使用了各向异性协方差模型，其中 x 尺度 theta 可以不同于 y 尺度 theta。

为了解决这个问题，我们可以使用各向同性协方差模型，其中 x 尺度保持等于 y 尺度。然而，仅将 x 和 y 标度设置为给定值并仅估计振幅参数 sigma 会更简单。

以下脚本将尺度设置为先前估计的平均值，并且仅估计 sigma。

scales = covarianceModel.getScale()
meanScale = (scales[0]+scales[1])/2.0
covarianceModel.setScale([meanScale]*2)
covarianceModel.setActiveParameter([2]) # Enable sigma (amplitude) only
# Learn amplitude only
krigingResult, krigingMetamodel = fitKriging(coordinates, observations, covarianceModel, basis)
covarianceModel = krigingResult.getCovarianceModel()
print("Covariance model=",covarianceModel)

这会打印：

Covariance model= SquaredExponential(scale=[0.880079,0.880079], amplitude=[6.1472])

最后，脚本如下：

plotKrigingPredictions(krigingMetamodel)

地块：

由于 theta 参数在两个方向上都相同，因此温度现在是球形的，正如我猜你所期望的那样。

高斯过程回归（Kriging）与径向基函数插值

Gaussian process regression (Kriging) vs Radial Basis Function interpolation

python

interpolation

scipy

scikit-learn