计算 2D 插值积分时出错。比较 numpy 数组

Question

我的优化任务涉及计算以下积分并找到 xl 和 xu 的最佳值：

迭代花费的时间太长，因此我决定通过计算所有可能值 xl 和 xu 的积分来加快迭代速度，然后在优化期间插入计算值。

我写了下面的函数：

def k_integrand(x, xl, xu):
    return((x**2)*mpmath.exp(x))/((xu - xl)*(mpmath.exp(x)-1)**2)
@np.vectorize   
def K(xl, xu):
    y, err = integrate.quad(k_integrand, xl, xu, args = (xl, xu))
    return y

和两个相同的数组 grid_xl 和 grid_xu 具有动态增量值。

当我运行我得到这个代码时：

K(grid_xl, grid_xu)
Traceback (most recent call last):

  File "<ipython-input-2-5b9df02f12b7>", line 1, in <module>
    K(grid_xl, grid_xu)

  File "C:/Users/909404/OneDrive/Работа/ZnS-FeS/Теплоемкость/Python/CD357/4 - Optimization CD357 interpolation.py", line 75, in K
    y, err = integrate.quad(k_integrand, xl, xu, args = (xl, xu))

  File "C:\Users9404\Anaconda3\lib\site-packages\scipy\integrate\quadpack.py", line 323, in quad
    points)

  File "C:\Users9404\Anaconda3\lib\site-packages\scipy\integrate\quadpack.py", line 372, in _quad
    if (b != Inf and a != -Inf):

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

我猜这是因为 xl 应该总是小于 xu。如果 xl>=xu?

有什么方法可以比较 xl 和 xu 和 return NaN

最后我想要这样的东西：

并且能够使用插值。

可能是我选错了方式？如果有任何帮助，我将不胜感激。

Answer 1

除非省略 np.vectorize 装饰器，否则我无法重现您的错误。设置重合的 xl/xu 值确实给了我一个 ZeroDivisionError。

无论如何，没有什么能阻止您在更高级别的函数中检查 xu 与 xl 的值。这样你就可以完全跳过无意义数据点的集成，并且 return np.nan 尽早：

import numpy as np
import mpmath
import scipy.integrate as integrate

def k_integrand(x, xl, xu):    
    return ((x**2)*mpmath.exp(x))/((xu - xl)*(mpmath.exp(x)-1)**2)

@np.vectorize   
def K(xl, xu):
    if xu <= xl:
        # don't even try to integrate
        return np.nan
    y, err = integrate.quad(k_integrand, xl, xu, args = (xl, xu))
    return y

grid_xl = np.linspace(0.1,1,10)        # shape (10,) ~ (1,10)
grid_xu = np.linspace(0.5,4,8)[:,None] # shape (8,1)

根据这些定义，我得到了（遵循 np.set_printoptions(linewidth=200) 以便于比较：

In [35]: K(grid_xl, grid_xu)
Out[35]: 
array([[0.99145351, 0.98925197, 0.98650808, 0.98322919,        nan,        nan,        nan,        nan,        nan,        nan],
       [0.97006703, 0.96656815, 0.96254363, 0.95800307, 0.95295785, 0.94742104, 0.94140733, 0.93493293, 0.9280154 ,        nan],
       [0.93730403, 0.93263063, 0.92745487, 0.92178832, 0.91564423, 0.90903747, 0.90198439, 0.89450271, 0.88661141, 0.87833062],
       [0.89565597, 0.88996696, 0.88380385, 0.87717991, 0.87010995, 0.8626103 , 0.85469862, 0.84639383, 0.83771595, 0.82868601],
       [0.84794429, 0.8414176 , 0.83444842, 0.82705134, 0.81924245, 0.81103915, 0.8024601 , 0.79352503, 0.7842547 , 0.77467065],
       [0.79692339, 0.78974   , 0.78214742, 0.77416128, 0.76579857, 0.75707746, 0.74801726, 0.73863822, 0.72896144, 0.71900874],
       [0.7449893 , 0.73732055, 0.7292762 , 0.72087263, 0.71212741, 0.70305921, 0.69368768, 0.68403329, 0.67411725, 0.66396132],
       [0.69402415, 0.68602325, 0.67767956, 0.66900991, 0.66003222, 0.65076537, 0.6412291 , 0.63144388, 0.62143077, 0.61121128]])

您可以看到这些值与您的链接图像完全一致。

现在，我有一个坏消息和一个好消息。坏消息是，虽然 np.vectorize 提供了围绕使用数组输入调用标量积分函数的语法糖，但与本机 for 循环相比，它实际上不会为您提供加速。好消息是，您可以将对 mpmath.exp 的调用替换为对 np.exp 的调用，并且您最终会更快地得到相同的结果：

def k_integrand_np(x, xl, xu):    
    return ((x**2)*np.exp(x))/((xu - xl)*(np.exp(x)-1)**2)

@np.vectorize   
def K_np(xl, xu):
    if xu <= xl:
        # don't even try to integrate
        return np.nan
    y, err = integrate.quad(k_integrand_np, xl, xu, args = (xl, xu))
    return y

有了这些定义

In [14]: res_mpmath = K(grid_xl, grid_xu)
    ...: res_np = K_np(grid_xl, grid_xu)
    ...: inds = ~np.isnan(res_mpmath)
    ...: 

In [15]: np.array_equal(res_mpmath[inds], res_np[inds])
Out[15]: True

In [16]: %timeit K(grid_xl, grid_xu)
107 ms ± 521 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [17]: %timeit K_np(grid_xl, grid_xu)
7.26 ms ± 157 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

所以这两种方法给出了相同的结果（完全正确！），但是 numpy 版本快了将近 15 倍。

Answer 2

使用低级回调函数进行集成

以下回答是对@Andras Deak 回答的评论，但需要评论。

scipy集成函数多次调用k_integrand_np函数，速度很慢。使用纯 Python 函数的替代方法是使用 low-level callback function. This function can be written directly in C or in Python using a compiler like Numba. The following is a slightly modified version of this

例子

import time
import numpy as np
import numba
import scipy.integrate as integrate
from scipy import LowLevelCallable
from numba import cfunc
from numba.types import intc, CPointer, float64


##wrapper for a function that takes 3 input values
def jit_integrand_function(integrand_function):
    jitted_function = numba.njit(integrand_function)
    
    @cfunc(float64(intc, CPointer(float64)))
    def wrapped(n, xx):
        return jitted_function(xx[0], xx[1],xx[2])
    return LowLevelCallable(wrapped.ctypes)

#your function to integrate
def k_integrand_np(x, xl, xu):
  return ((x**2)*np.exp(x))/((xu - xl)*(np.exp(x)-1)**2)

#compile integrand
k_integrand_nb=jit_integrand_function(k_integrand_np)

#now we can use the low-level callable
@np.vectorize
def K_nb(xl, xu):
    if xu <= xl:
        # don't even try to integrate
        return np.nan
    y, err = integrate.quad(k_integrand_nb, xl, xu, args = (xl, xu))
    return y

#for comparison
@np.vectorize
def K_np(xl, xu):
    if xu <= xl:
        # don't even try to integrate
        return np.nan
    y, err = integrate.quad(k_integrand_np, xl, xu, args = (xl, xu))
    return y

性能

#create some data
grid_xl = np.linspace(0.1,1,500)      
grid_xu = np.linspace(0.5,4,800)[:,None] 

t1=time.time()
res_nb = K_nb(grid_xl, grid_xu)
print(time.time()-t1)
t1=time.time()
res_np = K_np(grid_xl, grid_xu)
print(time.time()-t1)

inds = ~np.isnan(res_nb)
np.allclose(res_nb[inds], res_np[inds])

K_np: 24.58s
K_nb: 0.97s (25x speedup)
allclose: True

计算 2D 插值积分时出错。比较 numpy 数组

Error in calculating integral for 2D interpolation. Comparing numpy arrays

python

integration

interpolation

numpy

scipy

使用低级回调函数进行集成