复杂函数的向量化

Question

我正在尝试使用矢量化来加速我的一些 for 循环。不幸的是，循环内的函数比简单的算术运算更复杂。我想获取数组的每一项，使用具有多个输入的函数对其进行操作，然后将其放在字典中的正确位置。为此，我有一个名为 increase_element:

的函数

def increase_element(number, word, factor=0.05):
    print(2.*factor)
    return {'factor': 2.*factor, 'number': number, 'word': word}

我想要实现的是从数组开始：

array([0., 0.1, 0.2])

到数组：

array([
       {factor: 0., number: 5, word: 'hi'},
       {factor: 0.2, number: 5, word: 'hi'},
       {factor: 0.4, number: 5, word: 'hi'}
      ])

以有效的方式（即不使用 for 循环），因为实际上函数 increase_element 需要很长时间才能运行。

我尝试做的是使用一个函数将所有输入转换为单个输入，然后将其映射到一个 numpy 数组：

import numpy as np

muls = np.linspace(0, 1, 11)

def increase_element(number, word, factor=0.05):
    print(2.*factor)
    return {'factor': 2.*factor, 'number': number, 'word': word}

def single_increase_element(inputs):
    return increase_element(inputs[0], inputs[1], factor=inputs[2])

single_array = np.array(list(map(lambda x: (5, 'hi', x), muls)))

np.array(list(map(single_increase_element, single_array)))

但是当我尝试打印 2.*factor

时出现以下错误

TypeError: can't multiply sequence by non-int of type 'float'

如有任何建议，我们将不胜感激！

Answer 1

您打印中的因素属于 <class 'numpy.str_'> 类型。该错误是因为您正在尝试使用 2.0.

处理多个字符串

至于代码，我建议您解释一下您要实现的目标，因为它很难理解。

编辑：在其下方修复了您的代码。根本原因是缺少 dtype=np.object 并且数字被视为字符串。

import numpy as np

muls = np.linspace(0, 1, 11)

def increase_element(number, word, factor=0.05):
    print(2.*factor)
    return {'factor': 2.*factor, 'number': number, 'word': word}

def single_increase_element(inputs):
    return increase_element(*inputs)  # fixed
    # return increase_element(inputs[0], inputs[1], factor=inputs[2]) # original

# single_array = np.array(list(map(lambda x: (5, 'hi', x), muls)))  # original
single_array = np.array(list(map(lambda x: (5, 'hi', x), muls)), dtype=np.object) # fixed

np.array(list(map(single_increase_element, single_array)))

复杂函数的向量化

Vectorisation of complex function

python

numpy

vectorization