Python

Question

给定一个 templates 元组 (region, calc_3d_harmonics(region)) 列表，其中 calc_3d_harmonics 是某个函数 returns 每个区域的签名，我需要找到具有最小值的区域分数（实际分数无关紧要）。

区域的得分由 calc_harmonics_distance(calc_3d_harmonics(region),query_harmonics, radius) 给出，该函数计算给定半径的两个谐波特征之间的距离（query_harmonics 和半径是预先计算的）。

我目前的解决方案是：

query_harmonics = calc_3d_harmonics(query_region)
ref_region, score = min(templates, key=lambda t: calc_3d_harmonics_distance(t[1], query_harmonics, radius))

一位团队成员建议我改用以下内容：

query_harmonics = calc_3d_harmonics(query_region)
ref_region, score = min([(t[0], calc_harmonics_distance(t[1], query_harmonics, radius)) for t in templates], key=lambda x: x[1])

注意：calc_3d_harmonics 和 calc_harmonics_distance 都是很慢很重的函数。此外，score 可以替换为 _。

他声称他的建议可能会带来更好的运行时间（尽管这并不重要，因为谐波函数是主要操作）。如果 min(list, key=func) 创建一个密钥列表，那么我们的版本是等效的（我的版本更短），但如果它每次都计算密钥，他认为我的会更慢。

哪种方式更快？我认为必须有更好的（运行时方面）方法来做到这一点（也许使用 numpy？）并且想听听一些建议。

Answer 1

如有疑问，请勿猜测，profile it。

把你所有的代码放在后面，我们可以参考cPython实现。我们可以看到，min 函数使用了 min_max helper。在这个助手中，我们可以找到 key function is computed.

的位置

最小摘录为：

while (( item = PyIter_Next(it) )) {
    /* get the value from the key function */
    if (keyfunc != NULL) {
        val = PyObject_CallFunctionObjArgs(keyfunc, item, NULL);
        if (val == NULL)
            goto Fail_it_item;
    }
    /* no key function; the value is the item */
    else {
        val = item;
        Py_INCREF(val);
    }
    // comparision logic for min/max
}

源代码清楚地指出，键函数为排序的可迭代对象中的每个元素计算一次。另一方面，键函数结果在排序完成后被丢弃。因此，情况归结为您是否计划稍后重用关键函数值。

Answer 2

min(lst, key=func) 在 lst 的每个项目上调用 func 一次（这也适用于 max、list.sort 和 sorted).因此，如果 lst 包含重复项，则键函数会做不必要的工作，除非您使用记忆键函数。

为了说明，这里有几个关键函数，它们在调用时打印它们的 arg。 kf 是一个普通的键函数，kf_cached 使用默认的可变字典来做记忆。

def kf(n):
    print(' Key', n)
    return int(n)

def kf_cached(n, cache={}):
    if n in cache:
        print(' Cached', n)
        return cache[n]
    print(' Key', n)
    cache[n] = k = int(n)
    return k

a = '14142'

u = max(a, key=kf)
print('max', u, '\n')

u = max(a, key=kf_cached)
print('max', u)

输出

 Key 1
 Key 4
 Key 1
 Key 4
 Key 2
max 4 

 Key 1
 Key 4
 Cached 1
 Cached 4
 Key 2
max 4

Python - 使用 min(list, key=func) 如何提高效率的元组列表的最小值

Python - Minimum of list of tuples using min(list, key=func) how to improve efficiency

tuples

numpy

list

min