python 3 嵌套理解

python 3 nested comprehension

是否有一种聪明的list/dictionary理解方式来获得下面的预期输出给出以下内容:

import numpy as np
freq_mat = np.random.randint(2,size=(4,5));
tokens = ['a', 'b', 'c', 'd', 'e'];
labels = ['X', 'S', 'Y', 'S'];

freq_mat

的预期输出
array([[1, 0, 0, 1, 1],
       [0, 0, 0, 0, 1],
       [1, 0, 1, 1, 0],
       [0, 1, 0, 0, 0]])

应该喜欢以下内容:

[({'a': True, 'b': False, 'c': False, 'd': True, 'e': True}, 'X'),
 ({'a': False, 'b': False, 'c': False, 'd': False, 'e': True}, 'S'),
 ({'a': True, 'b': False, 'c': True, 'd': True, 'e': False}, 'Y'),
 ({'a': False, 'b': True, 'c': False, 'd': False, 'e': False}, 'S')]

您可以将该代码折叠为:

代码:

featureset = [
    ({key: val > 0 for val in row for key in tokens}, label)
    for row, label in zip(freq_mat, labels)]

测试代码:

freq_mat = np.random.randint(2, size=(4, 5));
tokens = ['a', 'b', 'c', 'd', 'e'];
labels = ['X', 'S', 'Y', 'S'];

featureset2 = []
for row, label in zip(freq_mat, labels):
    d = dict()
    for key in tokens:
        for val in row:
            d[key] = val > 0
    featureset2.append((d, label))

featureset = [
    ({key: val > 0 for val in row for key in tokens}, label)
    for row, label in zip(freq_mat, labels)]

assert featureset == featureset2

正如您在更新后的 post 中所指出的那样,您的原始代码无法正常工作:它为给定行中的每个键添加相同的值 - 所有 True 或所有 False。对您的原始代码最简单的更正是这样的:

featureset = []
for row, label in zip(freq_mat, labels):
    d = dict()
    for key, val in zip(tokens, row): # The critical bit
        d[key] = val>0            
    featureset.append((d,label))

一个更精简的版本,但我认为它仍然比单一理解方法更具可读性:

featureset = []
for row, label in zip(freq_mat, labels):
    d = {key: val > 0 for key, val in zip(tokens, row)}
    featureset.append((d, label))

或单线:

featureset = [({key:val>0 for key, val in zip(tokens, row)}, label)
    for row, label in zip(freq_mat, labels)]

就我个人而言,我可能会选择第二种方法,即简洁性和可读性的折衷方案。但这当然取决于您!