生成器未按预期关闭数据

Generator not closing over data as expected

抱歉,如果标题措辞不当,我不确定如何措辞。我有一个函数,它基本上迭代二维可迭代对象的第二维。以下为简单复现:

words = ['ACGT', 'TCGA']

def make_lists():
    for i in range(len(words[0])):
        iter_ = iter([word[i] for word in words])
        yield iter_

lists = list(make_lists())

for list_ in lists:
    print(list(list_))

运行 这输出:

['A', 'T']
['C', 'C']
['G', 'G']
['T', 'A']

我想产生生成器而不是必须计算 words,以防 words 很长,所以我尝试了以下方法:

words = ['ACGT', 'TCGA']

def make_generators():
    for i in range(len(words[0])):
        gen = (word[i] for word in words)
        yield gen

generators = list(make_iterator())

for gen in generators:
    print(list(gen))

然而,运行 输出:

['T', 'A']
['T', 'A']
['T', 'A']
['T', 'A']

我不确定到底发生了什么。我怀疑这与生成器理解有关,在生成时没有关闭其范围,所以它们都是共享的。如果我在一个单独的函数中创建生成器并从该函数中生成 return 它似乎可以工作。

i 现在对于那些生成器来说是一个自由变量,它们现在将使用它的最后一个值,即 3。简单来说,它们知道应该从哪里获取 i 但在创建时不知道 i 的实际值。所以,像这样:

def make_iterator():
    for i in range(len(words[0])):
        gen = (word[i] for word in words)
        yield gen
    i = 0  # Modified the value of i 

将导致:

['A', 'T']
['A', 'T']
['A', 'T']
['A', 'T']

生成器表达式作为函数作用域实现,另一方面,列表推导会立即运行,并可以在迭代过程中获取 i 的值。(好的列表推导在 Python3个,​​但不同的是他们不偷懒)

解决方法是使用一个内部函数,该函数使用默认参数值在每个循环中捕获 i 的实际值:

words = ['ACGT', 'TCGA']

def make_iterator():
    for i in range(len(words[0])):
        # default argument value is calculated at the time of
        # function creation, hence for each generator it is going
        # to be the value at the time of that particular iteration  
        def inner(i=i):
            return (word[i] for word in words)
        yield inner()

generators = list(make_iterator())

for gen in generators:
    print(list(gen))

您可能还想阅读: