生成器未按预期关闭数据
Generator not closing over data as expected
抱歉,如果标题措辞不当,我不确定如何措辞。我有一个函数,它基本上迭代二维可迭代对象的第二维。以下为简单复现:
words = ['ACGT', 'TCGA']
def make_lists():
for i in range(len(words[0])):
iter_ = iter([word[i] for word in words])
yield iter_
lists = list(make_lists())
for list_ in lists:
print(list(list_))
运行 这输出:
['A', 'T']
['C', 'C']
['G', 'G']
['T', 'A']
我想产生生成器而不是必须计算 words
,以防 words
很长,所以我尝试了以下方法:
words = ['ACGT', 'TCGA']
def make_generators():
for i in range(len(words[0])):
gen = (word[i] for word in words)
yield gen
generators = list(make_iterator())
for gen in generators:
print(list(gen))
然而,运行 输出:
['T', 'A']
['T', 'A']
['T', 'A']
['T', 'A']
我不确定到底发生了什么。我怀疑这与生成器理解有关,在生成时没有关闭其范围,所以它们都是共享的。如果我在一个单独的函数中创建生成器并从该函数中生成 return 它似乎可以工作。
i
现在对于那些生成器来说是一个自由变量,它们现在将使用它的最后一个值,即 3。简单来说,它们知道应该从哪里获取 i
但在创建时不知道 i
的实际值。所以,像这样:
def make_iterator():
for i in range(len(words[0])):
gen = (word[i] for word in words)
yield gen
i = 0 # Modified the value of i
将导致:
['A', 'T']
['A', 'T']
['A', 'T']
['A', 'T']
生成器表达式作为函数作用域实现,另一方面,列表推导会立即运行,并可以在迭代过程中获取 i
的值。(好的列表推导在 Python3个,但不同的是他们不偷懒)
解决方法是使用一个内部函数,该函数使用默认参数值在每个循环中捕获 i
的实际值:
words = ['ACGT', 'TCGA']
def make_iterator():
for i in range(len(words[0])):
# default argument value is calculated at the time of
# function creation, hence for each generator it is going
# to be the value at the time of that particular iteration
def inner(i=i):
return (word[i] for word in words)
yield inner()
generators = list(make_iterator())
for gen in generators:
print(list(gen))
您可能还想阅读:
- What do (lambda) function closures capture?
- Python internals: Symbol tables, part 1
抱歉,如果标题措辞不当,我不确定如何措辞。我有一个函数,它基本上迭代二维可迭代对象的第二维。以下为简单复现:
words = ['ACGT', 'TCGA']
def make_lists():
for i in range(len(words[0])):
iter_ = iter([word[i] for word in words])
yield iter_
lists = list(make_lists())
for list_ in lists:
print(list(list_))
运行 这输出:
['A', 'T']
['C', 'C']
['G', 'G']
['T', 'A']
我想产生生成器而不是必须计算 words
,以防 words
很长,所以我尝试了以下方法:
words = ['ACGT', 'TCGA']
def make_generators():
for i in range(len(words[0])):
gen = (word[i] for word in words)
yield gen
generators = list(make_iterator())
for gen in generators:
print(list(gen))
然而,运行 输出:
['T', 'A']
['T', 'A']
['T', 'A']
['T', 'A']
我不确定到底发生了什么。我怀疑这与生成器理解有关,在生成时没有关闭其范围,所以它们都是共享的。如果我在一个单独的函数中创建生成器并从该函数中生成 return 它似乎可以工作。
i
现在对于那些生成器来说是一个自由变量,它们现在将使用它的最后一个值,即 3。简单来说,它们知道应该从哪里获取 i
但在创建时不知道 i
的实际值。所以,像这样:
def make_iterator():
for i in range(len(words[0])):
gen = (word[i] for word in words)
yield gen
i = 0 # Modified the value of i
将导致:
['A', 'T']
['A', 'T']
['A', 'T']
['A', 'T']
生成器表达式作为函数作用域实现,另一方面,列表推导会立即运行,并可以在迭代过程中获取 i
的值。(好的列表推导在 Python3个,但不同的是他们不偷懒)
解决方法是使用一个内部函数,该函数使用默认参数值在每个循环中捕获 i
的实际值:
words = ['ACGT', 'TCGA']
def make_iterator():
for i in range(len(words[0])):
# default argument value is calculated at the time of
# function creation, hence for each generator it is going
# to be the value at the time of that particular iteration
def inner(i=i):
return (word[i] for word in words)
yield inner()
generators = list(make_iterator())
for gen in generators:
print(list(gen))
您可能还想阅读:
- What do (lambda) function closures capture?
- Python internals: Symbol tables, part 1