生成器的 max() 是在构建类似列表的对象还是工作效率更高?
Is max() of a generator building a list-like object or is it working more efficient?
假设我有一个目录,其中有名称为 'filename_1'
、'filename_2'
等的文件名,并且有一个生成器 models_paths
,我用它来查找最新的数字:
mypath = 'my/path/filename'
models_paths = Path(mypath).parent.glob(Path(mypath).name + '*')
number_newest = max(int(str(file_path).split('_')[-1]) for file_path in models_paths)
我想知道max
是在构建一个类似列表的数据结构还是在使用像
这样的算法
number_newest = None
for file_path in models_paths:
number_current = int(str(file_path).split('_')[-1])
number_newest = number_current if number_newest is None else max(number_current, number_newest)
换句话说:如果我写
我会失去处理效率 and/or 内存效率吗
mypath = 'my/path/filename'
models_paths = Path(mypath).parent.glob(Path(mypath).name + '*')
models_paths = list(models_paths)
number_newest = max(int(str(file_path).split('_')[-1]) for file_path in models_paths)
?
max
不构建列表。
这可以在这个例子中用自定义对象清楚地证明:
class Thing:
def __init__(self, x):
self.x = x
print(f'creating {x}')
def __lt__(self, other):
return self.x < other.x
def __del__(self):
print(f'destroying {self.x}')
def __str__(self):
return f'<{self.x}>'
print(max(Thing(i) for i in range(5)))
给出:
creating 0
creating 1
destroying 0
creating 2
destroying 1
creating 3
destroying 2
creating 4
destroying 3
<4>
destroying 4
如您所见,一旦确定不再是具有最大值的对象,就会对每个对象调用 __del__
方法。如果将它们附加到列表中,情况就不会如此。
对比:
print(max([Thing(i) for i in range(5)]))
给出:
creating 0
creating 1
creating 2
creating 3
creating 4
destroying 3
destroying 2
destroying 1
destroying 0
<4>
destroying 4
您可以编写一个(效率较低的)等效函数并证明它做同样的事情:
def mymax(things):
empty = True
for thing in things:
if empty or (thing > maximum): # parentheses for clarity only
maximum = thing
empty = False
if empty:
raise ValueError
return maximum
假设我有一个目录,其中有名称为 'filename_1'
、'filename_2'
等的文件名,并且有一个生成器 models_paths
,我用它来查找最新的数字:
mypath = 'my/path/filename'
models_paths = Path(mypath).parent.glob(Path(mypath).name + '*')
number_newest = max(int(str(file_path).split('_')[-1]) for file_path in models_paths)
我想知道max
是在构建一个类似列表的数据结构还是在使用像
number_newest = None
for file_path in models_paths:
number_current = int(str(file_path).split('_')[-1])
number_newest = number_current if number_newest is None else max(number_current, number_newest)
换句话说:如果我写
我会失去处理效率 and/or 内存效率吗mypath = 'my/path/filename'
models_paths = Path(mypath).parent.glob(Path(mypath).name + '*')
models_paths = list(models_paths)
number_newest = max(int(str(file_path).split('_')[-1]) for file_path in models_paths)
?
max
不构建列表。
这可以在这个例子中用自定义对象清楚地证明:
class Thing:
def __init__(self, x):
self.x = x
print(f'creating {x}')
def __lt__(self, other):
return self.x < other.x
def __del__(self):
print(f'destroying {self.x}')
def __str__(self):
return f'<{self.x}>'
print(max(Thing(i) for i in range(5)))
给出:
creating 0
creating 1
destroying 0
creating 2
destroying 1
creating 3
destroying 2
creating 4
destroying 3
<4>
destroying 4
如您所见,一旦确定不再是具有最大值的对象,就会对每个对象调用 __del__
方法。如果将它们附加到列表中,情况就不会如此。
对比:
print(max([Thing(i) for i in range(5)]))
给出:
creating 0
creating 1
creating 2
creating 3
creating 4
destroying 3
destroying 2
destroying 1
destroying 0
<4>
destroying 4
您可以编写一个(效率较低的)等效函数并证明它做同样的事情:
def mymax(things):
empty = True
for thing in things:
if empty or (thing > maximum): # parentheses for clarity only
maximum = thing
empty = False
if empty:
raise ValueError
return maximum