在嵌套列表理解中只使用一个列表中的项目一次

Question

我正在尝试使用列表理解来生成一个新列表，该列表包含一个取自 list1 的字母，紧接着（在冒号之后）是 list2 中以该特定字母开头的单词。我设法使用嵌套的 for 循环对此进行编码，如下所示：

list1=["A","B"]
list2=["Apple","Banana","Balloon","Boxer","Crayons","Elephant"]

newlist=[]
for i in list1:
    newlist.append(i+":")
    for j in list2:
        if j[0]==i:
            newlist[-1]+=j+","

导致预期结果：['A:Apple,', 'B:Banana,Balloon,Boxer,']

尝试使用列表推导进行相同的尝试，我得出以下结论：

list1=["A","B"]
list2=["Apple","Banana","Balloon","Boxer","Crayons","Elephant"]

newlist=[i+":"+j+"," for i in list1 for j in list2 if i==j[0]]

导致：['A:Apple,', 'B:Banana,', 'B:Balloon,', 'B:Boxer,']

其中每次找到一个以该起始字母开头的单词时，都会在 newlist 中创建一个新项目，而我的意图是每个字母一个项目。

有没有办法编辑列表理解代码以获得与使用嵌套 for 循环相同的结果？

Answer 1

您需要做的就是删除第二个 for 循环并将其替换为 ','.join(matching_words) 调用，您现在在字符串连接中使用 j：

newlist = ['{}:{}'.format(l, ','.join([w for w in list2 if w[0] == l])) for l in list1]

这不是很有效；您为每个字母循环 all list2 中的单词。为了有效地做到这一点，你最好将列表预处理成字典：

list2_map = {}
for word in list2:
    list2_map.setdefault(word[0], []).append(word)

newlist = ['{}:{}'.format(l, ','.join(list2_map.get(l, []))) for l in list1]

第一个循环构建一个字典，将首字母映射到单词列表，这样您就可以直接使用这些列表，而不是使用嵌套列表理解。

演示：

>>> list1 = ['A', 'B']
>>> list2 = ['Apple', 'Banana', 'Balloon', 'Boxer', 'Crayons', 'Elephant']
>>> list2_map = {}
>>> for word in list2:
...     list2_map.setdefault(word[0], []).append(word)
...
>>> ['{}:{}'.format(l, ','.join(list2_map.get(l, []))) for l in list1]
['A:Apple', 'B:Banana,Balloon,Boxer']

上述算法循环两次遍历所有 list2，一次循环遍历 list1，使其成为 O(N) 线性算法（向 list2 或list1 的单个字母以常数增加时间量）。您的版本对 list1 中的每个字母循环一次 list2，使其成为 O(NM) 算法，从而在您添加字母或单词时以指数方式增加所花费的时间。

为了将其转化为数字，如果您扩展 list1 以涵盖所有 26 个 ASCII 大写字母并将 list2 扩展为包含 1000 个单词，您的方法（扫描所有 list2带有给定字母的单词）将进行 26000 步。我的版本，包括预构建地图，只需要 2026 步。 list2 包含 100 万个单词，你的版本必须进行 2600 万步，我的 200 万和 26。

Answer 2

list1=["A","B"]
list2=["Apple","Banana","Balloon","Boxer","Crayons","Elephant"]

res = [l1 + ':' + ','.join(l2 for l2 in list2 if l2.startswith(l1)) for l1 in list1]
print(res)

# ['A:Apple', 'B:Banana,Balloon,Boxer']

但是读起来好像比较复杂，所以我建议使用嵌套循环。您可以创建生成器以提高可读性（如果您认为此版本更具可读性）：

def f(list1, list2):
    for l1 in list1:
        val = ','.join(l2 for l2 in list2 if l2.startswith(l1))
        yield l1 + ':' + val

print(list(f(list1, list2)))

# ['A:Apple', 'B:Banana,Balloon,Boxer']

在嵌套列表理解中只使用一个列表中的项目一次

Only using items from one list once in nested list comprehension

python

for-loop

nested

list-comprehension

list