我怎样才能使这个子字符串查找器更高效、更清晰?

How could I make this sub-string finder more efficient and cleaner?

这是我想到的第一个解决方案,但我真的无法想象另一个解决方案。

strings = [input("String: ") for i in range(int(input("How many strings?")))]
num = 0

for i, subA in enumerate(strings):
    for j,subB in enumerate(strings):
        if (i != j) and (subB in subA):
            num += 1
print("There are", num, "substrings")
strings = [input("String: ") for i in range(int(input("How many strings?")))]
num = 0

for i, subA in enumerate(strings[:-1]):
    for subB in strings[i+1:]:
        if subB in subA or subA in subB:
            num += 1
print("There are", num, "substrings")

现在,我不确定我是否理解您要执行的操作,但这可以确保每个字符串只与所有其他字符串进行一次比较。它未经测试,所以请不要相信我的话。

我假设,给定一个字符串数组,您想找出其中有多少字符串包含任何其他字符串作为子字符串。请注意,如果是这种情况,就会出现如何处理列表中相同字符串的问题。

在您的实施中,您正在重新发明 itertools.permutations() 具有给定长度的元素的 returns 排列 。您可以比较使用嵌套 for 循环 (来自您的代码示例)permutations():

生成的列表
from itertools import permutations

strings = ["a", "aa", "b"]
res = []
for i, subA in enumerate(strings):
    for j, subB in enumerate(strings):
        if i != j:
            res.append((subA, subB))
print("Nested loop:", res)
res = list(permutations(strings, 2))
print("permutations():", res)

您需要检查每个元素是否是另一个元素的子串,因此您可以迭代从 permutations() 返回的对并测试第一个元素是否包含第二个 (反之亦然) 。让我们用简单的 list comprehension:

from itertools import permutations

strings = ["a", "aa", "b"]
res = [a in b for a, b in permutations(strings, 2)]
# will return [True, False, False, False, False, False]

在python中True1False0(docs). So to count how many strings are substrings we can pass a generator expression into a sum().

from itertools import permutations

strings = ["a", "aa", "b"]
num = sum(a in b for a, b in permutations(strings, 2))
# will return 1

您还可以对 permutations().

返回的每一对使用 itertools.starmap() to call operator.contains() (与 a in b 相同)
from operator import contains
from itertools import permutations, starmap

strings = ["a", "aa", "b"]
num = sum(starmap(contains, permutations(strings, 2)))

这是您的代码的改进版本:

from operator import contains
from itertools import permutations, starmap

count = input("How many strings? ")
if count.isdecimal() and (count := int(count)):
    strings = []
    while count:
        item = input(f"String ({count} left): ")
        if item:  # skip empty strings
            strings.append(item)
            count -= 1
    num = sum(starmap(contains, permutations(strings, 2)))
    print("There", "are" if num > 1 else "is", num or "no",
          "substring" + "s" * (num != 1))
else:
    print(f'"{count}" is not a valid positive number')

P.S.一些关于性能的说明。

由于方法 sum() 处理可迭代的方法,您可以对带有生成器表达式的代码进行一些修补以更快地工作。

sum([1 for a, b in permutations(strings, 2) if a in b])

会比

稍微快一点
sum(a in b for a, b in permutations(strings, 2))

为什么?看看接下来的问题:

  1. ;
  2. .