过滤一行中有 n 个相等字符的字符串

Question

是否有一个选项可以从字符串列表中过滤掉那些连续包含例如 3 个相同字符的字符串？我创建了一个可以做到这一点的方法，但我很好奇是否有更 pythonic 的方式或更有效或更简单的方式来做到这一点。

list_of_strings = []


def check_3_in_row(string):
    for ch in set(string):
        if ch*3 in string:
            return True
    return False

new_list = [x for x in list_of_strings if check_3_in_row(x)]

编辑：我刚找到一个解决方案：

new_list = [x for x in set(keywords) if any(ch*3 in x for ch in x)]

但我不确定哪种方式更快 - 正则表达式还是这种方式。

Answer 1

你可以像这样使用正则表达式

>>> list_of_strings = ["aaa", "dasdas", "aaafff", "afff", "abbbc"]
>>> [x for x in list_of_strings if re.search(r'(.){2}', x)]
['aaa', 'aaafff', 'afff', 'abbbc']

这里，. 匹配任何字符，它被捕获在一个组中 ((.))。我们检查相同的捕获字符（我们使用反向引用 </code> 引用字符串中的第一个捕获组）是否再出现两次（<code>{2} 表示两次）。

过滤一行中有 n 个相等字符的字符串

Filter strings where there are n equal characters in a row

python

string

list

filter

char