一些两个字母的序列重复了 3 次? "contentment"和"maintaining"就是这样的词

some two-letter sequence repeated 3 times? "contentment" and "maintaining" are such words

有多少单词包含一些重复 3 次的双字母序列?例如,"contentment" 和 "maintaining" 是这样的词,因为 "contentment" 的序列 "nt" 重复了三次,而 "maintaining" 的序列 "in" 重复了三次次。

这是我的代码:

 len([f for f in file if re.match(r'(.*?[a-z]{2}.*?){3}',f)])

您可以使用

\b(?=\w*(\w{2})(?:\w*){2})\w+

参见regex demo

详情

  • \b - 单词边界
  • (?=\w*(\w{2})(?:\w*){2}) - 紧随其后的是 0+ 个单词字符,然后两个单词字符被捕获到第 1 组中,然后必须重复两次任何 0+ 个单词字符,后跟相同的值第 1 组
  • \w+ - 消耗一个或多个单词字符。

参见Python demo

import re

text = "contentment and maintaining are such words"
print ( [x.group() for x in re.finditer(r'\b(?=\w*(\w{2})(?:\w*){2})\w+', text)] )
# =>  ['contentment', 'maintaining']
print ( len([x.group() for x in re.finditer(r'\b(?=\w*(\w{2})(?:\w*){2})\w+', text)]) )
# => 2

这是一个简单的正则表达式:

.*(\w{2}).*.*

它用 (\w{2}) 捕获一组中的两个字母,然后具有相同字母的同一组必须用 </code> 出现两次。</p> <p>这是一个实际的例子:</p> <pre><code>import re text = """ How many words contain some two-letter sequence repeated 3 times? For example, "contentment" and "maintaining" are such words because "contentment" has the sequence "nt" repeated three times and "maintaining" has the sequence "in" repeated three times. """ def check(word): return re.match(r".*(\w{2}).*.*", word) def main(): for word in text.split(): if check(word): print(word) main()