collection Counter - 如何消除单词中的计数字符“'”

collection Counter - How to eliminate counting characters " ' " in a word

我写了一个脚本来查找 file/dictionary 中最长的单词。 不过英文都是用撇号“'”,我想跳过它。

from collections import Counter
import time

words = open('english').read().splitlines()

time_before = time.time()

k = Counter(words)

longest = max(k, key=len)
print('The longest word in Dictionary is:', longest, 'has' , len(longest), 'characters')

time_after = time.time()
time_taken = time_after - time_before
print ( 'Longest word found in: ' , time_taken)
print(".......................")

替换您用于 max() 的密钥:

longest = max(k, key=lambda s:len(letter for letter in s if letter != "'"))

您当然可以自定义此解决方案以排除更多字母,或仅包含特定内容等。

您好,您可以将单引号替换为 none 然后开始计数。

txt = "hi it's me"
x = txt.replace("'", "")
print(x)

要在字长统计中排除撇号及其后的所有字符:

max(k, key=lambda word: len(word.split("'")[0]))

所以对于初学者来说,没有理由在这里使用 Counter,你实际上并没有计算任何东西(除非出于某种原因你的字典文件有重复的单词)

您应该能够通过检查单词是否包含撇号的理解来过滤单词:

import time

with open('english') as f:
    words = f.read().splitlines()

time_before = time.time()

filtered = (word for word in words if "'" not in word)
longest = max(filtered, key=len)
print('The longest word in Dictionary is:', longest, 'has' , len(longest), 'characters')

time_after = time.time()
time_taken = time_after - time_before
print ( 'Longest word found in: ' , time_taken)
print(".......................")