collection Counter - 如何消除单词中的计数字符“'”

Question

我写了一个脚本来查找 file/dictionary 中最长的单词。不过英文都是用撇号“'”，我想跳过它。

from collections import Counter
import time

words = open('english').read().splitlines()

time_before = time.time()

k = Counter(words)

longest = max(k, key=len)
print('The longest word in Dictionary is:', longest, 'has' , len(longest), 'characters')

time_after = time.time()
time_taken = time_after - time_before
print ( 'Longest word found in: ' , time_taken)
print(".......................")

Answer 1

替换您用于 max() 的密钥：

longest = max(k, key=lambda s:len(letter for letter in s if letter != "'"))

您当然可以自定义此解决方案以排除更多字母，或仅包含特定内容等。

Answer 2

您好，您可以将单引号替换为 none 然后开始计数。

txt = "hi it's me"
x = txt.replace("'", "")
print(x)

Answer 3

要在字长统计中排除撇号及其后的所有字符：

max(k, key=lambda word: len(word.split("'")[0]))

Answer 4

所以对于初学者来说，没有理由在这里使用 Counter，你实际上并没有计算任何东西（除非出于某种原因你的字典文件有重复的单词）

您应该能够通过检查单词是否包含撇号的理解来过滤单词：

import time

with open('english') as f:
    words = f.read().splitlines()

time_before = time.time()

filtered = (word for word in words if "'" not in word)
longest = max(filtered, key=len)
print('The longest word in Dictionary is:', longest, 'has' , len(longest), 'characters')

time_after = time.time()
time_taken = time_after - time_before
print ( 'Longest word found in: ' , time_taken)
print(".......................")

collection Counter - 如何消除单词中的计数字符“'”

collection Counter - How to eliminate counting characters " ' " in a word

python

counter

python-3.x