书中的单词,按频率排序(.txt 文件)
Words, sorted by frequency, in a book (.txt file)
我正在使用:
from collections import Counter
wordlist = open('mybook.txt','r').read().split()
c = Counter(wordlist)
print c
# result :
# Counter({'the': 9530, 'to': 5004, 'a': 4203, 'and': 4202, 'was': 4197, 'of': 3912, 'I': 2852, 'that': 2574, ... })
打印一本书的所有单词,按频率排序。
如何将此结果写入 .txt 输出文件?
g = open('wordfreq.txt','w')
g.write(c) # here it fails
这是所需的输出 wordfreq.txt
:
the, 9530
to, 5004
a, 5004
and, 4203
was, 4197
...
我认为这可能是您需要的帮助:如何以您要求的格式打印词典。前四行是你的原始代码。
from collections import Counter
wordlist = open('so.py', 'r').read().split()
c = Counter(wordlist)
print c
outfile = open('output.txt', 'w')
for word, count in c.items():
outline = word + ',' + str(count) + '\n'
outfile.write(outline)
如果你想按顺序写,你可以这样做。
from collections import Counter
wordlist = open('so.py', 'r').read().split()
word_counts = Counter(wordlist)
write_file = open('wordfreq.txt', 'w')
for w, c in sorted(word_counts.iteritems(), key=lambda x: x[1], reverse=True):
write_file.write('{w}, {c}\n'.format(w=w, c=c))
我认为这可以做得更简单一些。我还使用上下文管理器 (with
) 自动关闭文件
from collections import Counter
with open('mybook.txt', 'r') as mybook:
wordcounts = Counter(mybook.read().split())
with open('wordfreq.txt', 'w') as write_file:
for item in word_counts.most_common():
print('{}, {}'.format(*item), file=write_file)
如果文件特别大,使用
可以避免一次全部读入内存
wordcounts = Counter(x for line in mybook for x in line.split())
我正在使用:
from collections import Counter
wordlist = open('mybook.txt','r').read().split()
c = Counter(wordlist)
print c
# result :
# Counter({'the': 9530, 'to': 5004, 'a': 4203, 'and': 4202, 'was': 4197, 'of': 3912, 'I': 2852, 'that': 2574, ... })
打印一本书的所有单词,按频率排序。
如何将此结果写入 .txt 输出文件?
g = open('wordfreq.txt','w')
g.write(c) # here it fails
这是所需的输出 wordfreq.txt
:
the, 9530
to, 5004
a, 5004
and, 4203
was, 4197
...
我认为这可能是您需要的帮助:如何以您要求的格式打印词典。前四行是你的原始代码。
from collections import Counter
wordlist = open('so.py', 'r').read().split()
c = Counter(wordlist)
print c
outfile = open('output.txt', 'w')
for word, count in c.items():
outline = word + ',' + str(count) + '\n'
outfile.write(outline)
如果你想按顺序写,你可以这样做。
from collections import Counter
wordlist = open('so.py', 'r').read().split()
word_counts = Counter(wordlist)
write_file = open('wordfreq.txt', 'w')
for w, c in sorted(word_counts.iteritems(), key=lambda x: x[1], reverse=True):
write_file.write('{w}, {c}\n'.format(w=w, c=c))
我认为这可以做得更简单一些。我还使用上下文管理器 (with
) 自动关闭文件
from collections import Counter
with open('mybook.txt', 'r') as mybook:
wordcounts = Counter(mybook.read().split())
with open('wordfreq.txt', 'w') as write_file:
for item in word_counts.most_common():
print('{}, {}'.format(*item), file=write_file)
如果文件特别大,使用
可以避免一次全部读入内存 wordcounts = Counter(x for line in mybook for x in line.split())