python 如何读取多个 nltk 语料库文件并写入单个文本文件
How to read multiple nltk corpus files and write in a single text file in python
我写了下面的代码:
import nltk
然后
file1 = nltk.corpus.gutenberg.words('shakespeare-caesar.txt')
file2 = nltk.corpus.gutenberg.words('shakespeare-hamlet.txt')
file3 = nltk.corpus.gutenberg.words('shakespeare-macbeth.txt')
我尝试将内容写入单个文件的部分
filenames = [file1, file2, file3]
with open('result.txt', 'w') as outfile: #want to store the contents of 3 files in result.txt
for fname in filenames:
with open(fname) as infile:
for line in infile:
outfile.write(line)
为此我收到以下错误
TypeError Traceback (most recent call last)
<ipython-input-9-917545c3c1ce> in <module>()
2 with open('result.txt', 'w') as outfile:
3 for fname in filenames:
----> 4 with open(fname) as infile:
5 for line in infile:
6 outfile.write(line)
TypeError: invalid file: ['[', 'The', 'Tragedie', 'of', 'Julius', 'Caesar', ...]
如错误消息的最后一行所示,file1
等人。不是文件名,而是单词列表。除了使用 words 函数,您还可以像这样将文件合并为一个文件:
filenames = [
"shakespeare-caesar.txt",
"shakespeare-hamlet.txt",
"shakespeare-macbeth.txt"
]
with open("result.txt", "w") as f:
for filename in filenames:
f.write(nltk.corpus.gutenberg.raw(filename))
我写了下面的代码:
import nltk
然后
file1 = nltk.corpus.gutenberg.words('shakespeare-caesar.txt')
file2 = nltk.corpus.gutenberg.words('shakespeare-hamlet.txt')
file3 = nltk.corpus.gutenberg.words('shakespeare-macbeth.txt')
我尝试将内容写入单个文件的部分
filenames = [file1, file2, file3]
with open('result.txt', 'w') as outfile: #want to store the contents of 3 files in result.txt
for fname in filenames:
with open(fname) as infile:
for line in infile:
outfile.write(line)
为此我收到以下错误
TypeError Traceback (most recent call last)
<ipython-input-9-917545c3c1ce> in <module>()
2 with open('result.txt', 'w') as outfile:
3 for fname in filenames:
----> 4 with open(fname) as infile:
5 for line in infile:
6 outfile.write(line)
TypeError: invalid file: ['[', 'The', 'Tragedie', 'of', 'Julius', 'Caesar', ...]
如错误消息的最后一行所示,file1
等人。不是文件名,而是单词列表。除了使用 words 函数,您还可以像这样将文件合并为一个文件:
filenames = [
"shakespeare-caesar.txt",
"shakespeare-hamlet.txt",
"shakespeare-macbeth.txt"
]
with open("result.txt", "w") as f:
for filename in filenames:
f.write(nltk.corpus.gutenberg.raw(filename))