将短语列表写入 csv 文件
Writing out a list of phrases to a csv file
继之前的 之后,我编写了一些 Python 代码来计算某些短语的出现频率(包含在“word_list”变量中以及三个示例列出但会有更多)在大量文本文件中。我在下面编写的代码要求我获取列表中的每个元素并将其插入到一个字符串中,以便与每个文本文件进行比较。但是,当前代码仅将列表中最后一个短语的频率写入电子表格中的相关列,而不是将所有频率写入电子表格中的相关列。这只是一个缩进问题,没有将 writerow 放在正确的位置,还是我的代码中存在逻辑缺陷。还有什么方法可以避免使用列表进行字符串赋值,以便将短语与文本文件中的短语进行比较?
word_list = ['in the event of', 'frankly speaking', 'on the other hand']
S = {}
p = 0
k = 0
with open(file_path, 'w+', newline='') as csv_file:
writer = csv.writer(csv_file)
writer.writerow(["Fohone-K"] + word_list)
for filename in glob.glob(os.path.join(path, '*.txt')):
if filename.endswith('.txt'):
f = open(filename)
Fohone-K = filename[8:]
data = f.read()
# new code section from scratch file
l = len(word_list)
for s in range(l):
phrase = word_list[s]
S = data.count((phrase))
if S:
#k = k + 1
print("'{}' match".format(Fohone-K), S)
else:
print("'{} no match".format(Fohone-K))
print("\n")
# for m in word_list:
if S >= 0:
print([Fohone-K] + [S])
writer.writerow([Fohone-K] + [S])
当前输出如下所示。
enter image description here
当它需要看起来像这样的时候。
enter image description here
你可能想要这样的东西:
import csv, glob, os
word_list = ['in the event of', 'frankly speaking', 'on the other hand']
file_path = 'out.csv'
path = '.'
with open(file_path, 'w+', newline='') as csv_file:
writer = csv.writer(csv_file)
writer.writerow(["Fohone-K"] + word_list)
for filename in glob.glob(os.path.join(path, '*.txt')):
if filename.endswith('.txt'):
with open(filename) as f:
postfix = filename[8:]
content = f.read()
matches = [content.count(phrase) for phrase in word_list]
print(f"'{filename}' {'no ' if all(n == 0 for n in matches) else ''}match")
writer.writerow([postfix] + matches)
关键问题是您在每一行都写了 S
,其中只包含一个计数。这是通过编写一整套比赛解决的。
继之前的
word_list = ['in the event of', 'frankly speaking', 'on the other hand']
S = {}
p = 0
k = 0
with open(file_path, 'w+', newline='') as csv_file:
writer = csv.writer(csv_file)
writer.writerow(["Fohone-K"] + word_list)
for filename in glob.glob(os.path.join(path, '*.txt')):
if filename.endswith('.txt'):
f = open(filename)
Fohone-K = filename[8:]
data = f.read()
# new code section from scratch file
l = len(word_list)
for s in range(l):
phrase = word_list[s]
S = data.count((phrase))
if S:
#k = k + 1
print("'{}' match".format(Fohone-K), S)
else:
print("'{} no match".format(Fohone-K))
print("\n")
# for m in word_list:
if S >= 0:
print([Fohone-K] + [S])
writer.writerow([Fohone-K] + [S])
当前输出如下所示。
enter image description here
当它需要看起来像这样的时候。
enter image description here
你可能想要这样的东西:
import csv, glob, os
word_list = ['in the event of', 'frankly speaking', 'on the other hand']
file_path = 'out.csv'
path = '.'
with open(file_path, 'w+', newline='') as csv_file:
writer = csv.writer(csv_file)
writer.writerow(["Fohone-K"] + word_list)
for filename in glob.glob(os.path.join(path, '*.txt')):
if filename.endswith('.txt'):
with open(filename) as f:
postfix = filename[8:]
content = f.read()
matches = [content.count(phrase) for phrase in word_list]
print(f"'{filename}' {'no ' if all(n == 0 for n in matches) else ''}match")
writer.writerow([postfix] + matches)
关键问题是您在每一行都写了 S
,其中只包含一个计数。这是通过编写一整套比赛解决的。