遍历 python 中的文本文件行
Looping through lines of text file in python
我有两个文本文件,我想逐行读取它并检查是否匹配,如果匹配则打印,否则什么也不做。但是在下面的代码中,它只检查第一个文件的第一行,并检查第二个 for 循环文件的所有行。但我想检查第一个文件和第二个文件的所有行。我不确定我做错了什么。
with open("changed_commands_from_default_value", "a") \
as changed_commands_from_default_value, \
open(command_file, "r") \
as command_executed_file, \
open("default_command_values", "r") \
as default_command_values:
for default_command in default_command_values:
for command_executed in command_executed_file:
only_command = command_executed.split()[0]
only_default_command = default_command.split()[0]
if only_command == only_default_command:
if command_executed != default_command:
print(" > The default value " +
default_command.rstrip() + " is changed to " +
command_executed.rstrip())
changed_commands_from_default_value.write(
"The default value " + '"' + default_command + '"' +
"is changed to " + '"' + command_executed + '"')
我的数据是这样的
File 1:
Data1 1
Data2 2
Data3 3
Data4 6
Data5 10
File 2:
Data1 4
Data2 4
Data3 6
....
我想要一个像
这样的输出
Data1 is changed from 1 to 4
Data2 is changed from 2 to 4 and so on...
只需将读取放在同一个循环中即可。两个文件的最小工作示例,分别命名为 t1.in 和 t2.in 将是:
with open('t1.in', 'r') as f1:
with open('t2.in', 'r') as f2:
while True:
l1, l2 = f1.readline(), f2.readline() # read lines simultaneously
# handle case where one of the lines is empty
# as file line count may differ
if (not l1) or (not l2): break
else:
# process lines here
此示例同时从两个文件读取行,如果其中一个文件的行数少于另一个,则读取 min(lines_of_file_1, lines_of_file_2)
行。
要在两个迭代器上循环 "in parallel",请使用内置的 zip
,或者,在 Python 2 中,itertools.izip
(后者将需要一个 import itertools
当然是在模块的开头)。
例如,更改:
for default_command in default_command_values:
for command_executed in command_executed_file:
进入:
for default_command, command_executed in zip(
default_command_values, command_executed_file):
这假设这两个文件确实是 "parallel"——即逐行 1-1 对应。如果不是这种情况,那么最简单的方法(除非文件太大以至于您的内存无法容纳)是首先将一个读入 dict
,然后循环检查另一个dict
。所以,例如:
cmd2val = {}
with open("default_command_values", "r") as default_command_values:
for default_command in default_command_values:
cmd2val[default_command.split()[0]] = default_command.strip()
然后,分别:
with open(command_file, "r") as command_executed_file:
for command_executed in command_executed_file:
only_command = command_executed.split()[0]
if only_command not in cmd2val: continue # or whatever
command_executed = command_executed.strip()
if command_executed != cmd2val[only_command]:
# etc, etc, for all output you desire in this case
反之亦然,从预期较小的文件构建字典,然后使用它逐行检查预期较大的文件。
下面是 的实现:
#!/usr/bin/env python3
"""Match data in two files. Print the changes in the matched values.
Usage: %(prog)s <old-file> <new-file>
"""
import sys
if len(sys.argv) != 3:
sys.exit(__doc__ % dict(prog=sys.argv[0]))
old_filename, new_filename = sys.argv[1:]
# read old file
data = {}
with open(old_filename) as file:
for line in file:
try:
key, value = line.split()
data[key] = int(value)
except ValueError:
pass # ignore non-key-value lines
# compare with the new file
with open(new_filename) as file:
for line in file:
columns = line.split()
if len(columns) == 2 and columns[0] in data:
try:
new_value = int(columns[1])
except ValueError:
continue # ignore invalid lines
else: # matching line
value = data[columns[0]]
if value != new_value: # but values differ
print('{key} is changed from {value} to {new_value}'.format(
key=columns[0], value=value, new_value=new_value))
输出(对于问题的输入)
Data1 is changed from 1 to 4
Data2 is changed from 2 to 4
Data3 is changed from 3 to 6
我有两个文本文件,我想逐行读取它并检查是否匹配,如果匹配则打印,否则什么也不做。但是在下面的代码中,它只检查第一个文件的第一行,并检查第二个 for 循环文件的所有行。但我想检查第一个文件和第二个文件的所有行。我不确定我做错了什么。
with open("changed_commands_from_default_value", "a") \
as changed_commands_from_default_value, \
open(command_file, "r") \
as command_executed_file, \
open("default_command_values", "r") \
as default_command_values:
for default_command in default_command_values:
for command_executed in command_executed_file:
only_command = command_executed.split()[0]
only_default_command = default_command.split()[0]
if only_command == only_default_command:
if command_executed != default_command:
print(" > The default value " +
default_command.rstrip() + " is changed to " +
command_executed.rstrip())
changed_commands_from_default_value.write(
"The default value " + '"' + default_command + '"' +
"is changed to " + '"' + command_executed + '"')
我的数据是这样的
File 1:
Data1 1
Data2 2
Data3 3
Data4 6
Data5 10
File 2:
Data1 4
Data2 4
Data3 6
....
我想要一个像
这样的输出Data1 is changed from 1 to 4
Data2 is changed from 2 to 4 and so on...
只需将读取放在同一个循环中即可。两个文件的最小工作示例,分别命名为 t1.in 和 t2.in 将是:
with open('t1.in', 'r') as f1:
with open('t2.in', 'r') as f2:
while True:
l1, l2 = f1.readline(), f2.readline() # read lines simultaneously
# handle case where one of the lines is empty
# as file line count may differ
if (not l1) or (not l2): break
else:
# process lines here
此示例同时从两个文件读取行,如果其中一个文件的行数少于另一个,则读取 min(lines_of_file_1, lines_of_file_2)
行。
要在两个迭代器上循环 "in parallel",请使用内置的 zip
,或者,在 Python 2 中,itertools.izip
(后者将需要一个 import itertools
当然是在模块的开头)。
例如,更改:
for default_command in default_command_values:
for command_executed in command_executed_file:
进入:
for default_command, command_executed in zip(
default_command_values, command_executed_file):
这假设这两个文件确实是 "parallel"——即逐行 1-1 对应。如果不是这种情况,那么最简单的方法(除非文件太大以至于您的内存无法容纳)是首先将一个读入 dict
,然后循环检查另一个dict
。所以,例如:
cmd2val = {}
with open("default_command_values", "r") as default_command_values:
for default_command in default_command_values:
cmd2val[default_command.split()[0]] = default_command.strip()
然后,分别:
with open(command_file, "r") as command_executed_file:
for command_executed in command_executed_file:
only_command = command_executed.split()[0]
if only_command not in cmd2val: continue # or whatever
command_executed = command_executed.strip()
if command_executed != cmd2val[only_command]:
# etc, etc, for all output you desire in this case
反之亦然,从预期较小的文件构建字典,然后使用它逐行检查预期较大的文件。
下面是
#!/usr/bin/env python3
"""Match data in two files. Print the changes in the matched values.
Usage: %(prog)s <old-file> <new-file>
"""
import sys
if len(sys.argv) != 3:
sys.exit(__doc__ % dict(prog=sys.argv[0]))
old_filename, new_filename = sys.argv[1:]
# read old file
data = {}
with open(old_filename) as file:
for line in file:
try:
key, value = line.split()
data[key] = int(value)
except ValueError:
pass # ignore non-key-value lines
# compare with the new file
with open(new_filename) as file:
for line in file:
columns = line.split()
if len(columns) == 2 and columns[0] in data:
try:
new_value = int(columns[1])
except ValueError:
continue # ignore invalid lines
else: # matching line
value = data[columns[0]]
if value != new_value: # but values differ
print('{key} is changed from {value} to {new_value}'.format(
key=columns[0], value=value, new_value=new_value))
输出(对于问题的输入)
Data1 is changed from 1 to 4
Data2 is changed from 2 to 4
Data3 is changed from 3 to 6