删除 python 中以 octothorpe 开头的列表元素会省略一些元素
removing list elements starting with octothorpe in python omits some elements
这是文本文件(pathProtocol.txt):
# 2018-09-30
# incubator at 33C
# sample compartment at 32C
# hold sample in for 30sec before
# measurement
samp-2
reps-3
#Temp Humid Press
# 24 42 980
# background
MilliQ_MilliQ_0 000-005
Q_Prp_62mM 006-011
Q_Ah6_62mM 012-017
Q_Eth_62mM 018-023
Q_AcA_62mM 024-029
Q_Imd_62mM 030-035
# background
MilliQ_MilliQ_0 036-041
# 24 43 977
我正在解析并试图从中删除空白和以 # 开头的行。
使用 this answer 我的代码以:
开头
# PROTOCOL PARSING
# READ FILE & EXCLUDE BLANKS
with open(os.path.join(*pathProtocol)) as f:
content = (line.rstrip() for line in f)
# Non-blank lines in a list
content = list(line for line in content if line)
#
print(content)
print(type(content))
print(len(content))
print('')
产生所需的输出:
['# 2018-09-30', '# incubator at 33C', '# sample compartment at 32C', '# hold sample in for 30sec before', '# measurement', 'samp-2', 'reps-3', '#Temp Humid\tPress', '# 24\t42\t980', '# background', 'MilliQ_MilliQ_0\t000-005', 'Q_Prp_62mM\t006-011', 'Q_Ah6_62mM\t012-017', 'Q_Eth_62mM\t018-023', 'Q_AcA_62mM\t024-029', 'Q_Imd_62mM\t030-035', '# background', 'MilliQ_MilliQ_0\t036-041', '# 24\t43\t977']
<class 'list'>
19
当我尝试从上面创建的列表 (this answer) 中删除以 octothrope 开头的行时,有趣的部分开始了:
# DELETE COMMENTS
for i, line in enumerate(content):
print(str(i), line, '\tfirst char:', line[0])
if line.startswith('#'):
content.remove(line)
#
#
print(content)
我得到以下输出:
0 # 2018-09-30 first char: #
1 # sample compartment at 32C first char: #
2 # measurement first char: #
3 reps-3 first char: r
4 #Temp Humid Press first char: #
5 # background first char: #
6 Q_Prp_62mM 006-011 first char: Q
7 Q_Ah6_62mM 012-017 first char: Q
8 Q_Eth_62mM 018-023 first char: Q
9 Q_AcA_62mM 024-029 first char: Q
10 Q_Imd_62mM 030-035 first char: Q
11 # background first char: #
12 # 24 43 977 first char: #
['# incubator at 33C', '# hold sample in for 30sec before', 'samp-2', 'reps-3', '# 24\t42\t980', 'MilliQ_MilliQ_0\t000-005', 'Q_Prp_62mM\t006-011', 'Q_Ah6_62mM\t012-017', 'Q_Eth_62mM\t018-023', 'Q_AcA_62mM\t024-029', 'Q_Imd_62mM\t030-035', 'MilliQ_MilliQ_0\t036-041']
从初始列表中省略了一些项目,但在后续列表输出中显示了它们。我只是无法理解逐行输出中发生的事情(作为 python 初学者)。我犯了什么错误?如何正确删除以 # 开头的行?
您在迭代 content
的同时更改它,这通常是行不通的。
相反,迭代 content
的 copy。即把enumerate(content)
改成enumerate(content[:])
.
这是文本文件(pathProtocol.txt):
# 2018-09-30
# incubator at 33C
# sample compartment at 32C
# hold sample in for 30sec before
# measurement
samp-2
reps-3
#Temp Humid Press
# 24 42 980
# background
MilliQ_MilliQ_0 000-005
Q_Prp_62mM 006-011
Q_Ah6_62mM 012-017
Q_Eth_62mM 018-023
Q_AcA_62mM 024-029
Q_Imd_62mM 030-035
# background
MilliQ_MilliQ_0 036-041
# 24 43 977
我正在解析并试图从中删除空白和以 # 开头的行。 使用 this answer 我的代码以:
开头# PROTOCOL PARSING
# READ FILE & EXCLUDE BLANKS
with open(os.path.join(*pathProtocol)) as f:
content = (line.rstrip() for line in f)
# Non-blank lines in a list
content = list(line for line in content if line)
#
print(content)
print(type(content))
print(len(content))
print('')
产生所需的输出:
['# 2018-09-30', '# incubator at 33C', '# sample compartment at 32C', '# hold sample in for 30sec before', '# measurement', 'samp-2', 'reps-3', '#Temp Humid\tPress', '# 24\t42\t980', '# background', 'MilliQ_MilliQ_0\t000-005', 'Q_Prp_62mM\t006-011', 'Q_Ah6_62mM\t012-017', 'Q_Eth_62mM\t018-023', 'Q_AcA_62mM\t024-029', 'Q_Imd_62mM\t030-035', '# background', 'MilliQ_MilliQ_0\t036-041', '# 24\t43\t977']
<class 'list'>
19
当我尝试从上面创建的列表 (this answer) 中删除以 octothrope 开头的行时,有趣的部分开始了:
# DELETE COMMENTS
for i, line in enumerate(content):
print(str(i), line, '\tfirst char:', line[0])
if line.startswith('#'):
content.remove(line)
#
#
print(content)
我得到以下输出:
0 # 2018-09-30 first char: #
1 # sample compartment at 32C first char: #
2 # measurement first char: #
3 reps-3 first char: r
4 #Temp Humid Press first char: #
5 # background first char: #
6 Q_Prp_62mM 006-011 first char: Q
7 Q_Ah6_62mM 012-017 first char: Q
8 Q_Eth_62mM 018-023 first char: Q
9 Q_AcA_62mM 024-029 first char: Q
10 Q_Imd_62mM 030-035 first char: Q
11 # background first char: #
12 # 24 43 977 first char: #
['# incubator at 33C', '# hold sample in for 30sec before', 'samp-2', 'reps-3', '# 24\t42\t980', 'MilliQ_MilliQ_0\t000-005', 'Q_Prp_62mM\t006-011', 'Q_Ah6_62mM\t012-017', 'Q_Eth_62mM\t018-023', 'Q_AcA_62mM\t024-029', 'Q_Imd_62mM\t030-035', 'MilliQ_MilliQ_0\t036-041']
从初始列表中省略了一些项目,但在后续列表输出中显示了它们。我只是无法理解逐行输出中发生的事情(作为 python 初学者)。我犯了什么错误?如何正确删除以 # 开头的行?
您在迭代 content
的同时更改它,这通常是行不通的。
相反,迭代 content
的 copy。即把enumerate(content)
改成enumerate(content[:])
.