使用 Python RE 解析文件,并在找到的字符之间插入换行符
Parse file using Python RE, and insert line break between the characters found
我有一个文件,每当我发现这些字符时我都需要插入一个换行符:}{
它出现了一千多次,所以我需要能够遍历文件的每一行(大约 4k 行)
我的正则表达式很烂,我试过的是:
order_item = re.compile("\}{", re.I)
# creates the file
f = open("orderProductsStructuredDataFinal001", "w", encoding='utf-8')
# opens up the file with the unstructured data, and use it as a iterator
with open('orderProductsStructuredData3') as inf:
for line in inf:
order = [ast.literal_eval(op) for op in re.findall(order_item, line)]
# ta-da! Now do something with the order
f.write("\n".join(str(x) for x in order))
# closes the file
f.close()
编辑:
with open('orderProdcutsStructuredData3') as inf, open('orderProdcutsStructuredData3') as ("orderProdcutsStructuredDataFinal001", "w", encoding='utf-8'):
for line in inf:
line = line.replace('}{', '}\n{')
f.write(line)
你快完成了...:[=15=]
with open('orderProductsStructuredData3') as inf:
for line in inf:
line = line.replace('}{', '}\n{')
f.write(line)
请注意,此特定任务不需要 RE,只需替换字符串。
作为旁注,只需将其嵌入到您正在编写的文件的另一个 with
中——没有理由为它切换到打开和关闭...!即:
with open("orderProductsStructuredDataFinal001", "w", encoding='utf-8') as f:
with open('orderProductsStructuredData3') as inf:
for line in inf:
line = line.replace('}{', '}\n{')
f.write(line)
您可以将两个 open
放在一个 with
中,以达到类似的效果;但在这种情况下,它们太长了,我认为这会降低可读性。
我有一个文件,每当我发现这些字符时我都需要插入一个换行符:}{
它出现了一千多次,所以我需要能够遍历文件的每一行(大约 4k 行)
我的正则表达式很烂,我试过的是:
order_item = re.compile("\}{", re.I)
# creates the file
f = open("orderProductsStructuredDataFinal001", "w", encoding='utf-8')
# opens up the file with the unstructured data, and use it as a iterator
with open('orderProductsStructuredData3') as inf:
for line in inf:
order = [ast.literal_eval(op) for op in re.findall(order_item, line)]
# ta-da! Now do something with the order
f.write("\n".join(str(x) for x in order))
# closes the file
f.close()
编辑:
with open('orderProdcutsStructuredData3') as inf, open('orderProdcutsStructuredData3') as ("orderProdcutsStructuredDataFinal001", "w", encoding='utf-8'):
for line in inf:
line = line.replace('}{', '}\n{')
f.write(line)
你快完成了...:[=15=]
with open('orderProductsStructuredData3') as inf:
for line in inf:
line = line.replace('}{', '}\n{')
f.write(line)
请注意,此特定任务不需要 RE,只需替换字符串。
作为旁注,只需将其嵌入到您正在编写的文件的另一个 with
中——没有理由为它切换到打开和关闭...!即:
with open("orderProductsStructuredDataFinal001", "w", encoding='utf-8') as f:
with open('orderProductsStructuredData3') as inf:
for line in inf:
line = line.replace('}{', '}\n{')
f.write(line)
您可以将两个 open
放在一个 with
中,以达到类似的效果;但在这种情况下,它们太长了,我认为这会降低可读性。