使用 Python RE 解析文件,并在找到的字符之间插入换行符

Parse file using Python RE, and insert line break between the characters found

我有一个文件,每当我发现这些字符时我都需要插入一个换行符:}{

它出现了一千多次,所以我需要能够遍历文件的每一行(大约 4k 行)

我的正则表达式很烂,我试过的是:

order_item = re.compile("\}{", re.I)

# creates the file
f = open("orderProductsStructuredDataFinal001", "w", encoding='utf-8')
# opens up the file with the unstructured data, and use it as a iterator
with open('orderProductsStructuredData3') as inf:
    for line in inf:
    order = [ast.literal_eval(op) for op in re.findall(order_item, line)]
    # ta-da! Now do something with the order
    f.write("\n".join(str(x) for x in order))
# closes the file
f.close()

编辑:

with open('orderProdcutsStructuredData3') as inf, open('orderProdcutsStructuredData3') as ("orderProdcutsStructuredDataFinal001", "w", encoding='utf-8'):
for line in inf:
    line = line.replace('}{', '}\n{')
    f.write(line)

你快完成了...:[=​​15=]

with open('orderProductsStructuredData3') as inf:
    for line in inf:
        line = line.replace('}{', '}\n{')
        f.write(line)

请注意,此特定任务不需要 RE,只需替换字符串。

作为旁注,只需将其嵌入到您正在编写的文件的另一个 with 中——没有理由为它切换到打开和关闭...!即:

with open("orderProductsStructuredDataFinal001", "w", encoding='utf-8') as f:
    with open('orderProductsStructuredData3') as inf:
        for line in inf:
            line = line.replace('}{', '}\n{')
            f.write(line)

您可以将两个 open 放在一个 with 中,以达到类似的效果;但在这种情况下,它们太长了,我认为这会降低可读性。