如何使用 python 将小于和大于符号转换为父标记中的实体?

How to use python to convert less than and more than signs to entities within a parent tag?

我很难编写将 <> 转换为 \&lt;\&gt; 父标签内的 <sep> 的功能代码。所以原始代码如下所示:

<xml>
<body>
<month>
<sep>Hello world!<p>This is 
september!</p> Hello world!<b>And today's Firday!</b></sep>
</month>
<month>
<sep><i>This is October!<i></sep>
</month>
</body>
</xml>

结果应该是:

<xml>
<body>
<month>
<sep>Hello world!\&lt;p\&gt;This is 
september!\&lt;/p\&gt; Hello world!\&lt;b\&gt;And today's Firday!\&lt;/b\&gt;</sep>
</month>
<month>
<sep>\&lt;i\&gt;This is October!\&lt;i\&gt;</sep>
</month>
</body>
</xml>

到目前为止,我的代码是这样的:

text1 = re.findall(r"<sep>((.|\n)*?)<\/sep>", f.read())
text2 = re.sub(r"<(.*?)>", r"\&lt;"+r""+"\&gt;", text1)

但是如何将转换后的文本放回原始文件中呢? 谢谢!

sample = """<xml>
<body>
<month>
<sep>Hello world!<p>This is 
september!</p> Hello world!<b>And today's Firday!</b></sep>
</month>
<month>
<sep><i>This is October!<i></sep>
</month>
</body>
</xml>"""

def encode_text(in_txt):
  out_txt = copy.copy(in_txt)
  matches = re.findall(r"<sep>((.|\n)*?)<\/sep>", in_txt)
  for (txt,_) in matches:
    out_txt = out_txt.replace(txt, re.sub(r"<(.*?)>", r"\&lt;"+r""+"\&gt;", txt), 1)
  return out_txt

def decode_text(in_txt):
  out_txt = copy.copy(in_txt)
  matches = re.findall(r"<sep>((.|\n)*?)<\/sep>", in_txt)
  for (txt,_) in matches:
    out_txt = out_txt.replace(txt, re.sub(r"\\&lt;(.*?)\\&gt;", r"<>", txt), 1)
  return out_txt

result_encoded = encode_text(sample)
result_decoded = decode_text(result_encoded)

print(result_encoded) 打印:

<xml>
<body>
<month>
<sep>Hello world!\&lt;p\&gt;This is 
september!\&lt;/p\&gt; Hello world!\&lt;b\&gt;And today's Firday!\&lt;/b\&gt;</sep>
</month>
<month>
<sep>\&lt;i\&gt;This is October!\&lt;i\&gt;</sep>
</month>
</body>
</xml>

print(result_decoded) 打印:

<xml>
<body>
<month>
<sep>Hello world!<p>This is 
september!</p> Hello world!<b>And today's Firday!</b></sep>
</month>
<month>
<sep><i>This is October!<i></sep>
</month>
</body>
</xml>

另外,请注意:

result_decode == sample
Out[87]: True