处理来自 python 中 XML 个标签的数据

Question

我正在尝试使用 python 从 XML 文档中提取数据。

我目前正在尝试并且似乎是一个稳定选择的工具是 lxml。

我遇到的问题是我遇到的教程和问题都假定 XML 文档的格式如下：

<note> 
   <to>Tove</to> 
   <from>Jani</from> 
   <heading>Reminder</heading> 
   <body>Don't forget me this weekend!</body> 
</note>

使用 XML 标签内的值。

但是 - 我试图从中提取的文档在标签元素内具有值，如下所示：

<note> 
   <to id="16" name="Tove"/>
   <from id="341" name"Jani"/> 
   <heading id="1" name="Reminder"/> 
   <body id="2" name="Don't forget me this weekend!"/> 
</note>

我在 LXML 中尝试这样做的方式是这样的：

xml_file = lxml.etree.parse("test.xml")

notes = xml_file.xpath("//note")

for note in notes:
    note_id = note.find("id").text
    print note_id

这只是returns"None"

我现在发现 .text 是从 XML 标签内部获取数据的东西 - 但是我根本找不到如何从上面显示的元素中获取数据。

谁能给我指出正确的方向？

Answer 1

要访问属性，您应该使用 attrib:

xml_file = lxml.etree.parse("test.xml")

notes = xml_file.xpath("//note")

for note in notes:
    print [ x.attrib for x in note.getchildren() ]

更多阅读：http://lxml.de/tutorial.html#elements-carry-attributes-as-a-dict

处理来自 python 中 XML 个标签的数据

Processing data from XML tags in python

python

xml

django

lxml