Python lxml:如何使用 xpath 选择器获取 XML 标签名称?
Python lxml: how to fetch XML tag names with xpath selector?
我正在尝试使用 Python 和 lxml
解析以下 XML:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="/bind9.xsl"?>
<isc version="1.0">
<bind>
<statistics version="2.2">
<memory>
<summary>
<TotalUse>1232952256
</TotalUse>
<InUse>835252452
</InUse>
<BlockSize>598212608
</BlockSize>
<ContextSize>52670016
</ContextSize>
<Lost>0
</Lost>
</summary>
</memory>
</statistics>
</bind>
</isc>
目标是提取 bind/statistics/memory/summary
下每个元素的标签名称和文本,以生成以下映射:
TotalUse: 1232952256
InUse: 835252452
BlockSize: 598212608
ContextSize: 52670016
Lost: 0
我已经设法提取了元素值,但我无法找出 xpath 表达式来获取元素标签名称。
示例脚本:
from lxml import etree as et
def main():
xmlfile = "bind982.xml"
location = "bind/statistics/memory/summary/*"
label_selector = "??????" ## what to put here...?
value_selector = "text()"
with open(xmlfile, "r") as data:
xmldata = et.parse(data)
etree = xmldata.getroot()
statlist = etree.xpath(location)
for stat in statlist:
label = stat.xpath(label_selector)[0]
value = stat.xpath(value_selector)[0]
print "{0}: {1}".format(label, value)
if __name__ == '__main__':
main()
我知道我可以使用 value = stat.tag
而不是 stat.xpath()
,但脚本必须足够通用才能处理标签选择器不同的 XML 的其他部分。
什么 xpath 选择器 return 元素的标签名称?
我认为这两个值不需要 XPath,元素节点具有 tag
和 text
属性,因此例如使用列表理解:
[(element.tag, element.text) for element in etree.xpath(location)]
或者如果你真的想使用 XPath
result = [(element.xpath('name()'), element.xpath('string()')) for element in etree.xpath(location)]
你当然也可以构造一个字典列表:
result = [{ element.tag : element.text } for element in root.xpath(location)]
或
result = [{ element.xpath('name()') : element.xpath('string()') } for element in etree.xpath(location)]
只需使用 XPath 的 name()
,并删除零索引,因为此 returns 是一个字符串而不是列表。
from lxml import etree as et
def main():
xmlfile = "ExtractXPathTagName.xml"
location = "bind/statistics/memory/summary/*"
label_selector = "name()" ## what to put here...?
value_selector = "text()"
with open(xmlfile, "r") as data:
xmldata = et.parse(data)
etree = xmldata.getroot()
statlist = etree.xpath(location)
for stat in statlist:
label = stat.xpath(label_selector)
value = stat.xpath(value_selector)[0]
print("{0}: {1}".format(label, value).strip())
if __name__ == '__main__':
main()
输出
TotalUse: 1232952256
InUse: 835252452
BlockSize: 598212608
ContextSize: 52670016
Lost: 0
我正在尝试使用 Python 和 lxml
解析以下 XML:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="/bind9.xsl"?>
<isc version="1.0">
<bind>
<statistics version="2.2">
<memory>
<summary>
<TotalUse>1232952256
</TotalUse>
<InUse>835252452
</InUse>
<BlockSize>598212608
</BlockSize>
<ContextSize>52670016
</ContextSize>
<Lost>0
</Lost>
</summary>
</memory>
</statistics>
</bind>
</isc>
目标是提取 bind/statistics/memory/summary
下每个元素的标签名称和文本,以生成以下映射:
TotalUse: 1232952256
InUse: 835252452
BlockSize: 598212608
ContextSize: 52670016
Lost: 0
我已经设法提取了元素值,但我无法找出 xpath 表达式来获取元素标签名称。
示例脚本:
from lxml import etree as et
def main():
xmlfile = "bind982.xml"
location = "bind/statistics/memory/summary/*"
label_selector = "??????" ## what to put here...?
value_selector = "text()"
with open(xmlfile, "r") as data:
xmldata = et.parse(data)
etree = xmldata.getroot()
statlist = etree.xpath(location)
for stat in statlist:
label = stat.xpath(label_selector)[0]
value = stat.xpath(value_selector)[0]
print "{0}: {1}".format(label, value)
if __name__ == '__main__':
main()
我知道我可以使用 value = stat.tag
而不是 stat.xpath()
,但脚本必须足够通用才能处理标签选择器不同的 XML 的其他部分。
什么 xpath 选择器 return 元素的标签名称?
我认为这两个值不需要 XPath,元素节点具有 tag
和 text
属性,因此例如使用列表理解:
[(element.tag, element.text) for element in etree.xpath(location)]
或者如果你真的想使用 XPath
result = [(element.xpath('name()'), element.xpath('string()')) for element in etree.xpath(location)]
你当然也可以构造一个字典列表:
result = [{ element.tag : element.text } for element in root.xpath(location)]
或
result = [{ element.xpath('name()') : element.xpath('string()') } for element in etree.xpath(location)]
只需使用 XPath 的 name()
,并删除零索引,因为此 returns 是一个字符串而不是列表。
from lxml import etree as et
def main():
xmlfile = "ExtractXPathTagName.xml"
location = "bind/statistics/memory/summary/*"
label_selector = "name()" ## what to put here...?
value_selector = "text()"
with open(xmlfile, "r") as data:
xmldata = et.parse(data)
etree = xmldata.getroot()
statlist = etree.xpath(location)
for stat in statlist:
label = stat.xpath(label_selector)
value = stat.xpath(value_selector)[0]
print("{0}: {1}".format(label, value).strip())
if __name__ == '__main__':
main()
输出
TotalUse: 1232952256
InUse: 835252452
BlockSize: 598212608
ContextSize: 52670016
Lost: 0