获取 Python 中特定节点的所有子节点
Get all children of specific node in Python
我有以下 example.xml 结构:
<ParentOne>
<SiblingOneA>This is Sibling One A</SiblingOneA>
<SiblingTwoA>
<ChildOneA>Value of child one A</ChildOneA>
<ChildTwoA>Value of child two A</ChildTwoA>
</SiblingTwoA>
</ParentOne>
<ParentTwo>
<SiblingOneA>This is a different value for Sibling one A</SiblingOneA>
<SiblingTwoA>
<ChildOneA>This is a different value for Child one A</ChildOneA>
<ChildTwoA>This is a different value for Child Two A</ChildTwoA>
</SiblingTwoA>
</ParentTwo>
<ParentThree>
<SiblingOneA>A final value for Sibling one A</SiblingOneA>
<SiblingTwoA>
<ChildOneA>A final value for Child one A</ChildOneA>
<ChildTwoA>A final value for Child one A</ChildTwoA>
</SiblingTwoA>
</ParentThree>
我的主要要求是遍历每个节点,当有问题的当前节点是 "SiblingOneA" 时,代码会检查直接相邻的兄弟节点是否是 "SiblingTwoA" .如果是这样,那么它应该检索所有子节点(包括元素本身和元素内的值)。
到目前为止,这是我的代码:
from lxml import etree
XMLDoc = etree.parse('example.xml')
rootXMLElement = XMLDoc.getroot()
tree = etree.parse('example.xml)
import os
for Node in XMLDoc.xpath('//*'):
if os.path.basename(XMLDoc.getpath(Node)) == "SiblingOneA":
if Node.getnext() is not None:
if Node.getnext().tag == "SiblingTwoA":
#RETRIEVE ALL THE CHILDREN ELEMENTS OF THAT SPECIFIC SiblingTwoA NODE AND THEIR VALUES
您可能已经从我上面的代码中推断出,我不知道用什么来代替注释来检索 "SiblingTwoA" 节点的所有子元素和值。此外,这段代码应该 而不是 return 整个树结构中 SiblingTwoA 节点的所有子元素,而只是有问题的那个(即那个 return 从 Node.getnext() 元素编辑)。您还会注意到许多元素是相同的,但它们的值不同。
编辑:
我已经能够使用 Node.getnext().getchildren()
检索相关元素的子元素。但是,这个return是列表形式的信息,如:
[<Element ChildOneA at 0x101a95870>, <Element ChildTwoA at 0x101a958c0>]
[<Element ChildOneA at 0x101a95a50>, <Element ChildTwoA at 0x101a95aa0>]
[<Element ChildOneA at 0x101a95c30>, <Element ChildTwoA at 0x101a95c80>]
如何检索元素中的实际值?
例如,对于第一次迭代,我想要的输出类似于:
ChildOneA = Value of child one A
ChildTwoA = Value of child two A
我想生成一个简单的列表 (['Value of child one A', 'Value of child two A', 'This is a different value for Child one A', 'This is a different value for Child Two A', 'A final value for Child one A', 'A final value for Child one A']
) 你可以使用
[child.xpath('string()') for sibling in doc.xpath('//SiblingTwoA[preceding-sibling::*[1][self::SiblingOneA]]') for child in sibling.xpath('*')]
要生成嵌套列表 ([['Value of child one A', 'Value of child two A'], ['This is a different value for Child one A', 'This is a different value for Child Two A'], ['A final value for Child one A', 'A final value for Child one A']]
),您可以使用
[[child.xpath('string()') for child in sibling.xpath('*')] for sibling in doc.xpath('//SiblingTwoA[preceding-sibling::*[1][self::SiblingOneA]]')]
我有以下 example.xml 结构:
<ParentOne>
<SiblingOneA>This is Sibling One A</SiblingOneA>
<SiblingTwoA>
<ChildOneA>Value of child one A</ChildOneA>
<ChildTwoA>Value of child two A</ChildTwoA>
</SiblingTwoA>
</ParentOne>
<ParentTwo>
<SiblingOneA>This is a different value for Sibling one A</SiblingOneA>
<SiblingTwoA>
<ChildOneA>This is a different value for Child one A</ChildOneA>
<ChildTwoA>This is a different value for Child Two A</ChildTwoA>
</SiblingTwoA>
</ParentTwo>
<ParentThree>
<SiblingOneA>A final value for Sibling one A</SiblingOneA>
<SiblingTwoA>
<ChildOneA>A final value for Child one A</ChildOneA>
<ChildTwoA>A final value for Child one A</ChildTwoA>
</SiblingTwoA>
</ParentThree>
我的主要要求是遍历每个节点,当有问题的当前节点是 "SiblingOneA" 时,代码会检查直接相邻的兄弟节点是否是 "SiblingTwoA" .如果是这样,那么它应该检索所有子节点(包括元素本身和元素内的值)。
到目前为止,这是我的代码:
from lxml import etree
XMLDoc = etree.parse('example.xml')
rootXMLElement = XMLDoc.getroot()
tree = etree.parse('example.xml)
import os
for Node in XMLDoc.xpath('//*'):
if os.path.basename(XMLDoc.getpath(Node)) == "SiblingOneA":
if Node.getnext() is not None:
if Node.getnext().tag == "SiblingTwoA":
#RETRIEVE ALL THE CHILDREN ELEMENTS OF THAT SPECIFIC SiblingTwoA NODE AND THEIR VALUES
您可能已经从我上面的代码中推断出,我不知道用什么来代替注释来检索 "SiblingTwoA" 节点的所有子元素和值。此外,这段代码应该 而不是 return 整个树结构中 SiblingTwoA 节点的所有子元素,而只是有问题的那个(即那个 return 从 Node.getnext() 元素编辑)。您还会注意到许多元素是相同的,但它们的值不同。
编辑:
我已经能够使用 Node.getnext().getchildren()
检索相关元素的子元素。但是,这个return是列表形式的信息,如:
[<Element ChildOneA at 0x101a95870>, <Element ChildTwoA at 0x101a958c0>]
[<Element ChildOneA at 0x101a95a50>, <Element ChildTwoA at 0x101a95aa0>]
[<Element ChildOneA at 0x101a95c30>, <Element ChildTwoA at 0x101a95c80>]
如何检索元素中的实际值?
例如,对于第一次迭代,我想要的输出类似于:
ChildOneA = Value of child one A
ChildTwoA = Value of child two A
我想生成一个简单的列表 (['Value of child one A', 'Value of child two A', 'This is a different value for Child one A', 'This is a different value for Child Two A', 'A final value for Child one A', 'A final value for Child one A']
) 你可以使用
[child.xpath('string()') for sibling in doc.xpath('//SiblingTwoA[preceding-sibling::*[1][self::SiblingOneA]]') for child in sibling.xpath('*')]
要生成嵌套列表 ([['Value of child one A', 'Value of child two A'], ['This is a different value for Child one A', 'This is a different value for Child Two A'], ['A final value for Child one A', 'A final value for Child one A']]
),您可以使用
[[child.xpath('string()') for child in sibling.xpath('*')] for sibling in doc.xpath('//SiblingTwoA[preceding-sibling::*[1][self::SiblingOneA]]')]