用 python 解析 xml

Parse xml with python

我正在尝试用 Python 解析 XML 文档,这是我的代码:

from xml.dom import minidom

xmldoc = minidom.parse("aula.xml")

hosts = xmldoc.getElementsByTagName("host")

for host in hosts:
    address = host.getElementByTag("address")
    ip = address.attributes["addr"]
    IP = ip.value
    print("IP:%s"%(IP))

这个returns我:

Traceback (most recent call last):
File "simple.py", line 8, in <module>
address = host.getElementByTag("address")
AttributeError: Element instance has no attribute 'getElementByTag'

XML 文件:

<?xml version="1.0"?>
<!DOCTYPE nmaprun>
<?xml-stylesheet href="file:///usr/local/bin/../share/nmap/nmap.xsl" type="text/xsl"?>
<!-- Nmap 6.47 scan initiated Thu Feb 26 11:38:24 2015 as: nmap -oX aula.xml x.x.x.x/26 -->
<nmaprun scanner="nmap" args="nmap -oX aula.xml x.x.x.x/26" start="1424947104" startstr="Thu Feb 26 11:38:24 2015" version="6.47" xmloutputversion="1.04">
  <scaninfo type="connect" protocol="tcp" numservices="1000" services="a lot of numbers"/>
  <verbose level="0"/>
  <debugging level="0"/>
  <host starttime="1424947104" endtime="1424947111"><status state="up" reason="conn-refused" reason_ttl="0"/>
    <address addr="x.x.x.x" addrtype="ipv4"/>
    <hostnames>
    </hostnames>
    <ports>
        <extraports state="closed" count="998">
            <extrareasons reason="conn-refused" count="998"/>
        </extraports>
        <port protocol="tcp" portid="22"><state state="open" reason="syn-ack" reason_ttl="0"/><service name="ssh" method="table" conf="3"/></port>
        <port protocol="tcp" portid="111"><state state="open" reason="syn-ack" reason_ttl="0"/><service name="rpcbind" method="table" conf="3"/></port>
    </ports>
    <times srtt="1280" rttvar="264" to="100000"/>
</host>
</nmaprun>

可以有多个address。所以使用

address = host.getElementsByTagName("address")[0]

获得第一个address

您要查找的方法是getElementsByTagName,而不是getElementByTag。 return 值是匹配的标签列表,因此您必须遍历它。例如:

from xml.dom import minidom

xmldoc = minidom.parse("aula.xml")

hosts = xmldoc.getElementsByTagName("host")

for host in hosts:
    addresses = host.getElementsByTagName("address")
    for address in addresses:
        ip = address.attributes["addr"]
        IP = ip.value
        print("IP:%s"%(IP))
  1. 没有 getElementByTag 方法::没有这样的方法。使用 getElementsByTagName() 方法获取 address 标签列表。
  2. 检查列表是否有项目,然后从地址标签中获取 addr 属性值。 3.

演示:

from xml.dom import minidom

p = '/home/vivek/Desktop/Work/input_minidom.xml'
xmldoc = minidom.parse(p)

hosts = xmldoc.getElementsByTagName("host")

for host in hosts:
    address = host.getElementsByTagName("address")
    if address:
        ip = address[0].attributes["addr"]
        IP = ip.value
        print("IP:%s"%(IP))

输出:

$ python task4.py 
IP:x.x.x.x