XML 未在 Python 中返回正确的 child tags/data

Question

您好，我正在请求调用 return 来自在线商店的订单数据。我的问题是，一旦我将数据传递给根变量，方法 iter 就不会 return 获得正确的结果。例如显示多个同名标签而不是一个标签，并且不显示标签内的数据。

我认为这是由于 XML 的格式不正确，所以我通过使用 pretty_print 将其保存到文件来格式化它，但这并没有修复错误。

我该如何解决这个问题？ - 提前致谢

代码：

import requests, xml.etree.ElementTree as ET, lxml.etree as etree

url="http://publicapi.ekmpowershop24.com/v1.1/publicapi.asmx"
headers = {'content-type': 'application/soap+xml'}
body = """<?xml version="1.0" encoding="utf-8"?>
<soap12:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap12="http://www.w3.org/2003/05/soap-envelope">
  <soap12:Body>
    <GetOrders xmlns="http://publicapi.ekmpowershop.com/">
      <GetOrdersRequest>
        <APIKey>my_api_key</APIKey>
        <FromDate>01/07/2018</FromDate>
        <ToDate>04/07/2018</ToDate>
      </GetOrdersRequest>
    </GetOrders>
  </soap12:Body>
</soap12:Envelope>"""

#send request to ekm
r = requests.post(url,data=body,headers=headers)

#save output to file
file = open("C:/Users/Mark/Desktop/test.xml", "w")
file.write(r.text)
file.close()

#take the file and format the xml
x = etree.parse("C:/Users/Mark/Desktop/test.xml")
newString = etree.tostring(x, pretty_print=True)
file = open("C:/Users/Mark/Desktop/test.xml", "w")
file.write(newString.decode('utf-8'))
file.close()

#parse the file to get the roots
tree = ET.parse("C:/Users/Mark/Desktop/test.xml")
root = tree.getroot()

#access elements names in the data
for child in root.iter('*'):
    print(child.tag)

#show orders elements attributes
tree = ET.parse("C:/Users/Mark/Desktop/test.xml")
root = tree.getroot()
for order in root.iter('{http://publicapi.ekmpowershop.com/}Order'):
    out = {}
    for child in order:
        if child.tag in ('OrderID'):
        out[child.tag] = child.text
    print(out)

元素输出：

{http://publicapi.ekmpowershop.com/}Orders
{http://publicapi.ekmpowershop.com/}Order
{http://publicapi.ekmpowershop.com/}OrderID
{http://publicapi.ekmpowershop.com/}OrderNumber
{http://publicapi.ekmpowershop.com/}CustomerID
{http://publicapi.ekmpowershop.com/}CustomerUserID
{http://publicapi.ekmpowershop.com/}Order
{http://publicapi.ekmpowershop.com/}OrderID
{http://publicapi.ekmpowershop.com/}OrderNumber
{http://publicapi.ekmpowershop.com/}CustomerID
{http://publicapi.ekmpowershop.com/}CustomerUserID

订单输出：

{http://publicapi.ekmpowershop.com/}Order {}
{http://publicapi.ekmpowershop.com/}Order {}

XML 格式化后的结构：

 <soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <soap:Body>
    <GetOrdersResponse xmlns="http://publicapi.ekmpowershop.com/">
      <GetOrdersResult>
        <Status>Success</Status>
        <Errors/>
        <Date>2018-07-10T13:47:00.1682029+01:00</Date>
        <TotalOrders>10</TotalOrders>
        <TotalCost>100</TotalCost>
        <Orders>
          <Order>
            <OrderID>100</OrderID>
            <OrderNumber>102/040718/67</OrderNumber>
            <CustomerID>6910</CustomerID>
            <CustomerUserID>204</CustomerUserID>
            <FirstName>TestFirst</FirstName>
            <LastName>TestLast</LastName>
            <CompanyName>Test Company</CompanyName>
            <EmailAddress>test@Test.com</EmailAddress>
            <OrderStatus>Dispatched</OrderStatus>
            <OrderStatusColour>#00CC00</OrderStatusColour>
            <TotalCost>85.8</TotalCost>
            <OrderDate>10/07/2018 14:30:43</OrderDate>
            <OrderDateISO>2018-07-10T14:30:43</OrderDateISO>
            <AbandonedOrder>false</AbandonedOrder>
            <EkmStatus>SUCCESS</EkmStatus>
          </Order>
        </Orders>
        <Currency>GBP</Currency>
      </GetOrdersResult>
    </GetOrdersResponse>
  </soap:Body>
</soap:Envelope>

Answer 1

检查标签时需要考虑命名空间。

>>> # Include the namespace part of the tag in the tag values that we check.
>>> tags = ('{http://publicapi.ekmpowershop.com/}OrderID', '{http://publicapi.ekmpowershop.com/}OrderNumber')
>>> for order in root.iter('{http://publicapi.ekmpowershop.com/}Order'):
...     out = {}
...     for child in order:
...         if child.tag in tags:
...             out[child.tag] = child.text
...     print(out)
... 
{'{http://publicapi.ekmpowershop.com/}OrderID': '100', '{http://publicapi.ekmpowershop.com/}OrderNumber': '102/040718/67'}

如果您不想在输出中使用名称空间前缀，您可以通过仅在 } 字符之后包含标记的那部分来去除它们。

>>> for order in root.iter('{http://publicapi.ekmpowershop.com/}Order'):
...     out = {}
...     for child in order:
...         if child.tag in tags:
...             out[child.tag[child.tag.index('}')+1:]] = child.text
...     print(out)
... 
{'OrderID': '100', 'OrderNumber': '102/040718/67'}

XML 未在 Python 中返回正确的 child tags/data

XML not returning correct child tags/data in Python

python

xml

elementtree

python-requests