PHP - 解析具有命名空间元素的 xml

PHP - parsing xml which has namespace elements

我阅读了其他帖子和解决方案,但它们对我不起作用 - 或许我对它们的理解不够透彻。

我有一个 hp 网络扫描仪,并且有一个 perl 脚本,它通过一系列事务进行交互,以便我可以启动扫描。我正在努力将其直接移植到 php;比较适合我要运行的服务器吧。有些交易有效,有些则无效。这是关于一个没有的。

我从其中一个查询中获取了 XML,但它无法成功解析(或者这是我不太了解的地方)。我正在 运行ning php 版本 7.1.12,以防与此相关。

我的测试输出如下:

> php xmltest.php
SimpleXMLElement Object
(
)
object(SimpleXMLElement)#1 (0) {
}
>

如果 xml 更简单(我认为没有名称-space 信息),那么 print_r() 就非常冗长。

这是完整的测试脚本,其中包含一些要处理的实际数据

error_reporting( E_ALL );
ini_set('display_errors', 1);

$test_1 = <<<EOM
<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope 
    xmlns:SOAP-ENV="http://www.w3.org/2003/05/soap-envelope"
    xmlns:SOAP-ENC="http://www.w3.org/2003/05/soap-encoding"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema"
    xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/08/addressing"
    xmlns:wst="http://schemas.xmlsoap.org/ws/2004/09/transfer"
    xmlns:mex="http://schemas.xmlsoap.org/ws/2004/09/mex"
    xmlns:wsdp="http://schemas.xmlsoap.org/ws/2006/02/devprof"
    xmlns:PNPX="http://schemas.microsoft.com/windows/pnpx/2005/10"
    xmlns:UNS1="http://www.microsoft.com/windows/test/testdevice/11/2005"
    xmlns:dd="http://www.hp.com/schemas/imaging/con/dictionaries/1.0"
    xmlns:wprt="http://schemas.microsoft.com/windows/2006/08/wdp/print"
    xmlns:wscn="http://schemas.microsoft.com/windows/2006/08/wdp/scan">
    <SOAP-ENV:Header>
        <wsa:To>http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</wsa:To>
        <wsa:Action>http://schemas.xmlsoap.org/ws/2004/09/transfer/GetResponse</wsa:Action>
        <wsa:MessageID>urn:uuid:fec6e42d-5356-1f69-9c3a-001f2927cf33</wsa:MessageID>
        <wsa:RelatesTo>urn:uuid:704ccde5-6861-415d-bd65-31dd9d7a8b98</wsa:RelatesTo>
    </SOAP-ENV:Header>
    <SOAP-ENV:Body>
        <mex:Metadata>
            <mex:MetadataSection Dialect="http://schemas.xmlsoap.org/ws/2006/02/devprof/ThisDevice">
                <wsdp:ThisDevice>
                    <wsdp:FriendlyName xml:lang="en">Printer (HP Color LaserJet CM1312nfi MFP)</wsdp:FriendlyName>
                    <wsdp:FirmwareVersion>20140625</wsdp:FirmwareVersion>
                    <wsdp:SerialNumber>CNB885H665</wsdp:SerialNumber>
                </wsdp:ThisDevice>
            </mex:MetadataSection>
            <mex:MetadataSection Dialect="http://schemas.xmlsoap.org/ws/2006/02/devprof/ThisModel">
                <wsdp:ThisModel>
                    <wsdp:Manufacturer xml:lang="en">HP</wsdp:Manufacturer>
                    <wsdp:ManufacturerUrl>http://www.hp.com/</wsdp:ManufacturerUrl>
                    <wsdp:ModelName xml:lang="en">HP Color LaserJet CM1312nfi MFP</wsdp:ModelName>
                    <wsdp:ModelNumber>CM1312nfi MFP</wsdp:ModelNumber>
                    <wsdp:PresentationUrl>http://192.168.1.20:80/</wsdp:PresentationUrl>
                    <PNPX:DeviceCategory>Printers</PNPX:DeviceCategory>
                </wsdp:ThisModel>
            </mex:MetadataSection>
            <mex:MetadataSection Dialect="http://schemas.xmlsoap.org/ws/2006/02/devprof/Relationship">
                <wsdp:Relationship Type="http://schemas.xmlsoap.org/ws/2006/02/devprof/host">
                    <wsdp:Hosted>
                        <wsa:EndpointReference>
                            <wsa:Address>http://192.168.1.20:3910/</wsa:Address>
                            <wsa:ReferenceProperties>
                                <UNS1:ServiceIdentifier>uri:prn</UNS1:ServiceIdentifier>
                            </wsa:ReferenceProperties>
                        </wsa:EndpointReference>
                        <wsdp:Types>wprt:PrinterServiceType</wsdp:Types>
                        <wsdp:ServiceId>uri:1cd4F16e-7c8a-a7a0-3797-00145a8827ce</wsdp:ServiceId>
                        <PNPX:CompatibleId>http://schemas.microsoft.com/windows/2006/08/wdp/print/PrinterServiceType</PNPX:CompatibleId>
                    </wsdp:Hosted>
                </wsdp:Relationship>
            </mex:MetadataSection>
        </mex:Metadata>
    </SOAP-ENV:Body>
</SOAP-ENV:Envelope>
EOM;

$myxml1 = simplexml_load_string($test_1);
print_r($myxml1);
var_dump($myxml1);
exit;
?>

我想提取其中的几个参数。一例是:

<wsa:Address>http://192.168.1.20:3910/</wsa:Address>

你能帮我弥补关于如何访问这个参数的知识差距吗?

谢谢!

首先,soap 和名称空间只会使解析 XML 变得比它必须的更难。我从来没有分析过 XML 的名称空间实际上使 XML 更好理解,或者有任何好处。我完全理解命名空间存在的原因,但这只是意味着要跳过一些额外的环节来获取数据。命名空间的技巧是您必须通过将命名空间作为子命名空间来 "enter in" 到命名空间分支。

<?php

error_reporting( E_ALL );
ini_set('display_errors', 1);

$str = <<<EOM
<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope 
    xmlns:SOAP-ENV="http://www.w3.org/2003/05/soap-envelope"
    xmlns:SOAP-ENC="http://www.w3.org/2003/05/soap-encoding"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema"
    xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/08/addressing"
    xmlns:wst="http://schemas.xmlsoap.org/ws/2004/09/transfer"
    xmlns:mex="http://schemas.xmlsoap.org/ws/2004/09/mex"
    xmlns:wsdp="http://schemas.xmlsoap.org/ws/2006/02/devprof"
    xmlns:PNPX="http://schemas.microsoft.com/windows/pnpx/2005/10"
    xmlns:UNS1="http://www.microsoft.com/windows/test/testdevice/11/2005"
    xmlns:dd="http://www.hp.com/schemas/imaging/con/dictionaries/1.0"
    xmlns:wprt="http://schemas.microsoft.com/windows/2006/08/wdp/print"
    xmlns:wscn="http://schemas.microsoft.com/windows/2006/08/wdp/scan">
    <SOAP-ENV:Header>
        <wsa:To>http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</wsa:To>
        <wsa:Action>http://schemas.xmlsoap.org/ws/2004/09/transfer/GetResponse</wsa:Action>
        <wsa:MessageID>urn:uuid:fec6e42d-5356-1f69-9c3a-001f2927cf33</wsa:MessageID>
        <wsa:RelatesTo>urn:uuid:704ccde5-6861-415d-bd65-31dd9d7a8b98</wsa:RelatesTo>
    </SOAP-ENV:Header>
    <SOAP-ENV:Body>
        <mex:Metadata>
            <mex:MetadataSection Dialect="http://schemas.xmlsoap.org/ws/2006/02/devprof/ThisDevice">
                <wsdp:ThisDevice>
                    <wsdp:FriendlyName xml:lang="en">Printer (HP Color LaserJet CM1312nfi MFP)</wsdp:FriendlyName>
                    <wsdp:FirmwareVersion>20140625</wsdp:FirmwareVersion>
                    <wsdp:SerialNumber>CNB885H665</wsdp:SerialNumber>
                </wsdp:ThisDevice>
            </mex:MetadataSection>
            <mex:MetadataSection Dialect="http://schemas.xmlsoap.org/ws/2006/02/devprof/ThisModel">
                <wsdp:ThisModel>
                    <wsdp:Manufacturer xml:lang="en">HP</wsdp:Manufacturer>
                    <wsdp:ManufacturerUrl>http://www.hp.com/</wsdp:ManufacturerUrl>
                    <wsdp:ModelName xml:lang="en">HP Color LaserJet CM1312nfi MFP</wsdp:ModelName>
                    <wsdp:ModelNumber>CM1312nfi MFP</wsdp:ModelNumber>
                    <wsdp:PresentationUrl>http://192.168.1.20:80/</wsdp:PresentationUrl>
                    <PNPX:DeviceCategory>Printers</PNPX:DeviceCategory>
                </wsdp:ThisModel>
            </mex:MetadataSection>
            <mex:MetadataSection Dialect="http://schemas.xmlsoap.org/ws/2006/02/devprof/Relationship">
                <wsdp:Relationship Type="http://schemas.xmlsoap.org/ws/2006/02/devprof/host">
                    <wsdp:Hosted>
                        <wsa:EndpointReference>
                            <wsa:Address>http://192.168.1.20:3910/</wsa:Address>
                            <wsa:ReferenceProperties>
                                <UNS1:ServiceIdentifier>uri:prn</UNS1:ServiceIdentifier>
                            </wsa:ReferenceProperties>
                        </wsa:EndpointReference>
                        <wsdp:Types>wprt:PrinterServiceType</wsdp:Types>
                        <wsdp:ServiceId>uri:1cd4F16e-7c8a-a7a0-3797-00145a8827ce</wsdp:ServiceId>
                        <PNPX:CompatibleId>http://schemas.microsoft.com/windows/2006/08/wdp/print/PrinterServiceType</PNPX:CompatibleId>
                    </wsdp:Hosted>
                </wsdp:Relationship>
            </mex:MetadataSection>
        </mex:Metadata>
    </SOAP-ENV:Body>
</SOAP-ENV:Envelope>
EOM;

$xml = simplexml_load_string($str);

$namespaces = $xml->getNamespaces(true);

// Here we are saying that we want the Body node in the SOAP-ENV namespace
$body = $xml->children( $namespaces['SOAP-ENV'] )->Body;

// Inside that Body node, we want to get into the mex namespace
$mex = $body->children( $namespaces['mex'] );

// We want the MetadataSections that are in each of the mex namespaces
$metadataSections = $mex->Metadata->MetadataSection;

// Loop through each of the MetadataSections
foreach( $metadataSections as $meta )
{
    // Get inside the wsdp namespace
    $wsdp = $meta->children( $namespaces['wsdp'] );

    // Check if there is a Hosted node inside a Relationship node
    if( isset( $wsdp->Relationship->Hosted ) )
    {
        // Get the wsa namespace inside the Hosted node
        $wsa = $wsdp->Relationship->Hosted->children( $namespaces['wsa'] );

        // If there is an Address inside the EndpointReference node
        if( isset( $wsa->EndpointReference->Address ) )
        {
            // Then output it
            echo $wsa->EndpointReference->Address;
        }
    }
}

作为一个极其简单的示例 - 如果您只想要 wsa:Address 元素...

$myxml1 = simplexml_load_string($test_1);
$myxml1->registerXPathNamespace("wsa", "http://schemas.xmlsoap.org/ws/2004/08/addressing");
echo "wsa:Address=".(string)$myxml1->xpath("//wsa:Address")[0];

这只是确保 wsa 名称空间已在文档中注册并且可用于 XPath 表达式。然后 XPath 表达式只是说 - 从文档中的任何位置获取元素 wsa:Address。但是作为 xpath returns 所有匹配项的列表(即使只有一个),所以使用 [0] 来获取第一项。这输出...

wsa:Address=http://192.168.1.20:3910/

如果您需要围绕(例如)<wsdp:Hosted> 元素的更多数据,您可以执行类似...

$myxml1 = simplexml_load_string($test_1);
$myxml1->registerXPathNamespace("wsdp", "http://schemas.xmlsoap.org/ws/2006/02/devprof");
$hosted = $myxml1->xpath("//wsdp:Hosted")[0];
$hostedWSA = $hosted->children("wsa", true);
echo "wsa:Address=".(string)$hostedWSA->EndpointReference->Address.PHP_EOL;
$hostedWSPD = $hosted->children("wsdp", true);
echo "wsdp:Types=".(string)$hostedWSPD->Types.PHP_EOL;

因此,这首先是获取正确的元素,然后处理该节点内不同命名空间中的各种子节点。