为什么 Nokogiri 的 xpath 没有按预期工作?
Why isn't Nokogiri's xpath working as expected?
我正在用 Nokogiri 解析 Soap 响应,但由于某些原因,xpath
或 css
方法无法找到 <soap:Body>
标签之外的任何标签。
我要解析的XML是
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<AuthenticationResponse xmlns="http://tempuri.org/">
<AuthenticationResult>
<SessionID>clinTQYART6qxeQ%k^Am&Sd5Co3</SessionID>
<RequestStatus>1</RequestStatus>
<RequestMessage>Success</RequestMessage>
</AuthenticationResult>
</AuthenticationResponse>
</soap:Body>
</soap:Envelope>
如果我用调试器检查已解析的 XML,我会看到
=> #(Document:0x3fce3c4dd95c {
name = "document",
children = [
#(Element:0x3fce385b04dc {
name = "Envelope",
namespace = #(Namespace:0x3fce385b04b4 { prefix = "soap", href = "http://schemas.xmlsoap.org/soap/envelope/" }),
children = [
#(Element:0x3fce385e509c {
name = "Body",
namespace = #(Namespace:0x3fce385b04b4 { prefix = "soap", href = "http://schemas.xmlsoap.org/soap/envelope/" }),
children = [
#(Element:0x3fce385e4c64 {
name = "AuthenticationResponse",
namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }),
children = [
#(Element:0x3fce385e48a4 {
name = "AuthenticationResult",
namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }),
children = [
#(Element:0x3fce385e44f8 { name = "SessionID", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "clinTQYART6qxeQ%k^Am&Sd5Co3")] }),
#(Element:0x3fce39dcff7c { name = "RequestStatus", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "1")] }),
#(Element:0x3fce39dcfa2c { name = "RequestMessage", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "Success")] })]
})]
})]
})]
})]
})
很好。
但是 xml.xpath("//SessionID")
给出 []
然而 xml.xpath("//soap:Body")[0]
给出
=> #(Element:0x3fce385e509c {
name = "Body",
namespace = #(Namespace:0x3fce385b04b4 { prefix = "soap", href = "http://schemas.xmlsoap.org/soap/envelope/" }),
children = [
#(Element:0x3fce385e4c64 {
name = "AuthenticationResponse",
namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }),
children = [
#(Element:0x3fce385e48a4 {
name = "AuthenticationResult",
namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }),
children = [
#(Element:0x3fce385e44f8 { name = "SessionID", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "clinTQYART6qxeQ%k^Am&Sd5Co3")] }),
#(Element:0x3fce39dcff7c { name = "RequestStatus", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "1")] }),
#(Element:0x3fce39dcfa2c { name = "RequestMessage", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "Success")] })]
})]
})]
})
和xml.xpath("//soap:Body")[0].children[0].children[0].children[0]
给出
=> #(Element:0x3fce385e44f8 { name = "SessionID", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "clinTQYART6qxeQ%k^Am&Sd5Co3")] })
因此 xml.xpath("//soap:Body")[0].children[0].children[0].children[0].content
给了我正确的 ID 字符串。
那么为什么 xml.xpath("//SessionID")
不起作用?
这是因为 SessionID
在命名空间 http://tempuri.org/
中。
尝试类似(未测试)的方法:
xml.xpath("//x:SessionID", {"x" => "http://tempuri.org/"})
不是您问题的直接答案,但如果您想解析 SOAP,最好使用 savon
gem 而不是 nokogiri
。它专为处理 SOAP 的所有复杂问题而设计。
我正在用 Nokogiri 解析 Soap 响应,但由于某些原因,xpath
或 css
方法无法找到 <soap:Body>
标签之外的任何标签。
我要解析的XML是
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<AuthenticationResponse xmlns="http://tempuri.org/">
<AuthenticationResult>
<SessionID>clinTQYART6qxeQ%k^Am&Sd5Co3</SessionID>
<RequestStatus>1</RequestStatus>
<RequestMessage>Success</RequestMessage>
</AuthenticationResult>
</AuthenticationResponse>
</soap:Body>
</soap:Envelope>
如果我用调试器检查已解析的 XML,我会看到
=> #(Document:0x3fce3c4dd95c {
name = "document",
children = [
#(Element:0x3fce385b04dc {
name = "Envelope",
namespace = #(Namespace:0x3fce385b04b4 { prefix = "soap", href = "http://schemas.xmlsoap.org/soap/envelope/" }),
children = [
#(Element:0x3fce385e509c {
name = "Body",
namespace = #(Namespace:0x3fce385b04b4 { prefix = "soap", href = "http://schemas.xmlsoap.org/soap/envelope/" }),
children = [
#(Element:0x3fce385e4c64 {
name = "AuthenticationResponse",
namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }),
children = [
#(Element:0x3fce385e48a4 {
name = "AuthenticationResult",
namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }),
children = [
#(Element:0x3fce385e44f8 { name = "SessionID", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "clinTQYART6qxeQ%k^Am&Sd5Co3")] }),
#(Element:0x3fce39dcff7c { name = "RequestStatus", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "1")] }),
#(Element:0x3fce39dcfa2c { name = "RequestMessage", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "Success")] })]
})]
})]
})]
})]
})
很好。
但是 xml.xpath("//SessionID")
给出 []
然而 xml.xpath("//soap:Body")[0]
给出
=> #(Element:0x3fce385e509c {
name = "Body",
namespace = #(Namespace:0x3fce385b04b4 { prefix = "soap", href = "http://schemas.xmlsoap.org/soap/envelope/" }),
children = [
#(Element:0x3fce385e4c64 {
name = "AuthenticationResponse",
namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }),
children = [
#(Element:0x3fce385e48a4 {
name = "AuthenticationResult",
namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }),
children = [
#(Element:0x3fce385e44f8 { name = "SessionID", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "clinTQYART6qxeQ%k^Am&Sd5Co3")] }),
#(Element:0x3fce39dcff7c { name = "RequestStatus", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "1")] }),
#(Element:0x3fce39dcfa2c { name = "RequestMessage", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "Success")] })]
})]
})]
})
和xml.xpath("//soap:Body")[0].children[0].children[0].children[0]
给出
=> #(Element:0x3fce385e44f8 { name = "SessionID", namespace = #(Namespace:0x3fce385e4c14 { href = "http://tempuri.org/" }), children = [ #(Text "clinTQYART6qxeQ%k^Am&Sd5Co3")] })
因此 xml.xpath("//soap:Body")[0].children[0].children[0].children[0].content
给了我正确的 ID 字符串。
那么为什么 xml.xpath("//SessionID")
不起作用?
这是因为 SessionID
在命名空间 http://tempuri.org/
中。
尝试类似(未测试)的方法:
xml.xpath("//x:SessionID", {"x" => "http://tempuri.org/"})
不是您问题的直接答案,但如果您想解析 SOAP,最好使用 savon
gem 而不是 nokogiri
。它专为处理 SOAP 的所有复杂问题而设计。