xpath：在不使用 text() 的情况下获取文本节点值

Question

我想知道在 xpath 中使用 text() 的目的是什么。如果我有一个 xml 文档

 <book category="COOKING">
  <title lang="en">Everyday Italian</title>
  <author>Giada De Laurentiis</author>
  <year>2005</year>
  <price>30.00</price>
 </book>

<book category="CHILDREN">
  <title lang="en">Harry Potter</title>
  <author>J K. Rowling</author>
  <year>2005</year>
  <price>29.99</price>
</book>

而且我需要找到这些书的价格。我可以用： /bookstore/book/price[文本()] 或 /bookstore/book/price

它会给我相同的结果。那么为什么要使用 text()？

Answer 1

在这种特殊情况下没有理由使用 text()，并且 text() 在 XPath 新手中经常被过度使用。

text() 节点测试有有效的用例，它们涉及人们想要专门针对文本节点的时间。

例如，假设有些书的价格是空白的，而您只想购买非空白的：

<book category="COOKING">
  <title lang="en">Everyday Italian</title>
  <price>30.00</price>
</book>

<book category="CHILDREN">
  <title lang="en">Harry Potter</title>
  <price></price>
</book>

<book category="CHILDREN">
  <title lang="en">Narnia</title>
  <price>29.99</price>
</book>

/bookstore/book/price 会 return 三个元素，而 /bookstore/book/price[text()] 会 return 两个。

或者有时您可能只想获取元素的文本节点，而不是其全部内容：

<book category="CHILDREN">
  Harry Potter
  <author>J. K. Rowling</author>
  <price>29.99</price>
</book>

在这种情况下，/bookstore/book 将生成一个字符串值为 Harry Potter J. K. Rowling29.99 的元素，而 /bookstore/book/text() 将生成一组文本节点，其中第一个具有字符串值Harry Potter，另外两个只是空格。

正如 Michael Kay 在评论中指出的那样，在处理混合内容时使用 text() 可能很有用（其中文本节点与上面第二个示例中的元素并排）。在极少数情况下，您需要将 text() 与非混合内容一起使用。

Answer 2

不，XPath 表达式

/bookstore/book/price

不是 return 字符串值。它 return 的元素节点称为 "price"。但是您正在使用的 XPath 环境或引擎（我们不知道是哪个）会自动输出那些元素的 字符串值 。

例如，如果 XPath 与 XSLT 结合使用，在许多情况下都会发生这种情况，就像 xsl:value-of 指令一样：

<xsl:value-of select="price"/>
^^^^^^^^^^^^^^^^^^^^^^     ^^^   XSLT
                      ^^^^^      XPath

上面XSLT里面的XPath表达式return是一个元素节点，但是xsl:value-of只输出这个元素的字符串值

在某些情况下，元素 price 的字符串值与 price/text() 不同，因为

the string value of an element is the concatenation of all its descendant text nodes

和

price/text() returns all immediate child text nodes of the price element

xpath：在不使用 text() 的情况下获取文本节点值

xpath: getting text node value without using text()

xml

xpath