带有尾随“[1]”或“[last()]”谓词的链式 XPath 轴

Question

这个问题专门针对在 XSLT 2.0 和 Saxon 中使用 XPath。

以 `[1]`

结尾的 XPath

对于类似

的 XPath

following-sibling::foo[1]
descendant::bar[1]

我想当然地认为 Saxon 不会遍历整个轴，而是在找到第一个匹配节点时停止 - 在以下情况下至关重要：

following-sibling::foo[some:expensivePredicate(.)][1]

我假设像这样的 XPath 也是这种情况：

(following-sibling::foo/descendant::bar)[1]

即在选择集合中的第一个节点之前，Saxon 不会编译匹配 following-sibling::foo/descendant::bar 的整个节点集合。相反，它将（即使对于链式轴）在第一个匹配节点处停止。

以 `[last()]`

结尾的 XPath

现在变得有趣了。在树中“向后”时，我假设 XPaths

preceding-sibling::foo[1]

的工作效率与其 following-sibling 同等水平。但是当链接轴时会发生什么，例如

(preceding-sibling::foo/descendant::bar)[last()]

因为我们需要在这里使用[last()]而不是[1],

Saxon 会编译整个节点集来计算它们以获得 last() 的数值吗？
或者当它找到匹配的后代时，它会变得聪明并停止迭代 preceding-sibling 轴吗？
或者它会更聪明并反向迭代 descendant 轴以更有效地找到最后一个后代吗？

Answer 1

Saxon 有多种评估策略last()。当用作谓词时，意思是[position()=last()]，它通常被翻译成一个内部函数[isLast()]，可以通过单项前瞻来计算。（因此，在您的 (preceding-sibling::foo /descendant::bar)[last()] 示例中，它不会在内存中构建节点集，而是一个接一个地读取节点，当它到达终点时，returns 它找到的最后一个节点） .

在其他情况下，特别是在 XSLT 匹配模式中使用时，Saxon 会将 child::x[last()] 转换为 child::x[not(following-sibling::x)]。

当这些方法中的 none 起作用时，多年来撒克逊人有两种评估 last() 的策略，具体取决于它所应用的表达式：(a) 有时它会计算表达式两次，第一次计算节点，第二次返回； (b) 在其他情况下，它会将所有节点读入内存。我们最近遇到过策略 (a) 失败的情况：参见 https://saxonica.plan.io/issues/3122，因此我们一直在执行 (b)。

last() 表达式可能开销很大，应尽可能避免使用。比如经常被写成

的经典"insert a separator between adjacent items"

xx
if (position() != last()) sep

最好写成

if (position() != 1) sep
xx

即不要在除最后一项之外的每个项目之后插入分隔符，而是将其插入除第一项之外的所有项目之前。或者使用string-join，或者xsl:value-of/@separator.

带有尾随“[1]”或“[last()]”谓词的链式 XPath 轴

Chained XPath axes with trailing `[1]` or `[last()]` predicates

xpath

saxon

xslt-2.0

以 `[1]`

以 `[last()]`

带有尾随“[1]”或“[last()]”谓词的链式 XPath 轴

Chained XPath axes with trailing `[1]` or `[last()]` predicates

xpath

saxon

xslt-2.0

以 [1]

以 [last()]

以 `[1]`

以 `[last()]`