return 字符串连接的 XPath 由 html 标记拆分

XPath to return string concatenation splitted by html tag

如何使用 XPath 表达式 return 包含连接值的字符串值?

<div>
This text node (1) should be returned.
<em>And the value of this element.</em>
And this.
</div>

<div>
This text node (2) should be returned.
And this.
</div>

<div>
This text node (3) should be returned.
<em>And the value of this element.</em>
And this.
</div>

returned 值应该是由 div 元素拆分的字符串数组:

"This text node (1) should be returned. And the value of this element. And this."
"This text node (2) should be returned. And this."
"This text node (3) should be returned. And the value of this element. And this."

在单个 XPath 表达式中这可能吗?

XPath 1.0

无法使用纯 XPath 1.0。相反,select div 元素:

//div

然后在托管 XPath 库调用的语言中对每个 div 元素的字符串值应用 space 规范化。

XPath 2.0

这个 XPath 2.0 表达式,

//div/normalize-space()

将return文档中所有div个元素的规范化字符串值:

This text node (1) should be returned. And the value of this element. And this.
This text node (2) should be returned. And this.
This text node (3) should be returned. And the value of this element. And this.

根据要求。