xquery：计算祖先（div）节点中的字符和段落数，直到子节点出现的某个点

Question

Xquery 新手又来了。我有以下 xml:

<div type="section" n="1">
<p>Lorem ipsum dolor sit amet, <rs type="xyz">consectetur</rs> adipiscing <placeName ref="#PLACE1">elit</placeName>, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.<p>

<p>Duis aute irure <rs type="xyz">dolor</rs> in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt <rs type="place" ref="#PLACE2">mollit anim</rs> id est <rs type="xyz">laborum</rs>.<p>
</div>

我想根据其在文本中的位置为每个 "place"（rs type=place 和 placeName）创建一个唯一 ID。为此，我想检索每个地点的以下信息：

"div type=section"节点内的段落编号
从段落开头到子节点开头的字符数（rs type=place 或 placeName）。

以上面的例子为例，我期望这些结果：

<placeName ref="#PLACE1">elit</placeName>

段落：1
字符数：51 ("Lorem ipsum dolor sit amet, consectetur adipiscing ")

<rs type="place" ref="#PLACE2">mollit anim</rs>

段落：2
字符数：186 ("Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt ")

我一定是遗漏了一些非常简单的东西，但我就是想不通如何在 xquery 中计算这个特定的字符数。我知道 preceding/following-sibling::text() 将允许我计数到 previous/following 节点。有没有这样的东西可以从给定节点到达祖先的开头？非常感谢任何帮助或指导。

Answer 1

如果您知道 place 和 rs 元素将是 p 元素的子元素，那么您可以 select preceding-sibling::node()、string-join 他们然后计算 string-length;所以在带有箭头运算符的 XQuery 3.1 中（我希望存在支持）：

(//placeName | //rs[@type = 'place'])
!
(ancestor::p[1]/(., preceding-sibling::p) => count() ||
 ' : '
 ||
 preceding-sibling::node() => string-join() => string-length()
)

https://xqueryfiddle.liberty-development.net/bFDb2BK/1

对于更复杂的情况，即您的元素是 p 的后代，我认为表达式

(preceding::text() intersect ancestor::p[1]//text())  => string-join() => string-length()

给出了你想要的值（https://xqueryfiddle.liberty-development.net/bFDb2BK/4），我不确定它的表现如何。

如果不支持箭头和地图运算符，或者您更喜欢 FLOWR 表达式，那么

for $place in (//placeName | //rs[@type = 'place'])
return ($place/ancestor::p[1]/count((., preceding-sibling::p)) || ' : ' || string-length(string-join($place/preceding-sibling::node())))

对于简单的子元素情况或

for $place in (//placeName | //rs[@type = 'place'])
return (
    $place/ancestor::p[1]/count((., preceding-sibling::p)) 
    || ' : ' || string-length(string-join($place/preceding-sibling::node()))
    || ' : ' || string-length(string-join($place/(preceding::text() intersect ancestor::p[1]//text())))
)

作为后代案例（好吧，两种方法的比较，最后一个子表达式应该适用于后代案例）。相交的替代方法可以使用 << 运算符：string-length(string-join($place/ancestor::p[1]//text()[. << $place])).

xquery：计算祖先（div）节点中的字符和段落数，直到子节点出现的某个点

xquery: count number of characters and paragraphs in ancestor (div) node up until a certain point in where child node occurs

xquery

count

exist-db