xslt 2.0 tokenize() 正则表达式 - 在 space 的第二次出现之间获取子字符串

Question

我有一个 XML 文件，其中包含以下代码：

 <RL>
<coordinates>7.53 -6.53 8.53 1.23 7.51 7021.13</coordinates>
 </RL>

可以有无限个坐标，但总是偶数个。基本上我想在每秒 space 之前拆分字符串，这样我的明文输出将是

coordinateTuple:7.53 -6.53
coordinateTuple:8.53 1.23 
coordinateTuple:7.51 7021.13

我试过在提问之前进行研究，我想我应该使用 tokenize 函数，但我无法直接获得正则表达式。我当前的代码是

<xsl:for-each select="fn:tokenize(RL/coordinates,'\s.*?(\s)')">
    <xsl:text>coordinateTuple:</xsl:text>
    <xsl:value-of select="." />
    <xsl:text>&#xa;</xsl:text>
</xsl:for-each>

我认为它会匹配 space 的每两次出现（根据 this，它应该匹配），因此使第二个 space 成为标记化的分隔符（）功能。然而，这个的实际输出似乎是它跳过了每一秒的坐标，但仍然给我最后一个：

coordinateTuple:7.53 
coordinateTuple:8.53
coordinateTuple:7.51 7021.13

非常感谢任何帮助:)

Answer 1

我会使用 xsl:analyze-string:

<xsl:template match="coordinates">
    <xsl:analyze-string select="." regex="(\S+)\s+(\S+)">
        <xsl:matching-substring>
            <xsl:value-of select="concat('CoordinateTuple:', regex-group(1), ' ', regex-group(2), '&#10;')"/>
        </xsl:matching-substring>
    </xsl:analyze-string>
</xsl:template>

我不确定是否有办法只对每个第二个空白序列进行标记化，但你当然可以对每个空白序列进行标记化，然后处理第一个、第三个、第五个项目并收集第二个、第四个、第六个项目方式：

    <xsl:variable name="numbers" select="tokenize(., '\s+')"/>
    <xsl:for-each select="$numbers[position() mod 2 = 1]">
        <xsl:variable name="pos" select="position()"/>
        <xsl:value-of select="concat('CoordinateTuple:', ., ' ', $numbers[$pos * 2], '&#10;')"/>
    </xsl:for-each>

Answer 2

我想我会在 space 上标记然后重新组合：

<xsl:for-each-group select="tokenize(., '\s+')" 
                    group-adjacent="(position()-1) idiv 2)">
  <xsl:value-of select="'CoordinateTuple:', current-group()"/>
</xsl:for-each-group>

xslt 2.0 tokenize() 正则表达式 - 在 space 的第二次出现之间获取子字符串

xslt 2.0 tokenize() regex - get substrings between each second occurence of space

regex

xslt

delimiter