如何使用 xsl:key 获得元素的独特结构
How to get unique structure of elements using xsl:key
请建议,如何使用 xsl:key 避免重复元素列表(我从变量方法得到结果,但这不是一种有效的方法)。请提出建议。
在我的输入中,'Ref' 是主要元素,它有几个后代。只需要列出 'Ref' 个结构(仅元素名称,而不是内容)唯一的元素。如果 13 和 10012001,则应仅显示 First 。在给定的输入中,忽略 'au' 和 'ed' 元素作为它们的祖先。
输入XML:
<article>
<Ref id="ref1">
<RefText>
<authors><au><snm>Kishan</snm><fnm>TR</fnm></au><au><snm>Rudramuni</snm><fnm>TP</fnm></au></authors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2016</Year><vol>1</vol>
<fpage>12</fpage><lpage>14</lpage>
</RefText></Ref><!-- should list -->
<Ref id="ref2">
<RefText>
<authors><au><snm>Rudramuni</snm><fnm>TP</fnm></au></authors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year><vol>2</vol>
<fpage>22</fpage><lpage>24</lpage>
</RefText></Ref><!-- This Ref should not list in output xml, because 'authors, articleTitle, like other same type elements present, ref2 is same as ref1. -->
<Ref id="ref3">
<RefText>
<authors><au><snm>Likhith</snm><fnm>MD</fnm></au></authors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year><fpage>22</fpage><lpage>24</lpage>
</RefText></Ref><!-- It should list, bcs, 'vol' missing here, then it is unique in structure with respect to prev Refs -->
<Ref id="ref4">
<RefText>
<authors><au><snm>Kowshik</snm><fnm>MD</fnm></au></authors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year><fpage>22</fpage>
</RefText></Ref><!-- should list, bcs, 'lpage' missing -->
<Ref id="ref5">
<RefText>
<editors><au><snm>Dhyan</snm><fnm>MD</fnm></au></editors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year><fpage>22</fpage>
</RefText></Ref><!-- should list, bcs, 'editors' missing -->
<Ref id="ref6">
<RefText>
<editors><ed><snm>Kishan</snm><fnm>TR</fnm></ed></editors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year>
</RefText></Ref><!-- should list -->
<Ref id="ref7">
<RefText>
<editors><ed><snm>Vivan</snm><fnm>S</fnm></ed></editors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year>
</RefText></Ref><!-- should not, same type elements in ref6 and ref7 -->
<Ref id="ref8">
<RefText><editors><au><snm>Dhyan</snm><fnm>MD</fnm></au><au><snm>Dhyan</snm><fnm>MD</fnm></au></editors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year><fpage>22</fpage>
</RefText></Ref><!-- should not, bcs, 'Ref5 and Ref8' are having same elements -->
</article>
XSLT 2.0:
在这里,我考虑了变量来存储前面的 Ref 的后代名称。
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:template match="@*|node()">
<xsl:copy><xsl:apply-templates select="@*|node()"/></xsl:copy>
</xsl:template>
<xsl:template match="article">
<article>
<xsl:for-each select="descendant::Ref">
<xsl:variable name="varPrev">
<xsl:for-each select="preceding::Ref">
<a>
<xsl:text>|</xsl:text>
<xsl:for-each select="descendant::*[not(ancestor-or-self::au) and not(ancestor-or-self::ed)]">
<xsl:value-of select="name()"/>
</xsl:for-each>
<xsl:text>|</xsl:text>
</a>
</xsl:for-each>
</xsl:variable>
<xsl:variable name="varPresent">
<a>
<xsl:text>|</xsl:text>
<xsl:for-each select="descendant::*[not(ancestor-or-self::au) and not(ancestor-or-self::ed)]">
<xsl:value-of select="name()"/>
</xsl:for-each>
<xsl:text>|</xsl:text>
</a>
</xsl:variable>
<xsl:if test="not(contains($varPrev, $varPresent))">
<xsl:copy><xsl:apply-templates select="@*|node()"/></xsl:copy>
</xsl:if>
</xsl:for-each>
</article>
</xsl:template>
<!--xsl:key name="keyRef" match="Ref" use="descendant::*"/>
<xsl:template match="article">
<xsl:for-each select="descendant::Ref">
<xsl:if test="count('keyRef', ./name())=1">
<xsl:copy><xsl:apply-templates select="@*|node()"/></xsl:copy>
</xsl:if>
</xsl:for-each>
</xsl:template-->
</xsl:stylesheet>
所需结果:
<article>
<Ref id="ref1">
<RefText>
<authors><au><snm>Kishan</snm><fnm>TR</fnm></au><au><snm>Rudramuni</snm><fnm>TP</fnm></au></authors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2016</Year><vol>1</vol>
<fpage>12</fpage><lpage>14</lpage>
</RefText></Ref>
<Ref id="ref3">
<RefText>
<authors><au><snm>Likhith</snm><fnm>MD</fnm></au></authors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year><fpage>22</fpage><lpage>24</lpage>
</RefText></Ref>
<Ref id="ref4">
<RefText>
<authors><au><snm>Kowshik</snm><fnm>MD</fnm></au></authors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year><fpage>22</fpage>
</RefText></Ref>
<Ref id="ref5">
<RefText><editors><au><snm>Dhyan</snm><fnm>MD</fnm></au></editors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year><fpage>22</fpage>
</RefText></Ref>
<Ref id="ref6">
<RefText>
<editors><ed><snm>Kishan</snm><fnm>TR</fnm></ed></editors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year>
</RefText></Ref>
</article>
这里尝试使用与字符串比较类似的键计算:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:mf="http://example.com/mf" exclude-result-prefixes="mf xs">
<xsl:function name="mf:fingerprint" as="xs:string">
<xsl:param name="input-element" as="element()"/>
<xsl:value-of select="for $d in $input-element/descendant::*[not(ancestor-or-self::au) and not(ancestor-or-self::ed)] return node-name($d)" separator="|"/>
</xsl:function>
<xsl:key name="group" match="Ref" use="mf:fingerprint(.)"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Ref[not(. is key('group', mf:fingerprint(.))[1])]"/>
</xsl:transform>
据我所知,它似乎在 http://xsltransform.net/bwdwsC 完成了工作,但我不太确定名称的字符串连接是否足以处理所有类型的输入。
我建议采用以下方法:
删除 authors
和 editors
的后代,以及所有文本节点;
使用deep-equal()
比较剩余的节点。
这是一个简化的 proof-of-concept:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/article">
<xsl:variable name="first-pass">
<xsl:apply-templates mode="first-pass"/>
</xsl:variable>
<xsl:copy>
<xsl:for-each select="$first-pass/Ref[not(some $ref in preceding-sibling::Ref satisfies deep-equal(RefText, $ref/RefText))]">
<Ref id="{@id}"/>
</xsl:for-each>
</xsl:copy>
</xsl:template>
<!-- identity transform -->
<xsl:template match="@*|node()" mode="#all">
<xsl:copy>
<xsl:apply-templates select="@*|node()" mode="#current"/>
</xsl:copy>
</xsl:template>
<xsl:template match="authors | editors" mode="first-pass">
<xsl:copy/>
</xsl:template>
<xsl:template match="text()" mode="first-pass" priority="0"/>
</xsl:stylesheet>
结果
<?xml version="1.0" encoding="UTF-8"?>
<article>
<Ref id="ref1"/>
<Ref id="ref3"/>
<Ref id="ref4"/>
<Ref id="ref5"/>
<Ref id="ref6"/>
</article>
请建议,如何使用 xsl:key 避免重复元素列表(我从变量方法得到结果,但这不是一种有效的方法)。请提出建议。
在我的输入中,'Ref' 是主要元素,它有几个后代。只需要列出 'Ref' 个结构(仅元素名称,而不是内容)唯一的元素。如果 13 和 10012001,则应仅显示 First 。在给定的输入中,忽略 'au' 和 'ed' 元素作为它们的祖先。
输入XML:
<article>
<Ref id="ref1">
<RefText>
<authors><au><snm>Kishan</snm><fnm>TR</fnm></au><au><snm>Rudramuni</snm><fnm>TP</fnm></au></authors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2016</Year><vol>1</vol>
<fpage>12</fpage><lpage>14</lpage>
</RefText></Ref><!-- should list -->
<Ref id="ref2">
<RefText>
<authors><au><snm>Rudramuni</snm><fnm>TP</fnm></au></authors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year><vol>2</vol>
<fpage>22</fpage><lpage>24</lpage>
</RefText></Ref><!-- This Ref should not list in output xml, because 'authors, articleTitle, like other same type elements present, ref2 is same as ref1. -->
<Ref id="ref3">
<RefText>
<authors><au><snm>Likhith</snm><fnm>MD</fnm></au></authors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year><fpage>22</fpage><lpage>24</lpage>
</RefText></Ref><!-- It should list, bcs, 'vol' missing here, then it is unique in structure with respect to prev Refs -->
<Ref id="ref4">
<RefText>
<authors><au><snm>Kowshik</snm><fnm>MD</fnm></au></authors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year><fpage>22</fpage>
</RefText></Ref><!-- should list, bcs, 'lpage' missing -->
<Ref id="ref5">
<RefText>
<editors><au><snm>Dhyan</snm><fnm>MD</fnm></au></editors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year><fpage>22</fpage>
</RefText></Ref><!-- should list, bcs, 'editors' missing -->
<Ref id="ref6">
<RefText>
<editors><ed><snm>Kishan</snm><fnm>TR</fnm></ed></editors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year>
</RefText></Ref><!-- should list -->
<Ref id="ref7">
<RefText>
<editors><ed><snm>Vivan</snm><fnm>S</fnm></ed></editors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year>
</RefText></Ref><!-- should not, same type elements in ref6 and ref7 -->
<Ref id="ref8">
<RefText><editors><au><snm>Dhyan</snm><fnm>MD</fnm></au><au><snm>Dhyan</snm><fnm>MD</fnm></au></editors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year><fpage>22</fpage>
</RefText></Ref><!-- should not, bcs, 'Ref5 and Ref8' are having same elements -->
</article>
XSLT 2.0: 在这里,我考虑了变量来存储前面的 Ref 的后代名称。
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:template match="@*|node()">
<xsl:copy><xsl:apply-templates select="@*|node()"/></xsl:copy>
</xsl:template>
<xsl:template match="article">
<article>
<xsl:for-each select="descendant::Ref">
<xsl:variable name="varPrev">
<xsl:for-each select="preceding::Ref">
<a>
<xsl:text>|</xsl:text>
<xsl:for-each select="descendant::*[not(ancestor-or-self::au) and not(ancestor-or-self::ed)]">
<xsl:value-of select="name()"/>
</xsl:for-each>
<xsl:text>|</xsl:text>
</a>
</xsl:for-each>
</xsl:variable>
<xsl:variable name="varPresent">
<a>
<xsl:text>|</xsl:text>
<xsl:for-each select="descendant::*[not(ancestor-or-self::au) and not(ancestor-or-self::ed)]">
<xsl:value-of select="name()"/>
</xsl:for-each>
<xsl:text>|</xsl:text>
</a>
</xsl:variable>
<xsl:if test="not(contains($varPrev, $varPresent))">
<xsl:copy><xsl:apply-templates select="@*|node()"/></xsl:copy>
</xsl:if>
</xsl:for-each>
</article>
</xsl:template>
<!--xsl:key name="keyRef" match="Ref" use="descendant::*"/>
<xsl:template match="article">
<xsl:for-each select="descendant::Ref">
<xsl:if test="count('keyRef', ./name())=1">
<xsl:copy><xsl:apply-templates select="@*|node()"/></xsl:copy>
</xsl:if>
</xsl:for-each>
</xsl:template-->
</xsl:stylesheet>
所需结果:
<article>
<Ref id="ref1">
<RefText>
<authors><au><snm>Kishan</snm><fnm>TR</fnm></au><au><snm>Rudramuni</snm><fnm>TP</fnm></au></authors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2016</Year><vol>1</vol>
<fpage>12</fpage><lpage>14</lpage>
</RefText></Ref>
<Ref id="ref3">
<RefText>
<authors><au><snm>Likhith</snm><fnm>MD</fnm></au></authors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year><fpage>22</fpage><lpage>24</lpage>
</RefText></Ref>
<Ref id="ref4">
<RefText>
<authors><au><snm>Kowshik</snm><fnm>MD</fnm></au></authors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year><fpage>22</fpage>
</RefText></Ref>
<Ref id="ref5">
<RefText><editors><au><snm>Dhyan</snm><fnm>MD</fnm></au></editors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year><fpage>22</fpage>
</RefText></Ref>
<Ref id="ref6">
<RefText>
<editors><ed><snm>Kishan</snm><fnm>TR</fnm></ed></editors>
<artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
<Year>2017</Year>
</RefText></Ref>
</article>
这里尝试使用与字符串比较类似的键计算:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:mf="http://example.com/mf" exclude-result-prefixes="mf xs">
<xsl:function name="mf:fingerprint" as="xs:string">
<xsl:param name="input-element" as="element()"/>
<xsl:value-of select="for $d in $input-element/descendant::*[not(ancestor-or-self::au) and not(ancestor-or-self::ed)] return node-name($d)" separator="|"/>
</xsl:function>
<xsl:key name="group" match="Ref" use="mf:fingerprint(.)"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Ref[not(. is key('group', mf:fingerprint(.))[1])]"/>
</xsl:transform>
据我所知,它似乎在 http://xsltransform.net/bwdwsC 完成了工作,但我不太确定名称的字符串连接是否足以处理所有类型的输入。
我建议采用以下方法:
删除
authors
和editors
的后代,以及所有文本节点;使用
deep-equal()
比较剩余的节点。
这是一个简化的 proof-of-concept:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/article">
<xsl:variable name="first-pass">
<xsl:apply-templates mode="first-pass"/>
</xsl:variable>
<xsl:copy>
<xsl:for-each select="$first-pass/Ref[not(some $ref in preceding-sibling::Ref satisfies deep-equal(RefText, $ref/RefText))]">
<Ref id="{@id}"/>
</xsl:for-each>
</xsl:copy>
</xsl:template>
<!-- identity transform -->
<xsl:template match="@*|node()" mode="#all">
<xsl:copy>
<xsl:apply-templates select="@*|node()" mode="#current"/>
</xsl:copy>
</xsl:template>
<xsl:template match="authors | editors" mode="first-pass">
<xsl:copy/>
</xsl:template>
<xsl:template match="text()" mode="first-pass" priority="0"/>
</xsl:stylesheet>
结果
<?xml version="1.0" encoding="UTF-8"?>
<article>
<Ref id="ref1"/>
<Ref id="ref3"/>
<Ref id="ref4"/>
<Ref id="ref5"/>
<Ref id="ref6"/>
</article>