使用不同值和 xslt 2.0 删除重复元素
Removing duplicate elements using distinct-values and xslt 2.0
我正在尝试解决一个问题,即我想从一系列元素中删除重复值。
我已经尝试了一段时间,下面的代码看起来像我认为可行的东西,但我收到了一个错误:
XPTY0020:前导“/”不能select包含上下文项的树的根节点:上下文项不是节点
XSLT:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0">
<xsl:strip-space elements="*"/>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<xsl:for-each select="distinct-values(/tobject/tobject.subject/@tobject.subject.refnum)">
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
XML:
<?xml version="1.0" encoding="UTF-8"?>
<tobject tobject.type="Utenriks">
<tobject.property tobject.property.type="Nyheter"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000" tobject.subject.matter="olje og energi"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000" tobject.subject.matter="valg"/>
<tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000" tobject.subject.type="kriminalitet og rettsvesen"/>
<tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000" tobject.subject.type="fritid"/>
</tobject>
想要的结果:
<?xml version="1.0" encoding="UTF-8"?>
<tobject tobject.type="Utenriks">
<tobject.property tobject.property.type="Nyheter"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000" tobject.subject.matter="olje og energi"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000" tobject.subject.matter="valg"/>
<tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000" tobject.subject.type="kriminalitet og rettsvesen"/>
<tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000" tobject.subject.type="fritid"/>
</tobject>
the code below sort of looks like something I thought would work, but
I am getting an error:
XPTY0020: Leading '/' cannot select the root node of the tree
containing the context item: the context item is not a node
此错误无法重现运行您的代码 - 请参阅:http://xsltransform.net/gWvjQfa
但是,distinct-values()
的结果是 个值 的序列,而不是 个节点 。您期望的结果 - 删除重复的 元素 - 使用分组更容易实现:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/tobject">
<xsl:copy>
<xsl:copy-of select="@* | tobject.property"/>
<xsl:for-each-group select="tobject.subject" group-by="@tobject.subject.refnum">
<xsl:copy-of select="current-group()[1]"/>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
我。一个更短的解决方案 它是纯 XSLT 1.0 并且不需要不必要的元素名称。
此外,它的效率不亚于使用 <xsl:for-each-group>
的 XSLT 2.0 解决方案 -- 因为 这里我们使用 Muenchian 方法进行分组:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="kOS" match="tobject.subject" use="@tobject.subject.refnum"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match=
"tobject.subject[generate-id() != generate-id(key('kOS', @tobject.subject.refnum)[1])]"/>
</xsl:stylesheet>
当此转换应用于提供的 XML 文档时:
<tobject tobject.type="Utenriks">
<tobject.property tobject.property.type="Nyheter"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000" tobject.subject.matter="olje og energi"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000" tobject.subject.matter="valg"/>
<tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000" tobject.subject.type="kriminalitet og rettsvesen"/>
<tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000" tobject.subject.type="fritid"/>
</tobject>
产生了想要的、正确的结果:
<tobject tobject.type="Utenriks">
<tobject.property tobject.property.type="Nyheter"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000" tobject.subject.matter="olje og energi"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000" tobject.subject.matter="valg"/>
<tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000" tobject.subject.type="kriminalitet og rettsvesen"/>
<tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000" tobject.subject.type="fritid"/>
</tobject>
二.一个单行 XPath 2.0 表达式,它选择想要的唯一(每个组元素中的一个)
$vElems[index-of($vElems/@tobject.subject.refnum, @tobject.subject.refnum)[1]]
此处 $vElems 必须定义为:
/*/tobject.subject
在提供的 XML 文档上计算此 XPath 2.0 表达式时,将选择所需的元素序列:
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000"
tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000"
tobject.subject.matter="olje og energi"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000"
tobject.subject.type="politikk"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000"
tobject.subject.matter="valg"/>
<tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000"
tobject.subject.type="kriminalitet og rettsvesen"/>
<tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000"
tobject.subject.type="fritid"/>
我正在尝试解决一个问题,即我想从一系列元素中删除重复值。
我已经尝试了一段时间,下面的代码看起来像我认为可行的东西,但我收到了一个错误:
XPTY0020:前导“/”不能select包含上下文项的树的根节点:上下文项不是节点
XSLT:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0">
<xsl:strip-space elements="*"/>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<xsl:for-each select="distinct-values(/tobject/tobject.subject/@tobject.subject.refnum)">
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
XML:
<?xml version="1.0" encoding="UTF-8"?>
<tobject tobject.type="Utenriks">
<tobject.property tobject.property.type="Nyheter"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000" tobject.subject.matter="olje og energi"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000" tobject.subject.matter="valg"/>
<tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000" tobject.subject.type="kriminalitet og rettsvesen"/>
<tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000" tobject.subject.type="fritid"/>
</tobject>
想要的结果:
<?xml version="1.0" encoding="UTF-8"?>
<tobject tobject.type="Utenriks">
<tobject.property tobject.property.type="Nyheter"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000" tobject.subject.matter="olje og energi"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000" tobject.subject.matter="valg"/>
<tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000" tobject.subject.type="kriminalitet og rettsvesen"/>
<tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000" tobject.subject.type="fritid"/>
</tobject>
the code below sort of looks like something I thought would work, but I am getting an error:
XPTY0020: Leading '/' cannot select the root node of the tree containing the context item: the context item is not a node
此错误无法重现运行您的代码 - 请参阅:http://xsltransform.net/gWvjQfa
但是,distinct-values()
的结果是 个值 的序列,而不是 个节点 。您期望的结果 - 删除重复的 元素 - 使用分组更容易实现:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/tobject">
<xsl:copy>
<xsl:copy-of select="@* | tobject.property"/>
<xsl:for-each-group select="tobject.subject" group-by="@tobject.subject.refnum">
<xsl:copy-of select="current-group()[1]"/>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
我。一个更短的解决方案 它是纯 XSLT 1.0 并且不需要不必要的元素名称。
此外,它的效率不亚于使用 <xsl:for-each-group>
的 XSLT 2.0 解决方案 -- 因为 这里我们使用 Muenchian 方法进行分组:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="kOS" match="tobject.subject" use="@tobject.subject.refnum"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match=
"tobject.subject[generate-id() != generate-id(key('kOS', @tobject.subject.refnum)[1])]"/>
</xsl:stylesheet>
当此转换应用于提供的 XML 文档时:
<tobject tobject.type="Utenriks">
<tobject.property tobject.property.type="Nyheter"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000" tobject.subject.matter="olje og energi"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000" tobject.subject.matter="valg"/>
<tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000" tobject.subject.type="kriminalitet og rettsvesen"/>
<tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000" tobject.subject.type="fritid"/>
</tobject>
产生了想要的、正确的结果:
<tobject tobject.type="Utenriks">
<tobject.property tobject.property.type="Nyheter"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000" tobject.subject.matter="olje og energi"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000" tobject.subject.matter="valg"/>
<tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000" tobject.subject.type="kriminalitet og rettsvesen"/>
<tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000" tobject.subject.type="fritid"/>
</tobject>
二.一个单行 XPath 2.0 表达式,它选择想要的唯一(每个组元素中的一个)
$vElems[index-of($vElems/@tobject.subject.refnum, @tobject.subject.refnum)[1]]
此处 $vElems 必须定义为:
/*/tobject.subject
在提供的 XML 文档上计算此 XPath 2.0 表达式时,将选择所需的元素序列:
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000"
tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000"
tobject.subject.matter="olje og energi"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000"
tobject.subject.type="politikk"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000"
tobject.subject.matter="valg"/>
<tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000"
tobject.subject.type="kriminalitet og rettsvesen"/>
<tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000"
tobject.subject.type="fritid"/>