基于BR元素将一个P元素拆分为多个P元素
Split a P element into several P elements based on BR elements
我正在尝试通过 BR 元素将包含多个 SPAN 和 BR 的单个 P 元素拆分为单独的 P 元素。
这是示例输入 xml 结构:
<P>
<SPAN CLASS="BYLINE">by john doe</SPAN>
<SPAN CLASS="TEXT">
<BR/>
</SPAN>
<SPAN CLASS="EMAIL">john.doe@email.com</SPAN>
<SPAN CLASS="TEXT">
<BR/>
</SPAN>
<SPAN CLASS="TEXT">Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. </SPAN>
<SPAN CLASS="BOLD">This sentence is bold. </SPAN>
<SPAN CLASS="TEXT">It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. </SPAN>
<SPAN CLASS="ITALIC">This sentence is in italics. </SPAN>
<SPAN CLASS="TEXT">It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
<BR/>
</SPAN>
<SPAN CLASS="BOLD">BOLD SUBTITLE HERE</SPAN>
<SPAN CLASS="TEXT">
<BR/>Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old.</SPAN>
<SPAN CLASS="ITALIC">
<BR/>ITALIC SUB-TITLE</SPAN>
<SPAN CLASS="TEXT">
<BR/>Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source.<BR/>
</SPAN>
</P>
我希望看到的输出 xml 是:
<P>
<SPAN CLASS="BYLINE">by john doe</SPAN>
</P>
<P>
<SPAN CLASS="EMAIL">john.doe@email.com</SPAN>
</P>
<P>
<SPAN CLASS="TEXT">Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. </SPAN>
<SPAN CLASS="BOLD">This sentence is bold. </SPAN>
<SPAN CLASS="TEXT">It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. </SPAN>
<SPAN CLASS="ITALIC">This sentence is in italics. </SPAN>
<SPAN CLASS="TEXT">It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.</SPAN>
</P>
<P>
<SPAN CLASS="BOLD">BOLD SUBTITLE HERE</SPAN>
</P>
<P>
<SPAN CLASS="TEXT">Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old.</SPAN>
</P>
<P>
<SPAN CLASS="ITALIC">ITALIC SUB-TITLE</SPAN>
</P>
<P>
<SPAN CLASS="TEXT">Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source.</SPAN>
</P>
<P>
<SPAN CLASS="TEXT"></SPAN>
</P>
这可能吗?
我尝试使用 xsl:key 和分组,但无法正常工作。
非常感谢任何建议。谢谢。
如果您使用的是 XSLT 2.0,看起来您可以将 xsl:for-each-group
与 group-ending-with
结合使用
<xsl:for-each-group select="SPAN" group-ending-with="*[BR]">
然后您将使用 current-group()
函数来获取您想要分组到 P
中的所有 SPAN
元素
<P>
<xsl:apply-templates select="current-group()" />
</P>
您还需要模板来停止 BR
标签,并且 SPAN
标签仅包含 BR
标签,正在输出。
试试这个 XSLT
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output method="xml" indent="yes" />
<xsl:template match="P">
<xsl:for-each-group select="SPAN" group-ending-with="*[BR]">
<P>
<xsl:apply-templates select="current-group()" />
</P>
</xsl:for-each-group>
</xsl:template>
<xsl:template match="SPAN[BR][not(normalize-space())]" />
<xsl:template match="BR" />
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
这并不能完全给你你需要的输出,因为 <SPAN CLASS="BOLD">BOLD SUBTITLE HERE</SPAN>
结合了下面的范围,而不是在它自己的 P
标签中,但我不知道为什么逻辑不同。
有关在 XSLT 2.0 中使用 xsl:for-each-group
的更多有趣方法,请参阅 http://www.xml.com/pub/a/2003/11/05/tr.html。
我正在尝试通过 BR 元素将包含多个 SPAN 和 BR 的单个 P 元素拆分为单独的 P 元素。
这是示例输入 xml 结构:
<P>
<SPAN CLASS="BYLINE">by john doe</SPAN>
<SPAN CLASS="TEXT">
<BR/>
</SPAN>
<SPAN CLASS="EMAIL">john.doe@email.com</SPAN>
<SPAN CLASS="TEXT">
<BR/>
</SPAN>
<SPAN CLASS="TEXT">Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. </SPAN>
<SPAN CLASS="BOLD">This sentence is bold. </SPAN>
<SPAN CLASS="TEXT">It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. </SPAN>
<SPAN CLASS="ITALIC">This sentence is in italics. </SPAN>
<SPAN CLASS="TEXT">It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
<BR/>
</SPAN>
<SPAN CLASS="BOLD">BOLD SUBTITLE HERE</SPAN>
<SPAN CLASS="TEXT">
<BR/>Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old.</SPAN>
<SPAN CLASS="ITALIC">
<BR/>ITALIC SUB-TITLE</SPAN>
<SPAN CLASS="TEXT">
<BR/>Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source.<BR/>
</SPAN>
</P>
我希望看到的输出 xml 是:
<P>
<SPAN CLASS="BYLINE">by john doe</SPAN>
</P>
<P>
<SPAN CLASS="EMAIL">john.doe@email.com</SPAN>
</P>
<P>
<SPAN CLASS="TEXT">Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. </SPAN>
<SPAN CLASS="BOLD">This sentence is bold. </SPAN>
<SPAN CLASS="TEXT">It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. </SPAN>
<SPAN CLASS="ITALIC">This sentence is in italics. </SPAN>
<SPAN CLASS="TEXT">It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.</SPAN>
</P>
<P>
<SPAN CLASS="BOLD">BOLD SUBTITLE HERE</SPAN>
</P>
<P>
<SPAN CLASS="TEXT">Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old.</SPAN>
</P>
<P>
<SPAN CLASS="ITALIC">ITALIC SUB-TITLE</SPAN>
</P>
<P>
<SPAN CLASS="TEXT">Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source.</SPAN>
</P>
<P>
<SPAN CLASS="TEXT"></SPAN>
</P>
这可能吗? 我尝试使用 xsl:key 和分组,但无法正常工作。
非常感谢任何建议。谢谢。
如果您使用的是 XSLT 2.0,看起来您可以将 xsl:for-each-group
与 group-ending-with
<xsl:for-each-group select="SPAN" group-ending-with="*[BR]">
然后您将使用 current-group()
函数来获取您想要分组到 P
SPAN
元素
<P>
<xsl:apply-templates select="current-group()" />
</P>
您还需要模板来停止 BR
标签,并且 SPAN
标签仅包含 BR
标签,正在输出。
试试这个 XSLT
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output method="xml" indent="yes" />
<xsl:template match="P">
<xsl:for-each-group select="SPAN" group-ending-with="*[BR]">
<P>
<xsl:apply-templates select="current-group()" />
</P>
</xsl:for-each-group>
</xsl:template>
<xsl:template match="SPAN[BR][not(normalize-space())]" />
<xsl:template match="BR" />
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
这并不能完全给你你需要的输出,因为 <SPAN CLASS="BOLD">BOLD SUBTITLE HERE</SPAN>
结合了下面的范围,而不是在它自己的 P
标签中,但我不知道为什么逻辑不同。
有关在 XSLT 2.0 中使用 xsl:for-each-group
的更多有趣方法,请参阅 http://www.xml.com/pub/a/2003/11/05/tr.html。