我正在尝试从一些XML中剥离标签,如下所示:
<vocabularyModel>
<conceptDomain name="ActAccountType">
<annotations>
<documentation>
<definition>
<text>
<p>
<b>Description: </b>more txt here </p>
<p>
<i>Examples: </i>
</p>
<p/>
<ul>
<li>
<p>Patient billing accounts</p>
</li>
<li>
<p>Cost center</p>
</li>
<li>
<p>Cash</p>
</li>
</ul>
</text>
</definition>
</documentation>
</annotations>
</conceptDomain>
<conceptDomain name="ActAdjudicationInformationCode">
<annotations>
<documentation>
<definition>
<text>
<p>long text.</p>
<p>long text.</p>
<p>long text.</p>
<p>long text.</p>
</text>
</definition>
</documentation>
</annotations>
</conceptDomain>
<conceptDomain name="ActAdjudicationType">
<annotations>
<documentation>
<definition>
<text>
<p>
<b>Description: </b>more text.</p>
<p>
<i>Examples: </i>
</p>
<p/>
<ul>
<li>
<p>adjudicated with adjustments</p>
</li>
<li>
<p>adjudicated as refused</p>
</li>
<li>
<p>adjudicated as submitted</p>
</li>
</ul>
</text>
</definition>
</documentation>
</annotations>
</conceptDomain>
其中文本下面的所有子标记都将被剥离,但所需的xml和文本将如下所示:
<vocabularyModel>
<conceptDomain name="ActAccountType">
<annotations>
<documentation>
<definition>
<text>
Description: more txt here
Examples:
Patient billing accounts
Cost center
Cash
</text>
</definition>
</documentation>
</annotations>
</conceptDomain>
<conceptDomain name="ActAdjudicationInformationCode">
<annotations>
<documentation>
<definition>
<text>
long text.
long text.
long text.
long text.
</text>
</definition>
</documentation>
</annotations>>
</conceptDomain>
<conceptDomain name="ActAdjudicationReason">
<annotations>
<documentation>
<definition>
<text>
long text.
long text.
long text.
long text.
</text>
</definition>
</documentation>
</annotations>
<specializesDomain name="ActReason"/>
</conceptDomain>
<conceptDomain name="ActAdjudicationType">
<annotations>
<documentation>
<definition>
<text>
Description: more text.
Examples:
adjudicated with adjustments
adjudicated as refused
adjudicated as submitted
</text>
</definition>
</documentation>
</annotations>
</conceptDomain>
我尝试了这里其他地方找到的以下内容,并进行了修改:
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="p | b | li | ul | i">
<xsl:apply-templates/>
</xsl:template>但这并没有剥离任何元素,即使我将匹配限制为只匹配一个元素。我还尝试了以下几种变体:
<xsl:output method="xml" indent="yes"/>
<xsl:template name="strip-tags">
<xsl:param name="html"/>
<xsl:choose>
<xsl:when test="contains($html, '<')">
<xsl:value-of select="substring-before($html, '<')"/>
<xsl:call-template name="strip-tags">
<xsl:with-param name="html" select="substring-after($html, '>')"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$html"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="definition">
<xsl:call-template name="strip-tags">
<xsl:with-param name="html" select="text"/>
</xsl:call-template>
</xsl:template>如果省略标识转换,它将剥离所有标记,否则仅复制原始XML的内容。任何帮助都将不胜感激。-scott
发布于 2013-06-06 04:44:04
显示的第一个样式表的输出(如果添加缺少的xsl:stylesheet元素为
<vocabularyModel>
<conceptDomain name="ActAccountType">
<annotations>
<documentation>
<definition>
<text>Description: more txt here Examples: Patient billing accountsCost centerCash</text>
</definition>
</documentation>
</annotations>
</conceptDomain>
<conceptDomain name="ActAdjudicationInformationCode">
<annotations>
<documentation>
<definition>
<text>long text.long text.long text.long text.</text>
</definition>
</documentation>
</annotations>
</conceptDomain>
<conceptDomain name="ActAdjudicationType">
<annotations>
<documentation>
<definition>
<text>Description: more text.Examples: adjudicated with adjustmentsadjudicated as refusedadjudicated as submitted</text>
</definition>
</documentation>
</annotations>
</conceptDomain>
</vocabularyModel>这似乎就是你想要的。也许您真正的输入是在名称空间中?
https://stackoverflow.com/questions/16948640
复制相似问题