我有一个XML
<data>
<peptides>
<peptide>
<accession>111</accession>
<sequence>AAA</sequence>
<score>4000</score>
</peptide>
<peptide>
<accession>111</accession>
<sequence>AAA</sequence>
<score>6000</score>
</peptide>
<peptide>
<accession>111</accession>
<sequence>AAA</sequence>
<score>5000</score>
</peptide>
<peptide>
<accession>111</accession>
<sequence>BBB</sequence>
<score>5000</score>
</peptide>
<peptide>
<accession>111</accession>
<sequence>BBB</sequence>
<score>1000</score>
</peptide>
<peptide>
<accession>111</accession>
<sequence>BBB</sequence>
<score>8000</score>
</peptide>
<peptide>
<accession>111</accession>
<sequence>BBB</sequence>
<score>5000</score>
</peptide>
<peptide>
<accession>222</accession>
<sequence>CCC</sequence>
<score>5000</score>
</peptide>
<peptide>
<accession>222</accession>
<sequence>CCC</sequence>
<score>9000</score>
</peptide>
<peptide>
<accession>222</accession>
<sequence>CCC</sequence>
<score>2000</score>
</peptide>
</peptides>
</data>使用下面的XSLT,我可以获得带有“accession”"111“的肽,消除了序列的冗余。这样我就得到了这个XML
<root>
<peptide>
<accession>111</accession>
<sequence>AAA</sequence>
<score>4000</score>
</peptide>
<peptide>
<accession>111</accession>
<sequence>BBB</sequence>
<score>5000</score>
</peptide>
</root>这就是XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:key name="byAcc" match="/data/peptides/peptide" use="accession" />
<xsl:key name="byAccSeq" match="/data/peptides/peptide" use="concat(accession, '|', sequence)"/>
<xsl:template match="/">
<root>
<xsl:apply-templates select="key('byAcc','111')
[
generate-id()
=
generate-id(key('byAccSeq', concat(accession, '|', sequence))[1])
]">
<xsl:sort select="sequence" data-type="text"/>
<xsl:sort select="score" data-type="number"/>
</xsl:apply-templates>
</root>
</xsl:template>
<xsl:template match="/data/peptides/peptide">
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>和活动的示例here
然后,问题是,从所有冗余中,“选定”节点只是原始XML中出现的第一个节点。
我需要在所有多余的肽(即那些具有相同登录号和序列的肽)中选择得分最高的肽。
那么所希望的XML就是下面这个
<root>
<peptide>
<accession>111</accession>
<sequence>AAA</sequence>
<score>6000</score>
</peptide>
<peptide>
<accession>111</accession>
<sequence>BBB</sequence>
<score>8000</score>
</peptide>
</root>如果不清楚,请让我知道,我会重新编辑问题。非常感谢。
杰拉德
发布于 2011-12-19 21:20:50
此转换
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="byAcc" match="/data/peptides/peptide" use="accession" />
<xsl:key name="byAccSeq" match="/data/peptides/peptide" use="concat(accession, '|', sequence)"/>
<xsl:template match="/">
<root>
<xsl:apply-templates select=
"key('byAcc','111')
[
generate-id()
=
generate-id(key('byAccSeq', concat(accession, '|', sequence))[1])
]">
<xsl:sort select="sequence" data-type="text"/>
<xsl:sort select="score" data-type="number"/>
</xsl:apply-templates>
</root>
</xsl:template>
<xsl:template match="/data/peptides/peptide">
<xsl:for-each select=
"key('byAccSeq', concat(accession, '|', sequence))">
<xsl:sort select="score" data-type="number" order="descending"/>
<xsl:if test="position() = 1">
<xsl:copy-of select="."/>
</xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>在所提供的XML文档上应用时的
<data>
<peptides>
<peptide>
<accession>111</accession>
<sequence>AAA</sequence>
<score>4000</score>
</peptide>
<peptide>
<accession>111</accession>
<sequence>AAA</sequence>
<score>6000</score>
</peptide>
<peptide>
<accession>111</accession>
<sequence>AAA</sequence>
<score>5000</score>
</peptide>
<peptide>
<accession>111</accession>
<sequence>BBB</sequence>
<score>5000</score>
</peptide>
<peptide>
<accession>111</accession>
<sequence>BBB</sequence>
<score>1000</score>
</peptide>
<peptide>
<accession>111</accession>
<sequence>BBB</sequence>
<score>8000</score>
</peptide>
<peptide>
<accession>111</accession>
<sequence>BBB</sequence>
<score>5000</score>
</peptide>
<peptide>
<accession>222</accession>
<sequence>CCC</sequence>
<score>5000</score>
</peptide>
<peptide>
<accession>222</accession>
<sequence>CCC</sequence>
<score>9000</score>
</peptide>
<peptide>
<accession>222</accession>
<sequence>CCC</sequence>
<score>2000</score>
</peptide>
</peptides>
</data>生成所需的、正确的结果
<root>
<peptide>
<accession>111</accession>
<sequence>AAA</sequence>
<score>6000</score>
</peptide>
<peptide>
<accession>111</accession>
<sequence>BBB</sequence>
<score>8000</score>
</peptide>
</root>说明
key() key(),它使用代码片段来查找所有这些元素中具有最大score的元素。只输出第一个这样的元素。https://stackoverflow.com/questions/8561370
复制相似问题