首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在python中解析XML。获取选定节点内的所有子节点值。

在python中解析XML。获取选定节点内的所有子节点值。
EN

Stack Overflow用户
提问于 2018-02-01 10:37:43
回答 1查看 259关注 0票数 0

我尝试在python中解析以下XML:

代码语言:javascript
复制
<abstract>
    <title>Abstract</title>
    <p>Amphinomids, more commonly known as fireworms, are a basal lineage of marine annelids characterized by the presence of defensive dorsal calcareous chaetae, which break off upon contact. It has long been hypothesized that amphinomids are venomous and use the chaetae to inject a toxic substance. However, studies investigating fireworm venom from a morphological or molecular perspective are scarce and no venom gland has been identified to date, nor any toxin characterized at the molecular level. To investigate this question, we analyzed the transcriptomes of three species of fireworms—
        <italic>Eurythoe complanata</italic>
        , 
        <italic>Hermodice carunculata</italic>
        , and 
        <italic>Paramphinome jeffreysii</italic>
        —following a venomics approach to identify putative venom compounds. Our venomics pipeline involved de novo transcriptome assembly, open reading frame, and signal sequence prediction, followed by three different homology search strategies: BLAST, HMMER sequence, and HMMER domain. Following this pipeline, we identified 34 clusters of orthologous genes, representing 13 known toxin classes that have been repeatedly recruited into animal venoms. Specifically, the three species share a similar toxin profile with C-type lectins, peptidases, metalloproteinases, spider toxins, and CAP proteins found among the most highly expressed toxin homologs. Despite their great diversity, the putative toxins identified are predominantly involved in three major biological processes: hemostasis, inflammatory response, and allergic reactions, all of which are commonly disrupted after fireworm stings. Although the putative fireworm toxins identified here need to be further validated, our results strongly suggest that fireworms are venomous animals that use a complex mixture of toxins for defense against predators.
    </p>
</abstract>

我试图检索<abstract>节点之间的所有文本,包括子节点。我可以迭代到节点并得到文本,但是迭代在‘最深节点’处停止:

代码语言:javascript
复制
import xml.etree.ElementTree as ET

resXML = ET.fromstring(response)
abstract = resXML.find(".//abstract").iter()
for section in abstract:
    print section.text

> Abstract 
> Amphinomids, more commonly known as fireworms, are a basal
> lineage of marine annelids characterized by the presence of defensive
> dorsal calcareous chaetae, which break off upon contact. It has long
> been hypothesized that amphinomids are venomous and use the chaetae to
> inject a toxic substance. However, studies investigating fireworm
> venom from a morphological or molecular perspective are scarce and no
> venom gland has been identified to date, nor any toxin characterized
> at the molecular level. To investigate this question, we analyzed the
> transcriptomes of three species of fireworms— 
> Eurythoe complanata
> Hermodice carunculata 
> Paramphinome jeffreysii

显然我的方法不太好。我没有得到斜体物种之间的逗号或段落的其余部分:'-following a venomics...'

如何迭代所选节点下的所有节点?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-02-01 10:52:24

ElementTree模型中,元素后面的文本节点存储为该元素的尾,而不是父元素的text。因此,除了section.text之外,还需要查看section.tail

代码语言:javascript
复制
>>> section in abstract:
...     print section.text.strip()
...     if section.tail:
...         print section.tail.strip()
... 

Abstract

Amphinomids, more commonly known as fireworms, are a basal lineage of marine annelids characterized by the presence of defensive dorsal calcareous chaetae, which break off upon contact. It has long been hypothesized that amphinomids are venomous and use the chaetae to inject a toxic substance. However, studies investigating fireworm venom from a morphological or molecular perspective are scarce and no venom gland has been identified to date, nor any toxin characterized at the molecular level. To investigate this question, we analyzed the transcriptomes of three species of fireworms—

Eurythoe complanata
,
Hermodice carunculata
, and
Paramphinome jeffreysii
—following a venomics approach to identify putative venom compounds. Our venomics pipeline involved de novo transcriptome assembly, open reading frame, and signal sequence prediction, followed by three different homology search strategies: BLAST, HMMER sequence, and HMMER domain. Following this pipeline, we identified 34 clusters of orthologous genes, representing 13 known toxin classes that have been repeatedly recruited into animal venoms. Specifically, the three species share a similar toxin profile with C-type lectins, peptidases, metalloproteinases, spider toxins, and CAP proteins found among the most highly expressed toxin homologs. Despite their great diversity, the putative toxins identified are predominantly involved in three major biological processes: hemostasis, inflammatory response, and allergic reactions, all of which are commonly disrupted after fireworm stings. Although the putative fireworm toxins identified here need to be further validated, our results strongly suggest that fireworms are venomous animals that use a complex mixture of toxins for defense against predators.
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/48560703

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档