首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用XPATH提取两个标记(粗体)<b>之间的所有文本

使用XPATH提取两个标记(粗体)<b>之间的所有文本
EN

Stack Overflow用户
提问于 2020-03-06 13:53:26
回答 1查看 108关注 0票数 2

这是我的HTML元素

代码语言:javascript
复制
<div class="abstract-content selected" id="en-abstract">
    <p>
        <b>Introduction.</b> 
         Against the backdrop of increasing resistance to conventional antibiotics, bacteriocins represent an attractive alternative, given their potent activity, novel modes of action and perceived lack of issues with resistance.
        <b>Aim.</b>
         In this study, the nature of the antibacterial activity of a clinical isolate of 
        <i>Streptococcus gallolyticus</i>
         was investigated.
        <b>Methods.</b>
         Optimization of the production of an inhibitor from strain AB39 was performed using different broth media and supplements. Purification was carried out using size exclusion, ion exchange and HPLC. Gel diffusion agar overlay, MS/MS, 
        <i>de novo</i>
         peptide sequencing and genome mining were used in a proteogenomics approach to facilitate identification of the genetic basis for production of the inhibitor.
        <b>Results.</b>
         Strain AB39 was identified as representing 
        <i>Streptococcus gallolyticus</i>
         subsp. 
        <i>pasteurianus</i>
         and the successful production and purification of the AB39 peptide, named nisin P, with a mass of 3133.78 Da, was achieved using BHI broth with 10 % serum. Nisin P showed antibacterial activity towards clinical isolates of drug-resistant bacteria, including methicillin-resistant 
        <i>Staphylococcus aureus</i>
         , vancomycin-resistant 
        <i>Enterococcus</i>
         and penicillin-resistant 
        <i>Streptococcus pneumoniae</i>
         . In addition, the peptide exhibited significant stability towards high temperature, wide pH and certain proteolytic enzymes and displayed very low toxicity towards sheep red blood cells and Vero cells.
        <b>Conclusion.</b>
         To the best of our knowledge, this study represents the first production, purification and characterization of nisin P. Further study of nisin P may reveal its potential for treating or preventing infections caused by antibiotic-resistant Gram-positive bacteria, or those evading vaccination regimens.
    </p>
</div>

在这里,我想从"<b>“标记中提取”标题“,并从它们下面的文本中提取相应的值。

例:"AIM“:本研究对一株临床分离的溶血链球菌的抗菌活性进行了研究。

是否有任何使用xpath实现这一目标的方法。注意:我是用刮伤来提取东西的。

我用过

"response.xpath("//p//text()normalize-space()")“,它将所有的标题值作为单独的块,

在常规抗生素耐药日益严重的背景下,细菌素具有较强的活性、新的作用方式和明显的耐药性问题,是一种很有吸引力的选择。在本研究中,我们研究了临床分离的‘u’ ',u‘’的抗菌活性的性质,并利用不同的发酵液和补剂对细菌AB39抑制剂的产生进行了优化。采用粒度排除法、离子交换法和高效液相色谱法进行纯化。采用凝胶扩散琼脂重叠法、MS/MS、‘、u’肽测序和基因组挖掘等方法,对产生该抑制剂的遗传基础进行了鉴定,确定了u‘菌株AB39为代表',u’亚类。用含10 %血清的BHI肉汤,成功地生产和纯化了质量为3133.78 Da的AB39肽nisin。Nisin P对耐甲氧西林、u‘、万古霉素耐药、u’和青霉素耐药等临床耐药菌均有抗菌活性。此外,该肽对高温、宽pH和某些蛋白水解酶具有显著的稳定性,对绵羊红细胞和Vero细胞的毒性很低。据我们所知,本研究是第一次生产、纯化和鉴定nisin P。进一步研究nisin P可能揭示其治疗或预防抗生素耐药革兰氏阳性菌感染或逃避疫苗接种方案的潜力。

任何指引都是有帮助的

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-03-06 15:06:51

最快的方法可能是使用string(//p)获取所有内容,并使用特定的文本操作命令进行拆分。

使用XPath,您可以尝试:

获取所有标题(返回5个元素):

代码语言:javascript
复制
//b/text()

使用以下XPaths (返回5*1元素)获取相应的描述(包括斜体标记):

代码语言:javascript
复制
normalize-space(substring-before(substring-after(string(//p),//b[.="Introduction."]),//b[.="Aim."]))
normalize-space(substring-before(substring-after(string(//p),//b[.="Aim."]),//b[.="Methods."]))
normalize-space(substring-before(substring-after(string(//p),//b[.="Methods."]),//b[.="Results."]))
normalize-space(substring-before(substring-after(string(//p),//b[.="Results."]),//b[.="Conclusion."]))
normalize-space(substring-after(string(//p),//b[.="Conclusion."]))

如果您不知道标记之间的文本,可以使用位置索引(//b1、//b2、.)。使用计数(//b)知道最大值。

编辑:替代XPaths:

代码语言:javascript
复制
normalize-space(//text()[preceding::b="Introduction." and following::b="Aim."])
normalize-space(//text()[preceding::b="Aim." and following::b="Methods."])
normalize-space(//text()[preceding::b="Methods." and following::b="Results."])
normalize-space(//text()[preceding::b="Results." and following::b="Conclusion."])
normalize-space(//text()[preceding::b="Conclusion."])
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/60565504

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档