我用java编写了一段代码,用于解析将要使用的xml文件。现在我有麻烦了。我的数据集是AIMed,如下所示:
<passage>
<text>
Isolation of human delta-catenin and its binding specificity with presenilin 1.
We screened proteins for interaction with presenilin (PS) 1, and cloned the full-length cDNA of human delta-catenin, which encoded 1225 amino acids.
Yeast two-hybrid assay, GST binding assay and immunoprecipitation demonstrated that delta-catenin interacted with a hydrophilic loop region in the endoproteolytic C-terminal fragment of PS1, but not with that of PS-2.
These results suggest that PS1 and PS2 partly differ in function.
PS1 loop fragment containing the pathogenic mutation retained the binding ability.
We also found another armadillo-protein, p0071, interacted with PS1.
</text>
<annotation id="T1">
<infon key="file">ann</infon>
<infon key="type">protein</infon>
<location offset="19" length="13"></location>
<text>delta-catenin</text>
</annotation>
<relation id="R3">
<infon key="relation type">Interaction</infon>
<infon key="file">ann</infon>
<infon key="type">Relation</infon>
<node refid="T5" role="Arg1"></node>
<node refid="T6" role="Arg2"></node>
</relation>
</passage>我使用的是SAXParser,我的代码如下(用于文本标记):
else if (bText)
{
System.out.println("Text: "
+ new String(ch, start, length));
bText = false;
}但它只显示了两个句子。我的问题是如何修复它?
发布于 2015-12-08 17:41:46
遍历NodeList中的节点,直到找到相应的节点,将其强制转换为一个元素(在本例中为文本),然后使用element.getTextContent()。查看Interface Node,并认为它还将返回节点后代的文本(如果它们存在的话)。
https://stackoverflow.com/questions/34151979
复制相似问题