使用HTML:
<table width="100%" cellpadding="0" cellspacing="0" border="0">
<tr>
<td width="27%" align="left" valign="top">
<span class="param">Text0</span> 23<br />
<span class="param">Text1</span> 173<br />
<span class="param">Text2</span> 54<br />
<span class="param">Text3</span> 2<br /><br />
</td>
<td width="27%" align="left" valign="top">
<span class="param">Text4</span><br />
one <br />
two <br />
three <br />
</td>
<td width="46%" align="left" valign="top">
<span class="param">Text5</span><br />
one -<br />
two -<br />
three -<br />
</td>
</tr>
</table>我可以获取值Text5 0-3解析代码更改get(0)-get(3),但无法获取Text4和Text5:
Document doc = Jsoup.connect("text.html").get();
Element param = doc.select("span[class=param]").get(0);
Node node = param.nextSibling();
System.out.println(node.toString());如何获取Text4和Text5的值?get(4)或get(5),现在返回br,但我需要get“一,二,三”
现在我使用以下代码:
Document doc = Jsoup.connect("text.hml").get();
Elements params = doc.select("span[class=param]");
int i;
for (i=0; i<6; i++) {
Element param = params.get(i);
Node node = param.nextSibling();
System.out.println(node.toString());
}此打印:
23
173
54
2
<br>
<br>我需要:
23
173
54
2
one two three
one two three疯狂的代码答案:
Document doc = Jsoup.connect("text.html").get();
Elements params = doc.select("span[class=param]");
int i;
for (i=0; i<3; i++) {
Element param = params.get(i);
Node node = param.nextSibling();
System.out.println(node.toString());
}
for (i=4; i<5; i++){
Element apar = params.get(i);
Node apan = apar.nextSibling();
System.out.println("apar: "+apan.nextSibling().toString());
System.out.println("apar: "+apan.nextSibling().nextSibling().nextSibling().toString());
System.out.println("apar: "+apan.nextSibling().nextSibling().nextSibling().nextSibling().nextSibling().toString());
//System.out.println(apan.nextSibling().toString());
}
for (i=5; i<6; i++){
Element vih = params.get(i);
Node vihn = vih.nextSibling();
System.out.println("vih: "+vihn.nextSibling().toString());
System.out.println("vih: "+vihn.nextSibling().nextSibling().nextSibling().toString());
System.out.println("vih: "+vihn.nextSibling().nextSibling().nextSibling().nextSibling().nextSibling().toString());
//System.out.println(apan.nextSibling().toString());
}
}这太疯狂了(?)代码打印出我想要的东西。
发布于 2016-03-26 00:02:16
当您执行Element param = doc.select("span[class=param]")时,您会得到一个元素列表。您需要遍历列表以处理每个<span>元素。在您的代码中,您只能通过执行一个Element param = doc.select("span[class=param]").get(0);来获取一个
Document doc = Jsoup.connect("text.hml").get();
Elements params = doc.select("span[class=param]");
for(Element element: params){
//Will print out the text contained within the <span>...</span>
System.out.println(element.ownText());
}
params = doc.select("td");
for(Element element: params){
//Will print out the text contained in all children nodes of <td> nodes, that are text nodes
System.out.println(element.ownText());
//System.out.println(element.text());
}上面的代码将打印出来:
Text0
Text1
Text2
Text3
Text4
Text5
23 173 54 2
one two three
one - two - three -这应该足以让你到达你要去的地方。祝好运!
https://stackoverflow.com/questions/36223347
复制相似问题