大家好,我是个新手,可以使用jsoup lib从网页上阅读html
这是我的基本html:
<ul id="nav" class="sf-menu">
<li class="level0 nav-3 level-top parent">
<a href="mylink.html"
class="level-top"><span>ABCD_MAIN CAT</span></a>
<ul class="level0">
<li id="level1nav-3-1first"><a class="arrow"
href="mylink.html">SUB CAT
</a>
<ul>
<li><span><a
href="mylink.html">SUB TO SUB CAT1
</span></a></li>
<li><span><a
href="mylink.html">
SUB TO SUB CAT2</span></a></li>
</ul>
</li>
<li class="level1 nav-3-1 first"><a href="mylink.html">
<span>SUB CAT(HERE NO SUB TO SUB CAT)</span></a>
</li>
<li><a href="mylink.html" class="see-all"><span>SUB CAT(HERE NO SUB TO SUB CAT)</span></a>
</li>
</ul>
</li>
</ul>为此,我需要读取所有的猫(类别),它的链接子猫和它的相关链接,以及子到带有链接的子猫。
我该怎么做呢?
请帮帮忙
先谢谢你...
发布于 2015-12-29 13:37:27
您可以按如下方式解析某些内容:
String webpageContent = <your html page>;
Document doc = Jsoup.parseBodyFragment(webpageContent);
Elements liTags = doc.select("li"); //this will select all li tags
for (Element litag : liTags ) {
// parse each litag to get your desire content
you can use litag.attr, litag.html() , outerHtml()
}有关element类的其他属性,请参阅this link
https://stackoverflow.com/questions/34504549
复制相似问题