在过去的几个小时里,我取得了很大的进步,终于碰壁了。
下面是我的代码:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$info = curl_exec($ch);
$html = new DOMDocument();
$html->loadHTML($info);
$xpath = new DOMXPath($html);
$texts = $xpath->query("//div[@class='summary-gems']/ul/li");
$imgs = $xpath->query("//div[@class='summary-gems']/ul/li");
for ($i = 0; $i < $texts->length; $i++) {
$gems[$i]['text'] = $texts->item($i)->nodeValue;
$gems[$i]['img'] = $imgs->getAttribute('href');
echo $gems[$i]['img'];
die;
}下面是XHTML现在的样子:
<div class="summary-gems">
<ul>
<li>
<span class="value">5</span>
<span class="times">x</span>
<span class="icon">
<span class="icon-socket socket-2">
<a href="/wow/en/item/52207" class="gem">
<img src="http://us.battle.net/wow-assets/static/images/icons/18/inv_misc_cutgemsuperior6.jpg" alt="" />
<span class="frame"></span>
</a></span></span>
<a href="/wow/en/item/52207" class="name color-q3">Brilliant Inferno Ruby</a>
<span class="clear">
<!-- -->
</span>
</li>
<li>
<span class="value">3</span>
<span class="times">x</span>
<span class="icon">
<span class="icon-socket socket-10">
<a href="/wow/en/item/52236" class="gem">
<img src="http://us.battle.net/wow-assets/static/images/icons/18/inv_misc_cutgemsuperior3.jpg" alt="" />
<span class="frame"></span>
</a></span></span>
<a href="/wow/en/item/52236" class="name color-q3">Purified Demonseye</a>
<span class="clear">
<!-- -->
</span>
</li>
<li>
<span class="value">3</span>
<span class="times">x</span>
<span class="icon">
<span class="icon-socket socket-6">
<a href="/wow/en/item/68356" class="gem">
<img src="http://us.battle.net/wow-assets/static/images/icons/18/inv_misc_cutgemsuperior4.jpg" alt="" />
<span class="frame"></span>
</a></span></span>
<a href="/wow/en/item/68356" class="name color-q3">Willful Ember Topaz</a>
<span class="clear">
<!-- -->
</span>
</li>
<li>
<span class="value">1</span>
<span class="times">x</span>
<span class="icon">
<span class="icon-socket socket-1">
<a href="/wow/en/item/52298" class="gem">
<img src="http://us.battle.net/wow-assets/static/images/icons/18/inv_misc_metagem_b.jpg" alt="" />
<span class="frame"></span>
</a></span></span>
<a href="/wow/en/item/52298" class="name color-q3">Destructive Shadowspirit Diamond</a>
<span class="clear">
<!-- -->
</span>
</li>
</ul>
</div>当我得到它的“text”部分时,我就得到了那个特定节点中的纯文本(在这个事件中有4个)。如果可能的话,我想要的是所有的XHTML。如果不是,那么我要做的就是为每个节点获取"image source“和"hyperlink class 'gem'”。对于如何在此事件中获取节点的纯文本以外的任何内容,我感到有点困惑。
任何帮助都将不胜感激!如果你有任何问题请告诉我。
发布于 2011-03-15 17:08:47
链接的XPath为
//div[@class='summary-gems']/ul//a[@class='gem']并且您可以使用以下命令访问该属性
(string)$simplexmlelement['href']对<img src="..">执行相同的操作。
要获取元素的完整XML,请使用$simplexmlelement->asXML()。
https://stackoverflow.com/questions/5308333
复制相似问题