我对尝试网络抓取很感兴趣。但是如果我使用下面的代码,比如((!))致命错误:在非对象上调用成员函数innertext()
include_once('simple_html_dom.php');
set_time_limit(300);
$url = "http://www.flickr.com/photos/terriek/galleries/72157622371738280/";
echo $url;
$ch = curl_init();
echo $ch;
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, TRUE);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
$result = curl_exec ($ch);
//echo $result;
curl_close($ch);
$html = new simple_html_dom();
echo $html;
$html->load($result);
$exts = array('jpg', 'jpeg', 'png', 'gif');
foreach($html->find('img') as $element) // error with this line
$path_parts = pathinfo($element->src);
// if condition
$ch = curl_init($element->src);
$fp = fopen("imgs/".$path_parts['basename'], "wb");
curl_setopt($ch, CURLOPT_FILE, $fp);
echo curl_exec($ch);
curl_close($ch);
fclose($fp);发布于 2014-10-18 16:00:04
这可能是因为当你在循环中卷曲的时候,你错过了完整的URL。
尝试:
echo $element->src;在你的循环中,确保它给出了完整的网址,如果它给出了一个相对的网址,在进行冰壶之前,在它前面加上你的$url。
发布于 2014-10-18 22:44:51
问题出在主url上-当你在浏览器中打开它时,你会看到它被重定向到安全协议,所以将它更新到https应该会让你的代码正常工作:
$url = "https://www.flickr.com/photos/terriek/galleries/72157622371738280/"; https://stackoverflow.com/questions/26436723
复制相似问题