我正在使用mechanize库来模拟浏览器来获取html,如下所示,但是我总是得到一个错误...
错误的代码:
post_url = "http://www.stackoverflow.com/"
browser = mechanize.Browser()
browser.set_handle_robots(False)
browser.addheaders = [('User-agent', 'Firefox')]
html = browser.open(post_url).read().decode('UTF-8')错误:
Traceback (most recent call last):
File "C:\test.py", line 1538, in <module>
periodically(180, -60, +60, getData)
File "C:\test.py", line 262, in periodically
s.run()
File "C:\Python27\lib\sched.py", line 117, in run
action(*argument)
File "C:\test.py", line 1241, in getData
html = browser.open(post_url).read().decode('UTF-8')
File "build\bdist.win32\egg\mechanize\_mechanize.py", line 203, in open
return self._mech_open(url, data, timeout=timeout)
File "build\bdist.win32\egg\mechanize\_mechanize.py", line 255, in _mech_open
raise response
httperror_seek_wrapper: HTTP Error 500: Internal Server Error
>>> 有人知道如何修复/避免这个错误吗?
发布于 2013-07-16 20:40:54
HTTP错误500表示“内部服务器错误”。
我猜您提供的示例代码没有错误,对吗?
两个可能的原因:
我不认为这与机械化库有关。
如果您不关心该错误的原因,只想捕获您可以使用的异常,请编辑:
try:
html = browser.open(post_url).read().decode('UTF-8')
except mechanize.HTTPError, e:
# handle http errors explicit by code
if int(e.code) == 500:
# do nothing. Maybe you need to set "html" to empy string.
pass
else:
raise e # if http error code is not 500, reraise the exception发布于 2013-07-17 16:05:38
您无法修复它,只能仔细检查您正在解析的数据是否正确。
要解决此问题,请使用try/except
from urllib2 import HTTPError
try:
post_url = "http://www.stackoverflow.com/"
browser = mechanize.Browser()
browser.set_handle_robots(False)
browser.addheaders = [('User-agent', 'Firefox')]
html = browser.open(post_url).read().decode('UTF-8')
except HTTPError, e:
print "Got error code", e.code https://stackoverflow.com/questions/17676677
复制相似问题