我想在mechanize.Browser()中使用mechanize.Browser,我让HTTPNtlmAuthHandler与urllib2和mechanize.urlopen()一起工作,并尝试将它与Browser()一起使用,但它不起作用。
下面是我为urlopen使用的代码
passman = mechanize.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, url, user, password)
auth_NTLM = HTTPNtlmAuthHandler.HTTPNtlmAuthHandler(passman)
opener = mechanize.build_opener(auth_NTLM)
mechanize.install_opener(opener)
mechanize.urlopen(baseurl)根据请求进行跟踪,
harrisony@lithium:~$ python sitefoo.py
now running mechanize.urlopen
<addinfourl at 169181868 whose fp = <httplib.HTTPResponse instance at 0xa15858c>>
now running mechanize.Browser then br.open
Traceback (most recent call last):
File "sitescreaper.py", line 21, in <module>
br.open(baseurl)
File "/usr/lib/python2.6/dist-packages/mechanize/_mechanize.py", line 209, in open
return self._mech_open(url, data, timeout=timeout)
File "/usr/lib/python2.6/dist-packages/mechanize/_mechanize.py", line 261, in _mech_open
raise response
mechanize._response.httperror_seek_wrapper: HTTP Error 401: Unauthorized发布于 2011-02-25 22:48:38
可能有更好的选择,但我能让它工作的唯一方法是删除HTTPRobotRulesProcessor处理程序,这在某种程度上阻止了HTTPNtlmAuthHandler的调用。
注意:下面的代码也碰巧删除了ProxyHandler,以便绕过代理服务器--如果适用的话删除。
passman = mechanize.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, baseurl, user, password)
auth_NTLM = HTTPNtlmAuthHandler.HTTPNtlmAuthHandler(passman)
browser = mechanize.Browser()
browser.add_handler(auth_NTLM)
handlersToKeep = []
for handler in browser.handlers:
if not isinstance(handler, (mechanize._auth.ProxyHandler,
mechanize._urllib2_support.HTTPRobotRulesProcessor)):
handlersToKeep.append(handler)
browser.handlers = handlersToKeep
browser.open(url)https://stackoverflow.com/questions/2153095
复制相似问题