所有人。我是python的新手。我想从我们学校的网站上获取数据。在此之前,我想进行自动登录。这是我们学校的网站"http://ams.bhsfic.com“。更重要的是,我试图记录登录的真实URL,当我点击真实的URL时,它会标记为"404“。下面是代码
import urllib
import urllib2
import cookielib
class Login:
def __init__(self):
self.loginUrl = 'http://ams.bhsfic.com/system/login/doLogin'
self.cookies = cookielib.CookieJar()
self.postdata = urllib.urlencode({
'emall': '20160612',
'userPwd': 'At121212'
})
self.opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(self.cookies))
def getPage(self):
request = urllib2.Request(
url=self.loginUrl,
data=self.postdata)
result = self.opener.open(request)
print result.read().decode('gbk')
login = Login()
login.getPage()bug:
Connected to pydev debugger (build 172.3544.46)
Traceback (most recent call last):
File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1599, in <module>
globals = debugger.run(setup['file'], None, None, is_module)
File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1026, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/Users/mac/PycharmProjects/crawler/crawler", line 23, in <module>
sdu.getPage()
File "/Users/mac/PycharmProjects/crawler/crawler", line 18, in getPage
result = self.opener.open(request)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 437, in open
response = meth(req, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 475, in error
return self._call_chain(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 558, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found谢谢
发布于 2017-08-15 22:18:29
您将收到404错误,因为页面http://ams.bhsfic.com/system/login/doLogin不存在。取而代之的是尝试“does”到http://ams.bhsfic.com/system/login,这是可以做到的。请注意,如果您的站点有任何CSRF令牌(或等效令牌),您也必须处理它们。
或者,您可以尝试使用Selenium。它模拟真实的用户交互。因此,您可以“选择”用户名/密码字段,然后单击“登录”按钮。
https://stackoverflow.com/questions/45694661
复制相似问题