关于我不能登录的原因有什么想法吗?我一直在尝试用同样的方法通过facebook和linkedin登录,但没有成功。我使用的是最新版本的Scrapy。我正在尝试访问“消息”进行测试,但我知道它不起作用,因为它会将我重定向回登录页面……在LinkedIn上也是如此。
import scrapy
from scrapy.spiders import BaseSpider
from scrapy.http import FormRequest
from scrapy.contrib.spiders import CrawlSpider
from linkedIn.items import LinkedinItem
from scrapy.http import Request
#from spider.settings import JsonWriterPipeline
class MySpider (CrawlSpider):
name = 'fb'
allowed_domains = ['facebook.com']
start_urls = ['https://login.facebook.com/login.php']
def parse(self, response):
return [FormRequest.from_response(response,
formname='login_form',
formdata={'email':'my_email@example.com',
'pass':'test!'},
callback=self.after_login)]
def after_login(self, response):
# check login succeed before going on
if "the password you entered is incorrect" in response.body:
self.log("\n\n\n\nLogin failed\n\n\n\n", level=self.log())
return
else:
self.log("\n\n\n Login was successful!!!\n\n\n")
self.log(response.body)
return Request(url="https://facebook.com/messages",
callback=self.parse_items)
def parse_items(self,response):
hxs = scrapy.Selector(response)
titles =hxs.xpath("//title")
items = []
for title in titles:
item = LinkedinItem()
item['friendName']= titles.xpath("//title").extract()
#item['numberOffriends']= titles.select("some path here").extract().pop()
items.append(item)
return (items)发布于 2015-08-25 07:30:26
Facebook和Linkedin都使用CSRF令牌。您必须首先获取包含登录表单的页面,然后解析HTML并获取CSRF令牌,最后使用用户名/密码和CSRF令牌发出POST请求。
https://stackoverflow.com/questions/32163377
复制相似问题