首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用下载机中间件的代理抓取

使用下载机中间件的代理抓取
EN

Stack Overflow用户
提问于 2016-06-04 03:32:53
回答 1查看 525关注 0票数 0

我是Scrapy的新手,我正在尝试构建自己的DownLoader中间件,以便通过代理来抓取网络。我得到了这个错误:

代码语言:javascript
复制
Traceback (most recent call last):
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/twisted/internet/defer.py", line 1128, in _inlineCallbacks
    result = g.send(result)
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/crawler.py", line 90, in crawl
    six.reraise(*exc_info)
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/crawler.py", line 72, in crawl
    self.engine = self._create_engine()
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/crawler.py", line 97, in _create_engine
    return ExecutionEngine(self, lambda _: self.stop())
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/core/engine.py", line 68, in __init__
    self.downloader = downloader_cls(crawler)
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/core/downloader/__init__.py", line 88, in __init__
    self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/middleware.py", line 58, in from_crawler
    return cls.from_settings(crawler.settings, crawler)
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/middleware.py", line 34, in from_settings
    mwcls = load_object(clspath)
  File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/utils/misc.py", line 44, in load_object
    mod = import_module(module)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/__init__.py", line 37, in import_module
    __import__(name)
ImportError: No module named downloaders.downloader_middlewares.proxy_connect

这个错误是由于Scrapy找不到我的中间件。我不确定是因为我没有设置正确的路径而导致的,还是我的中间件做错了什么。

这是我的项目结构:

代码语言:javascript
复制
/chisel
    __init__.py
    pipelines.py
    items.py
    settings.py
    /downloaders
        __init__.py
        /downloader_middlewares
            __init__.py
        proxy_connect.py
        /resources
          config.json
    /spiders
        __init__.py
        craiglist_spider.py
        /spider_middlewares
            __init__.py
        /resources
          craigslist.json
scrapy.cfg

在我的settings.py里

代码语言:javascript
复制
DOWNLOADER_MIDDLEWARES = {
    'downloaders.downloader_middlewares.proxy_connect.ProxyConnect': 100,
    'scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware': 110
}
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2016-06-04 04:43:50

根据文档,路径应该包括项目('myproject.middlewares.CustomDownloaderMiddleware'),在您的情况下,我认为应该是:

代码语言:javascript
复制
'chisel.downloaders.downloader_middlewares.proxy_connect.ProxyConnect': 100
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/37626063

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档