我是Scrapy的新手,我正在尝试构建自己的DownLoader中间件,以便通过代理来抓取网络。我得到了这个错误:
Traceback (most recent call last):
File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/twisted/internet/defer.py", line 1128, in _inlineCallbacks
result = g.send(result)
File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/crawler.py", line 90, in crawl
six.reraise(*exc_info)
File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/crawler.py", line 72, in crawl
self.engine = self._create_engine()
File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/crawler.py", line 97, in _create_engine
return ExecutionEngine(self, lambda _: self.stop())
File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/core/engine.py", line 68, in __init__
self.downloader = downloader_cls(crawler)
File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/core/downloader/__init__.py", line 88, in __init__
self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/middleware.py", line 58, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/middleware.py", line 34, in from_settings
mwcls = load_object(clspath)
File "/Users/bli1/Development/projects/hinwin/chisel/lib/python2.7/site-packages/scrapy/utils/misc.py", line 44, in load_object
mod = import_module(module)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
ImportError: No module named downloaders.downloader_middlewares.proxy_connect这个错误是由于Scrapy找不到我的中间件。我不确定是因为我没有设置正确的路径而导致的,还是我的中间件做错了什么。
这是我的项目结构:
/chisel
__init__.py
pipelines.py
items.py
settings.py
/downloaders
__init__.py
/downloader_middlewares
__init__.py
proxy_connect.py
/resources
config.json
/spiders
__init__.py
craiglist_spider.py
/spider_middlewares
__init__.py
/resources
craigslist.json
scrapy.cfg在我的settings.py里
DOWNLOADER_MIDDLEWARES = {
'downloaders.downloader_middlewares.proxy_connect.ProxyConnect': 100,
'scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware': 110
}发布于 2016-06-04 04:43:50
根据文档,路径应该包括项目('myproject.middlewares.CustomDownloaderMiddleware'),在您的情况下,我认为应该是:
'chisel.downloaders.downloader_middlewares.proxy_connect.ProxyConnect': 100https://stackoverflow.com/questions/37626063
复制相似问题