我想在我的父爬虫类中设置以下内容,因为对于每个孩子来说这应该是相同的,我该怎么做呢?
scrapy crawl spiderX -a full >> FEED_URI = /xx/spiderX_full
scrapy crawl spiderX -a quick >> FEED_URI = /xx/spiderX_quick这就是我到目前为止所知道的:
@classmethod
def update_settings(cls, settings):
settings_dict = cls.custom_settings or {}
feed_uri = path.join(settings.get('FEED_DIR'), '%s' % cls.name)
settings_dict['FEED_URI'] = feed_uri
settings.setdict(settings_dict, priority='spider')如何从该函数访问快速/完整参数?我试着这样做:
def __new__(cls, full=False, quick=False, *a, **kw):
cls.full = full
cls.quick = quick
return super(MyCrawlSpider, cls).__new__(cls, *a, **kw)但显然update_settings是在它之前运行的
发布于 2015-10-05 10:03:25
尝试使用-s参数。
scrapy crawl spiderX -s FEED_URI=s3://mybucket/path/to/export.csv
https://stackoverflow.com/questions/32913864
复制相似问题