每当我在start_urls变量中使用captions and transcription的链接时,它都会在标题和转录变量中给出caption的价格,并再次在这两个变量中给出transcription的价格。为什么以及如何解决这个问题?
import scrapy
from .. items import FetchingItem
class SiteFetching(scrapy.Spider):
name = 'Site'
start_urls = ['https://www.rev.com/freelancers/captions',
'https://www.rev.com/freelancers/transcription']
def parse(self, response):
items = FetchingItem()
Transcription_price = response.css('#middle-benefit .mt1::text').extract()
Caption_price = response.css('#middle-benefit .mt1::text').extract()
items['Transcription_price'] = Transcription_price
items['Caption_price'] = Caption_price
yield items发布于 2019-05-08 13:54:50
我怀疑您需要另一种类结构,顺序:
import scrapy
from .. items import FetchingItem
class SiteFetching(scrapy.Spider):
name = 'Site'
start_urls = ['https://www.rev.com/freelancers/captions']
def parse(self, response):
items = FetchingItem()
items['Caption_price'] = response.css('#middle-benefit .mt1::text').extract()
yield Request('https://www.rev.com/freelancers/transcription', self.parse_transcription, meta={'items': items})
def parse_transcription(self, response):
items = response.meta['items']
items['Transcription_price'] = response.css('#middle-benefit .mt1::text').extract()
yield itemshttps://stackoverflow.com/questions/56030966
复制相似问题