文章/答案/技术大牛

发布

社区首页 >问答首页 >导出为CSV格式不正确的刮伤

问导出为CSV格式不正确的刮伤
EN

Stack Overflow用户

提问于 2015-07-17 06:01:32

回答 1查看 1.6K关注 0票数 3

我正在尝试打印一个CSV文件后，使用线划线，但格式有点奇怪，因为它不是打印自上而下，而是打印它在刮完第1页后，然后所有的第2页在一列。我附加了piplines.py和csv输出的一行(相当大)。那么，我应该如何从一页中一次打印一次，而不是一次一次地打印列智慧。

pipline.py

# -*- coding: utf-8 -*-

# Define your item pipelines here
#
# Don't forget to add your pipeline to the ITEM_PIPELINES setting
# See: http://doc.scrapy.org/en/latest/topics/item-pipeline.html

from scrapy import signals
from scrapy.contrib.exporter import CsvItemExporter

class CSVPipeline(object):

    def __init__(self):
        self.files = {}

    @classmethod
    def from_crawler(cls, crawler):
        pipeline = cls()
        crawler.signals.connect(pipeline.spider_opened, signals.spider_opened)
        crawler.signals.connect(pipeline.spider_closed, signals.spider_closed)
        return pipeline


    def spider_opened(self, spider):
        file = open('%s_items.csv' % spider.name, 'w+b')
        self.files[spider] = file
        self.exporter = CsvItemExporter(file)
        self.exporter.fields_to_export = ['names','stars','subjects','reviews']
        self.exporter.start_exporting()

    def spider_closed(self, spider):
        self.exporter.finish_exporting()
        file = self.files.pop(spider)
        file.close()


    def process_item(self, item, spider):
        self.exporter.export_item(item)
        return item

和output.csv

names   stars   subjects
Vivek0388,NikhilVashisth,DocSharad,Abhimanyu_swarup,Suresh N,kaushalhkapadia,JyotiMallick,Nitin T,mhdMumbai,SunilTukrel(COLUMN 2)   5 of 5 stars,4 of 5 stars,1 of 5 stars,5 of 5 stars,3 of 5 stars,4 of 5 stars,5 of 5 stars,5 of 5 stars,4 of 5 stars,4 of 5 stars(COLUMN 3) Best Stay,Awesome View... Nice Experience!,Highly mismanaged and dishonest.,A Wonderful Experience,Good place with average front office,Honeymoon,Awesome Resort,Amazing,ooty's beauty!!,Good stay and food

应该是这样的

Vivek0388      5 of 5
NikhilVashisth 5 of 5
DocSharad      5 of 5
...so on

编辑：

items = [{'reviews:':"",'subjects:':"",'names:':"",'stars:':""} for k in range(1000)]
if(sites and len(sites) > 0):
    for site in sites:
        i+=1
        items[i]['names'] = item['names']
        items[i]['stars'] = item['stars']
        items[i]['subjects'] = item['subjects']
        items[i]['reviews'] = item['reviews']
        yield Request(url="http://tripadvisor.in" + site, callback=self.parse)
    for k in  range(1000):
        yield items[k]

python

csv

web-scraping

scrapy

export-to-csv

回答 1

Stack Overflow用户

回答已采纳

发布于 2015-07-19 18:05:29

算出来了，csv把它拉链，然后循环它并写行。一旦你读了这些文档，这就不那么复杂了。

import csv
import itertools

class CSVPipeline(object):

   def __init__(self):
      self.csvwriter = csv.writer(open('items.csv', 'wb'), delimiter=',')
      self.csvwriter.writerow(['names','starts','subjects','reviews'])

   def process_item(self, item, ampa):

      rows = zip(item['names'],item['stars'],item['subjects'],item['reviews'])


      for row in rows:
         self.csvwriter.writerow(row)

      return item

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/31469263

复制

相似问题

问导出为CSV格式不正确的刮伤
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问导出为CSV格式不正确的刮伤EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问导出为CSV格式不正确的刮伤
EN