首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在Python中将amazon数据转换为csv格式

在Python中将amazon数据转换为csv格式
EN

Stack Overflow用户
提问于 2014-04-28 10:12:39
回答 2查看 1.4K关注 0票数 0

总而言之,我之前问过类似的问题,但还没有解决。

我有亚马逊评论数据集,并想在Python中将其转换为csv格式。我拥有的原始数据如下所示:

代码语言:javascript
复制
product/productId: B00032K32A
product/title: Cruiser Accessories 21330 Slim Rim, Chrome License Plate Frame
product/price: 4.99
review/userId: A2O41UFL8HAQWV
review/profileName: Nick Nefsik
review/helpfulness: 4/4
review/score: 5.0
review/time: 1239667200
review/summary: It's slim, alright!
review/text: Similar to another review, I also found that this frame is more of a overlay to a   license plate (sits on top of the plate), as opposed to securing the plate underneath it, if that makes sense.It *just* covers the edges of my AZ plate, which is fine, but I sure wouldn't want it to be any smaller around its outside perimeter. I also ordered the chrome covers for the screws (Cruiser Accessories 82030 Screw Covers, Chrome) that I was already using, and, altogether, it looks great, and is exactly the look I was going for.

product/productId: B00032K32A
product/title: Cruiser Accessories 21330 Slim Rim, Chrome License Plate Frame
product/price: 4.99
review/userId: A3V7H58BH72AYT
review/profileName: Illustratedman
review/helpfulness: 6/7
review/score: 5.0
review/time: 1199145600
review/summary: Nice...
review/text: I first purchased these for my new 2008 Honda Accord EX-L to complement the chrome on the car and though they looked nice I eventually ordered the 20130 version of the Cruiser chrome frame for the wider border.

结果应该如下所示:

代码语言:javascript
复制
product/productId, product/title, product/price, review/userId, review/profileName, review/helpfullness, review/score, review/time, review/summary, review/text
B00032K32A, Cruiser Accessories 21330 Slim Rim, Chrome License Plate Frame, 4.99, A2O41UFL8HAQWV, Nick Nefsik, 4/4, 5.0, 1239667200, It's slim, alright!, Similar to another review, I also found that this frame is more of a overlay to a   license plate (sits on top of the plate), as opposed to securing the plate underneath it, if that makes sense.It *just* covers the edges of my AZ plate, which is fine, but I sure wouldn't want it to be any smaller around its outside perimeter. I also ordered the chrome covers for the screws (Cruiser Accessories 82030 Screw Covers, Chrome) that I was already using, and, altogether, it looks great, and is exactly the look I was going for.
B00032K32A, Cruiser Accessories 21330 Slim Rim, Chrome License Plate Frame, 4.99, A3V7H58BH72AYT, Illustratedman, 6/7, 5.0, 1199145600, Nice..., I first purchased these for my new 2008 Honda Accord EX-L to complement the chrome on the car and though they looked nice I eventually ordered the 20130 version of the Cruiser chrome frame for the wider border.

我有相当多的数据(超过300MB)与上述相同的格式,所以我想写它,而不是打印它。

我是python的新手,尝试了几种不同的方法,但仍然没有成功。有没有人有把原始数据类型转换成csv格式的好主意?

EN

回答 2

Stack Overflow用户

发布于 2014-04-28 10:36:24

我不确定为什么为您的问题here提供的解决方案不起作用,但这里是相同类型的代码的另一个示例。请注意,您应该更改INPUT_FILE_NAMEOUTPUT_FILE_NAME以满足您的需要。

代码语言:javascript
复制
INPUT_FILE_NAME = "Input.txt"
OUTPUT_FILE_NAME = "Output.csv"

header = [
    "product/productId",
    "product/title",
    "product/price",
    "review/userId",
    "review/profileName",
    "review/helpfulness",
    "review/score",
    "review/time",
    "review/summary",
    "review/text"]

f = open(INPUT_FILE_NAME)
outfile = open(OUTPUT_FILE_NAME,"w")

# Write header
outfile.write(",".join(header) + "\n")

currentLine = []
for line in f:
   line = line.strip()
   if line == "": 
      outfile.write(",".join(currentLine))
      outfile.write("\n")
      currentLine = []
      continue
   parts = line.split(":",1)
   currentLine.append(parts[1])

if currentLine != []:
    outfile.write(",".join(currentLine))


f.close()
outfile.close()
票数 0
EN

Stack Overflow用户

发布于 2014-04-28 10:42:41

这里有一个简单的解决方案。创建一个类来表示数据文件中的记录。然后迭代数据文件的每一行,将每一行映射到您的记录对象上的一个属性。然后在对象上调用一个方法,将记录格式化为所需的CSV格式。

代码语言:javascript
复制
import string
import sys

file = open('amazon.txt')
csv = ''

class Record:
    def toCSV(self):
        return self.productId + ',' + self.title


record = Record()
for line in file:
    if '\n' == line:
        csv += record.toCSV()
    elif "productId" in line:
        record.productId = line
    elif "title" in line:
        record.title = line
    #add conditions for other fields in the datafile

file.close()

print csv
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/23331480

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档