我的JSON遇到了一个问题:
第一个问题是SyntaxError: Non-ASCII character '\xe2' in file,所以我在文件的顶部添加了# -*- coding: utf-8 -*-。
然后,问题变成了加载JSON x = json.loads(x):ValueError: Expecting , delimiter: line 3 column 52 (char 57)的问题。我引用了this stackoverflow solution,因此在JSON前面添加了一个r:
x = r"""[
{ my validated json... }
]"""但是我得到了一个错误的TypeError: sequence item 3: expected string or Unicode, NoneType found -我认为它是r抛出它的方式?
JSON类似于以下内容:
[
{
"brief": "Brief 1",
"description": "Description 1",
"photos": [
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example.jpg?0101010101010",
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example2.jpg?0101010101010",
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example3.jpg?0101010101010"
],
"price": "145",
"tags": [
"tag1",
"tag2",
"tag3"
],
"title": "Title 1"
},
{
"brief": "Brief 2",
"description": "Description 2",
"photos": [
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example4.jpg?0101010101010",
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example5.jpg?0101010101010"
],
"price": "150",
"tags": [
"tag4",
"tag5",
"tag6",
"tag7",
"tag8"
],
"title": "Title 2"
},{
"brief": "blah blah 5'0\" to 5'4\"",
"buyerPickup": true,
"condition": "Good",
"coverShipping": false,
"description": "blah blah 5'0\" to 5'4\". blah blah.Size L/20”\n 5’8-5’11\n29lbs\n3x7 speed\n\n \r\n\r\n",
"photos": [
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-010101.jpeg?11111",
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-020202?111111"
],
"price": "240",
"tags": [
"tag2",
"5'0\"-5'4\""
],
"title": "blah blah 17\" Frame",
"front": "https://firebasestorage.googleapis.com/v0/b/example.appspot.com/o/Images%2F0007891113.jpg?alt=media&token=111-11-11-11-111"
}
]当前代码
# -*- coding: utf-8 -*-
import csv
import json
x = """[
{
"brief": "Brief 1",
"description": "Description 1",
"photos": [
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example.jpg?0101010101010",
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example2.jpg?0101010101010",
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example3.jpg?0101010101010"
],
"price": "145",
"tags": [
"tag1",
"tag2",
"tag3"
],
"title": "Title 1"
},
{
"brief": "Brief 2",
"description": "Description 2",
"photos": [
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example4.jpg?0101010101010",
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example5.jpg?0101010101010"
],
"price": "150",
"tags": [
"tag4",
"tag5",
"tag6",
"tag7",
"tag8"
],
"title": "Title 2"
},{
"brief": "blah blah 5'0\" to 5'4\"",
"buyerPickup": true,
"condition": "Good",
"coverShipping": false,
"description": "blah blah 5'0\" to 5'4\". blah blah.Size L/20”\n 5’8-5’11\n29lbs\n3x7 speed\n\n \r\n\r\n",
"photos": [
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-010101.jpeg?11111",
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-020202?111111"
],
"price": "240",
"tags": [
"tag2",
"5'0\"-5'4\""
],
"title": "blah blah 17\" Frame",
"front": "https://firebasestorage.googleapis.com/v0/b/example.appspot.com/o/Images%2F0007891113.jpg?alt=media&token=111-11-11-11-111"
}
]"""
x = json.loads(x)
f = csv.writer(open("example.csv", "wb+"))
f.writerow(["Handle","Title","Body (HTML)", "Vendor","Type","Tags","Published","Option1 Name","Option1 Value","Variant Inventory Qty","Variant Inventory Policy","Variant Fulfillment Service","Variant Price","Variant Requires Shipping","Variant Taxable","Image Src"])
for x in x:
allTags = "\"" + ','.join(x["tags"]) + "\""
images = x["photos"]
f.writerow([x["title"],
x["title"],
x["description"],
"Vendor Name",
"Widget",
allTags,
"TRUE",
"Title",
"Default Title",
"1",
"deny",
"manual",
x["price"],
"TRUE",
"TRUE",
images.pop(0) if images else None])
while images:
f.writerow([x["title"],None,None,None,None,None,None,None,None,None,None,None,None,None,None,images.pop(0)])错误消息:我看到的完全跟踪:跟踪(最近一次调用):
Traceback (most recent call last): File "runnit2.py", line 976, in <module> allTags = "\"" + ','.join(x["tags"]) + "\"" TypeError: sequence item 3: expected string or Unicode, NoneType found
更新:我发现数据,特别是[x"title",x"title",x"description",有一些代码不喜欢的字符。'ascii' codec can't encode character u'\u201d' in position 9: ordinal not in range(128)。我用x"description".encode('utf-8')做了一个快速修复,但是它基本上消除了那个单元格中的所有内容。有没有更好的方法,不删除所有的后冒犯字符?
发布于 2018-07-10 08:56:14
根据您发布的示例数据,我假设已发布的json的第一个索引在tag键值的第三个索引中为null。即: tag7
"tags": [
"tag4",
"tag5",
"tag6",
"tag7",
"tag8"
],要摆脱由空引发的TypeError,您只需检查并替换存在的空值,如下图所示。
x["tags"] = ["" if i is None else i for i in x["tags"]]
allTags = "\"" + ','.join(x["tags"]) + "\""我已经分配了一个空字符串来替换空字符串。
或者,您可以通过在None函数中使用filter()删除所有假元素。
allTags = "\"" + ','.join(filter(None, x["tags"])) + "\""注意到:在r"[...]"循环中添加r"[...]"并修复缩进问题。
发布于 2018-07-11 19:21:11
打开时,使用原始字符串并将文件编码设置为正常(非二进制模式)模式下的utf-8。对于Python3.6来说,这就足够了。
在Python2.7上,在处理unicode内容时,应该使用codecs.open('example.csv', 'w', encoding='utf-8')而不是常规的open()。而且,Python2.7上的csv模块不支持开箱即用的unicode,所以我建议切换到unicodecsv或者遵循this answer中的指导方针。
发布于 2018-07-09 09:26:34
使用W修改读写,如果必须使用WB,请使用以下功能。您需要在所有文本前面添加r以处理特殊符号。
import csv
import json
x = r"""[
{
"brief": "Brief 1",
"description": "Description 1",
"photos": [
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example.jpg?0101010101010",
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example2.jpg?0101010101010",
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example3.jpg?0101010101010"
],
"price": "145",
"tags": [
"tag1",
"tag2",
"tag3"
],
"title": "Title 1"
},
{
"brief": "Brief 2",
"description": "Description 2",
"photos": [
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example4.jpg?0101010101010",
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example5.jpg?0101010101010"
],
"price": "150",
"tags": [
"tag4",
"tag5",
"tag6",
"tag7",
"tag8"
],
"title": "Title 2"
},{
"brief": "blah blah 5'0\" to 5'4\"",
"buyerPickup": true,
"condition": "Good",
"coverShipping": false,
"description": "blah blah 5'0\" to 5'4\". blah blah.Size L/20”\n 5’8-5’11\n29lbs\n3x7 speed\n\n \r\n\r\n",
"photos": [
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-010101.jpeg?11111",
"https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-020202?111111"
],
"price": "240",
"tags": [
"tag2",
"5'0\"-5'4\""
],
"title": "blah blah 17\" Frame",
"front": "https://firebasestorage.googleapis.com/v0/b/example.appspot.com/o/Images%2F0007891113.jpg?alt=media&token=111-11-11-11-111"
}
]"""
x = json.loads(x)
def to_str(bytes_or_str):
if isinstance(bytes_or_str, bytes):
value = bytes_or_str.encode('utf-8')
else:
value = bytes_or_str
return value
def to_bytes(bytes_or_str):
if isinstance(bytes_or_str, str):
value = bytes_or_str.encode('utf-8')
else:
value = bytes_or_str
return value
f = csv.writer(open("example.csv", "w+"))
writeList = ["Handle", "Title", "Body (HTML)", "Vendor", "Type", "Tags", "Published", "Option1 Name", "Option1 Value",
"Variant Inventory Qty", "Variant Inventory Policy", "Variant Fulfillment Service", "Variant Price",
"Variant Requires Shipping", "Variant Taxable", "Image Src"]
newList = []
for item in writeList:
newList.append(to_bytes(item))
f.writerow(newList)
for x in x:
allTags = r"\"" + ','.join(x["tags"]) + r"\""
images = x["photos"]
f.writerow([x["title"],
x["title"],
x["description"],
"Vendor Name",
"Widget",
allTags,
"TRUE",
"Title",
"Default Title",
"1",
"deny",
"manual",
x["price"],
"TRUE",
"TRUE",
images.pop(0) if images else None])
while images:
f.writerow([x["title"], None, None, None, None, None, None, None, None, None, None, None, None, None, None,
images.pop(0)])https://stackoverflow.com/questions/51216219
复制相似问题