我正在尝试将一些数据保存在excel表中,并且有unicode解码问题的一个特定的部分,它有以下形式:
work_info['title'] = Darimān-i afsaradgī : rāhnamā-yi kāmil bira-yi hamah-ʼi khānvādahʹhā导致错误的代码是:
data.write(b + book + accumulated_books+ 2, 43, work_info['title'])
wb.save('/Users/apple/Downloads/WC Scrape_trialfortwo.csv')错误是:
UnicodeDecodeError:'ascii‘编解码器无法解码位置5的字节0xc4 :序数不在范围内(128个)
我尝试过几种不同的编码/解码技术,但到目前为止还没有成功。任何建议都将不胜感激。
谢谢!
发布于 2017-12-19 10:33:52
看起来您正在使用python2,而python2 2的unicode/字节处理是造成问题的原因。
>>> s = 'Darimān-i afsaradgī : rāhnamā-yi kāmil bira-yi hamah-ʼi khānvādahʹhā'
>>> wb = Workbook()
>>> ws = wb.add_sheet('test')
>>> ws.write(1, 0, s)
>>> wb.save('test.xls')
Traceback (most recent call last):
...
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 5: ordinal not in range(128)xlwt假设s是一个ascii编码的字符串,并试图将其解码为unicode,但失败了:
>>> s.decode('ascii')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 5: ordinal not in range(128)实际上,s被编码为utf-8:
>>> s.decode('utf-8')
u'Darim\u0101n-i afsaradg\u012b : r\u0101hnam\u0101-yi k\u0101mil bira-yi hamah-\u02bci kh\u0101nv\u0101dah\u02b9h\u0101'最简单的解决方案可能是将工作簿编码为utf-8:
>>> wb = Workbook(encoding='utf-8')
>>> ws = wb.add_sheet('test')
>>> ws.write(1, 0, s)
>>> wb.save('test.xls')如果需要更细粒度的方法,可以在将字符串写入工作表之前将字符串显式解码为unicode:
>>> wb = Workbook()
>>> ws = wb.add_sheet('test')
>>> ws.write(1, 0, s.decode('utf-8'))
>>> wb.save('test.xls')https://stackoverflow.com/questions/47876570
复制相似问题