文章/答案/技术大牛

发布

社区首页 >问答首页 >从.txt文件读取时出现错误的撇号

问从.txt文件读取时出现错误的撇号
EN

Stack Overflow用户

提问于 2021-01-15 12:03:26

回答 2查看 492关注 0票数 0

我在从.txt文件中读取行时遇到了问题。我的文件包含了一些句子，比如

没有，不能，没有

以此类推，问题是当我在

的

我有这样的事情：

-欧元™

所以我读的单词是hadnâ€™t而不是hadn’t

我的意见：

Love at First Sight

One <adjective> afternoon, I was walking by the <place> when
accidentally I bumped into a <adjective> boy.
At first I blushed and apologized for bumping into him, but when he flashed his
<adjective> smile I just couldn’t help falling in love. His
<adjective> voice telling me that it was ok sounded like music to myears.
I could have stayed there staring at him for <period_of_time>.
He had <adjective> <color> eyes and <adjective>
<color> hair. I thought he was perfect for me. Before I noticed,
<number> <period_of_time> had passed by after I apologized,
and I hadn’t said anything else since!
That’s when I noticed thathe was looking at me
<adverb>. I didn’t know what tosay, so I just <past_verb>.
I noticed him giving me astrange look when he started walking to his
<noun>.I looked back at him <number> more time(s), but hewas already out of sight.
It wasn’t love after all

预期输出:与输入文件相同

我的代码：

f = open('loveatfirstsight.txt','r')
for i in f.readlines():
    print(i)

我的操作系统: Windows 10

python

character-encoding

回答 2

Stack Overflow用户

回答已采纳

发布于 2021-01-15 12:21:43

该文件是用UTF-8编码的，但是您正在读取它，就好像它是(我猜想) windows-1252 (或其他一些特定于Windows的编码)一样。由于该文件中出现的撇号字符不是典型的ASCII‘打字机撇号’(' U+0027 APOSTROPHE)，而是位于基本拉丁文(‘ASCII’)块之外的“排印者的撇号”(’ U+2019 RIGHT单引号)，因此，不匹配的编码会使字符出现故障。

>>> 'hadn’t'.encode('utf-8').decode('cp1252')
'hadnâ€™t'

要纠正这个问题，您应该通过encoding参数将正确的编码指定给open函数。

f = open('loveatfirstsight.txt', 'r', encoding='utf-8')
for i in f.readlines():
    print(i)

正如help(open)所解释的，

在文本模式下，如果未指定encoding，则使用的编码依赖于平台:调用locale.getpreferredencoding(False)以获取当前的区域编码。(对于读取和写入原始字节，使用二进制模式并保留encoding未指定。)

。

票数 4

Stack Overflow用户

发布于 2021-01-15 12:11:01

这听起来像是编码问题。文本文件存储在UTF-8中，其中包含卷曲引号。您要么用错误的编码(可能是拉丁文-1)读取它，要么将它输出到某个地方(可能是Windows控制台？)这不是预期的UTF-8编码。

如果将问题修改为包含更多关于数据如何准确存储、读取和处理的详细信息，包括您所使用的系统以及使用的Python版本等，您将能够得到更好的答案。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/65735782

复制

相似问题

问从.txt文件读取时出现错误的撇号
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从.txt文件读取时出现错误的撇号EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从.txt文件读取时出现错误的撇号
EN