文章/答案/技术大牛

发布

社区首页 >问答首页 >将中文ascii字符串转换为中文字符串

问将中文ascii字符串转换为中文字符串
EN

Stack Overflow用户

提问于 2016-03-03 04:51:59

回答 1查看 1.3K关注 0票数 1

我尝试使用sys模块将默认编码设置为转换字符串，但它不起作用。

字符串是：

`\xd2\xe6\xc3\xf1\xba\xcb\xd0\xc4\xd4\xf6\xb3\xa4\xbb\xec\xba\xcf`

中文意思是益民核心增长混合。但是如何将它转换成中文字符串呢？

我试过这个：

>>> string = '\xd2\xe6\xc3\xf1\xba\xcb\xd0\xc4\xd4\xf6\xb3\xa4\xbb\xec\xba\xcf'
>>> print string.decode("gbk")
益民核心增长混合  # As you can see here, got the right answer
>>> new_str = string.decode("gbk")
>>> new_str
u'\u76ca\u6c11\u6838\u5fc3\u589e\u957f\u6df7\u5408' # It returns the another encode type.
>>> another = u"益民核心增长混合"
>>> another
u'\u76ca\u6c11\u6838\u5fc3\u589e\u957f\u6df7\u5408' # same as new_str

所以，我只是对这种情况感到困惑，为什么我可以打印string.decode("gbk")，但是我的python控制台中的new_str只是返回另一个编码类型呢？

我的操作系统是Windows 10，Python版本是Python2.7。非常感谢!

python

encoding

ascii

windows-10

回答 1

Stack Overflow用户

发布于 2016-03-03 05:03:37

你做得对。

在本例中，new_str实际上是一个由u前缀表示的unicode字符串。

>>> new_str
u'\u76ca\u6c11\u6838\u5fc3\u589e\u957f\u6df7\u5408' # It returns the another encode type.

当您解码GBK编码的字符串时，您将得到一个unicode字符串。该字符串的每个字符都是一个unicode代码点。

>>> u'\u76ca'
u'\u76ca'
>>> print u'\u76ca'
益
>>> import unicodedata
>>> unicodedata.name(u'\u76ca')
'CJK UNIFIED IDEOGRAPH-76CA'

>>> print new_str
益民核心增长混合
>>> print repr(new_str)
u'\u76ca\u6c11\u6838\u5fc3\u589e\u957f\u6df7\u5408

这就是Python在解释器中显示unicode字符串的方式--它使用repr来显示它。但是，当您打印字符串时，Python将转换为您的终端(sys.stdout.encoding)的编码，这就是字符串如您所期望的那样显示的原因。

因此，它不是字符串的不同编码，它只是Python在解释器中显示字符串的方式。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/35763467

复制

相似问题

问将中文ascii字符串转换为中文字符串
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将中文ascii字符串转换为中文字符串EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将中文ascii字符串转换为中文字符串
EN