通过GET请求,我从Google geocode API中提取了json:
import urllib, urllib2
url = "http://maps.googleapis.com/maps/api/geocode/json"
params = {'address': 'ivory coast', 'sensor': 'false'}
request = urllib2.Request(url + "?" + urllib.urlencode(params))
response = urllib2.urlopen(request)
st = response.read()结果看起来是这样的:
{
"results" : [
{
"address_components" : [
{
"long_name" : "Côte d'Ivoire",
"short_name" : "CI",
"types" : [ "country", "political" ]
}
],
"formatted_address" : "Côte d'Ivoire",
"geometry" : { ... # rest snipped如您所见,国家名称有一些编码问题。我试着像这样猜测编码:
import chardet
encoding = chardet.detect(st)
print "String is encoded in {0} (with {1}% confidence).".format(encoding['encoding'], encoding['confidence']*100)它返回:
String is encoded in GB2312 (with 99.0% confidence).我想知道的是,我如何才能将它转换成一个带有编码的字典,其中ô (o with properly )是正确显示的。
我试过了:
st = st.decode(encoding['encoding']).encode('utf-8')但我得到的是:
{
"results" : [
{
"address_components" : [
{
"long_name" : "C么te d'Ivoire",
"short_name" : "CI",
"types" : [ "country", "political" ]
}
],
"formatted_address" : "C么te d'Ivoire",
"geometry" : { ... # rest snipped发布于 2012-12-20 02:08:08
google api结果总是用UTF-8编码的,您甚至可以从它们的HTTP Content-Type头中手动读取:

发布于 2012-12-20 02:14:43
一旦您(正确地)解码了它,就不要对它进行重新编码;json可以很好地与unicode一起工作。
>>> json.loads(u"[\"C\xf4te d'Ivoire\"]")
[u"C\xf4te d'Ivoire"]https://stackoverflow.com/questions/13958196
复制相似问题