文章/答案/技术大牛

发布

社区首页 >问答首页 >在数据库中缓存大型非unicode字典？

问在数据库中缓存大型非unicode字典？
EN

Stack Overflow用户

提问于 2012-06-06 18:35:09

回答 2查看 209关注 0票数 3

我有一个大型字典(输出为366 in中的字符串，~383764153行filetextfile)，我希望将其存储在数据库中，以便快速访问，并跳过填充字典所涉及的计算时间。

我的字典由一本文件名/内容对字典组成。小子集：

{
    'Reuters/19960916': {
        '54826newsML': '<?xml version="1.0"
encoding="iso-8859-1" ?>\r\n<newsitem itemid="54826" id="root"
date="1996-09-16" xml:lang="en">\r\n<title>USA: RESEARCH ALERT -
Crestar Financial cut.</title>\r\n<headline>RESEARCH ALERT - Crestar
Financial cut.</headline>\r\n<text>\n<p>-- Salomon Brothers analyst
Carole Berger said she cut her rating on Crestar Financial Corp to
hold from buy, at the same time lowering her 1997 earnings per share
view to $5.40 from $5.85.</p>\n<p>-- Crestar said it would buy
Citizens Bancorp in a $774 million stock swap.</p>\n<p>-- Crestar
shares were down 2-1/2 at 58-7/8. Citizens Bancorp soared 14-5/8 to
46-7/8.</p>\n</text>\r\n<copyright>(c) Reuters Limited',
        '55964newsML': '<?xml version="1.0" encoding="iso-8859-1"
?>\r\n<newsitem itemid="55964" id="root" date="1996-09-16"
xml:lang="en">\r\n<title>USA: Nebraska cattle sales thin at
$114/dressed-feedlot.</title>\r\n'
    }
}

我认为MongoDB很适合，但是它似乎同时要求键和值都是Unicode，而且由于我从namelist() on ZipFile获取文件名，所以不能保证它是Unicode。

你建议我如何把这本字典序列化成一个数据库？

python

mongodb

dictionary

data-dictionary

database

回答 2

Stack Overflow用户

发布于 2012-06-06 18:59:23

pymongo不要求字符串是unicode，它实际上发送ascii刺痛的原样，并将独角兽编码到UTF8。当从pymongo检索数据时，总是会得到unicode。@@ http://api.mongodb.org/python/2.0/tutorial.html#a-note-on-unicode-strings

如果输入包含具有高阶字节(如ab\xC3cd)的“国际”字节字符串，则需要将这些字符串转换为unicode或将它们编码为UTF-8。下面是一个简单的递归转换器，它处理任意嵌套的dicts：

def unicode_all(s):
    if isinstance(s, dict):
        return dict((unicode(k), unicode_all(v)) for k, v in s.items())
    if isinstance(s, list):
        return [unicode_all(v) for v in s]
    return unicode(s)

票数 5

Stack Overflow用户

发布于 2012-06-06 18:40:46

如果您有内存(显然是有内存的，因为您首先填充了字典) -- cPickle。或者，如果你想要的东西需要较少的内存，但速度较慢- shelve。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/10920082

复制

相似问题

问在数据库中缓存大型非unicode字典？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在数据库中缓存大型非unicode字典？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在数据库中缓存大型非unicode字典？
EN