我是用Python编程的新手。我有一个很大的CSV文件(大约5k个项目)。有两列,我需要得到数据计数。解释我所需要的最好的方法是向您展示csv的几行:
Name column OPTIONALDATA5 column
Collaborative Desk Broward
Collaborative Desk Broward
Academic Desk Broward
Academic Desk Broward
Academic Desk Broward
Academic Desk Broward
Collaborative Desk Broward
Collaborative Desk Broward
Collaborative Desk Broward
Collaborative Desk Broward
Broward Broward
Alachua Alachua
Collaborative Desk Alachua
Collaborative Desk Alachua
Collaborative Desk Alachua
Collaborative Desk Alachua
Collaborative Desk Alachua在上面的例子中,我只是想要如下结果:
Broward:
collaborative Desk - 6
Academic Desk - 4
Broward - 1
Alachua:
collaborative Desk - 5
Alachua - 1也许也是一个总数,然后转到电子表格中的下一个库。
我开始写代码,但我想知道是否有更好的方法来做到这一点。
发布于 2014-04-10 02:27:01
import csv
from collections import defaultdict, Counter
input_file = open('data')
csv_reader = csv.reader(input_file, delimiter='\t')
data = defaultdict(list)
for row in csv_reader:
data[row[1]].append(row[0])数据现在将包含:
{'Alachua': ['Alachua', 'Collaborative Desk', 'Collaborative Desk', 'Collaborative Desk', 'Collaborative Desk', 'Collaborative Desk'],
'Broward': ['Collaborative Desk', 'Collaborative Desk', 'Academic Desk', 'Academic Desk', 'Academic Desk', 'Academic Desk', 'Collaborative Desk', 'Collaborative Desk', 'Collaborative Desk', 'Collaborative Desk', 'Broward']}您可以遍历每个键的值列表并获得总计数,或者在python中使用Counter方法,如下所示:
for k, v in data.items():
print k
print Counter(v)这将打印:
Alachua
Counter({'Collaborative Desk': 5, 'Alachua': 1})
Broward
Counter({'Collaborative Desk': 6, 'Academic Desk': 4, 'Broward': 1})发布于 2014-04-10 02:43:14
这也是有效的(假设你的文件是\t分隔的):
import itertools
import operator
import csv
import collections
results = collections.defaultdict(lambda: collections.defaultdict(int))
with open('sample.csv', 'r') as f_in:
f_in.seek(0)
rdr = csv.reader(f_in, delimiter='\t')
next(rdr)
for row in rdr:
results[row[1]][row[0]] += 1
for k, v in results.iteritems():
print "%s" % k
for k2, v2 in v.iteritems():
print " %s - %s" % (k2, v2)输出:
Alachua
Alachua - 1
Collaborative Desk - 5
Broward
Collaborative Desk - 6
Academic Desk - 4
Broward - 1https://stackoverflow.com/questions/22970506
复制相似问题