我有一个使用Pandas read_csv函数导入的CSV数据集,当我运行.head()时,我得到以下表输出:
LSOA code Crime type
0 E01006687 Burglary
1 E01007229 Anti-social behaviour
2 E01007229 Anti-social behaviour
3 E01007229 Anti-social behaviour
4 E01007229 Burglary
5 E01007229 Other theft
6 E01007229 Other theft
7 E01007229 Shoplifting
8 E01007229 Theft from the person
9 E01007230 Anti-social behaviour
10 E01007230 Anti-social behaviour
11 E01007230 Anti-social behaviour
12 E01007230 Anti-social behaviour
13 E01007230 Anti-social behaviour
14 E01007230 Anti-social behaviour
15 E01007230 Anti-social behaviour
16 E01007230 Anti-social behaviour
17 E01007230 Anti-social behaviour
18 E01007230 Anti-social behaviour
19 E01007230 Anti-social behaviour这个表有超过33,000行。我需要做的是获取“LSOA代码”的所有唯一值--其中有207个,然后对于每个“LSOA代码”,我需要一个值来表示每个“犯罪类型”的出现次数。其中大约有30个,然后是每个LSOA代码的总犯罪总数。
我想要以下类型的输出表,其中'LSOA code‘是索引列:
LSOA code | Burglary | Anti-social Behavior | Bicycle Theft | Assault ... | Total
E01000067 | 32 | 21 | 8 | 43 ... | 1023
E01000043 | 98 | 65 | 5 | 73 ... | 2308
E01000237 | 38 | 34 | 12 | 92 ... | 897
E01000038 | 82 | 28 | 3 | 18 ... | 2147等。
我已经设法将LSOA代码放入一个数据帧中,并使用以下内容显示每个LSOA中的犯罪总数:
WirralCrimes = Crimes['LSOA code'].value_counts()
CrimeDF = pd.DataFrame(pd.Series(WirralCrimes))
CrimeDF.columns = ["Count"]..but我想不出如何在一列中获取每种犯罪类型,并汇总每种犯罪类型的出现次数
有没有人能给我指个方向,告诉我该怎么做?
非常感谢
发布于 2015-11-16 02:47:29
如果您拥有的数据类似于以下内容,则现在应该可以执行此操作:
df = DataFrame({'LSOA code':['E01006687','E01007229','E01007229','E01007229','E01007229','E01007229','E01007229','E01007229','E01007230','E01007230']
, 'Crime type':['Burglary','Anti-social behaviour','Anti-social behaviour','Anti-social behaviour','Burglary','Other theft','Other theft','Shoplifting','Theft from the person','Anti-social behaviour']})
your_data['count'] = 1
table = pandas.pivot_table(your_data, index='LSOA code', columns='Crime type',values='count',aggfunc='sum')
table ["total"] = table.sum(axis=1)https://stackoverflow.com/questions/33721434
复制相似问题