首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >用Pandas GroupBy和value_counts找到最常用的值

用Pandas GroupBy和value_counts找到最常用的值
EN

Stack Overflow用户
提问于 2018-05-29 20:52:36
回答 2查看 2.4K关注 0票数 4

我正在处理一张桌子上的两列。

代码语言:javascript
复制
+-------------+--------------------------------------------------------------+
|  Area Name  |                       Code Description                       |
+-------------+--------------------------------------------------------------+
| N Hollywood | VIOLATION OF RESTRAINING ORDER                               |
| N Hollywood | CRIMINAL THREATS - NO WEAPON DISPLAYED                       |
| N Hollywood | CRIMINAL THREATS - NO WEAPON DISPLAYED                       |
| N Hollywood | ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT               |
| Southeast   | ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT               |
| West Valley | CRIMINAL THREATS - NO WEAPON DISPLAYED                       |
| West Valley | CRIMINAL THREATS - NO WEAPON DISPLAYED                       |
| 77th Street | RAPE, FORCIBLE                                               |
| Foothill    | CRM AGNST CHLD (13 OR UNDER) (14-15 & SUSP 10 YRS OLDER)0060 |
| N Hollywood | VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS) 0114 |
+-------------+--------------------------------------------------------------+

我使用Groupby和value_counts按区域名称查找代码描述。

代码语言:javascript
复制
df.groupby(['Area Name'])['Code Description'].value_counts()

是否有办法只查看每个区域名称的顶部'n‘值?如果我将.nlargest(3)附加到上面的代码中,它只返回一个区域名称的结果。

代码语言:javascript
复制
+---------------------------------------------------------------------------------+
| Wilshire     SHOPLIFTING-GRAND THEFT ($950.01 & OVER)                         7 |
+---------------------------------------------------------------------------------+
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2018-05-29 21:20:27

根据head的结果,在每组中使用value_counts

代码语言:javascript
复制
df.groupby('Area Name')['Code Description'].apply(lambda x: x.value_counts().head(3))

输出:

代码语言:javascript
复制
Area Name                                                                
77th Street  RAPE, FORCIBLE                                                  1
Foothill     CRM AGNST CHLD (13 OR UNDER) (14-15 & SUSP 10 YRS OLDER)0060    1
N Hollywood  CRIMINAL THREATS - NO WEAPON DISPLAYED                          2
             VIOLATION OF RESTRAINING ORDER                                  1
             ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT                  1
Southeast    ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT                  1
West Valley  CRIMINAL THREATS - NO WEAPON DISPLAYED                          2
Name: Code Description, dtype: int64
票数 4
EN

Stack Overflow用户

发布于 2018-05-29 21:23:10

您可以执行双groupby

代码语言:javascript
复制
s = df.groupby('Area Name')['Code Description'].value_counts()
res = s.groupby('Area Name').nlargest(3).reset_index(level=1, drop=True)

print(res)

Area Name    Code Description                                            
77th Street  RAPE, FORCIBLE                                                  1
Foothill     CRM AGNST CHLD (13 OR UNDER) (14-15 & SUSP 10 YRS OLDER)0060    1
N Hollywood  CRIMINAL THREATS - NO WEAPON DISPLAYED                          2
             ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT                  1
             VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS) 0114    1
Southeast    ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT                  1
West Valley  CRIMINAL THREATS - NO WEAPON DISPLAYED                          2
Name: Code Description, dtype: int64
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/50592762

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档