我使用Python3和熊猫作为数据集,如下所示(玩具数据集)-
data
location importance agent count
0 London Low chatbot 2
1 NYC Medium chatbot 1
2 London High human 3
3 London Low human 4
4 NYC High human 1
5 NYC Medium chatbot 2
6 Melbourne Low chatbot 3
7 Melbourne Low human 4
8 Melbourne High human 5
9 NYC High chatbot 5
10 Melbourne Low human 3
11 Melbourne Low human 1
12 Melbourne High chatbot 5
13 Washington Medium chatbot 7
14 Washington Medium human 8
15 Washington High chatbot 5
16 Melbourne Medium chatbot 4
17 Washington Medium chatbot 5
18 Melbourne High human 3
19 Washington Low chatbot 2熊猫交叉表如下所示-
pd.crosstab(data['location'], data['importance'])
importance High Low Medium
location
London 1 2 0
Melbourne 3 4 1
NYC 2 0 2
Washington 1 1 3问题是要将3列'High‘、'Low’、'Medium‘相加,这样您就只包含了sum >= 4的交叉表行。因此,对于这个例子,它应该排除伦敦,因为它的列sum < 4。
帮助?
发布于 2021-02-04 10:22:33
您可以通过4和boolean indexing中的筛选器对行值进行求和和比较。
df1 = pd.crosstab(data['location'], data['importance'])
df = df1[df1.sum(axis=1).ge(4)]工作方式如下:
df = df1[df1.sum(axis=1)>= 4)]https://stackoverflow.com/questions/66043623
复制相似问题