我是熊猫图书馆的新手,需要一些帮助。我有两个这样的专栏:
Test Result Risk Rating
Fail Low
Pass Medium
Skip High
Pass Low
Fail Medium
Pass High
Skip Low
Fail Medium
Fail High现在,我需要从“测试结果”列中找到失败、传球和跳过的总数,并且我能够做到这一点。但是,我还需要从“测试结果”列中获得“失败”的总数,而从“风险评级”栏中得到“低”的总数。同样,失败的总次数与中型等等类似。我的最终结果应该是:
Fail (Low Risk Rating) = 1
Fail (Medium Risk Rating) = 2
Fail (High Risk Rating) = 1
Pass (Low Risk Rating) = 1
Pass (Medium Risk Rating) = 1
Pass (High Risk Rating) = 1
Skip (Low Risk Rating) = 1
Skip (Medium Risk Rating) = 0
Skip (High Risk Rating) = 1我该怎么做?任何帮助都将不胜感激。
发布于 2016-10-28 11:21:27
我认为您需要同时使用groupby列和聚合size。
df = df.groupby(['Test Result', 'Risk Rating']).size().reset_index(name='counts')
print (df)
Test Result Risk Rating counts
0 Fail High 1
1 Fail Low 1
2 Fail Medium 2
3 Pass High 1
4 Pass Low 1
5 Pass Medium 1
6 Skip High 1
7 Skip Low 1也许更好的是带有unstack的枢轴表
df = df.groupby(['Test Result', 'Risk Rating']).size().unstack(fill_value=0)
print (df)
Risk Rating High Low Medium
Test Result
Fail 1 1 2
Pass 1 1 1
Skip 1 1 0或者用crosstab更慢的解决方案
df = pd.crosstab(df['Test Result'], df['Risk Rating'])
print (df)
Risk Rating High Low Medium
Test Result
Fail 1 1 2
Pass 1 1 1
Skip 1 1 0如果需要的话,用0添加stack
df = df.groupby(['Test Result', 'Risk Rating'])
.size()
.unstack(fill_value=0)
.stack()
.reset_index(name='counts')
print (df)
Test Result Risk Rating counts
0 Fail High 1
1 Fail Low 1
2 Fail Medium 2
3 Pass High 1
4 Pass Low 1
5 Pass Medium 1
6 Skip High 1
7 Skip Low 1
8 Skip Medium 0https://stackoverflow.com/questions/40303957
复制相似问题