我有一个IPL数据集,如下所示:
df.head(10):
toss_winner winner
0 Royal Challengers Bangalore Sunrisers Hyderabad
1 Rising Pune Supergiant Rising Pune Supergiant
2 Kolkata Knight Riders Kolkata Knight Riders
3 Kings XI Punjab Kings XI Punjab
4 Royal Challengers Bangalore Royal Challengers Bangalore
5 Sunrisers Hyderabad Sunrisers Hyderabad
6 Mumbai Indians Mumbai Indians
7 Royal Challengers Bangalore Kings XI Punjab
8 Rising Pune Supergiant Delhi Daredevils
9 Mumbai Indians Mumbai Indians我想根据每支球队掷硬币获胜的次数和他们在掷硬币后赢得比赛的次数对我的数据进行分组。
对于ex,期望的输出是:
team total_toss_win win_on_toss_win
Royal Challengers Bangalore 3 1
Rising Pune Supergiant 2 1
Kolkata Knight Riders 1 1
Kings XI Punjab 1 1 (although 2 wins, but lost the toss on second win)
and so on....我尝试了groupby和aggregation的变体,但似乎都不起作用
发布于 2020-06-21 02:17:13
先试用melt,然后使用unstack试用groupby
s = pd.melt(df).groupby('value')['variable'].value_counts().unstack('variable')\
.fillna(0)
print(s)
variable toss_winner winner
value
Delhi Daredevils 0.0 1.0
Kings XI Punjab 1.0 2.0
Kolkata Knight Riders 1.0 1.0
Mumbai Indians 2.0 2.0
Rising Pune Supergiant 2.0 1.0
Royal Challengers Bangalore 3.0 1.0
Sunrisers Hyderabad 1.0 2.0发布于 2020-06-21 02:24:16
下面是理解每一步的简单方法:
# number of counts each team win the toss
a = df.groupby("toss_winner").size()
# number of times they win the match after winning the toss
b = df.query("toss_winner == winner").groupby(["toss_winner"]).size()
# output
f = pd.concat([a, b], axis=1).reset_index().rename(columns={0: 'total_toss_win', 1: 'win_on_toss_win'})
print(f)
toss_winner total_toss_win win_on_toss_win
0 Kings XI Punjab 1 1
1 Kolkata Knight Riders 1 1
2 Mumbai Indians 2 2
3 Rising Pune Supergiant 2 1
4 Royal Challengers Bangalore 3 1
5 Sunrisers Hyderabad 1 1https://stackoverflow.com/questions/62489773
复制相似问题