我有一个数据集,看起来如下:
New_ID loanid RPC RPC_PERIOD PhoneNumber
0 1282908.0 10321436 0 0 9.055100e+10
1 1282908.0 10321436 0 0 9.059893e+10
2 1282908.0 10321436 0 0 9.570575e+12
3 1282908.0 10321436 0 0 9.057456e+10
4 1282908.0 10321436 0 0 9.570551e+12变量RPC是二进制(1,0)。
我希望将数据按"New_ID“分组,一个添加了RPC总数的新列,sum RPC (意思是计数RPC=1) a使这两个属性的比率。
我试过:
df['picked_up'] = df.groupby(by='New_ID')['RPC'].sum()
df['tries'] = df.groupby(by='New_ID')['RPC'].count()
df['ratio'] = df['picked_up'] / df['tries']我会感谢你的帮助。
发布于 2022-05-11 18:57:23
这将起作用:
df['sum'] = df.groupby('New_ID')['RPC'].transform(sum)
df['total'] = df.groupby('New_ID')['RPC'].transform('count')
df['ratio'] = df.groupby('New_ID')['RPC'].transform(mean)https://stackoverflow.com/questions/72202462
复制相似问题