我有一个数据集,我想在其中:
包括其计数。
数据
ID location type box status
aa NY no box55 hey
aa NY no box55 hi
aa NY yes box66 hello
aa NY yes box66 goodbye
aa CA no box11 hey
aa CA no box11 hi
aa CA yes box11 hello
aa CA yes box11 goodbye
aa CA no box86 hey
aa CA no box86 hi
aa CA yes box86 hello
aa CA yes box99 goodbye
aa CA no box99 hey
aa CA no box99 hi 所需
location box count box hey hi hello goodbye
NY 2 box55 1 1 0 0
NY 2 box66 0 0 1 1
CA 3 box11 1 1 1 1
CA 3 box86 1 1 1 0
CA 3 box99 1 1 0 1 正在做什么
df['box count'] = df.groupby(['location','box'])['box'].size()
t = pd.get_dummies(df, prefix_sep='', prefix='',
columns=['status']).groupby(['box', 'location'],
as_index=False).sum().assign(count=df.groupby(['box', 'location'],
as_index=False)['status'].size()['size'])如有任何建议,将不胜感激。
发布于 2022-11-22 23:34:47
尝试制作两个数据文件:第一个用.groupby(),第二个用pd.crosstab。然后只需pd.concat:
df1 = df.groupby(["location", "box"]).agg(**{"box count": ("box", "size")})
df2 = pd.crosstab([df["location"], df["box"]], df["status"])
df_out = pd.concat([df1, df2], axis=1)
print(df_out.reset_index())指纹:
location box box count goodbye hello hey hi
0 CA box11 4 1 1 1 1
1 CA box86 3 0 1 1 1
2 CA box99 3 1 0 1 1
3 NY box55 2 0 0 1 1
4 NY box66 2 1 1 0 0编辑:
m = df.groupby(["location"])["box"].nunique()
df1 = df.groupby(["location", "box"]).agg(
**{
"box count": (
"location",
lambda x: m[x.iat[0]],
)
}
)
df2 = pd.crosstab([df["location"], df["box"]], df["status"])
df_out = pd.concat([df1, df2], axis=1)
print(df_out.reset_index())指纹:
location box box count goodbye hello hey hi
0 CA box11 3 1 1 1 1
1 CA box86 3 0 1 1 1
2 CA box99 3 1 0 1 1
3 NY box55 2 0 0 1 1
4 NY box66 2 1 1 0 0https://stackoverflow.com/questions/74539860
复制相似问题