我有一个数据文件,如下所示
UnitID Sector Start_Date Status
1 SE1 2018-02-26 Closed
1 SE1 2019-03-27 Active
2 SE1 2017-02-26 Closed
2 SE1 2018-02-26 Closed
2 SE1 2019-02-26 Active
3 SE1 NaT Not_in_contract
4 SE1 NaT Not_in_contract
5 SE2 2017-02-26 Closed
5 SE2 2018-02-26 Closed
5 SE2 2019-02-26 Active
6 SE2 2018-02-26 Closed
6 SE2 2019-02-26 Active
7 SE2 2018-02-26 Closed
7 SE2 2018-07-15 Closed
8 SE2 NaT Not_in_contract
9 SE2 NaT Not_in_contract
10 SE2 2019-05-22 Active
11 SE2 2019-06-24 Active从上面我想准备下面的数据框架。
Sector Number_of_unique_units Number_of_Active_units
SE1 4 2
SE2 7 4发布于 2020-02-14 08:05:46
使用GroupBy.agg与DataFrameGroupBy.nunique和自定义lambda函数,并使用布尔掩码的sum的Active计数数:
df1=(df.groupby('Sector').agg(Number_of_unique_units=('UnitID','nunique'),
Number_of_Active_units=('Status',lambda x:x.eq('Active').sum()))
.reset_index())
print (df1)
Sector Number_of_unique_units Number_of_Active_units
0 SE1 4 2
1 SE2 7 4https://stackoverflow.com/questions/60222215
复制相似问题