有谁能告诉我在这里根据另一个列名过滤(和填充nan)的正确方法。非常感谢。
相关链接:How to fill dataframe's empty/nan cell with conditional column mean
df
ID Name Industry Expenses
1 Treslam Financial Services 734545
2 Rednimdox Construction nan
3 Lamtone IT Services 567678
4 Stripfind Financial Services nan
5 Openjocon Construction 8678957
6 Villadox Construction 5675676
7 Sumzoomit Construction 231244
8 Abcd Construction nan
9 Stripfind Financial Services nan
df_mean_expenses = (df.groupby(['Industry'], as_index = False)['Expenses']).mean()
df_mean_expenses
Industry Expenses
0 Construction 554433.11
1 Financial Services 2362818.48
2 IT Services 149153.46为了用结构行的平均值(在df_mean_expenses中)替换结构-收入nan,我尝试了两种方法:
。。返回错误:项目错误长度500,而不是3!
2.
df'Expenses'np.isnan(df'Expenses')][df'Industry‘== 'Construction'] = df_mean_expenses.loc[df_mean_expenses'Industry’== 'Construction',‘Construction’.values .这将运行,但不会向df添加值。
预期产出:
df
ID Name Industry Expenses
1 Treslam Financial Services 734545
2 Rednimdox Construction 554433.11
3 Lamtone IT Services 567678
4 Stripfind Financial Services nan
5 Openjocon Construction 8678957
6 Villadox Construction 5675676
7 Sumzoomit Construction 231244
8 Abcd Construction 554433.11
9 Stripfind Financial Services nan发布于 2020-07-28 18:42:36
尝试使用transform
df_mean_expenses = df.groupby('Industry')['Expenses'].transform('mean')
df['Revenue'] = df['Revenue'].fillna(df_mean_expenses[df['Industry']=='Construction'])https://stackoverflow.com/questions/63140765
复制相似问题