这是我的数据
Column IV Source
RRD 5.795765 Personal_Demographics
RRD 5.795765 Cust360_Agreement
RRD 5.792729 External_Data
WO 4.361066 Cust360_Asset
Rating 3.600918 Personal_Demographics我的预期结果
Column IV Source
RRD 5.795765 Personal_Demographics
WODate 4.361066 Cust360_Asset
Rating 3.600918 Personal_Demographics我所尝试的
inds = df.groupby(['Column'])['IV'].transform(max) == df['IV']但结果是
Column IV Source
RRD 5.795765 Personal_Demographics
RRD 5.795765 Cust360_Agreement
WO 4.361066 Cust360_Asset
Rating 3.600918 Personal_Demographics第一个是具有相似的值,但我只需要一个输出,比如
Column IV Source
RRD 5.795765 Personal_Demographics
WO 4.361066 Cust360_Asset
Rating 3.600918 Personal_Demographics问候
发布于 2021-05-10 10:34:08
试用drop_duplicates + sort_values
out = df.sort_values('IV',ascending=False).drop_duplicates('Column')
Out[121]:
Column IV Source
0 RRD 5.795765 Personal_Demographics
3 WO 4.361066 Cust360_Asset
4 Rating 3.600918 Personal_Demographics如果您喜欢groupby
df.sort_values('IV',ascending=False).groupby(['Column']).head(1)https://stackoverflow.com/questions/67464082
复制相似问题