根据:
How to use pandas to rename rows when they are the same in column A?
我的数据是:

当医院列中值相同的行在GeneralRepresentation列中有不同的值时,我想使用熊猫来重命名医院。如果医院列中值相同的行在GeneralRepresentation列中具有相同的值,则不会对医院进行重命名。对于没有GeneralRepresentation的医院,保持医院的名称不变。
我想要的效果如下:

当我在How to use pandas to rename rows when they are the same in column A?中使用Beny的代码时
g = df.groupby('Hospital')['GeneralRepresentation']
s1 = g.transform(lambda x :x.factorize()[0]+1).astype(str)
s2 = g.transform('nunique')
df['Hospital'] = np.where(s2==1, df['Hospital'], df['Hospital'] + '_' + s1,)其效果如下:

但是我想要的是当医院没有GeneralRepresentation时,医院的名称保持不变,效果就像第二张图片,我如何修改这个代码以满足我的要求?
发布于 2021-12-30 09:11:30
问题是缺少值,因为misisng值是factorize设置为-1,所以如果为最后2行添加1 get 0,则在我的解决方案中,将NaN替换为groupby之前的空字符串,以防止出现这样的情况:
g = df.fillna({'GeneralRepresentation':''}).groupby('Hospital')['GeneralRepresentation']
s1 = g.transform(lambda x :x.factorize()[0]+1).astype(str)
s2 = g.transform('nunique')
df['Hospital'] = np.where(s2==1, df['Hospital'], df['Hospital'] + '_' + s1)
print (df)
Hospital GeneralRepresentation
0 a a
1 b_1 b
2 b_2 c
3 c_1 d
4 c_2 e
5 d NaN
6 t NaN发布于 2021-12-30 09:18:42
使用np.select(listof conditions, list of choices, alternative)
a=~(df['GeneralRepresentation'].str.contains('\w'))
b= ((df['GeneralRepresentation'].str.contains('\w'))&(df['Hospital'].duplicated(keep=False))&(df['GeneralRepresentation'].duplicated(keep=False)))
df['Hospital'] np.select([a,b],[df['Hospital']+'_'+(df.groupby('Hospital').cumcount()+1).astype(str),''],df['Hospital'])https://stackoverflow.com/questions/70529415
复制相似问题