当名为‘关键字’的列中的值与相邻值重复时,我试图覆盖在名为'Group‘的列中命名的值。
例如,由于字符串‘商业办公清洁服务’是重复的,所以我想将相邻的列覆盖到‘商业清洁服务’。
示例数据

期望输出

最小可重现性示例
import pandas as pd
data = [
["commercial cleaning services", "commercial cleaning services"],
["commercial office cleaning services", "commercial cleaning services"],
["janitorial cleaning services", "commercial cleaning services"],
["commercial office services", "commercial cleaning"],
]
df = pd.DataFrame(data, columns=["Keyword", "Group"])
print(df)我对熊猫很陌生,不知道从哪里开始,我已经到了一个死胡同,谷歌和搜索堆叠溢出。
发布于 2022-11-26 15:26:22
IIUC,使用duplicated与mask和ffill:
#is the keyword duplicated ?
m = df['Keyword'].duplicated()
df['Group'] = df['Group'].mask(m).ffill()#产出:
print(df)
Keyword Group
0 commercial cleaning services commercial cleaning services
1 commercial office cleaning services commercial cleaning services
2 janitorial cleaning services commercial cleaning services
3 commercial office cleaning services commercial cleaning services注意:可复制的示例与输入的图像()不匹配。
https://stackoverflow.com/questions/74583388
复制相似问题