我有以下数据
df1 = pd.DataFrame({'sentence': ['A', "A", "A", "A", 'A', 'B', "B", 'B'], 'entity': ['Stay home', "Stay home", "WAY", "WAY", "Stay home", 'Go outside', "Go outside", "purpose"], 'token' : ['Severe weather', "raining", "smt", "SMT0", "Windy", 'Sunny', "Good weather", "smt"]
})
sentence entity token
0 A Stay home Severe weather
1 A Stay home raining
2 A Way smt
3 A Way SMT0
4 A Stay home Windy
5 B Go outside Sunny
6 B Go outside Good weather
7 B Purpose smt我想group by sentences的值,并在Way和Purpose存在于entity列时创建新的columns。
预期成果:
sentence entity token Way Purpose
0 A Stay home Severe weather, raining, Windy smt, SMTO Nan
1 B Go outside Sunny, Good weather Nan smt发布于 2022-11-22 12:27:20
用Series.isin在boolean indexing中为非匹配行筛选行,~用于反向掩码,聚合join,并使用DataFrame.join进行与DataFrame.pivot_table匹配的筛选行列表
vals = ['WAY','purpose']
m = df1['entity'].isin(vals)
df2 = df1[m].pivot_table(index='sentence',columns='entity',values='token', aggfunc=','.join)
df3 = df1[~m].groupby(['sentence','entity'])['token'].agg(', '.join).reset_index()
df = df3.join(df2, on='sentence')
print (df)
sentence entity token WAY purpose
0 A Stay home Severe weather, raining, Windy smt,SMT0 NaN
1 B Go outside Sunny, Good weather NaN smthttps://stackoverflow.com/questions/74532513
复制相似问题