首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >按其他列的值创建新列组

按其他列的值创建新列组
EN

Stack Overflow用户
提问于 2022-11-22 12:18:18
回答 1查看 27关注 0票数 1

我有以下数据

代码语言:javascript
复制
df1 = pd.DataFrame({'sentence': ['A', "A", "A", "A", 'A', 'B', "B", 'B'], 'entity': ['Stay home', "Stay home", "WAY", "WAY", "Stay home", 'Go outside', "Go outside", "purpose"], 'token' : ['Severe weather', "raining", "smt", "SMT0", "Windy", 'Sunny', "Good weather", "smt"]
})


    sentence        entity      token
0   A               Stay home   Severe weather
1   A               Stay home   raining
2   A               Way         smt
3   A               Way         SMT0
4   A               Stay home   Windy
5   B               Go outside  Sunny
6   B               Go outside  Good weather
7   B               Purpose     smt

我想group by sentences的值,并在WayPurpose存在于entity列时创建新的columns

预期成果:

代码语言:javascript
复制
   sentence entity      token                          Way       Purpose
0   A        Stay home  Severe weather, raining, Windy smt, SMTO Nan
1   B        Go outside Sunny, Good weather            Nan       smt
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-11-22 12:27:20

Series.isinboolean indexing中为非匹配行筛选行,~用于反向掩码,聚合join,并使用DataFrame.join进行与DataFrame.pivot_table匹配的筛选行列表

代码语言:javascript
复制
vals = ['WAY','purpose']

m = df1['entity'].isin(vals)

df2 = df1[m].pivot_table(index='sentence',columns='entity',values='token', aggfunc=','.join)
df3 = df1[~m].groupby(['sentence','entity'])['token'].agg(', '.join).reset_index()

df = df3.join(df2, on='sentence')
print (df)
  sentence      entity                           token       WAY purpose
0        A   Stay home  Severe weather, raining, Windy  smt,SMT0     NaN
1        B  Go outside             Sunny, Good weather       NaN     smt
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/74532513

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档