首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >按组分组,并在一个条目中将字符串应用于整个组

按组分组,并在一个条目中将字符串应用于整个组
EN

Stack Overflow用户
提问于 2018-07-24 10:34:17
回答 1查看 34关注 0票数 1

我需要根据组中的非空值向组应用一个字符串。一个例子是:

代码语言:javascript
复制
ID    name    surname  prsn_id
 A    john      smith  prsn_01
 A    john      smith      NaN
 A    john      smith      NaN
 A    john      smith      NaN
 B    mary      jane   prsn_02
 B    mary      jane       NaN
 B    mary      jane       NaN
 B    mary      jane       NaN
 B    mary      jane       NaN
 B    mary      jane       NaN
 B    mary      jane       NaN
 C    Barry   willis   prsn_03
 C    Barry   willis       Nan
 C    Barry   willis       Nan
 C    Barry   willis       Nan
 C    Barry   willis       Nan

产出应是:

代码语言:javascript
复制
ID    name    surname  prsn_id
 A    john      smith  prsn_01
 A    john      smith  prsn_01
 A    john      smith  prsn_01
 A    john      smith  prsn_01
 B    mary      jane   prsn_02
 B    mary      jane   prsn_02
 B    mary      jane   prsn_02
 B    mary      jane   prsn_02
 B    mary      jane   prsn_02
 B    mary      jane   prsn_02
 B    mary      jane   prsn_02
 C    Barry   willis   prsn_03
 C    Barry   willis   prsn_03
 C    Barry   willis   prsn_03
 C    Barry   willis   prsn_03
 C    Barry   willis   prsn_03

或者:

代码语言:javascript
复制
ID    name    surname  prsn_id    prsn_id_2
 A    john      smith  prsn_01          NaN
 A    john      smith      NaN      prsn_01
 A    john      smith      NaN      prsn_01
 A    john      smith      NaN      prsn_01
 B    mary      jane   prsn_02          NaN
 B    mary      jane       NaN      prsn_02
 B    mary      jane       NaN      prsn_02
 B    mary      jane       NaN      prsn_02
 B    mary      jane       NaN      prsn_02
 B    mary      jane       NaN      prsn_02
 B    mary      jane       NaN      prsn_02
 C    Barry   willis   prsn_03          NaN
 C    Barry   willis       Nan      prsn_03
 C    Barry   willis       Nan      prsn_03
 C    Barry   willis       Nan      prsn_03
 C    Barry   willis       Nan      prsn_03

我试过:

代码语言:javascript
复制
df['prsn_id_2'] = (df
                 .groupby(['ID', 'name', 'surname'])['prsn_id']
                 .fillna(method='ffill'))

这可能是可行的,但它需要很长的时间,因此,将不是很实际的前进。我需要另一个解决方案,是矢量化和相对快速的。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-07-24 10:39:40

使用dropna删除NaN的行,然后左加入merge

代码语言:javascript
复制
df1 = df.dropna(subset=['prsn_id'])
#if possible duplicates
#df1 = df.dropna(subset=['prsn_id']).drop_duplicates(['ID','name', 'surname'])
df = df.drop('prsn_id', axis=1).merge(df1, on=['ID','name', 'surname'], how='left')
print (df)
   ID   name surname  prsn_id
0   A   john   smith  prsn_01
1   A   john   smith  prsn_01
2   A   john   smith  prsn_01
3   A   john   smith  prsn_01
4   B   mary    jane  prsn_02
5   B   mary    jane  prsn_02
6   B   mary    jane  prsn_02
7   B   mary    jane  prsn_02
8   B   mary    jane  prsn_02
9   B   mary    jane  prsn_02
10  B   mary    jane  prsn_02
11  C  Barry  willis  prsn_03
12  C  Barry  willis  prsn_03
13  C  Barry  willis  prsn_03
14  C  Barry  willis  prsn_03
15  C  Barry  willis  prsn_03

细节

代码语言:javascript
复制
print (df1)
   ID   name surname  prsn_id
0   A   john   smith  prsn_01
4   B   mary    jane  prsn_02
11  C  Barry  willis  prsn_03
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/51496429

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档