我有这个数据
d = {
'Primary area': [
'Biological Sciences A',
'Cultures and Cultural Production',
'Mathematics'
],
'Discipline': [
'Biochemistry and Molecular Biology',
'Philosophy',
'Pure Mathematics'
]
}
import pandas as pd
df = pd.DataFrame(data=d)
Discipline Primary area
0 Biochemistry and Molecular Biology Biological Sciences A
1 Philosophy Cultures and Cultural Production
2 Pure Mathematics Mathematics我想获得一个新的列,“My纪律”列,根据行的不同从“纪律”或“主区域”获取一个项。我有一个单词列表
Mydisciplines = ['Biological Sciences A', 'Mathematics', 'Philosophy']我想使用这个新的列表来过滤两列,然后合并剩下的。比如
Discipline Mydisciplines Primary area
0 Biochemistry and Molecular Biology Biological Sciences A Biological Sciences A
1 Philosophy Philosophy Cultures and Cultural Production
2 Pure Mathematics Mathematics Mathematics我尝试了一些东西,但是我无法在代码中合成我想要的东西。我完全搞不懂如何处理这个问题。
发布于 2018-04-13 13:01:02
我认为需要extract将Mydisciplines和|的所有值连接起来,用于正则表达式OR和单词边界的\b:
Mydisciplines = ['Biological Sciences A', 'Mathematics', 'Philosophy']
pat = r'(\b{}\b)'.format('|'.join(Mydisciplines))
#join columns together
s = df['Discipline'] + ' ' + df['Primary area']
df['Mydisciplines'] = s.str.extract(pat, expand=False)
print (df)
Discipline Primary area \
0 Biochemistry and Molecular Biology Biological Sciences A
1 Philosophy Cultures and Cultural Production
2 Pure Mathematics Mathematics
Mydisciplines
0 Biological Sciences A
1 Philosophy
2 Mathematics 发布于 2018-04-13 12:26:39
我所做的就是这个。
创建一个包括与主要领域相关的所有学科的列表:
Biological_A = df[(df["Primary area"] == 'Biological Sciences A')].Discipline.unique()
Mathematics = df[(df["Primary area"] == 'Mathematics')].Discipline.unique()然后替换列纪律中的值,这些值位于该列表中:
for x in Biology_A: df.replace({'Discipline': {x:'Biological Sciences A'}}, regex=True, inplace=True) for x in Mathematics: df.replace({'Discipline': {x:'Mathematics'}}, regex=True, inplace=True)
根据需要对其他主要区域重复此操作。
这段代码
Discipline Primary area
0 Biochemistry and Molecular Biology Biological Sciences A
1 Philosophy Cultures and Cultural Production
2 Pure Mathematics Mathematics转到
Discipline Primary area
0 Biological Sciences A Biological Sciences A
1 Philosophy Cultures and Cultural Production
2 Pure Mathematics Mathematics它没有真正回答这个问题,因为它没有创建一个新的专栏,但这正是我所需要的,尽管问题的措辞。
https://stackoverflow.com/questions/49799642
复制相似问题