我知道合并两个数据帧可以使用下面的代码来完成
left2.merge(right2, left_on='keyLeft', right_on='keyRight', how='inner')在此link中详细说明了合并过程
但是,我有一个数据帧,它的文本如下
Dataframe1
61 ability produce required documents based trans...
237 ability setup track definable error sources
440 ability log discrpeancies received vs shipped ...
1786 training education cover supply chain principl...
1931 system show estimated cost make reslotting mov... Dataframe2
KeyWords
0 ability
1 require
2 document
3 base
4 transportation现在,我想要连接数据帧,并且希望复制行,因为Dataframe2的单词可以出现在Dataframe1的多行中。
当我使用一个简单的合并时,我得到一个缺省的空值
input_file = Dataframe2.merge(Dataframe1, left_on='KeyWords', right_on='Questions', how = 'left')
KeyWords Questions
0 ability NaN
1 require NaN
2 document NaN
3 base NaN
4 transportation NaN我如何加入才能得到值呢?谢谢
我的预期输出应该是这样的。
KeyWords Questions
ability ability produce required documents based trans...
ability ability setup track definable error sources
ability ability log discrpeancies received vs shipped ...发布于 2019-11-11 19:14:56
使用pandas.Series.str.contains的一种方式
df2['Questions'] = df2['KeyWords'].apply(lambda x: df1['Qs'][df1['Qs'].str.contains(x)].index.tolist())
print(df2)输出:
KeyWords Questions
0 ability [61, 237, 440]
1 require [61]
2 document [61]
3 base [61]
4 transportation []发布于 2019-11-11 19:22:09
我们可以使用Series.str.split + DataFrame.explode。然后使用DataFrame.merge
df1['list_words']=df1['Questions'].str.split(' ')
new_df1=df1.explode('list_words')
df_merge=df2.merge(new_df1,left_on='KeyWords',right_on='list_words',how='inner').drop('list_words',axis=1)
print(df_merge)输出
KeyWords Questions
0 ability ability produce required documents based trans...
1 ability ability setup track definable error sources
2 ability ability log discrpeancies received vs shipped ...https://stackoverflow.com/questions/58800125
复制相似问题