我的数据是:
a=pd.DataFrame({'sentences':['i am here','bye bye','go back home quickly']})当我使用拆分时,我可以将字符串转换成单独的单词:
a.loc[:,'sentences1']=a.loc[:,'sentences'].astype(str).str.split(' ')结果是:
sentences sentences1
0 i am here [i, am, here]
1 bye bye [bye, bye]
2 go back home quickly [go, back, home, quickly]现在,我想集成'sentences1‘列中的列表,然后删除重复的列表。所以看起来是:
[i, am, here, bye, go, back, home, quickly]我该怎么做?
发布于 2019-08-07 03:36:11
您可以使用itertools.chain.from_iterable将列表的列表与dict.keys一起压平,以消除欺骗并维持秩序:
import itertools
[*itertools.chain.from_iterable([dict.fromkeys(i.split()).keys() for i in a.sentences])]或者使用OrderedDict
from collections import OrderedDict
[*itertools.chain.from_iterable([OrderedDict.fromkeys(i.split()).keys()
for i in a.sentences])]['i', 'am', 'here', 'bye', 'go', 'back', 'home', 'quickly']https://stackoverflow.com/questions/57386595
复制相似问题