这是我的数据集,放在一个data 1.txt文件中
Keyword:Category
ccn:fintech
credit:fintech
smart:fintech这是我的数据集,放在一个data 2.txt文件中
Keyword:Category
mcm:mcm
switching:switching
pul-sim:pulsa
transfer:transfer
debit sms:money transfer我想做的是
Keyword Category_all
mcm mcm
switching switching
pul-sim pulsa
transfer transfer
debit sms money transfer
ccn fintech
credit fintech
smart fintechL所做的是
with open('entity_dict.txt') as f: //bank.txt
content = f.readlines()
content = [x.strip() for x in content ]
def ambil(inp):
try:
out = []
for x in content:
if x in inp:
out.append(x)
if len(out) == 0:
return 'other'
else:
output = ' '.join(out)
return output
except:
return 'other'
frame_institution['Keyword'] = frame_institution['description'].apply(ambil)
fintech = pd.read_csv('bank.txt', sep=":")
frame_Keyword = pd.merge(frame_institution, fintech, on='Keyword')那么,对于Then 2.txt,代码是
with open('entity_dict2.txt') as f:
content2 = f.readlines()
content2 = [x.strip() for x in content2 ]
def ambil2(inp):
try:
out = []
for x in content2:
if x in inp:
out.append(x)
if len(out) == 0:
return 'other'
else:
output = ' '.join(out)
return output
except:
return 'other'
frame_institution['Keyword2'] = frame_institution['description'].apply(ambil2)
fintech2 = pd.read_csv('bank2.txt', sep=":")
frame_Keyword2 = pd.merge(frame_institution, fintech, on='Keyword')
frame_Keyword2 = pd.merge(frame_Keyword2, fintech2, on='Keyword2')然后l对一些关键字进行过滤:
frame_Keyword2[frame_Keyword2['category_all'] == 'pulsa'] 实际结果是:
Keyword Category_all
mcm mcm
switching switching
ccn fintech
credit fintech
smart fintech但是Category_all中没有出现'pulsa'、'transfer'和'money transfer'。我想有更好的办法来解决它。
`
发布于 2019-01-16 16:17:25
只需尝试使用merge:
DataFrame 1:
>>> df1
Keyword Category
0 ccn fintech
1 credit fintech
2 smart fintechDataFrame 2:
>>> df2
Keyword Category
0 mcm mcm
1 switching switching
2 pul-sim pulsa
3 transfer transfer
4 debit sms money transfer结果,合并外部...
>>> pd.merge(df1, df2, how='outer')
Keyword Category
0 ccn fintech
1 credit fintech
2 smart fintech
3 mcm mcm
4 switching switching
5 pul-sim pulsa
6 transfer transfer
7 debit sms money transfer下面添加的另一个解决方案只是为了后人,如果有人在这里钩住类似的查询:
使用DataFrame.append()方法:
df1.append(df2, ignore_index=True)使用pd.concat()
pd.concat([df1, df2], ignore_index=True)或者创建一个Farme,然后拼接:
frames = [df1,df2]
pd.concat(frames, ignore_index=True)https://stackoverflow.com/questions/54212718
复制相似问题