嗨,我得数一数病人一天吃多少药。病人每天服用几种不同的药物,剂量也不同。初始数据如下所示:
df_data={'med1':['Prednisolone','Prednisolone','Folic acid','Folic acid','Prednisolone','Enbrel','Prednisolone'],
'med2': [np.nan, np.nan, 'Folic acid','Folic acid',np.nan,'Methotrexate pill',np.nan],
'med3':[np.nan, np.nan,'Prednisolone','Prednisolone',np.nan,'Prednisolone',np.nan]}
df_data=pd.DataFrame(df_data)
df_data
med1 med2 med3
------------------------------------------
0 Prednisolone NaN NaN
1 Prednisolone NaN NaN
2 Folic acid Folic acid Prednisolone
3 Folic acid Folic acid Prednisolone
4 Prednisolone NaN NaN
5 Enbrel Methotrexate pill Prednisolone
6 Prednisolone NaN NaN我想要得到的是为每种药物创建新列的计数。我希望它看起来像这样:
med1 med2 med3 Prednisolone Folic acid Enbrel Methotrexate pill
---------------------------------------------------------------------------------
0 Prednisolone NaN NaN 1 0 0 0
1 Prednisolone NaN NaN 1 0 0 0
2 Folic acid Folic acid Prednisolone 1 2 0. 0
3 Folic acid Folic acid Prednisolone 1 2 0 0
4 Prednisolone NaN NaN 1 0 1 1
5 Enbrel Methotrexate pill Prednisolone 1 0 1 1
6 Prednisolone NaN NaN 1 0 0 0我不知道该怎么做。每列一个热编码,然后求和?还有更简单的建议吗?
发布于 2020-07-09 05:16:58
我们可以使用stack + str.get_dummies
s=df_data.stack().str.get_dummies().sum(level=0)
Enbrel Folic acid Methotrexate pill Prednisolone
0 0 0 0 1
1 0 0 0 1
2 0 2 0 1
3 0 2 0 1
4 0 0 0 1
5 1 0 1 1
6 0 0 0 1
df=df.join(s)https://stackoverflow.com/questions/62803677
复制相似问题