我有下面的熊猫资料。
epi_week state loc_type disease cases incidence
21835 200011 WY STATE MUMPS 2 0.40
21836 197501 WY STATE POLIO 3 0.76
21837 199607 WY STATE HEPATITIS 3 0.61
21838 197116 WY STATE MUMPS 6 1.73
21839 200048 WY STATE HEPATITIS 6 1.21我试图用一个唯一的整数替换每个disease。例如,'MUMPS'==1、'POLIO'==2等。最后的数据文件应该如下所示:
epi_week state loc_type disease cases incidence
21835 200011 WY STATE 1 2 0.40
21836 197501 WY STATE 2 3 0.76
21837 199607 WY STATE 3 3 0.61
21838 197116 WY STATE 1 6 1.73
21839 200048 WY STATE 3 6 1.21我使用以下代码:
# creating a dictionary
disease_dic = {'MUMPS':1, 'POLIO':2, 'MEASLES':3, 'RUBELLA':4,
'PERTUSSIS':5, 'HEPATITIS A':6, 'SMALLPOX':7,
'DIPHTHERIA':8}
data.disease = [disease_dic[item] for item in data.disease]我收到以下错误:
KeyErrorTraceback (most recent call last)
<ipython-input-115-52394901c90d> in <module>()
----> 1 cdc.disease = [disease_dic[item2] for item2 in cdc.disease]
KeyError: 1有谁能帮忙解决这个问题吗?谢谢。
发布于 2018-04-24 13:25:23
使用apply。
Ex:
disease_dic = {'MUMPS':1, 'POLIO':2, 'MEASLES':3, 'RUBELLA':4,
'PERTUSSIS':5, 'HEPATITIS A':6, 'SMALLPOX':7,
'DIPHTHERIA':8}
import pandas as pd
df = pd.DataFrame({"disease": disease_dic.keys()})
print(df["disease"].apply(lambda x: disease_dic.get(x)))输出:
0 4
1 2
2 1
3 8
4 3
5 5
6 7
7 6
Name: disease, dtype: int64https://stackoverflow.com/questions/50002738
复制相似问题