我有一个包含类别列和数值列的数据集。我希望将类别数据转换为数字,并将每种类别映射到特定的数字值。例如,在“教育”栏下,我有高中、本科生、研究生、博士等。如果有人能给我提供将每个代码映射到任意数值的代码,我将不胜感激。
发布于 2021-04-09 09:20:50
import pandas as pd
df = pd.DataFrame(["Highschool", "Undergraduate","Highschool" ,"Graduate", "PHD", "Graduate", "Graduate","Undergraduate"],columns = ["Education"])
df_transformed = pd.get_dummies(df)
df_transformed.head()操作:
Education_Graduate Education_Highschool Education_PHD Education_Undergraduate
0 0 1 0 0
1 0 0 0 1
2 0 1 0 0
3 1 0 0 0
4 0 0 1 0#标签编码
from sklearn import preprocessing
encoder = preprocessing.LabelEncoder()
encoder.fit(df["Education"].values)
#use_any_input_list_here and it will assign a numerical value. I have given a sample list
encoder.transform(["Undergraduate","Highschool" ,"Graduate"])https://stackoverflow.com/questions/67013766
复制相似问题