我正在对我的数据集进行标准化
def standardization(new_df2, labelcol):
from sklearn.preprocessing import StandardScaler
labels = new_df2[labelcol]
del new_df2[labelcol]
scaled_features = StandardScaler().fit_transform(new_df2.values)
new_df3 = pd.DataFrame(scaled_features, index = new_df2, columns =
new_df2.columns)
new_df3[labelcol] = labels
return new_df3
labelcol = new_df2.population #population is one of the columns in dataframe
new_df3 = standardization(new_df2, labelcol)
print(new_df3)我收到以下错误!
KeyError: '[ 322. 2401. 496. ..., 1007. 741. 1387.] not in index'据我所知,322, 2401, ...是population列中的值。
请帮助我如何摆脱这个错误。这是什么意思?
附注:new_df2 = (20640, 14)和labelcol.shape = (20640,)
发布于 2017-12-19 23:59:54
下面的代码解决了我的问题
def standardization(new_df2, labelcol):
dflabel = new_df2[[labelcol]]
std_df = new_df2.drop(labelcol, 1)
scaled_features = StandardScaler().fit_transform(std_df.values)
new_df3 = pd.DataFrame(scaled_features, columns = std_df.columns)
new_df3 = pd.concat([dflabel, new_df3], axis=1)
return new_df3 感谢那些试图提供帮助的人。
https://stackoverflow.com/questions/47601936
复制相似问题