首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >ValueError:无法将字符串转换为浮点型: sklearn

ValueError:无法将字符串转换为浮点型: sklearn
EN

Stack Overflow用户
提问于 2020-02-10 01:15:22
回答 1查看 5.3K关注 0票数 1

最近,我在python中处理数据集时遇到了一个意外的错误。错误是:ValueError: could not convert string to float。实际上,在数据集中也有文本数据,我用LabelEncoder将其转换为整数。但是当我在训练部分拟合模型时,我得到了这个错误,这是没有意义的。

代码:

代码语言:javascript
复制
import sklearn
from sklearn import model_selection
from sklearn import linear_model
from sklearn import preprocessing
import pandas as pd
import pickle
import numpy as np
data = pd.read_csv("house_train.csv")
data = data.fillna(value=0)
dataX_train = data.drop(["SalePrice"], axis = 1)
dataX_test = data.SalePrice


le = preprocessing.LabelEncoder()

dataX_train.MSZoning = le.fit_transform(list(data["MSZoning"]))
dataX_train.Street = le.fit_transform(list(data["Street"]))
dataX_train.Alley = le.fit_transform(list(data["Alley"]))
dataX_train.LotShape = le.fit_transform(list(data["LotShape"]))
dataX_train.LandContour = le.fit_transform(list(data["LandContour"]))
dataX_train.Utilities = le.fit_transform(list(data["Utilities"]))
dataX_train.LotConfig = le.fit_transform(list(data["LotConfig"]))
dataX_train.LandSlope = le.fit_transform(list(data["LandSlope"]))
dataX_train.Neighborhood = le.fit_transform(list(data["Neighborhood"]))
dataX_train.Condition1 = le.fit_transform(list(data["Condition1"]))
dataX_train.Condition2 = le.fit_transform(list(data["Condition2"]))
dataX_train.BldgType = le.fit_transform(list(data["BldgType"]))
dataX_train.HouseStyle = le.fit_transform(list(data["HouseStyle"]))
dataX_train.RoofStyle = le.fit_transform(list(data["RoofStyle"]))
dataX_train.RoofMatl = le.fit_transform(list(data["RoofMatl"]))
dataX_train.Exterior1st = le.fit_transform(list(data["Exterior1st"]))
dataX_train.Exterior2nd = le.fit_transform(list(data["Exterior2nd"]))
dataX_train.MasVnrType = le.fit_transform(list(data["MasVnrType"]))
dataX_train.ExterQual = le.fit_transform(list(data["ExterQual"]))
dataX_train.ExterCond = le.fit_transform(list(data["ExterCond"]))
dataX_train.Foundation = le.fit_transform(list(data["Foundation"]))
dataX_train.BsmtQual = le.fit_transform(list(data["BsmtQual"]))
dataX_train.BsmtExposure = le.fit_transform(list(data["BsmtExposure"]))
dataX_train.BsmtFinType1 = le.fit_transform(list(data["BsmtFinType1"]))
dataX_train.BsmtFinType2 = le.fit_transform(list(data["BsmtFinType2"]))
dataX_train.Heating = le.fit_transform(list(data["Heating"]))
dataX_train.HeatingQC = le.fit_transform(list(data["HeatingQC"]))
dataX_train.CentralAir = le.fit_transform(list(data["CentralAir"]))
dataX_train.Electrical = le.fit_transform(list(data["Electrical"]))
dataX_train.KitchenQual = le.fit_transform(list(data["KitchenQual"]))
dataX_train.Functional = le.fit_transform(list(data["Functional"]))
dataX_train.FireplaceQu = le.fit_transform(list(data["FireplaceQu"]))
dataX_train.GarageType = le.fit_transform(list(data["GarageType"]))
dataX_train.GarageFinish = le.fit_transform(list(data["GarageFinish"]))
dataX_train.GarageQual = le.fit_transform(list(data["GarageQual"]))
dataX_train.GarageCond = le.fit_transform(list(data["GarageCond"]))
dataX_train.PavedDrive = le.fit_transform(list(data["PavedDrive"]))
dataX_train.PoolQC = le.fit_transform(list(data["PoolQC"]))
dataX_train.Fence = le.fit_transform(list(data["Fence"]))
dataX_train.MiscFeature = le.fit_transform(list(data["MiscFeature"]))
dataX_train.SaleType = le.fit_transform(list(data["SaleType"]))
dataX_train.SaleCondition = le.fit_transform(list(data["SaleCondition"]))


best = 0

x_train, x_test, y_train, y_test = model_selection.train_test_split(dataX_train, dataX_test, 
test_size = 0.2)
clf = linear_model.LinearRegression()
clf.fit(x_train, y_train)
acc = clf.score(x_test, y_test)
if acc > best:
   best = acc
   with open("housingmodel.pickle", "wb") as f:
      pickle.dump(clf , f)
print(acc)
EN

回答 1

Stack Overflow用户

发布于 2020-02-16 16:56:03

首先检查一下,如果你用dataX_train编码了所有的特性,我想你遗漏了一些东西。

尝试:dataX_train.dtypes并检查是否存在任何非数值,然后在非数值列上使用to_numeric。例如

代码语言:javascript
复制
dataX_train['NonNumericCol'] = dataX_train['NonNumericCol'].apply(pd.to_numeric)
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/60139200

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档