我正在Macbook OSX10.2.1(塞拉利昂)上运行Python3.5.2。
在试图从Kaggle运行泰坦尼克号数据集的一些代码时,我一直收到以下错误:
NotFittedError跟踪(最近一次调用)在() 67#中使用测试集进行预测并打印它们。->8 my_prediction = my_tree_one.predict(test_features) 9打印(My_prediction) 10
/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/sklearn/tree/tree.py in proba (self,X,check_input) 429“”430 -> 431 X= self._validate_X_predict(X,check_input) 432 proba= self.tree_.predict(X) 433 n_samples = X.shape
/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/sklearn/tree/tree.py in _validate_X_predict(self,X,check_input) 386“”每当有人试图预测、应用、predict_proba“387如果self.tree_为空”时“验证X”:-> 388举起NotFittedError(“估计不合适,“389”在开发该模型之前调用fit。
NotFittedError:估计器不合适,在开发模型之前调用fit。
违规代码似乎是这样的:
# Impute the missing value with the median
test.Fare[152] = test.Fare.median()
# Extract the features from the test set: Pclass, Sex, Age, and Fare.
test_features = test[["Pclass", "Sex", "Age", "Fare"]].values
# Make your prediction using the test set and print them.
my_prediction = my_tree_one.predict(test_features)
print(my_prediction)
# Create a data frame with two columns: PassengerId & Survived. Survived contains your predictions
PassengerId =np.array(test["PassengerId"]).astype(int)
my_solution = pd.DataFrame(my_prediction, PassengerId, columns = ["Survived"])
print(my_solution)
# Check that your data frame has 418 entries
print(my_solution.shape)
# Write your solution to a csv file with the name my_solution.csv
my_solution.to_csv("my_solution_one.csv", index_label = ["PassengerId"])这里有一个指向代码其余部分的链接。
因为我已经调用了'fit‘函数,所以我无法理解这个错误消息。我哪里出问题了?耽误您时间,实在对不起。
编辑:原来问题是从前面的代码块继承而来的。
# Fit your first decision tree: my_tree_one
my_tree_one = tree.DecisionTreeClassifier()
my_tree_one = my_tree_one.fit(features_one, target)
# Look at the importance and score of the included features
print(my_tree_one.feature_importances_)
print(my_tree_one.score(features_one, target))行:my_tree_one = my_tree_one.fit(features_one,target)
产生错误:
ValueError:输入包含NaN、无穷大或对dtype('float32')来说太大的值。
发布于 2016-12-03 10:29:59
错误是不言自明的:features_one或target数组确实包含NaNs或无限值,因此估计器无法拟合,因此您以后不能使用它进行预测。
在拟合之前,检查这些数组并相应地处理NaN值。
https://stackoverflow.com/questions/40937543
复制相似问题