我已经保存了一个经过训练的模型和测试数据集,并希望重新加载它,只是为了验证我是否获得了相同的结果,以便将来使用该模型(目前我没有新的数据可供测试)。我保存的csv不包含标签,它是与原始训练/测试操作中相同的测试数据,运行良好。
我创建了这样的模型:
# copy split data for this model
dtc_test_X = test_X
dtc_test_y = test_y
dtc_train_X = train_X
dtc_train_y = train_y
# initialize the model
dtc = DecisionTreeClassifier(random_state = 1)
# fit the trianing data
dtc_yhat = dtc.fit(dtc_train_X, dtc_train_y).predict(dtc_test_X)
# scikit-learn's accuracy scoring
acc = accuracy_score(dtc_test_y, dtc_yhat)
# scikit-learn's Jaccard Index
jacc = jaccard_similarity_score(dtc_test_y, dtc_yhat)
# scikit-learn's classification report
class_report = classification_report(dtc_test_y, dtc_yhat)我已经将模型和数据保存在下面:
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.externals import joblib
# setup the pipe line
pipe = make_pipeline(DecisionTreeClassifier)
# save the model
joblib.dump(pipe, 'model.pkl')
dtc_test_X.to_csv('set_to_predict.csv')当我重新加载模型并尝试进行如下预测时:
#Loading the saved model with joblib
pipe = joblib.load('model.pkl')
# New data to predict
pr = pd.read_csv('set_to_predict.csv')
pred_cols = list(pr.columns.values)
pred_cols
# apply the whole pipeline to data
pred = pd.Series(pipe.predict(pr[pred_cols]))但在最后一行(预测),它引发了一个异常:
TypeError: predict() missing 1 required positional argument: 'X'在搜索答案时,我只能找到类似异常的示例,但使用Y而不是X,答案似乎不适用。为什么我会得到这个错误?
发布于 2019-08-14 03:08:05
尝试用pipe.predict(X=pr[pred_cols])替换pipe.predict(pr[pred_cols]),看看它是否正常工作,或者是否会抛出其他错误
https://stackoverflow.com/questions/57483923
复制相似问题