首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >GridSearchCV初始化

GridSearchCV初始化
EN

Stack Overflow用户
提问于 2017-06-11 10:16:03
回答 1查看 2.6K关注 0票数 1

我想在一定范围的alphas (LaPlace平滑参数)上使用贝叶斯来检验,这给了我伯努利朴素贝叶斯模型最好的准确性。

代码语言:javascript
复制
def binarize_pixels(data, threshold=0.784):
    # Initialize a new feature array with the same shape as the original data.
    binarized_data = np.zeros(data.shape)

    # Apply a threshold to each feature.
    for feature in range(data.shape[1]):
        binarized_data[:,feature] = data[:,feature] > threshold
    return binarized_data

binarized_train_data = binarize_pixels(mini_train_data)

def BNB():
    clf = BernoulliNB()
    clf.fit(binarized_train_data, mini_train_labels)
    scoring = clf.score(mini_train_data, mini_train_labels)
    predsNB = clf.predict(dev_data)
    print "Bernoulli binarized model accuracy: {:.4}".format(np.mean(predsNB == dev_labels))

这个模型运行得很好,而我的GridSearch交叉验证却不能:

代码语言:javascript
复制
pipeline = Pipeline([('classifier', BNB())])
def P8(alphas):
    gs_clf = GridSearchCV(pipeline, param_grid = alphas, refit=True)
    y_predictions = gs_clf.best_estimator_.predict(dev_data)
    print classification_report(dev_labels, y_predictions)
alphas = {'alpha' : [0.0, 0.0001, 0.001, 0.01, 0.1, 0.5, 1.0, 2.0, 10.0]}
P8(alphas)

我得到AttributeError:'GridSearchCV‘对象没有'best_estimator_’属性

EN

回答 1

Stack Overflow用户

发布于 2017-06-11 10:25:15

问题出现在以下两行中:

代码语言:javascript
复制
gs_clf = GridSearchCV(pipeline, param_grid = alphas, refit=True)
y_predictions = gs_clf.best_estimator_.predict(dev_data)

请注意,在使用predict之前,首先需要拟合模型。也就是说,调用gs_clf.fit。请参阅documentation中的以下示例

代码语言:javascript
复制
>>> from sklearn import svm, datasets
>>> from sklearn.model_selection import GridSearchCV
>>> iris = datasets.load_iris()
>>> parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]}
>>> svr = svm.SVC()
>>> clf = GridSearchCV(svr, parameters)
>>> clf.fit(iris.data, iris.target)
...                             
GridSearchCV(cv=None, error_score=...,
       estimator=SVC(C=1.0, cache_size=..., class_weight=..., coef0=...,
                     decision_function_shape=None, degree=..., gamma=...,
                     kernel='rbf', max_iter=-1, probability=False,
                     random_state=None, shrinking=True, tol=...,
                     verbose=False),
       fit_params={}, iid=..., n_jobs=1,
       param_grid=..., pre_dispatch=..., refit=..., return_train_score=...,
       scoring=..., verbose=...)
>>> sorted(clf.cv_results_.keys())
...                             
['mean_fit_time', 'mean_score_time', 'mean_test_score',...
 'mean_train_score', 'param_C', 'param_kernel', 'params',...
 'rank_test_score', 'split0_test_score',...
 'split0_train_score', 'split1_test_score', 'split1_train_score',...
 'split2_test_score', 'split2_train_score',...
 'std_fit_time', 'std_score_time', 'std_test_score', 'std_train_score'...]
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/44479790

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档