首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在scikit中使用BaseEstimator的GradientBoostingClassifier?

在scikit中使用BaseEstimator的GradientBoostingClassifier?
EN

Stack Overflow用户
提问于 2013-07-04 01:07:48
回答 4查看 9.1K关注 0票数 8

我尝试在scikit-learn中使用GradientBoostingClassifier,它使用它的默认参数工作得很好。但是,当我尝试用不同的分类器替换BaseEstimator时,它不起作用,并给出以下错误:

代码语言:javascript
复制
return y - np.nan_to_num(np.exp(pred[:, k] -
IndexError: too many indices

你对这个问题有什么解决办法吗?

可以使用以下代码片段重新生成此错误:

代码语言:javascript
复制
import numpy as np
from sklearn import datasets
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.utils import shuffle

mnist = datasets.fetch_mldata('MNIST original')
X, y = shuffle(mnist.data, mnist.target, random_state=13)
X = X.astype(np.float32)
offset = int(X.shape[0] * 0.01)
X_train, y_train = X[:offset], y[:offset]
X_test, y_test = X[offset:], y[offset:]

### works fine when init is None
clf_init = None
print 'Train with clf_init = None'
clf = GradientBoostingClassifier( (loss='deviance', learning_rate=0.1,
                             n_estimators=5, subsample=0.3,
                             min_samples_split=2,
                             min_samples_leaf=1,
                             max_depth=3,
                             init=clf_init,
                             random_state=None,
                             max_features=None,
                             verbose=2,
                             learn_rate=None)
clf.fit(X_train, y_train)
print 'Train with clf_init = None is done :-)'

print 'Train LogisticRegression()'
clf_init = LogisticRegression();
clf_init.fit(X_train, y_train);
print 'Train LogisticRegression() is done'

print 'Train with clf_init = LogisticRegression()'
clf = GradientBoostingClassifier(loss='deviance', learning_rate=0.1,
                             n_estimators=5, subsample=0.3,
                             min_samples_split=2,
                             min_samples_leaf=1,
                             max_depth=3,
                             init=clf_init,
                             random_state=None,
                             max_features=None,
                             verbose=2,
                             learn_rate=None)
 clf.fit(X_train, y_train) # <------ ERROR!!!!
 print 'Train with clf_init = LogisticRegression() is done'

以下是错误的完整回溯:

代码语言:javascript
复制
Traceback (most recent call last):
File "/home/mohsena/Dropbox/programing/gbm/gb_with_init.py", line 56, in <module>
   clf.fit(X_train, y_train)
File "/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/gradient_boosting.py", line 862, in fit
   return super(GradientBoostingClassifier, self).fit(X, y)
File "/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/gradient_boosting.py", line 614, in fit random_state)
File "/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/gradient_boosting.py", line 475, in _fit_stage
   residual = loss.negative_gradient(y, y_pred, k=k)
File "/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/gradient_boosting.py", line 404, in negative_gradient
   return y - np.nan_to_num(np.exp(pred[:, k] -
   IndexError: too many indices
EN

回答 4

Stack Overflow用户

回答已采纳

发布于 2013-07-04 18:39:29

根据scikit-learn开发人员的建议,这个问题可以通过使用这样的适配器来解决:

代码语言:javascript
复制
def __init__(self, est):
   self.est = est
def predict(self, X):
    return self.est.predict_proba(X)[:, 1]
def fit(self, X, y):
    self.est.fit(X, y)
票数 5
EN

Stack Overflow用户

发布于 2013-10-30 18:34:44

iampat的答案进行改进,并对scikit-developers的答案稍作修改,应该可以做到这一点。

代码语言:javascript
复制
class init:
    def __init__(self, est):
        self.est = est
    def predict(self, X):
        return self.est.predict_proba(X)[:,1][:,numpy.newaxis]
    def fit(self, X, y):
        self.est.fit(X, y)
票数 9
EN

Stack Overflow用户

发布于 2015-05-28 20:43:31

在我看来,这是iampat的代码片段的一个完整且更简单的版本。

代码语言:javascript
复制
    class RandomForestClassifier_compability(RandomForestClassifier):
        def predict(self, X):
            return self.predict_proba(X)[:, 1][:,numpy.newaxis]
    base_estimator = RandomForestClassifier_compability()
    classifier = GradientBoostingClassifier(init=base_estimator)
票数 5
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/17454139

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档