我解决了Stepik的问题:
一棵树是好的,但是保证它是最好的,或者至少接近它的保证在哪里呢?找到一个或多或少最优的树参数集的方法之一是对一组具有不同参数的树进行迭代,并选择合适的参数集。为此,有一个GridSearchCV类,它迭代为模型指定的参数之间的每个组合,对数据进行训练并执行交叉验证。然后,将具有最佳参数的模型存储在.best_estimator_属性中。现在的任务是根据以下参数对虹膜数据上的所有树进行迭代:最大深度--从1级到10级,用于分离的最小样本数从1到10张最小样本数--从1到10,并将最佳树存储在变量best_tree中。用GridSearchCV搜索命名变量。这是我的解决方案:
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data
y = iris.target
parameters = {'max_depth': range(1, 10), 'min_samples_split': range(2, 10), 'min_samples_leaf': range(1, 10)}
search = GridSearchCV(iris, parameters)
search.fit(X, y)
best_tree = search.estimator我为什么要犯这个错误?
Traceback (most recent call last):
File "jailed_code", line 22, in <module>
search.fit(X, y)
File "/home/stepic/instances/master-plugins/sandbox/python3/lib/python3.6/site-packages/sklearn/model_selection/_search.py", line 595, in fit
self.estimator, scoring=self.scoring)
File "/home/stepic/instances/master-plugins/sandbox/python3/lib/python3.6/site-packages/sklearn/metrics/scorer.py", line 342, in _check_multimetric_scoring
scorers = {"score": check_scoring(estimator, scoring=scoring)}
File "/home/stepic/instances/master-plugins/sandbox/python3/lib/python3.6/site-packages/sklearn/metrics/scorer.py", line 274, in check_scoring
"'fit' method, %r was passed" % estimator)
TypeError: estimator should be an estimator implementing 'fit' method, {'data': array([[5.1, 3.5, 1.4, 0.2],
[4.9, 3. , 1.4, 0.2],
[4.7, 3.2, 1.3, 0.2],
[4.6, 3.1, 1.5, 0.2],
[5. , 3.6, 1.4, 0.2],
[5.4, 3.9, 1.7, 0.4],
[4.6, 3.4, 1.4, 0.3],
[5. , 3.4, 1.5, 0.2],
...发布于 2022-01-11 05:38:26
您传递的是数据集而不是估计器。如果您还没有,请看一下这个https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html
这应该能行
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data
y = iris.target
parameters = {'max_depth': range(1, 10), 'min_samples_split': range(2, 10), 'min_samples_leaf': range(1, 10)}
search = GridSearchCV(estimator=DecisionTreeClassifier(),
param_grid=parameters)
search.fit(X, y)
search.cv_results_发布于 2022-01-11 05:36:12
您已经将任何估计器传递给您的GridSearchCV函数。您必须传递一个要与GridSearCV相适应的估计器的实例,但是您只是经过了不属于估计器的虹膜。
https://stackoverflow.com/questions/70662018
复制相似问题