我创建了一个包含RFE和RandomForestClassifer的管道,然后应用RandomizedSearchCV为两者找到最佳的超参数值。我的代码就是这样的-
from sklearn.esemble_learning import RandomForestClassifier
from sklearn.feature_selection import RFE
from sklearn.pipeline import Pipeline
from sklearn.model_selection import RandomizedSearchCV
steps = [
("rfe", RFE(estimator = RandomForestClassifier(random_state = 42))),
("est", RandomForestClassifier())
]
rf_clf_pl = Pipeline(steps = steps)
params = {
"rfe__n_features_to_select" : range(2, smote_X_train.shape[1] + 1),
"est__random_state" : np.linspace(0, 42, 5).astype(int),
"est__n_estimators" : range(50, 201, 10),
"est__max_depth" : [None] + list(range(5, max_depth, 3)),
"est__max_leaf_nodes" : [None] + list(range(100, max_leaf_nodes, 20))
}
rs = RandomizedSearchCV(estimator = rf_clf_pl, cv = 4, param_distributions = params, n_jobs = -1, n_iter = 100, random_state = 42)
rs.fit(smote_X_train, smote_y_train)我试着用下面的代码但是发现了一个错误-
rf_clf_pl.named_steps["rfe"].support_错误-
AttributeError Traceback (most recent call last)
<ipython-input-53-c73290f0e090> in <module>()
----> 1 rf_clf_pl.named_steps["rfe"].support_
AttributeError: 'RFE' object has no attribute 'support_'如何获得保留功能的名称?
发布于 2022-06-27 12:20:42
您可以按以下方式访问最佳估计器的保留功能:
rs.best_estimator_.named_steps['rfe'].support_也就是说,您应该访问RandomizedSearchCV fitted实例的RandomizedSearchCV属性(即,由于默认参数refit=True of RandomizedSearchCV,管道重新安装了最好的超参数)。
您试图从管道实例访问属性support_的方式不起作用,因为您没有显式地安装管道本身,安装的RandomizedSearchCV也返回已安装的基本估计器(尽管在运行搜索时调用了best_estimator_ ),但上述情况下的best_estimator_除外。
下面是一个例子:
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.feature_selection import RFE
from sklearn.pipeline import Pipeline
from sklearn.model_selection import RandomizedSearchCV, train_test_split
iris = load_iris(as_frame=True)
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test= train_test_split(X, y, random_state=0)
steps = [
("rfe", RFE(estimator = RandomForestClassifier(random_state = 42))),
("est", RandomForestClassifier())
]
rf_clf_pl = Pipeline(steps = steps)
params = {
"rfe__n_features_to_select" : range(2, X_train.shape[1] + 1),
"est__random_state" : np.linspace(0, 42, 5).astype(int),
"est__n_estimators" : range(50, 201, 10),
"est__max_depth" : [None] + list(range(5, 16, 3)),
"est__max_leaf_nodes" : [None] + list(range(100, 201, 20))
}
rs = RandomizedSearchCV(estimator = rf_clf_pl, cv = 4, param_distributions = params, n_jobs = -1, n_iter = 100, random_state = 42)
rs.fit(X_train, y_train)
rs.best_estimator_.named_steps['rfe'].support_最后,如果您想访问保留的特性的显式名称,可以通过rs.feature_names_in_[np.where(rs.best_estimator_.named_steps['rfe'].support_)[0]]检索它们。
https://stackoverflow.com/questions/72768825
复制相似问题