如果您能让我知道如何使用SMOTENC,我将非常感激。我写道:
num_indices1 = list(X.iloc[:,np.r_[0:94,95,97,100:123]].columns.values)
cat_indices1 = list(X.iloc[:,np.r_[94,96,98,99,123:160]].columns.values)
print(len(num_indices1))
print(len(cat_indices1))
pipeline=Pipeline(steps= [
# Categorical features
('feature_processing', FeatureUnion(transformer_list = [
('categorical', MultiColumn(cat_indices1)),
#numeric
('numeric', Pipeline(steps = [
('select', MultiColumn(num_indices1)),
('scale', StandardScaler())
]))
])),
('clf', rg)
]
)因此,正如我所指出的,我有5个分类特征。实际上,索引123到160与一个分类特性相关,其中有37个可能的值,使用get_dummies将其转换为37个列。
我认为SMOTENC应该插入到分类器('clf', reg)之前,但我不知道如何在SMOTENC中定义"categorical_features“。另外,你能告诉我在哪里使用imblearn.pipeline吗?
提前谢谢。
发布于 2019-10-25 13:51:04
如下所示,应使用两条管道:
num_indices1 = list(X.iloc[:,np.r_[0:94,95,97,100:120,121:123]].columns.values)
cat_indices1 = list(X.iloc[:,np.r_[94,96,98,99,120]].columns.values)
print(len(num_indices1))
print(len(cat_indices1))
cat_indices = [94, 96, 98, 99, 120]
from imblearn.pipeline import make_pipeline
pipeline=Pipeline(steps= [
# Categorical features
('feature_processing', FeatureUnion(transformer_list = [
('categorical', MultiColumn(cat_indices1)),
#numeric
('numeric', Pipeline(steps = [
('select', MultiColumn(num_indices1)),
('scale', StandardScaler())
]))
])),
('clf', rg)
]
)
pipeline_with_resampling = make_pipeline(SMOTENC(categorical_features=cat_indices), pipeline)https://datascience.stackexchange.com/questions/44219
复制相似问题