首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用VotingClassifier在sklearn2pmml中保存模型时出错

使用VotingClassifier在sklearn2pmml中保存模型时出错
EN

Stack Overflow用户
提问于 2021-02-05 10:23:44
回答 1查看 96关注 0票数 0

我是编程新手,在pmml中保存模型时遇到了一些问题。我有一个数据库,我需要选择属性,然后使用多数投票,最后保存在pmml中。即使多数投票部分也可以,但当我使用sklearn2pmml保存最后一行上的模型时,它会给出一个错误。

代码语言:javascript
复制
from pandas import read_csv
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from mlxtend.classifier import EnsembleVoteClassifier
from sklearn.metrics import accuracy_score
from sklearn2pmml import make_pmml_pipeline
from sklearn2pmml import sklearn2pmml
from sklearn.compose import ColumnTransformer, make_column_transformer
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn2pmml.pipeline import PMMLPipeline
from sklearn.ensemble import VotingClassifier
import joblib

url = 'D:/treinamento.CSV'
df = read_csv(url, header=None)
data = df.values

url_test = 'D:/TESTE.CSV'
df_test = read_csv(url_test, header=None)
data_test = df_test.values
   
X = data[:, :-1]
y = data_test[:, -1]

X_train = data[:, :-1]
X_test = data_test[:, :-1]
y_train = data[:, -1]
y_test = y
#features selection
features1 = [2, 5, 7]
features2 = [0, 1, 4, 5, 7]
features3 = [0, 1, 4, 5, 6]
features4 = [1, 4]
numeric_transformer = Pipeline(steps=[('scaler', StandardScaler())])
preprocessor1 = ColumnTransformer(transformers=[('numerical', numeric_transformer, features1)])
preprocessor2 = ColumnTransformer(transformers=[('numerical', numeric_transformer, features2)])
preprocessor3 = ColumnTransformer(transformers=[('numerical', numeric_transformer, features3)])
preprocessor4 = ColumnTransformer(transformers=[('numerical', numeric_transformer, features4)])

pipe1 = PMMLPipeline(steps=[('preprocessor', preprocessor1),('classifier', DecisionTreeClassifier(min_samples_split = 2))])
pipe2 = PMMLPipeline(steps=[('preprocessor', preprocessor2),('classifier', DecisionTreeClassifier(min_samples_split = 2))])
pipe3 = PMMLPipeline(steps=[('preprocessor', preprocessor3),('classifier', DecisionTreeClassifier(min_samples_split = 2))])
pipe4 = PMMLPipeline(steps=[('preprocessor', preprocessor4),('classifier', DecisionTreeClassifier(min_samples_split = 2))])



eclf = VotingClassifier(estimators=[('pipe1', PMMLPipeline(steps=[('preprocessor', preprocessor1),('classifier', DecisionTreeClassifier(min_samples_split = 2))])),
                                    ('pipe2', PMMLPipeline(steps=[('preprocessor', preprocessor2),('classifier', DecisionTreeClassifier(min_samples_split = 2))])),
                                    ('pipe3', PMMLPipeline(steps=[('preprocessor', preprocessor3),('classifier', DecisionTreeClassifier(min_samples_split = 2))])),
                                    ('pipe4', PMMLPipeline(steps=[('preprocessor', preprocessor4),('classifier', DecisionTreeClassifier(min_samples_split = 2))]))], voting='hard', weights=[1,1,1,1])

eclf.fit(X_train, y_train)
yhat = eclf.predict(X_test)
accuracy = accuracy_score(y_test, yhat)
print('Accuracy: %.3f' % (accuracy * 100))

sklearn2pmml(eclf, "D:/Mestrado/ARTIGO DRC/dados_pos_revisao/cross validation - dados reavaliados/4 revisao/5 FOLDS/1 FOLD/eclf.pmml", with_repr = True)

代码错误

代码语言:javascript
复制
65 sklearn2pmml(eclf, "D:/mest/eclf.pmml", with_repr = True)

~\anaconda3\lib\site-packages\sklearn2pmml\__init__.py in sklearn2pmml(pipeline, pmml, user_classpath, with_repr, debug, java_encoding)
    222                 print("{0}: {1}".format(java_version[0], java_version[1]))
    223         if not isinstance(pipeline, PMMLPipeline):
--> 224                 raise TypeError("The pipeline object is not an instance of " + PMMLPipeline.__name__ + ". Use the 'sklearn2pmml.make_pmml_pipeline(obj)' utility function to translate a regular Scikit-Learn estimator or pipeline to a PMML pipeline")
    225         estimator = pipeline._final_estimator
    226         cmd = ["java", "-cp", os.pathsep.join(_classpath(user_classpath)), "org.jpmml.sklearn.Main"]

TypeError: The pipeline object is not an instance of PMMLPipeline. Use the 'sklearn2pmml.make_pmml_pipeline(obj)' utility function to translate a regular Scikit-Learn estimator or pipeline to a PMML pipeline
EN

回答 1

Stack Overflow用户

发布于 2021-02-05 15:17:41

管道对象不是PMMLPipeline的实例

您是否阅读了SkLearn2PMML错误消息?可能不是,因为它清楚地说明了问题所在!

您在完全错误的地方使用了PMMLPipeline类。它只能作为VotingClassifier估计器的最上面的包装器的使用。

请像这样重新组织您的代码:

代码语言:javascript
复制
pipeline = PMMLPipeline([
  ("classifier", VotingClassifier([
    ("pipe1", Pipeline(...)),
    ("pipe2", Pipeline(...)),
    ("pipe3", Pipeline(...))
  ]))
])
sklearn2pmml(pipeline, "pipeline.pmml")
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/66056723

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档