from sklearn.datasets import load_breast_cancer
# Load dataset
data = load_breast_cancer()
# Organize our data
label_names = data['target_names']
labels = data['target']
feature_names = data['feature_names']
features = data['data']
from sklearn.model_selection import train_test_split
# Split our data
train, test, train_labels, test_labels = train_test_split(features, labels, test_size=0.33, random_state=42)
from sklearn.naive_bayes import GaussianNB
from sklearn2pmml import PMMLPipeline
nb_pipeline = PMMLPipeline([
('classifier', GaussianNB())
])
#
# Train our classifier
nb_pipeline.fit(train, train_labels)
#
from sklearn2pmml import sklearn2pmml
sklearn2pmml(nb_pipeline, 'nb.pmml', with_repr = True,debug=True)错误跟踪:
Exception in thread "main" java.lang.IllegalArgumentException: The estimator object of the final step (Python class sklearn.naive_bayes.GaussianNB) does not specify the number of outputs
at sklearn2pmml.pipeline.PMMLPipeline.initTargetFields(PMMLPipeline.java:564)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:132)
at com.sklearn2pmml.Main.run(Main.java:91)
at com.sklearn2pmml.Main.main(Main.java:66)然后我调试,发现model:nb_pipeline不包含键:“target_fields”。因此,如果我想使用GaussianNB,我如何将模型转换成pmml ?希望得到一些想法,谢谢!
发布于 2022-10-29 19:13:54
这个问题已经在jpmml/sklearn2pmml#357中得到了回答
简而言之,用户似乎在使用传统的Scikit学习版本,该版本不设置Estimator.n_outputs_属性。SkLearn2PMML软件包版本0.86.X渴望拥有它。
可以通过将Scikit升级到1.1.X (或更新)或将SkLearn2PMML升级到0.87.0 (或更新)来解决此错误。
https://stackoverflow.com/questions/74191643
复制相似问题