首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >sklearn2pmml和jpmml-sklearn使用错误

sklearn2pmml和jpmml-sklearn使用错误
EN

Stack Overflow用户
提问于 2015-12-04 06:30:54
回答 1查看 1.2K关注 0票数 0

最近,我在寻找一种将scikit学习模型转换为jpmml-sklearn的方法时遇到了sklearn2pmml和PMML。然而,当我尝试使用我无法理解的基本用法示例时,我遇到了错误。

当尝试在sklearn2pmml中使用示例时,我收到了以下关于将长整型转换为整型的问题:

代码语言:javascript
复制
Exception in thread "main" java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer
    at numpy.core.NDArrayUtil.getShape(NDArrayUtil.java:66)
    at org.jpmml.sklearn.ClassDictUtil.getShape(ClassDictUtil.java:92)
    at org.jpmml.sklearn.ClassDictUtil.getShape(ClassDictUtil.java:76)
    at sklearn.linear_model.BaseLinearClassifier.getCoefShape(BaseLinearClassifier.java:144)
    at sklearn.linear_model.BaseLinearClassifier.getNumberOfFeatures(BaseLinearClassifier.java:56)
    at sklearn.Classifier.createSchema(Classifier.java:50)
    at org.jpmml.sklearn.Main.run(Main.java:104)
    at org.jpmml.sklearn.Main.main(Main.java:87)
Traceback (most recent call last):
  File "C:\Users\user\workspace\sklearn_pmml\test.py", line 40, in <module>
    sklearn2pmml(iris_classifier, iris_mapper, "LogisticRegressionIris.pmml")
  File "C:\Python27\lib\site-packages\sklearn2pmml\__init__.py", line 49, in sklearn2pmml
    os.remove(dump)
WindowsError: [Error 32] The process cannot access the file because it is being used by another process: 'c:\\users\\user\\appdata\\local\\temp\\tmpmxyp2y.pkl'

对这里发生的事情有什么建议吗?

使用代码:

代码语言:javascript
复制
#
# Step 1: feature engineering
#

from sklearn.datasets import load_iris
from sklearn.decomposition import PCA

import pandas
import sklearn_pandas

iris = load_iris()

iris_df = pandas.concat((pandas.DataFrame(iris.data[:, :], columns = ["Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"]), pandas.DataFrame(iris.target, columns = ["Species"])), axis = 1)

iris_mapper = sklearn_pandas.DataFrameMapper([
    (["Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"], PCA(n_components = 3)),
    ("Species", None)
])

iris = iris_mapper.fit_transform(iris_df)

#
# Step 2: training a logistic regression model
#

from sklearn.linear_model import LogisticRegressionCV

iris_X = iris[:, 0:3]
iris_y = iris[:, 3]

iris_classifier = LogisticRegressionCV()
iris_classifier.fit(iris_X, iris_y)

#
# Step 3: conversion to PMML
#

from sklearn2pmml import sklearn2pmml

sklearn2pmml(iris_classifier, iris_mapper, "LogisticRegressionIris.pmml")

编辑12/6:在新的更新之后,同样的问题会出现在更远的地方:

代码语言:javascript
复制
Dec 06, 2015 5:56:49 PM sklearn_pandas.DataFrameMapper updatePMML
INFO: Updating 1 target field and 3 active field(s)
Dec 06, 2015 5:56:49 PM sklearn_pandas.DataFrameMapper updatePMML
INFO: Mapping target field y to Species
Dec 06, 2015 5:56:49 PM sklearn_pandas.DataFrameMapper updatePMML
INFO: Mapping active field(s) [x1, x2, x3] to [Sepal.Length, Sepal.Width, Petal.Length, Petal.Width]
Traceback (most recent call last):
  File "C:\Users\user\workspace\sklearn_pmml\test.py", line 40, in <module>
    sklearn2pmml(iris_classifier, iris_mapper, "LogisticRegressionIris.pmml")
  File "C:\Python27\lib\site-packages\sklearn2pmml\__init__.py", line 49, in sklearn2pmml
    os.remove(dump)
WindowsError: [Error 32] The process cannot access the file because it is being used by another process: 'c:\\users\\user\\appdata\\local\\temp\\tmpqeblat.pkl'
EN

回答 1

Stack Overflow用户

发布于 2015-12-04 15:53:30

SkLearn期望ndarray.shapei4的元组(由Pyrolite库映射到java.lang.Integer )。然而,在本例中,它是一个i8元组(映射到java.lang.Long)。因此出现了强制转换异常。

这个问题已经在JPMML SkLearn commit f7c16ac2fb中得到了解决。

如果你遇到另一个异常(平台之间的数据转换可能很棘手),那么你也应该打开一个关于它的JPMML-SkLearn问题。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/34077482

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档