文章/答案/技术大牛

发布

社区首页 >问答首页 >RuntimeError:管道尚未优化。请在Python中使用TPOT自动机器学习调用fit() first.Problem

问RuntimeError:管道尚未优化。请在Python中使用TPOT自动机器学习调用fit() first.Problem
EN

Stack Overflow用户

提问于 2019-08-04 21:39:11

回答 2查看 779关注 0票数 0

在执行示例代码时，我遇到了以下问题："RuntimeError:管道尚未优化。请先调用fit()。

Python中TPOT自动机器学习的问题。我正在尝试创建示例: Dataset 2:蘑菇分类(https://towardsdatascience.com/tpot-automated-machine-learning-in-python-4c063b3e5de9)

源代码：https://www.kaggle.com/discdiver/tpot-mushroom-classification-task/

我试图改变tpot.fit (X_train，y_train)的位置，但是没有解决问题。

图书馆

import time
import gc
import pandas as pd
import numpy as np
import seaborn as sns
import timeit
import plotly.offline as py
import plotly.graph_objs as go
py.init_notebook_mode(connected=True)
import matplotlib.pyplot as plt
%matplotlib inline
sns.set(font_scale=1.5, palette="colorblind")
import category_encoders

from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import train_test_split

from tpot import TPOTClassifier

# Read data
df_cogumelo = pd.read_csv('agaricus-lepiota.csv')  

# Visualization
pd.options.display.max_columns = 200
pd.options.display.width = 200

# separate out X
X = df_cogumelo.reindex(columns=[x for x in df_cogumelo.columns.values if x != 'class']) 

X = X.apply(LabelEncoder().fit_transform)

# separate out y
y = df_cogumelo.reindex(columns=['class'])   
print(y['class'].value_counts())
y = np.ravel(y)                     # flatten the y array
y = LabelEncoder().fit_transform(y)


X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.75, test_size=0.25, random_state=10) 


print(X_train.describe())
print("\n\n\n")
print(X_train.info())

# generation and population_size determine how many populations are made.

tpot = TPOTClassifier(verbosity=3, 
                  scoring="accuracy", 
                  random_state=10, 
                  periodic_checkpoint_folder="tpot_mushroom_results", 
                  n_jobs=-1, 
                  generations=2, 
                  population_size=10, use_dask=True) #use_dask=True

times = []
scores = []
winning_pipes = []

# run several fits 
for x in range(10):
start_time = timeit.default_timer()

tpot.fit(X_train, y_train)

elapsed = timeit.default_timer() - start_time
times.append(elapsed)

winning_pipes.append(tpot.fitted_pipeline_)

scores.append(tpot.score(X_test, y_test))
tpot.export('tpot_mushroom.py')


# output results
times = [time/60 for time in times]
print('Times:', times)
print('Scores:', scores)   
print('Winning pipelines:', winning_pipes)

#The expected result is as follows: 
#https://www.kaggle.com/discdiver/tpot-#mushroom-classification-task/

python

scikit-learn

tpot

回答 2

Stack Overflow用户

发布于 2020-09-07 18:59:37

删除"use_dask=True“为我解决了这个错误。

票数 0

Stack Overflow用户

发布于 2020-03-18 00:50:44

你的问题不是代码，而是你的数据。该快速增长数据集没有标题行。进入文件并插入新的第一行并标记列(不管是什么)，确保最后一列被命名为“class”(小写的c)。这应该可以解决这个问题。如果您查看您的输出，当您打印y‘’class‘计数时，您将得不到任何结果。如果您已经正确添加了标签，请发送输出堆栈跟踪。

票数 -1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/57347026

复制

相似问题

问RuntimeError:管道尚未优化。请在Python中使用TPOT自动机器学习调用fit() first.Problem
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问RuntimeError:管道尚未优化。请在Python中使用TPOT自动机器学习调用fit() first.ProblemEN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问RuntimeError:管道尚未优化。请在Python中使用TPOT自动机器学习调用fit() first.Problem
EN