我想使用下面的方法来应用svm,但是显然"Bunch“类型是不合适的。
通常,对于类字典对象,有趣的属性是:‘data’,需要学习的数据和‘目标值’,分类标签。您可以相应地访问.data和.target信息。我如何使它工作,因为我有下面的代码?
import pandas as pd
from sklearn import preprocessing
#Call the data below using scikit learn which stores them in Bunch
newsgroups_train = fetch_20newsgroups(subset='train',remove=('headers', 'footers', 'quotes'), categories = cats)
newsgroups_test = fetch_20newsgroups(subset='test',remove=('headers', 'footers', 'quotes'), categories = cats)
vectorizer = TfidfVectorizer( stop_words = 'english') #new
vectors = vectorizer.fit_transform(newsgroups_train.data) #new
vectors_test = vectorizer.transform(newsgroups_test.data) #new
max_abs_scaler = preprocessing.MaxAbsScaler()
scaled_train_data = max_abs_scaler.fit_transform(vectors)#corrected
scaled_test_data = max_abs_scaler.transform(vectors_test)
clf=CalibratedClassifierCV(OneVsRestClassifier(SVC(C=1)))
clf.fit(scaled_train_data, train_labels)
predictions=clf.predict(scaled_test_data)
proba=clf.predict_proba(scaled_test_data)在clf.fit行的"trained_labels“位置上,我将"vectorizer.vocabulary_.keys()”放在“vectorizer.vocabulary_.keys()”中,但它给出了:ValueError: bad input shape ()。我该怎么做才能得到受过训练的标签并使其发挥作用?
https://stackoverflow.com/questions/58256824
复制相似问题