我正在尝试运行下面的代码,但是当我执行管道‘’count‘时,我得到了一个'Pipeline’对象不可订阅‘的错误。
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.pipeline import Pipeline
import numpy as np
corpus = ['this is the first document',
'this document is the second document',
'and this is the third one',
'is this the first document']
vocabulary = ['this', 'document', 'first', 'is', 'second', 'the',
'and', 'one']
pipe = Pipeline([('count', CountVectorizer(vocabulary=vocabulary)),
('tfid', TfidfTransformer())]).fit(corpus)
pipe['count'].transform(corpus).toarray()
array([[1, 1, 1, 1, 0, 1, 0, 0],
[1, 2, 0, 1, 1, 1, 0, 0],
[1, 0, 0, 1, 0, 1, 1, 1],
[1, 1, 1, 1, 0, 1, 0, 0]])
pipe['tfid'].idf_
array([1. , 1.22314355, 1.51082562, 1. , 1.91629073,
1. , 1.91629073, 1.91629073])
pipe.transform(corpus).shape
(4, 8)```发布于 2020-04-07 20:40:21
您可以尝试pipe.named_steps['count'],而不是pipe['count']。要访问您的'tfidf'步骤,请尝试pipe.named_steps['tfid']。
https://stackoverflow.com/questions/59507657
复制相似问题