文章/答案/技术大牛

发布

社区首页 >问答首页 >使用空间和文本。需要在原始tweet语料库中查找tf-idf得分，但不能导入文本向量器。

问使用空间和文本。需要在原始tweet语料库中查找tf-idf得分，但不能导入文本向量器。
EN

Stack Overflow用户

提问于 2018-04-20 15:01:45

回答 1查看 3.3K关注 0票数 3

我对这些框架和NLP都很陌生。下面是一个示例，它给出了下面的代码片段，用于计算tweet中所有令牌的TF-下手得分。但是，我总是得到导入错误或未定义的向量器。

代码：

import spacy
 from textacy.vsm import Vectorizer
 import textacy.vsm
 vectorizer = Vectorizer(weighting = 'tfidf')
 term_matrix = vectorizer.fit_transform([tok.lemma_ for tok in doc] for doc 
 in spacy_tweets)

收到的错误：

from textacy.vsm import Vectorizer
ImportError: cannot import name 'Vectorizer
//
import textacy
vectorizer = textacy.Vectorizer(weighting='tfidf')
AttributeError: module 'textacy' has no attribute 'Vectorizer'


//
   import textacy
   vectorizer = Vectorizer(weighting='tfidf')
   NameError: name 'Vectorizer' is not defined

我的环境

operating system: windows 10 64bit
python version: Python 3.6.4 :: Anaconda, Inc.
spacy version: 1.9.0-np111py36_vc14_1 installed
spacy models: en_core_web_sm 
textacy version: 0.3.4-py36_0

访问textacy矢量器类的正确导入语句是什么？

python-3.x

tf-idf

spacy

textacy

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-04-20 17:16:59

使用conda时，将安装textacy的0.3.4版本。此版本没有矢量器。相反，通过PyPi项目安装它。

https://pypi.org/project/textacy/

要检查是否有向量器，可以执行以下操作：

In [1]: import textacy

In [2]: dir(textacy)
Out[2]:
['Corpus',
'Doc',
'TextStats',
'TopicModel',
'Vectorizer',
'__builtins__',
'__cached__',
'__doc__',
'__file__',
'__loader__',
'__name__',
'__package__',
'__path__',
'__spec__',
'__version__',
'about',
'absolute_import',
'cache',
'compat',
'constants',
'corpus',
'data_dir',
'doc',
'extract',
'io',
'load_spacy',
'logger',
'logging',
'network',
'os',
'preprocess',
'preprocess_text',
'spacy_utils',
'text_stats',
'text_utils',
'tm',
'utils',
'viz',
'vsm']

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/49944599

复制

相似问题

问使用空间和文本。需要在原始tweet语料库中查找tf-idf得分，但不能导入文本向量器。
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用空间和文本。需要在原始tweet语料库中查找tf-idf得分，但不能导入文本向量器。EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用空间和文本。需要在原始tweet语料库中查找tf-idf得分，但不能导入文本向量器。
EN