我正在构建一个情感分析算法,它可以在.txt语料库上进行分割,但在代码中出现了一些问题,我不知道如何解决?
class Splitter(object):
def _init_(self):
self.nltk_splitter = nltk.data.load('tokenizers/punkt/english/pickle')
self.nltk_tokenizer = nltk.tokenize.TreebankWordTokenizer()
def split(self,text):
"""imput format: a .txt file
output format : a list of lists of words.
for eg [['this', 'is']['life' , 'worth' , 'living']]"""
sentences = self.nltk_splitter.tokenize(text)
tokenized_sentences = [self.nltk_tokenizer.tokenize(sent) for sent in sentences]
return tokenized_sentences然后我做了以下事情
>>> f = open('amazonshoes.txt')
>>> raw = f.read()
>>> text = nltk.Text(raw)
>>> splitter = Splitter()
>>> splitted_sentences = splitter.split(text) 错误是
Traceback (most recent call last):
File "<pyshell#21>", line 1, in <module>
splitted_sentences = splitter.split(text)
File "<pyshell#14>", line 9, in split
sentences = self.nltk_splitter.tokenize(text)
AttributeError: 'Splitter' object has no attribute 'nltk_splitter'发布于 2014-07-02 20:30:53
类Splitter的构造函数应该称为__init__,带有两个前导下划线和尾随下划线。
当前未执行_init_方法(单下划线),因此您(通过调用Splitter())创建的Splitter对象永远不会获取属性/字段nltk_splitter
https://stackoverflow.com/questions/24531078
复制相似问题