有人能帮我在nltk中标注语料库的语法吗?
hunpos.HunPosTagger module导入什么?import nltk
from nltk.corpus import PlaintextCorpusReader
from nltk.corpus.util import LazyCorpusLoader
corpus_root = './'
reader = PlaintextCorpusReader (corpus_root, '.*')
ntuen = LazyCorpusLoader ('ntumultien', PlaintextCorpusReader, reader)
ntuen.fileids()
isinstance (ntuen, PlaintextCorpusReader)
# So how do I hunpos tag `ntuen`? I can't get the following code to work.
# please help me to correct my python syntax errors, I'm new to python
# but i really need this to work. sorry
##from nltk.tag import hunpos.HunPosTagger
ht = HunPosTagger('english.model')
for sentence in ntu.sent() ##looping through the no. of sentence
ht.tag(ntusent()[i])发布于 2011-02-23 22:16:01
import nltk
from nltk.tag.hunpos import HunposTagger
from nltk.tokenize import word_tokenize
corpus = "so how do i hunpos tag my ntuen ? i can't get the following code to work."
#please help me to correct my python syntax errors, i'm new to python
#but i really need this to work. sorry
##from nltk.tag import hunpos.HunPosTagger
ht = HunposTagger('en_wsj.model')
print ht.tag(word_tokenize(corpus))我觉得问题在于您没有标记这些单词,但是代码可能无法工作还有其他原因(这是HunposTagger,而不是HunPosTagger)。我从你的问题中做了一个简化的例子。如果你还有任何问题,请发表评论。
我从这里得到了一切:http://code.google.com/p/hunpos/
python hunpos.py
('so','RB'),('how','WRB'),('do','VBP'),('i','FW'),('hunpos','NN'),('tag','NN'),('my','PRP$'),('ntuen','NN'),('?','.'),('i','FW'),('ca','MD'),("n't",'RB'),('get','VB'),('the','DT'),(‘以下’,'JJ'),(‘代码’,'NN'),('to','TO'),('work','VB'),('.',‘.’‘)
https://stackoverflow.com/questions/5088448
复制相似问题