我需要对一个列表进行迭代,并将单词的同义词和助词添加到列表中。例如:
list_of_words = ["bird", "smart", "cool", "happy"]
list_of_words = list_of_words + list_of_words_synonyms + list_of_words_hypnonyms我能够得到单个单词的同义词和次词,但需要迭代一组值。
s = wordnet.synset(word)[0]需要返回单个同义词添加到原始列表中的列表。
预期结果是: list_of_words =“鸟”、“聪明”、“酷”、“高兴”、“母鸡”、“公鸡”、..other同义词、“聪明”、“智能”、smart....and的其他同义词等等。
如何使synset函数在list_of_words上迭代并将这些单词包含在列表中?我对文字分析很陌生。任何帮助都是非常感谢的。
发布于 2016-04-14 05:42:12
(创建这个新答案而不是更新我现有的答案,因为问题已经更新了很多)
最后,通过安装包“模式”并进行调试,了解wordnet.sysets()返回的内容。下面是运行的代码:
from pattern.en import wordnet
list_of_words = [u"bird", u"smart", u"cool", u"happy"]
list_of_words_synonyms = []
list_of_words_hypnonyms = []
for word in list_of_words:
sts = wordnet.synsets(word)
if len(sts):
st = sts[0]
list_of_words_synonyms.extend(st.synonyms)
list_of_words_hypnonyms.extend(hs.senses[0] for hs in st.hyponyms())
list_of_words = list_of_words + list_of_words_synonyms + list_of_words_hypnonyms
print(list_of_words)请注意:
list_of_words_hypnonyms.extend(sense for hs in st.hyponyms() for sense in hs.senses)结果是:
[u'bird', u'smart', u'cool', u'happy', u'bird', u'smart', u'smarting', u'smartness', u'cool', u'dickeybird', u'cock', u'hen', u'nester', u'night bird', u'bird of passage', u'protoavis', u'archaeopteryx', u'Sinornis', u'Ibero-mesornis', u'archaeornis', u'ratite', u'carinate', u'passerine', u'nonpasserine bird', u'bird of prey', u'gallinaceous bird', u'parrot', u'cuculiform bird', u'coraciiform bird', u'apodiform bird', u'caprimulgiform bird', u'piciform bird', u'trogon', u'aquatic bird', u'twitterer']发布于 2016-04-14 01:03:00
下面是一个快速的实现。不要太担心fakesynset,它只是wordnet.synsets的一个模型。您可以直接检查此函数后面的代码。
def fakesynsets(word):
from collections import namedtuple
sysnset = namedtuple('sysnset', ['synonyms', 'hyponyms'])
return [sysnset(synonyms = [word+'syn'+str(ii) for ii in range(1,3)], hyponyms = lambda : [word+'hyp'+str(ii) for ii in range(1,3)])]
list_of_words = ["bird", "smart", "cool", "happy"]
list_of_words_synonyms = []
list_of_words_hypnonyms = []
for word in list_of_words:
s = fakesynsets(word)[0]
list_of_words_synonyms.extend(s.synonyms)
list_of_words_hypnonyms.extend(s.hyponyms())
list_of_words = list_of_words + list_of_words_synonyms + list_of_words_hypnonyms
print(list_of_words)https://stackoverflow.com/questions/36610441
复制相似问题