文章/答案/技术大牛

发布

问使用python进行情感分析
EN

Data Science用户

提问于 2014-11-30 00:42:28

回答 3查看 1.7K关注 0票数 2

我有一些包含电影评论的文本文件，我需要知道评论是好的还是坏的。我尝试了以下代码，但它不起作用：

import nltk
with open("c:/users/user/desktop/datascience/moviesr/movies-1-32.txt", 'r') as m11:
    mov_rev = m11.read()
mov_review1=nltk.word_tokenize(mov_rev)
bon="crap aweful horrible terrible bad bland trite sucks unpleasant boring dull moronic dreadful disgusting distasteful flawed ordinary slow senseless unoriginal weak wacky uninteresting unpretentious "
bag_of_negative_words=nltk.word_tokenize(bon)
bop="Absorbing Big-Budget Brilliant Brutal Charismatic Charming Clever Comical Dazzling Dramatic Enjoyable Entertaining Excellent Exciting  Expensive Fascinating Fast-Moving First-Rate Funny Highly-Charged Hilarious Imaginative Insightful Inspirational Intriguing Juvenile Lasting Legendary Pleasant Powerful Ripping Riveting Romantic Sad  Satirical Sensitive  Sentimental Surprising Suspenseful Tender Thought Provoking Tragic Uplifting Uproarious"
bop.lower()
bag_of_positive_words=nltk.word_tokenize(bop)
vec=[]
for i in bag_of_negative_words:
    if i in mov_review1:
        vec.append(1)
    else:
        for w in bag_of_positive_words:
            if w in moview_review1:
                vec.append(5)

因此，我想看看检讨是否包含正面或负面的字眼。如果它包含一个否定词，那么一个值1将被赋值给向量vec，否则，一个值5将被赋值。但是我得到的输出是一个空向量。

请帮帮忙。另外，请建议其他解决这个问题的方法。

python

nlp

sentiment-analysis

回答 3

Data Science用户

发布于 2014-12-02 20:07:14

试着从google在这个链接谷歌官方公布的坏话清单中发布的官方“坏话”数据库中搜索。此外，这里是好词不是正式的好话清单的链接

对于代码，我会这样做：

textArray = file('dir_to_your_text','r').read().split()

#Bad words should be listed like this for the split function to work
# "*** ****** **** ****" the stars are for the cenzuration :P
badArray = file('dir_to_your_bad_word_file).read().split()
goodArray = file('dir_to_your_good_word_file).read().split()

# Then you use matching algorithm from difflib on good and bad word for every word in an array of words
import difflib

goodMachingCouter = 0;
badMacihngCouter = 0;


for iGood in range(0, len(goodArray)):
    for iWord in range(0, len(textArray)):
        goodMachingCounter += difflib.SequenceMatcher(None, goodArray[iGood], textArray[iWord]).ratio()
     
for iBad in range(0, len(badArray)):
    for iWord in range(0, len(textArray)):
        badMachingCounter += difflib.SequenceMatcher(None, badArray[ibad], textArray[iWgoodord]).ratio()

goodMachingCouter *= 100/(len(goodArray)*len(textArray))
badMacihngCouter *= 100/(len(badArray)*len(textArray))

print('Show the good measurment of the text in %: '+goodMachingCouter)
print('Show the bad measurment of the text in %: '+badMacihngCouter)
print('Show the hootnes of the text: ' + len(textArray)*goodMachingCounter)

代码将是缓慢但准确的:)我没有运行和测试它，请为我做它，并发布正确的代码:)因为我也想测试它:)

票数 1

Data Science用户

发布于 2014-12-07 16:17:20

下面的链接包含了在-5，5量表上的积极和消极的两极分化情绪的列表。只要试着根据单词匹配来计算分数，你就可以得到电影评论的整体分数。

AFINN

票数 1

Data Science用户

发布于 2014-12-01 09:58:00

试一试

vec =[]

for word in bag_of_negative_words:
    if word in mov_review1:
        vec.append(1)

for word in bag_of_positive_words:
    if word in moview_review1:
         vec.append(5)

票数 0

页面原文内容由Data Science提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://datascience.stackexchange.com/questions/2568

复制

相似问题

问使用python进行情感分析
EN

回答 3

Data Science用户

Data Science用户

Data Science用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用python进行情感分析EN

回答 3

Data Science用户

Data Science用户

Data Science用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用python进行情感分析
EN