文章/答案/技术大牛

发布

社区首页 >问答首页 >使用Pandas dataframe进行空间依赖分析

问使用Pandas dataframe进行空间依赖分析
EN

Stack Overflow用户

提问于 2021-04-19 00:35:57

回答 1查看 265关注 0票数 0

我想使用Spacy的依存关系解析器在我的pandas数据框架上提取名词-形容词对，用于基于方面的情感分析。我在来自Kaggle的亚马逊优质食品评论数据集上尝试了这段代码：Named Entity Recognition in aspect-opinion extraction using dependency rule matching

然而，我将我的熊猫数据帧提供给spacy的方式似乎有问题。我的结果并不是我期望的那样。有人能帮我调试一下这个吗？非常感谢。

!python -m spacy download en_core_web_lg
import nltk
nltk.download('vader_lexicon')

import spacy
nlp = spacy.load("en_core_web_lg")

from nltk.sentiment.vader import SentimentIntensityAnalyzer
sid = SentimentIntensityAnalyzer()


def find_sentiment(doc):
    # find roots of all entities in the text
  for i in df['Text'].tolist():
    doc = nlp(i)
    ner_heads = {ent.root.idx: ent for ent in doc.ents}
    rule3_pairs = []
    for token in doc:
        children = token.children
        A = "999999"
        M = "999999"
        add_neg_pfx = False
        for child in children:
            if(child.dep_ == "nsubj" and not child.is_stop): # nsubj is nominal subject
                if child.idx in ner_heads:
                    A = ner_heads[child.idx].text
                else:
                    A = child.text
            if(child.dep_ == "acomp" and not child.is_stop): # acomp is adjectival complement
                M = child.text
            # example - 'this could have been better' -> (this, not better)
            if(child.dep_ == "aux" and child.tag_ == "MD"): # MD is modal auxiliary
                neg_prefix = "not"
                add_neg_pfx = True
            if(child.dep_ == "neg"): # neg is negation
                neg_prefix = child.text
                add_neg_pfx = True
        if (add_neg_pfx and M != "999999"):
            M = neg_prefix + " " + M
        if(A != "999999" and M != "999999"):
            rule3_pairs.append((A, M, sid.polarity_scores(M)['compound']))
    return rule3_pairs
df['three_tuples'] = df['Text'].apply(find_sentiment) 
df.head()

我的结果是这样的，这显然意味着我的循环出了问题：

sentiment-analysis

python

pandas

nlp

spacy

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-04-19 02:36:09

如果在df['Text']上调用apply，则实际上是遍历该列中的每个值，并将该值传递给函数。

然而，在这里，您的函数本身将遍历您要应用函数的同一个dataframe列，同时还会覆盖在函数早期传递给它的值。

因此，我将首先重写函数，如下所示，看看它是否产生了预期的结果。我不能肯定，因为你没有发布任何样本数据，但这至少应该使球向前移动：

def find_sentiment(text):
    doc = nlp(text)
    ner_heads = {ent.root.idx: ent for ent in doc.ents}
    rule3_pairs = []
    for token in doc:
        children = token.children
        A = "999999"
        M = "999999"
        add_neg_pfx = False
        for child in children:
            if(child.dep_ == "nsubj" and not child.is_stop): # nsubj is nominal subject
                if child.idx in ner_heads:
                    A = ner_heads[child.idx].text
                else:
                    A = child.text
            if(child.dep_ == "acomp" and not child.is_stop): # acomp is adjectival complement
                M = child.text
            # example - 'this could have been better' -> (this, not better)
            if(child.dep_ == "aux" and child.tag_ == "MD"): # MD is modal auxiliary
                neg_prefix = "not"
                add_neg_pfx = True
            if(child.dep_ == "neg"): # neg is negation
                neg_prefix = child.text
                add_neg_pfx = True
        if (add_neg_pfx and M != "999999"):
            M = neg_prefix + " " + M
        if(A != "999999" and M != "999999"):
            rule3_pairs.append((A, M, sid.polarity_scores(M)['compound']))
    return rule3_pairs

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/67150944

复制

相似问题

问使用Pandas dataframe进行空间依赖分析
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Pandas dataframe进行空间依赖分析EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Pandas dataframe进行空间依赖分析
EN