我有两个数据帧,我已经连接了它们。在连接的数据帧上,我编写了一个用户定义的函数,其中基于时间戳和列的值计数,我需要根据下面提到的条件返回值创建一个名为"Day_Sentiment“的新列。但是我已经不再出错了。请告诉我该怎么做。
输入:
Date Content Cleaned-content Sentiment
11/12/2020 abb bbc abb Bad
12/10/2020 xyz xxy Good
11/24/2020 tyu yuu Neutral
12/16/2020 iop yui Bad 输出:
Date Content Cleaned-content Sentiment Day_Sentiment
11/12/2020 abb bbc abb Bad Bad
12/10/2020 xyz xxy Good Bad
11/24/2020 tyu yuu Neutral Bad
12/16/2020 iop yui Bad Bad到目前为止,我尝试了以下内容:
df = input_data.join(results)
def compare_def(df):
no.bad_senti= df.loc[df['Sentiment'] == 'Bad']
no.neut_senti = df.loc[df['Sentiment'] == 'Neutral']
no.good_senti= df.loc[df['Sentiment'] == 'Good']
if ((no.bad_senti> no.good_senti) & (no.bad_senti> no.neut_senti)):
output = 'Bad'
elif ((no.good_senti> no.bad_senti) & (no.good_senti> no.neut_senti)):
output= 'Good'
elif ((no.neut_senti> no.bad_senti) & (no.neut_senti> no.good_senti)):
output= 'Neutral'
elif no.good_senti== no.bad_senti:
output= 'Neutral'
elif no.bad_senti== no.neut_senti:
output= 'bad'
elif no.good_senti== no.neut_senti:
output= 'good'
else:
output= 'Neutral'
return output
df['Day_Sentiment'] = output备用项:
output = compare_def(df)
df['Day_Sentiment'] = output错误:
ValueError: Can only compare identically-labeled DataFrame objects示例1:预测情绪情绪2坏1好1中性
则在函数2>1和2>1中返回Bad
例2:情绪:2坏5好5中性
功能:
2>5 false 5>2 and 5>5 false 5>2 and 5>5 false 5==2 false 2==5 false 5==5真实返回良好
发布于 2020-12-22 22:46:45
你的代码有几个问题。首先,变量bad、good和neut是包含字符串变量的不同长度的Panda系列。然后尝试求值,执行几个条件测试,例如生成ValueError的if ((bad> good) & (bad> neut)。我不太确定您尝试实现的是什么逻辑,但以下模板可能会有所帮助:
def compare_data(row):
value = 'Good'
# The logic here escapes me
# Evaluate the row contents of row[Sentiment] and modify value
return value
df["Day Sentiment"]= df.apply(lambda row: compare_data(row), axis= 1)收益率:
Date Content Cleaned-content Sentiment Day Sentiment
0 11/12/2020 abb bbc abb Bad Good
1 12/10/2020 xyz xxy Good Good
2 11/24/2020 tyu yuu Neutral Good
3 12/16/2020 iop yui Bad Goodhttps://stackoverflow.com/questions/65410017
复制相似问题