我试图运行一个双绞线搜索来检索一个推特列表,我对其进行情感分析。我创建了一个for循环,它遍历熊猫的日期数据,并使用给定的日期参数运行twint搜索。
这是我的密码:
import twint
import pandas
from textblob import TextBlob
# Functions
def twint_to_pandas(columns): #Creds to Favio Vazques
return twint.output.panda.Tweets_df[columns]
def getTweets(st, startDate, endDate): #runs a twint search and returns a pandas df
c = twint.Config()
c.Search= str(st)
c.Limit = 20
c.Lang = "en"
c.Since = startDate
c.Until = endDate
#c.Verified = True
c.Hide_output = True
c.Pandas = True
twint.run.Search(c)
df = twint_to_pandas(["date", "username", "tweet"])
return df
def getSentiScore(string):
t = TextBlob(str(string)) #create a textblob class instance
score = t.sentiment.polarity # get sentiment
return score #pass it to next function
def getAverageScore(st, startDate, endDate):
df = getTweets(st, startDate, endDate) #establish a variable for the fetched tweets
results = [getSentiScore(str(x)) for x in df['tweet']] #list comprehension
resultsDf = pandas.DataFrame(results, columns=['sentiScore']).dropna() #create dataframe for it
mean = resultsDf['sentiScore'].mean() #get a mean sentiment score
#median = resultsDf['sentiScore'].median()
#mode = resultsDf['sentiScore'].mode()
print("Mean" + str(mean)) # print the mean
#print("Median" + str(median))
#print("Mode" + str(mode))
def weeklyScoreToCSV(st, startDate, days):
datetime = pandas.date_range(start=(str(startDate)), freq='D', periods=days, closed='left')
datetimeDf = datetime.to_frame(index=False, name='date')
datesDf = [i for i in (datetimeDf['date'])]
dateLength = int(len(datesDf)-1)
for i in range(0, dateLength):
sentiScore = getAverageScore(st, str(datesDf[i]), str(datesDf[i+1]))
#print(str(datesDf[i]) + str(datesDf[i+1]))
# Execution
#getAverageScore("Obama")
weeklyScoreToCSV("a", '01/01/2019', 10)在weeklyScoreToCSV函数中,每当我手动输入getAverageScore函数调用的日期参数时,该函数就能很好地工作。但是,当我尝试使用给定的代码时,
给我以下错误
KeyError: "None of [Index(['date', 'username', 'tweet'], dtype='object')] are in the [columns]"我不知道我哪里出了问题。
发布于 2021-03-21 10:35:33
也有类似的问题。使用twitter的高级搜索功能将搜索修改为searchstr = "(search string) until:2021-02-19 since:2021-02-17)"
我建议使用Twitter网站的所有高级搜索语法,并将其包含在c.Search = searchstr中,而不是包含c的其他参数。
https://stackoverflow.com/questions/62554125
复制相似问题