我遇到了一个问题,twint中的twint命令似乎无法工作。我只能刮过去10天的推特。
我试图按照这里的建议修改twint的源文件,但没有结果:https://issueexplorer.com/issue/twintproject/twint/1253
有谁有解决办法的建议吗?或者其他的包裹?
import twint
c = twint.Config()
c.Search = "biden"
c.Lang = "en"
c.Since = "2021-01-01"
c.Limit = 5000
c.Pandas = True
c.Show_hashtags = False
c.Hide_output = True
# Run search
try:
twint.run.Search(c)
except:
import nest_asyncio
nest_asyncio.apply()发布于 2021-10-19 16:04:51
我建议使用snscrape从帐户或关键字中刮取任意数量的tweet。
您可以在这个中篇中找到更多关于如何通过关键字进行操作的信息。下面是我从给定帐户中抓取所有tweet的代码:
import pandas as pd
import sqlite3
import re
import numpy as np
from time import sleep
import time
import sqlite3 as sq
import snscrape.modules.twitter as sntwitter
maxTweets = 5086
# Creating list to append tweet data to
tweets_list = []
source="Twitter"
# Using TwitterSearchScraper to scrape data
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('from:peopleagainstVE').get_items()):
if i>maxTweets:
break
print(i)
tweets_list.append([tweet.id,tweet.url,tweet.user.username,tweet.content,tweet.date,source,tweet.retweetCount,tweet.likeCount,tweet.replyCount])
tweets_df = pd.DataFrame(tweets_list,columns=['Tweet_ID', 'URL', "Account_Name", 'Text', 'Datetime','Source','Number_Retweets', 'Number_Likes', 'Number_Comments'])
print(tweets_df)
data = tweets_df
sql_data = 'tweets2_PAVE.sqlite' #- Creates DB names SQLite
conn = sq.connect(sql_data)
cur = conn.cursor()
cur.execute('''DROP TABLE IF EXISTS tweets2_PAVE''')
data.to_sql('tweets2_PAVE', conn, if_exists='replace', index=False) # - writes the pd.df to SQLIte DB
pd.read_sql('select * from tweets2_PAVE', conn)
conn.commit()
conn.close()https://stackoverflow.com/questions/69584919
复制相似问题