我有一个小脚本正在运行,它应该从包含在下面的列表中的所有用户中提取所有评论:
import asyncpraw
import pandas as pd
reddit = asyncpraw.Reddit(
user_agent = "XXX",
client_id = "XXX",
client_secret = "XXX",
username = "XXX",
password = "XXX"
)
dataset_one_author #(just a list of usernames)
body = []
author = []
created_utc = []
score = []
subreddit_id = []
permalink = []
count = 0
for index, row in dataset_one_author.iterrows():
author_row = row['author']
#try:
count = count + 1
print(count)
for comment in reddit.redditor(author_row).comments.new(limit=None):
body.append(comment.body)
author.append(author_row)
created_utc.append(comment.created_utc)
score.append(comment.score)
subreddit_id.append(comment.subreddit_id)
permalink.append(comment.permalink)
# except:
# body.append("user_deleted")
# author.append("user_deleted")
# created_utc.append("user_deleted")
# score.append("user_deleted")
# subreddit_id.append("user_deleted")
# permalink.append("user_deleted")
# #continue
a = pd.DataFrame(author, columns =['author'])
a['body'] = pd.DataFrame(body)
a['created_utc'] = pd.DataFrame(created_utc)
a['score'] = pd.DataFrame(score)
a['subreddit_id'] = pd.DataFrame(subreddit_id)
a['permalink'] = pd.DataFrame(permalink)当我用普通的praw运行这个脚本时,它运行得很好(虽然非常慢)。按照建议,我改用了异步语言,现在我得到的是"AttributeError:'coroutine‘对象没有属性’注释‘。我知道我需要“等待”任务,但无法确定在哪里,任何帮助都是非常感谢的。如果您碰巧也发现了一个明显的性能问题,我也很高兴听到它。
(我已经注释掉了try/ out以获得异步errors错误--try/除这里通常是为了捕获来自已删除用户的404错误)。
发布于 2022-01-24 18:22:49
听起来好像您还没有等待异步迭代器,所以
for comment in reddit.redditor(author_row).comments.new(limit=None):试着做
async for comment in reddit.redditor(author_row).comments.new(limit=None):https://stackoverflow.com/questions/70838604
复制相似问题