我有一个聚合大型数据集时间戳的代码块 (见下文)。每个时间戳代表一条推特。代码每周聚合推文,效果很好。现在,我还有一篇专栏文章,列出了每条推文的情感价值。我想知道是否有可能计算每周推文的平均情绪。最好在最后有一个数据集,上面有每周推文的数量和这些聚集的推文的平均情绪。如果有什么提示,请告诉我:)
你好,丹尼尔
weekly_counts_2 <- df_bw %>%
drop_na(Timestamp) %>%
mutate(weekly_cases = floor_date(
Timestamp,
unit = "week")) %>%
count(weekly_cases) %>%
tidyr::complete(
weekly_cases = seq.Date(
from = min(weekly_cases),
to = max(weekly_cases),
by = "week"),
fill = list(n = 0))发布于 2022-01-31 01:26:51
由于没有共享数据,所以很难验证答案,但根据这里提供的说明,您可以尝试一种解决方案。
library(dplyr)
library(tidyr)
library(lubridate)
weekly_counts_2 <- df_bw %>%
drop_na(Timestamp) %>%
mutate(weekly_cases = floor_date(Timestamp,unit = "week")) %>%
group_by(weekly_cases) %>%
summarise(mean_sentiment = mean(sentiment_value, na.rm = TRUE),
count = n()) %>%
complete(weekly_cases = seq.Date(min(weekly_cases),
max(weekly_cases),by = "week"), fill = list(n = 0))我假设带有情感值的列称为sentiment_value,相应地将其更改为您的数据。
https://stackoverflow.com/questions/70920346
复制相似问题