我有一个像这样的数据文件:
PM2.5 PM10 SO2 datetime
1 4 4 7 2013-3-1
2 8 4 7 2013-3-1
3 7 7 3 2013-3-1
4 6 6 3 2013-3-2
5 3 3 3 2013-3-2
6 5 5 4 2013-3-2现在,我希望groupby --基于datetime列的所有列,并且在操作之后,结果数据like应该如下所示:
PM2.5 PM10 SO2 datetime PM2.5_mean PM10_mean SO2_mean PM2.5_min PM10_min SO2_min PM2.5_max PM10_max SO2_max
1 [4,8,7] [4,4,7] [7,7,3] 2013-3-1 6.33 5 5.66 4 4 3 8 8 7
2 [6,3,5] [6,3,5] [3,3,4] 2013-3-2 4.66 4.66 3.33 3 3 3 6 6 4 我尝试应用聚合函数,但这样只能得到平均值/ min / max。但是,我想将均值,min,max作为数据挖掘中每个现有列的单独列。我该怎么做呢?或者还有其他方法可以得到所需的结果?
发布于 2020-01-02 22:25:57
一个选项是在按'datetime‘分组之后,获取mutate_at中其余列的mutate_at,在group_by中添加它,然后对初始列进行paste
library(dplyr)
df1 %>%
group_by(datetime) %>%
mutate_at(vars(-group_cols()), list(mean = mean, max = max)) %>%
group_by_at(vars(matches('(mean|max)$')), .add = TRUE) %>%
summarise_at(vars(-group_cols()), ~ sprintf("[%s]", toString(.)))
# A tibble: 2 x 10
# Groups: datetime, PM2.5_mean, PM10_mean, SO2_mean, PM2.5_max, PM10_max [2]
# datetime PM2.5_mean PM10_mean SO2_mean PM2.5_max PM10_max SO2_max PM2.5 PM10 SO2
# <chr> <dbl> <dbl> <dbl> <int> <int> <int> <chr> <chr> <chr>
#1 2013-3-1 6.33 5 5.67 8 7 7 [4, 8, 7] [4, 4, 7] [7, 7, 3]
#2 2013-3-2 4.67 4.67 3.33 6 6 4 [6, 3, 5] [6, 3, 5] [3, 3, 4]数据
df1 <- structure(list(PM2.5 = c(4L, 8L, 7L, 6L, 3L, 5L), PM10 = c(4L,
4L, 7L, 6L, 3L, 5L), SO2 = c(7L, 7L, 3L, 3L, 3L, 4L), datetime = c("2013-3-1",
"2013-3-1", "2013-3-1", "2013-3-2", "2013-3-2", "2013-3-2")),
class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))https://stackoverflow.com/questions/59570750
复制相似问题