文章/答案/技术大牛

发布

社区首页 >问答首页 >将R中的时间序列数据从半小时减至每小时

问将R中的时间序列数据从半小时减至每小时
EN

Stack Overflow用户

提问于 2019-12-18 15:44:12

回答 2查看 693关注 0票数 0

我正在处理智能仪表数据，这是半小时的分辨率。由于数据量巨大，我正试图将每小时半小时的分辨率降低到每小时的分辨率。在这样做的时候，我试图在两个半小时的测量值之间对消耗进行汇总。问题是，我在我的数据框架中也有可怕的数据，当我使用xts时会丢失这些数据。我的数据就是这样的：

> head(test1)
      LCLid stdorToU            DateTime KWH.hh..per.half.hour.   Acorn Acorn_grouped
1 MAC000002      Std 2012-10-12 00:30:00                      0 ACORN-A      Affluent
2 MAC000002      Std 2012-10-12 01:00:00                      0 ACORN-A      Affluent
3 MAC000002      Std 2012-10-12 01:30:00                      0 ACORN-A      Affluent
4 MAC000002      Std 2012-10-12 02:00:00                      0 ACORN-A      Affluent
5 MAC000002      Std 2012-10-12 02:30:00                      0 ACORN-A      Affluent
6 MAC000002      Std 2012-10-12 03:00:00                      0 ACORN-A      Affluent

这是我一直试图使用的代码和我得到的结果。

test1 <- read.csv("test.csv", stringsAsFactors = F)
test1$DateTime <- ymd_hms(test1$DateTime)
test1$KWH.hh..per.half.hour. <- as.numeric(test1$KWH.hh..per.half.hour.)
test2 <- xts(test1$KWH.hh..per.half.hour., test1$DateTime)
head(test2)
period.apply(test2, endpoints(test2, "hours"), sum)

> period.apply(test2, endpoints(test2, "hours"), sum)
                     [,1]
2012-10-12 00:30:00 0.000
2012-10-12 01:30:00 0.000
2012-10-12 02:30:00 0.000
2012-10-12 03:30:00 0.000
2012-10-12 04:30:00 0.000
2012-10-12 05:30:00 0.000
2012-10-12 06:30:00 0.000
2012-10-12 07:30:00 0.000
2012-10-12 08:30:00 0.000
2012-10-12 09:30:00 0.000
2012-10-12 10:30:00 0.000

理想情况下，我需要一个与我原来的(test1)完全相同的数据集，只有一半的大小聚合成每小时的频率，而不是每小时半小时。有人能帮忙吗。

谢谢

time-series

回答 2

Stack Overflow用户

回答已采纳

发布于 2019-12-18 15:50:17

您需要创建一个分组列，然后按组进行求和。

# create grouped column
test1$grouped_time = lubridate::floor_date(test1$DateTime, unit = "hour")
# (use ceiling_date instead if you want to round the half hours up instead of down)

# sum by group
library(dplyr)
test2 = test1 %>%
  group_by(grouped_time, LCLid, stdorToU, Acorn, Acorn_grouped) %>%
  summarize(KWH.hh.per.hour = sum(KWH.hh..per.half.hour.))

在dplyr上有许多替代Sum by Group R-FAQ的方法，以防您想要查看更多的选项。

请注意，这将为group_by()中其他列的每个唯一组合加总KWH列。如果其中一些可以更改，比如stdorToU或ACORN值可能从一个小时更改到下半个小时，但您仍然希望行合并，则需要将该列移出group_by并进入summarize，并指定要保留的值。

# if ACORN can change and you want to keep the first one
test2 = test1 %>%
  group_by(grouped_time, LCLid, stdorToU, Acorn_grouped) %>%
  summarize(KWH.hh.per.hour = sum(KWH.hh..per.half.hour.),
            ACORN = first(ACORN))

票数 3

Stack Overflow用户

发布于 2019-12-18 20:09:14

> head(sm_2013_tof)
# A tibble: 6 x 6
# Groups:   grouped_time, LCLid, stdorToU, Acorn [6]
  grouped_time        LCLid     stdorToU Acorn   Acorn_grouped KWH.hh.per.hour
  <dttm>              <chr>     <chr>    <chr>   <chr>                   <dbl>
1 2013-01-01 00:00:00 MAC000146 ToU      ACORN-L Adversity               0.155
2 2013-01-01 00:00:00 MAC000147 ToU      ACORN-F Comfortable             0.276
3 2013-01-01 00:00:00 MAC000158 ToU      ACORN-H Comfortable             0.152
4 2013-01-01 00:00:00 MAC000165 ToU      ACORN-E Affluent                0.401
5 2013-01-01 00:00:00 MAC000170 ToU      ACORN-F Comfortable             0.64 
6 2013-01-01 00:00:00 MAC000173 ToU      ACORN-E Affluent                0.072
>

下面是分组后的每小时数据。

如果我做了这个as.data.frame，你会看到00:00消失

sm_short_2013 <- as.data.frame(sm_2013_tof)

> head(sm_short_2013)
  grouped_time     LCLid stdorToU   Acorn Acorn_grouped KWH.hh.per.hour
1   2013-01-01 MAC000146      ToU ACORN-L     Adversity           0.155
2   2013-01-01 MAC000147      ToU ACORN-F   Comfortable           0.276
3   2013-01-01 MAC000158      ToU ACORN-H   Comfortable           0.152
4   2013-01-01 MAC000165      ToU ACORN-E      Affluent           0.401
5   2013-01-01 MAC000170      ToU ACORN-F   Comfortable           0.640
6   2013-01-01 MAC000173      ToU ACORN-E      Affluent           0.072

> dput(droplevels(sm_short_2013[1:10, ]))
structure(list(grouped_time = structure(c(1356998400, 1356998400, 
1356998400, 1356998400, 1356998400, 1356998400, 1356998400, 1356998400, 
1356998400, 1356998400), class = c("POSIXct", "POSIXt"), tzone = "UTC"), 
    LCLid = c("MAC000146", "MAC000147", "MAC000158", "MAC000165", 
    "MAC000170", "MAC000173", "MAC000186", "MAC000187", "MAC000193", 
    "MAC000194"), stdorToU = c("ToU", "ToU", "ToU", "ToU", "ToU", 
    "ToU", "ToU", "ToU", "ToU", "ToU"), Acorn = c("ACORN-L", 
    "ACORN-F", "ACORN-H", "ACORN-E", "ACORN-F", "ACORN-E", "ACORN-E", 
    "ACORN-L", "ACORN-D", "ACORN-D"), Acorn_grouped = c("Adversity", 
    "Comfortable", "Comfortable", "Affluent", "Comfortable", 
    "Affluent", "Affluent", "Adversity", "Affluent", "Affluent"
    ), KWH.hh.per.hour = c(0.155, 0.276, 0.152, 0.401, 0.64, 
    0.072, 0.407, 0.554, 0.725, 0.158)), row.names = c(NA, 10L
), class = "data.frame")

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/59395540

复制

相似问题

问将R中的时间序列数据从半小时减至每小时
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将R中的时间序列数据从半小时减至每小时EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将R中的时间序列数据从半小时减至每小时
EN