首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >data.table中的滚动连接使用了错误的时区

data.table中的滚动连接使用了错误的时区
EN

Stack Overflow用户
提问于 2019-08-23 19:57:28
回答 1查看 71关注 0票数 1

我试图使用滚动连接连接两个数据表。我的时间/日期邮票是POSIXct指定为东部标准时区(EST)。在连接期间,使用GMT检索值,即从EST-5小时中检索值。此外,它还对夏令时进行修正,在夏令时从EST -4小时中检索值。

我尝试过使用fuzzyjoin,这是正确的,但它的速度非常慢,即使是对于小连接和资源不足的更大的问题(我猜,因为它正在做一个完整的笛卡儿连接,然后过滤),所以不会运行我的真正的工作。我看过堆栈过低,我没有看到这个问题被描述

在下面的示例中,流数据是以15分钟间隔(以立方米/秒为单位)测量的流流量,而化学数据是分析抓取样本的结果。我想要的是找到流流(Q.cms)的值,它在时间上最接近抓取样本被收集的时间。结果表明,从流数据中检索到的Q.cms应该与flow.EST匹配,而不是flow.GMT (flow.EST和flow.GMT是使用Excel中的vlookup函数获得的)。您可以查看结果表,也可以使用ggplot来可视化结果。黑色符号应该落在1:1线上,而不是红色符号上。这里给出的例子中有一些是https://r-norberg.blogspot.com/2016/06/understanding-datatable-rolling-joins.html。在roll语句中,我也尝试过"-Inf“和"T”--也没有帮助。

代码语言:javascript
复制
library(tidyverse)
library(data.table)

NHchem <- tibble(SampleDateTime = list("2012-03-07 15:15:00",
                                      "2012-03-07 07:57:00",
                                      "2012-03-07 09:00:00",
                                      "2012-03-07 08:31:00",
                                      "2012-03-07 08:00:00"),
              Parameter = rep("HgD", 5),
              Value = c(11.83, 9.09, 8.98, 10.91, 10.91),
              flow.EST = c(1.734292, 1.730299, 1.711665, 1.721448, 1.729299),
              flow.GMT = c(1.729082, 1.855115, 1.818062,    1.838623,   1.855263)
              )

NHchem$SampleDateTime <- lubridate::ymd_hms(NHchem$SampleDateTime, tz = "EST")

NHflow <- tibble(DateTime = list("2012-03-07 00:00:00", "2012-03-07 00:15:00",  "2012-03-07 00:30:00",
                                 "2012-03-07 00:45:00", "2012-03-07 01:00:00",  "2012-03-07 01:15:00",
                                 "2012-03-07 01:30:00", "2012-03-07 01:45:00",  "2012-03-07 02:00:00",
                                 "2012-03-07 02:15:00", "2012-03-07 02:30:00",  "2012-03-07 02:45:00",
                                 "2012-03-07 03:00:00", "2012-03-07 03:15:00",  "2012-03-07 03:30:00",
                                 "2012-03-07 03:45:00", "2012-03-07 04:00:00",  "2012-03-07 04:15:00",
                                 "2012-03-07 04:30:00", "2012-03-07 04:45:00",  "2012-03-07 05:00:00",
                                 "2012-03-07 05:15:00", "2012-03-07 05:30:00",  "2012-03-07 05:45:00",
                                 "2012-03-07 06:00:00", "2012-03-07 06:15:00",  "2012-03-07 06:30:00",
                                 "2012-03-07 06:45:00", "2012-03-07 07:00:00",  "2012-03-07 07:15:00",
                                 "2012-03-07 07:30:00", "2012-03-07 07:45:00",  "2012-03-07 08:00:00",
                                 "2012-03-07 08:15:00", "2012-03-07 08:30:00",  "2012-03-07 08:45:00",
                                 "2012-03-07 09:00:00", "2012-03-07 09:15:00",  "2012-03-07 09:30:00",
                                 "2012-03-07 09:45:00", "2012-03-07 10:00:00",  "2012-03-07 10:15:00",
                                 "2012-03-07 10:30:00", "2012-03-07 10:45:00",  "2012-03-07 11:00:00",
                                 "2012-03-07 11:15:00", "2012-03-07 11:30:00",  "2012-03-07 11:45:00",
                                 "2012-03-07 12:00:00", "2012-03-07 12:15:00",  "2012-03-07 12:30:00",
                                 "2012-03-07 12:45:00", "2012-03-07 13:00:00",  "2012-03-07 13:15:00",
                                 "2012-03-07 13:30:00", "2012-03-07 13:45:00",  "2012-03-07 14:00:00",
                                 "2012-03-07 14:15:00", "2012-03-07 14:30:00",  "2012-03-07 14:45:00",
                                 "2012-03-07 15:00:00", "2012-03-07 15:15:00",  "2012-03-07 15:30:00",
                                 "2012-03-07 15:45:00", "2012-03-07 16:00:00",  "2012-03-07 16:15:00",
                                 "2012-03-07 16:30:00", "2012-03-07 16:45:00",  "2012-03-07 17:00:00",
                                 "2012-03-07 17:15:00", "2012-03-07 17:30:00",  "2012-03-07 17:45:00",
                                 "2012-03-07 18:00:00", "2012-03-07 18:15:00",  "2012-03-07 18:30:00",
                                 "2012-03-07 18:45:00", "2012-03-07 19:00:00",  "2012-03-07 19:15:00",
                                 "2012-03-07 19:30:00", "2012-03-07 19:45:00",  "2012-03-07 20:00:00",
                                 "2012-03-07 20:15:00", "2012-03-07 20:30:00",  "2012-03-07 20:45:00",
                                 "2012-03-07 21:00:00", "2012-03-07 21:15:00",  "2012-03-07 21:30:00",
                                 "2012-03-07 21:45:00", "2012-03-07 22:00:00",  "2012-03-07 22:15:00",
                                 "2012-03-07 22:30:00", "2012-03-07 22:45:00",  "2012-03-07 23:00:00",
                                 "2012-03-07 23:15:00", "2012-03-07 23:30:00",  "2012-03-07 23:45:00"
                                 ),
                 Q.cms = c(1.9183428976162, 1.90884554590853,   1.90659614969849,   1.89946370922413, 
                           1.89667381849492,    1.88823268972675,   1.89009845845496,   1.88925133710857,
                           1.89175644143891,    1.88902902166044,   1.88199493586431,   1.87318875971247,
                           1.86917038183388,    1.86071830324102,   1.85932613686066,   1.86093401873381,
                           1.87101369024615,    1.87296981545807,   1.8762632040545,    1.86625922525504,
                           1.86445655629997,    1.86091440792692,   1.86261797589624,   1.85849714690774,
                           1.85809949483732,    1.85365423835426,   1.86769219811402,   1.87137568188775,
                           1.86957715377645,    1.86178854932717,   1.85842153505713,   1.85511511281169,
                           1.85526345829864,    1.84606992374992,   1.8386232796469,    1.8306698161104,
                           1.81806190831676,    1.80972662643993,   1.80671560966769,   1.79692071214778,
                           1.79246415779217,    1.78383758090983,   1.77911151556738,   1.77540849330392,
                           1.76977546791765,    1.76704647349273,   1.75506646810193,   1.75116656853259,
                           1.74116032441497,    1.73694852237565,   1.73539800459245,   1.73029929958069,
                           1.72929873358336,    1.72145096214292,   1.72144822481096,   1.7137286373663,
                           1.71166508485562,    1.72512015058183,   1.72801058727519,   1.71946126373971,
                           1.72908219540057,    1.73003063754833,   1.73433620637326,   1.74704929729279,
                           1.75374511203315,    1.75751451176466,   1.74550984385788,   1.7374425791972,
                           1.73028285052801,    1.73164553811319,   1.71538821542136,   1.71111583998635,
                           1.69850042893639,    1.7018112620745,    1.70068743638687,   1.70624871061042,
                           1.70513766128531,    1.70189855875322,   1.71388172504149,   1.70687667156756,
                           1.72965507980508,    1.73429231190258,   1.73062829040751,   1.74161628160761,
                           1.74502884750137,    1.74764319440737,   1.76775550519358,   1.74926846992512,
                           1.75406991794209,    1.75302950118107,   1.75219840078475,   1.75821138471585,
                           1.74749746409036,    1.75047602988317,   1.76192575356191,   1.76360735465132)
                 )

NHflow$DateTime <- lubridate::ymd_hms(NHflow$DateTime, tz = "EST")

NHflowDT <- setDT(NHflow)
NHchemDT <- setDT(NHchem)

NHflowDT[, join_time := DateTime]
NHchemDT[, join_time := SampleDateTime]

setkey(NHflowDT, join_time)
setkey(NHchemDT, join_time)

NHflow_chem <- NHflowDT[ NHchemDT, roll = 'nearest']

NHflow_chem

ggplot(data = NHflow_chem, mapping = aes(x = flow.EST, y = Q.cms)) +
  geom_point(size = 4) +
  geom_point(aes(x = flow.GMT, y = Q.cms), color = "red", size = 4) +
  geom_abline(slope = 1, intercept = 0) +
  labs(x = "flow.EST (black), flow.GMT (red)")
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-08-27 11:20:43

我不知道您的确切问题是什么,但是下面的滚动更新连接看起来很好,当我查看您的样本数据集时。

代码语言:javascript
复制
NHflowDT <- setDT(NHflow)
NHchemDT <- setDT(NHchem)

NHchemDT[, c("nearest_flow_time", "Q.cms") := 
       NHflowDT[NHchemDT, .(x.DateTime, x.Q.cms), 
                on = .(DateTime = SampleDateTime), 
                roll = "nearest"] ][]

输出

代码语言:javascript
复制
        SampleDateTime Parameter Value flow.EST flow.GMT   nearest_flow_time    Q.cms
1: 2012-03-07 15:15:00       HgD 11.83 1.734292 1.729082 2012-03-07 15:15:00 1.730031
2: 2012-03-07 07:57:00       HgD  9.09 1.730299 1.855115 2012-03-07 08:00:00 1.855263
3: 2012-03-07 09:00:00       HgD  8.98 1.711665 1.818062 2012-03-07 09:00:00 1.818062
4: 2012-03-07 08:31:00       HgD 10.91 1.721448 1.838623 2012-03-07 08:30:00 1.838623
5: 2012-03-07 08:00:00       HgD 10.91 1.729299 1.855263 2012-03-07 08:00:00 1.855263

> attributes( NHchemDT$SampleDateTime)
$class
[1] "POSIXct" "POSIXt" 

$tzone
[1] "EST"

> attributes( NHchemDT$nearest_flow_time)
$class
[1] "POSIXct" "POSIXt" 

$tzone
[1] "EST"
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/57632336

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档