首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何合并R中的两个时间序列数据?

如何合并R中的两个时间序列数据?
EN

Stack Overflow用户
提问于 2020-11-16 20:40:50
回答 1查看 89关注 0票数 0

我有两个时间序列数据。一份包括药品名称、开始使用日期、停止使用日期、药物剂量和第二次数据,包括访问日期和评分,

代码语言:javascript
复制
data1<- data.frame("Drug Name" = c("Drug1","Drug1","Drug2","Drug1","Drug3","Drug2",
                                   "Drug4","Drug5","Drug1"), 
                   "Start Date" = c("7/1/2016","1/1/2016", "8/6/2015","2/1/2015","6/14/2017",
                                    "6/21/2017","1/24/2018","7/30/2018","7/30/2018"),
                   "Stop Date "=c("1/14/2017","1/14/2017", "1/14/2017","1/14/2017"
                                  ,"1/24/2018","6/29/2018","6/29/2018","Ongoing","Ongoing"),
                   "Dose"=c(12,20,32,3,5,6,6,8,9))
data2<-data.frame("visitdate"=c("8/24/2016","8/24/2016", "10/19/2016","12/7/2016","12/21/2016",
                                "3/22/2017","5/10/2017", "6/14/2017", "7/12/2017","9/27/2017", 
                                "11/29/2017", "1/24/2018","3/21/2018","5/30/2018","8/15/2018",
                                "10/3/2018", "11/28/2018"),
                  "Score"=c(1,2,3,34,6,7,9,5,6,8,5,5,7,9,1,2,5))

我想合并这两个数据的方式,例如,在访问日期,8/24/2016告诉我,病人服用了多少药物,其剂量加上临床评分。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-11-17 00:04:08

可能会有一些数据的预处理,以便尽早考虑。

首先,上面的例子有带有空格的列名,最好避免。我为这个例子编辑并删除了空格。

此外,您还有“正在进行的”约会。建议使用as.Date将日期转换为日期。但是,在转换那些具有“正在进行”的内容后,将被包含为NA。这些可以设置为Inf (无穷大),这将工作。

例如:

代码语言:javascript
复制
data1$StartDate <- as.Date(data1$StartDate, format = "%m/%d/%Y")
data1$StopDate <- as.Date(data1$StopDate, format = "%m/%d/%Y")
data2$VisitDate <- as.Date(data2$VisitDate, format = "%m/%d/%Y")

data1$StopDate[8:9] <- Inf

根据数据源的不同,有许多其他方法可以解决这一问题。

之后,您可以使用tidyversefuzzyjoin执行以下操作。使用fuzzy_left_join,您可以将两个数据框架连接在一起,其中仅包括日期范围内的日期。

您可以考虑将结果保持较长的形式。但是,如果需要宽格式,则可以使用pivot_wider。最后一个select按数字顺序排列列,如您的示例所示。

代码语言:javascript
复制
library(tidyverse)
library(fuzzyjoin)

fuzzy_left_join(data2,
                data1,
                by = c("VisitDate" = "StartDate",
                       "VisitDate" = "StopDate"),
                match_fun = list(`>=`, `<=`)) %>%
  select(-StartDate, -StopDate) %>%
  group_by(VisitDate, Score) %>%
  mutate(rn = row_number(),
         NumDrugs = ifelse(all(is.na(DrugName)) == TRUE, 0, n())) %>%
  pivot_wider(id_cols = c(VisitDate, Score, NumDrugs), names_from = rn, values_from = c(DrugName, Dose)) %>%
  select(VisitDate, Score, NumDrugs, names(.)[-c(1:3)][order(parse_number(names(.)[-c(1:3)]))])

输出

代码语言:javascript
复制
   VisitDate  Score NumDrugs DrugName_1 Dose_1 DrugName_2 Dose_2 DrugName_3 Dose_3 DrugName_4 Dose_4
   <date>     <dbl>    <dbl> <chr>       <dbl> <chr>       <dbl> <chr>       <dbl> <chr>       <dbl>
 1 2016-08-24     1        4 Drug1          12 Drug1          20 Drug2          32 Drug1           3
 2 2016-08-24     2        4 Drug1          12 Drug1          20 Drug2          32 Drug1           3
 3 2016-10-19     3        4 Drug1          12 Drug1          20 Drug2          32 Drug1           3
 4 2016-12-07    34        4 Drug1          12 Drug1          20 Drug2          32 Drug1           3
 5 2016-12-21     6        4 Drug1          12 Drug1          20 Drug2          32 Drug1           3
 6 2017-03-22     7        0 NA             NA NA             NA NA             NA NA             NA
 7 2017-05-10     9        0 NA             NA NA             NA NA             NA NA             NA
 8 2017-06-14     5        1 Drug3           5 NA             NA NA             NA NA             NA
 9 2017-07-12     6        2 Drug3           5 Drug2           6 NA             NA NA             NA
10 2017-09-27     8        2 Drug3           5 Drug2           6 NA             NA NA             NA
11 2017-11-29     5        2 Drug3           5 Drug2           6 NA             NA NA             NA
12 2018-01-24     5        3 Drug3           5 Drug2           6 Drug4           6 NA             NA
13 2018-03-21     7        2 Drug2           6 Drug4           6 NA             NA NA             NA
14 2018-05-30     9        2 Drug2           6 Drug4           6 NA             NA NA             NA
15 2018-08-15     1        2 Drug5           8 Drug1           9 NA             NA NA             NA
16 2018-10-03     2        2 Drug5           8 Drug1           9 NA             NA NA             NA
17 2018-11-28     5        2 Drug5           8 Drug1           9 NA             NA NA             NA

数据

(在转换日期之前)

代码语言:javascript
复制
data1 <- structure(list(DrugName = c("Drug1", "Drug1", "Drug2", "Drug1", 
"Drug3", "Drug2", "Drug4", "Drug5", "Drug1"), StartDate = c("7/1/2016", 
"1/1/2016", "8/6/2015", "2/1/2015", "6/14/2017", "6/21/2017", 
"1/24/2018", "7/30/2018", "7/30/2018"), StopDate = c("1/14/2017", 
"1/14/2017", "1/14/2017", "1/14/2017", "1/24/2018", "6/29/2018", 
"6/29/2018", NA, NA), Dose = c(12, 20, 32, 3, 5, 6, 6, 8, 9)), class = "data.frame", row.names = c(NA, 
-9L))

data2 <- structure(list(VisitDate = c("8/24/2016", "8/24/2016", "10/19/2016", 
"12/7/2016", "12/21/2016", "3/22/2017", "5/10/2017", "6/14/2017", 
"7/12/2017", "9/27/2017", "11/29/2017", "1/24/2018", "3/21/2018", 
"5/30/2018", "8/15/2018", "10/3/2018", "11/28/2018"), Score = c(1, 
2, 3, 34, 6, 7, 9, 5, 6, 8, 5, 5, 7, 9, 1, 2, 5)), class = "data.frame", row.names = c(NA, 
-17L))
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/64865407

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档