我目前正在处理一个数据集,每个患者ID都有多个活检。我需要找到最接近特定日期的活检结果(每个患者都有)。可以在下面看到一个虚拟数据集
df <- data.frame(m1 = c("1","1","1","2","2","2"),
patodate=c("2013-06-03","2014-01-06","2018-11-23","2004-03-03","2018-06-25","2018-12-19"),
baselinedate=c("2018-11-09","2018-11-09","2018-11-09","2018-07-24","2018-07-24","2018-07-24"),
biopsy=c("1","2","3","1","2","3"))然后我计算了patodate和baselinedate之间的时间差。
df$patodate <- as.Date(df$patodate)
df$baselinedate <- as.Date(df$baselinedate)
df <- df%>%
group_by(m1) %>%
mutate(diff = baselinedate-recdate)我现在的问题是-我想添加一个名为'status‘的新列,它显示(按组m1)时间差最接近于0的’活检‘结果。最终结果将是
df <- data.frame(m1 = c("1","1","1","2","2","2"),
patodate=c("2013-06-03","2014-01-06","2018-11-23","2004-03-03","2018-06-25","2018-12-19"),
baselinedate=c("2018-11-09","2018-11-09","2018-11-09","2018-07-24","2018-07-24","2018-07-24"),
biopsy=c("1","2","3","1","2","3"),
status=c("3","3","3","2","2","2"))我希望有人能理解这个问题,并能提供帮助。非常感谢
致以亲切的问候,
托比亚斯·伯格
发布于 2021-10-08 19:59:03
我们可能会做
library(dplyr)
df %>%
group_by(m1) %>%
mutate(status = abs(patodate - baselinedate),
status = which(status == min(status))[1]) %>%
ungroup-output
# A tibble: 6 × 5
m1 patodate baselinedate biopsy status
<chr> <date> <date> <chr> <int>
1 1 2013-06-03 2018-11-09 1 3
2 1 2014-01-06 2018-11-09 2 3
3 1 2018-11-23 2018-11-09 3 3
4 2 2004-03-03 2018-07-24 1 2
5 2 2018-06-25 2018-07-24 2 2
6 2 2018-12-19 2018-07-24 3 2发布于 2021-10-08 11:27:49
您可以获得每组日期之间差异的最小绝对值的索引。
library(dplyr)
df %>%
group_by(m1) %>%
mutate(status = which.min(abs(patodate - baselinedate))) %>%
ungroup
# m1 patodate baselinedate biopsy status
# <chr> <date> <date> <chr> <int>
#1 1 2013-06-03 2018-11-09 1 3
#2 1 2014-01-06 2018-11-09 2 3
#3 1 2018-11-23 2018-11-09 3 3
#4 2 2004-03-03 2018-07-24 1 2
#5 2 2018-06-25 2018-07-24 2 2
#6 2 2018-12-19 2018-07-24 3 2发布于 2021-10-08 12:10:51
以下是另一种方法:
library(dplyr)
library(lubridate)
df %>%
group_by(m1) %>%
mutate(across(contains("date"), ymd),
helper = abs(difftime(baselinedate,patodate))) %>%
mutate(status = biopsy[helper==min(helper)]) %>%
select(-helper) m1 patodate baselinedate biopsy status
<chr> <date> <date> <chr> <chr>
1 1 2013-06-03 2018-11-09 1 3
2 1 2014-01-06 2018-11-09 2 3
3 1 2018-11-23 2018-11-09 3 3
4 2 2004-03-03 2018-07-24 1 2
5 2 2018-06-25 2018-07-24 2 2
6 2 2018-12-19 2018-07-24 3 2 https://stackoverflow.com/questions/69495031
复制相似问题