当一个列(dt_3)日期位于另外两个列(dt_1和dt_2)的日期之间时,我试图添加一个二进制列指示符。我有一个小样本的数据,但是在我的更大的集合,日期列(dt_3),我想要比较,其他有很多NA的,这是抛出错误Error: Expecting a single value:。什么是只检查非NA值是否在这两列之间的最佳方法。
下面是我的数据的一个例子:
dt_1 dt_2 dt_3
2019-7-10 2019-8-21 2020-2-01
2019-8-22 2019-10-11 2019-9-01
2019-2-09 2019-3-02 NA我现在的代码是:
dates %>%
mutate(between = ifelse(between(dt_3, dt_1, dt_2), 1, 0))预期产出:
dt_1 dt_2 dt_3 between
2019-7-10 2019-8-21 2020-2-01 0
2019-8-22 2019-10-11 2019-9-01 1
2019-2-09 2019-3-02 NA 0发布于 2020-06-12 18:29:35
替代between选项的是比较运算符(>=,<=),然后用0替换NA
library(dplyr)
library(lubridate)
library(tidyr)
dates %>%
mutate(across(everything(), ymd)) %>%
mutate(between = mutate(between = replace_na(dt_3 >= dt_1 & dt_3 <= dt_2, 0))对于between,left和right不是矢量化的,即它只需要一个值。一种选择是rowwise
dates %>%
mutate(across(everything(), ymd)) %>%
rowwise %>%
mutate(between = replace_na(between(dt_3, dt_1, dt_2), 0))
# A tibble: 3 x 4
# Rowwise:
# dt_1 dt_2 dt_3 between
# <date> <date> <date> <dbl>
#1 2019-07-10 2019-08-21 2020-02-01 0
#2 2019-08-22 2019-10-11 2019-09-01 1
#3 2019-02-09 2019-03-02 NA 0数据
dates <- structure(list(dt_1 = c("2019-7-10", "2019-8-22", "2019-2-09"
), dt_2 = c("2019-8-21", "2019-10-11", "2019-3-02"), dt_3 = c("2020-2-01",
"2019-9-01", NA)), class = "data.frame", row.names = c(NA, -3L
))https://stackoverflow.com/questions/62350343
复制相似问题