我有一个包含多个日期和条件的数据集。我想提取以条件place == "A“开头的所有行,以及具有place == "A”开始日期的所有行,以及最多7天之后。例如:
Date Place Value1 Value2
2018-10-27 C 20 8
2018-10-29 A 10 5
2018-10-31 B 15 6
2018-11-4 C 17 9
2018-11-8 D 18 5 我想:
Date Place Value1 Value2
2018-10-29 A 10 5
2018-10-31 B 15 6
2018-11-4 C 17 9 如您所见,它必须在7天内提取带有place == A和所有行的第一行。在第一天之后,像"A“这样的地方是没有意义的,比如"B”和"C“。它必须以"A“开头。它跳过了2018-11-8,因为从2018-10-29已经超过7天了。
我试过这样一个问题:R: Extract data based on date, "if date lesser than",但我不知道如何提取这7天。
发布于 2021-02-15 09:18:38
我们可以使用match获取相应的Date值,并在7天内选择所有行。
library(dplyr)
df %>%
mutate(Date = as.Date(Date)) %>%
filter({tmp <- Date[match('A', Place)]
between(Date, tmp, tmp + 7)})
# Date Place Value Value.1
#1 2018-10-29 A 10 5
#2 2018-10-31 B 15 6
#3 2018-11-04 C 17 9dplyr允许在全局环境中不创建临时变量而执行操作,上面的解决方案可以在基R中写成如下:
df$Date <- as.Date(df$Date)
date_val <- df$Date[match('A', df$Place)]
subset(df, Date >= date_val & Date <= date_val + 7)数据
df <- structure(list(Date = structure(c(17831, 17833, 17835, 17839,
17843), class = "Date"), Place = c("C", "A", "B", "C", "D"),
Value = c(20L, 10L, 15L, 17L, 18L), Value.1 = c(8L, 5L, 6L,
9L, 5L)), row.names = c(NA, -5L), class = "data.frame")发布于 2021-02-15 09:34:49
Base R中的一个选项是
# Find the difference in days
tmp1 <- df$Date - df[df$Place == "A", "Date"]
# Time differences in days
# [1] -2 0 2 6 10
# And then just subset your df
df[df$Place == "A" | (tmp1 <= 7 & tmp1 > 0), ]
# Date Place Value Value.1
# 2 2018-10-29 A 10 5
# 3 2018-10-31 B 15 6
# 4 2018-11-04 C 17 9数据
df <- read.table( text = "Date Place Value Value
2018-10-27 C 20 8
2018-10-29 A 10 5
2018-10-31 B 15 6
2018-11-4 C 17 9
2018-11-8 D 18 5 ", header = T)
df[, 1] <- as.Date(df[, 1])发布于 2021-02-15 09:34:15
即使这样也可以工作,尽管与Ronak的回答非常相似,但不需要创建tmp变量。
#dput
dat <- structure(list(Date = c("2018-10-27", "2018-10-29", "2018-10-31",
"2018-11-04", "2018-11-08"), Place = c("C", "A", "B", "C", "D"
), Value1 = c(20L, 10L, 15L, 17L, 18L), Value2 = c(8, 5, 6, 9,
5)), class = "data.frame", row.names = c(NA, -5L))
#code
library(dplyr)
dat %>% mutate(Date = as.Date(Date)) %>%
filter(between(Date, Date[Place == "A"], Date[Place == "A"] + 7))
Date Place Value1 Value2
1 2018-10-29 A 10 5
2 2018-10-31 B 15 6
3 2018-11-04 C 17 9https://stackoverflow.com/questions/66205311
复制相似问题