如果我有一个包含范围的数据框:
start end
12 18
22 30
35 40
49 70
81 90
102 110
120 124如何在R中获得距离在某个下限和上限阈值内的行(即下一行的开始-前一行的结束在该阈值内)?
比方说,我想得到距离在5-10之间的行,然后我想得到:
start.1 end.1 start.2 end.2
22 30 35 40
35 40 49 70
102 110 120 124这里,start.2 - end.1总是在5-10之间。
发布于 2019-06-19 08:19:42
library(dplyr)
df <- data.frame(
start = c(12,22,35,49,81,102,120),
end = c(18,30,40,70,90,110,124)
)
df %>%
mutate(difference = start - lag(end),
start.1 = lag(start),
end.1 = lag(end),
start.2 = start,
end.2 = end) %>%
filter(difference >= 5 & difference <= 10) %>%
select(-c(difference, start, end))发布于 2019-06-19 11:24:28
使用base R的单向方法
#Get the difference between consecutive start and end values
diffs <- df$start[-1] - df$end[-nrow(df)]
#Get indices where the condition of difference is satisfied
rows_inds <- which(diffs >= 5 & diffs <= 10)
#cbind the rows present in row_inds and next row
df1 <- cbind(df[rows_inds, ], df[rows_inds + 1, ])
#Make the columns name unique
names(df1) <- make.unique(names(df1))
df1
# start end start.1 end.1
#2 22 30 35 40
#3 35 40 49 70
#6 102 110 120 124https://stackoverflow.com/questions/56657961
复制相似问题