我有这样的df
check <- read.table(text='material previousUser currentUser status date originFrame currentFrame
123 inventory Dave draft 2016-1 1/1/2016 1/1/2016
123 Dave Carl transfer 2016-2 2/1/2016 1/1/2016
123 Carl customer sent 2016-4 4/1/2016 1/1/2016
123 inventory Dave draft 2016-1 1/1/2016 2/1/2016
123 Dave Carl transfer 2016-2 2/1/2016 2/1/2016
123 Carl customer sent 2016-4 4/1/2016 2/1/2016
123 inventory Dave draft 2016-1 1/1/2016 3/1/2016
123 Dave Carl transfer 2016-2 2/1/2016 3/1/2016
123 Carl customer sent 2016-4 4/1/2016 3/1/2016
123 inventory Dave draft 2016-1 1/1/2016 4/1/2016
123 Dave Carl transfer 2016-2 2/1/2016 4/1/2016
123 Carl customer sent 2016-4 4/1/2016 4/1/2016
123 inventory Dave draft 2016-1 1/1/2016 5/1/2016
123 Dave Carl transfer 2016-2 2/1/2016 5/1/2016
123 Carl customer sent 2016-4 4/1/2016 5/1/2016
123 inventory Dave draft 2016-1 1/1/2016 1/1/2017
123 Dave Carl transfer 2016-2 2/1/2016 1/1/2017
123 Carl customer sent 2016-4 4/1/2016 1/1/2017
123 inventory Dave draft 2016-1 1/1/2016 2/1/2017
123 Dave Carl transfer 2016-2 2/1/2016 2/1/2017
123 Carl customer sent 2016-4 4/1/2016 2/1/2017
123 inventory Dave draft 2016-1 1/1/2016 3/1/2017
123 Dave Carl transfer 2016-2 2/1/2016 3/1/2017
123 Carl customer sent 2016-4 4/1/2016 3/1/2017
123 inventory Dave draft 2016-1 1/1/2016 4/1/2017
123 Dave Carl transfer 2016-2 2/1/2016 4/1/2017
123 Carl customer sent 2016-4 4/1/2016 4/1/2017
123 inventory Dave draft 2016-1 1/1/2016 5/1/2017
123 Dave Carl transfer 2016-2 2/1/2016 5/1/2017
123 Carl customer sent 2016-4 4/1/2016 5/1/2017
104 inventory Dave draft 2017-1 1/1/2017 1/1/2016
104 Dave Carl transfer 2017-2 2/1/2017 1/1/2016
104 Carl customer sent 2017-4 4/1/2017 1/1/2016
104 inventory Dave draft 2017-1 1/1/2017 2/1/2016
104 Dave Carl transfer 2017-2 2/1/2017 2/1/2016
104 Carl customer sent 2017-4 4/1/2017 2/1/2016
104 inventory Dave draft 2017-1 1/1/2017 3/1/2016
104 Dave Carl transfer 2017-2 2/1/2017 3/1/2016
104 Carl customer sent 2017-4 4/1/2017 3/1/2016
104 inventory Dave draft 2017-1 1/1/2017 4/1/2016
104 Dave Carl transfer 2017-2 2/1/2017 4/1/2016
104 Carl customer sent 2017-4 4/1/2017 4/1/2016
104 inventory Dave draft 2017-1 1/1/2017 5/1/2016
104 Dave Carl transfer 2017-2 2/1/2017 5/1/2016
104 Carl customer sent 2017-4 4/1/2017 5/1/2016
104 inventory Dave draft 2017-1 1/1/2017 1/1/2017
104 Dave Carl transfer 2017-2 2/1/2017 1/1/2017
104 Carl customer sent 2017-4 4/1/2017 1/1/2017
104 inventory Dave draft 2017-1 1/1/2017 2/1/2017
104 Dave Carl transfer 2017-2 2/1/2017 2/1/2017
104 Carl customer sent 2017-4 4/1/2017 2/1/2017
104 inventory Dave draft 2017-1 1/1/2017 3/1/2017
104 Dave Carl transfer 2017-2 2/1/2017 3/1/2017
104 Carl customer sent 2017-4 4/1/2017 3/1/2017
104 inventory Dave draft 2017-1 1/1/2017 4/1/2017
104 Dave Carl transfer 2017-2 2/1/2017 4/1/2017
104 Carl customer sent 2017-4 4/1/2017 4/1/2017
104 inventory Dave draft 2017-1 1/1/2017 5/1/2017
104 Dave Carl transfer 2017-2 2/1/2017 5/1/2017
104 Carl customer sent 2017-4 4/1/2017 5/1/2017', header=TRUE, stringsAsFactors = FALSE)
check[c('originFrame','currentFrame')] <- lapply(check[c('originFrame','currentFrame')], as.Date, format = '%m/%d/%Y')我希望对originFrame等于currentFrame的行进行按currentFrame和物料分组的筛选,如果不相等,则选择小于currentFrame的最大originFrame,如下所示:
material previousUser currentUser status date originFrame currentFrame
123 inventory Dave draft 2016-1 1/1/2016 1/1/2016
123 Dave Carl transfer 2016-2 2/1/2016 2/1/2016
123 Dave Carl transfer 2016-2 2/1/2016 3/1/2016
123 Carl customer sent 2016-4 4/1/2016 4/1/2016
123 Carl customer sent 2016-4 4/1/2016 5/1/2016
123 inventory Dave draft 2016-1 4/1/2016 1/1/2017
123 Dave Carl transfer 2016-2 4/1/2016 2/1/2017
123 Dave Carl transfer 2016-2 4/1/2016 3/1/2017
123 Carl customer sent 2016-4 4/1/2016 4/1/2017
123 Carl customer sent 2016-4 4/1/2016 5/1/2017
104 inventory Dave draft 2016-1 1/1/2017 1/1/2016
104 Dave Carl transfer 2016-2 1/1/2017 2/1/2016
104 Dave Carl transfer 2016-2 1/1/2017 3/1/2016
104 Carl customer sent 2016-4 1/1/2017 4/1/2016
104 Carl customer sent 2016-4 1/1/2017 5/1/2016
104 inventory Dave draft 2016-1 1/1/2017 1/1/2017
104 Dave Carl transfer 2016-2 2/1/2017 2/1/2017
104 Dave Carl transfer 2016-2 2/1/2017 3/1/2017
104 Carl customer sent 2016-4 4/1/2017 4/1/2017
104 Carl customer sent 2016-4 4/1/2017 5/1/2017这是可行的,但没有考虑到currentFrame的价值,因此给出了错误的结果。
check <- as.data.frame(
check %>%
group_by(currentFrame, material) %>%
filter(
ifelse(
currentFrame %in% originFrame,
originFrame == currentFrame,
ifelse(
max(originFrame) > currentFame,
originFrame == max(originFrame),
originFrame == max(originFrame)
)
)
)
)但是,我似乎无法使它与使用以下返回错误的观察数的currentFrame值必须低于最大值的规则一起工作
check <- as.data.frame(
check %>%
group_by(currentFrame, material) %>%
filter(
ifelse(
currentFrame %in% originFrame,
originFrame == currentFrame,
ifelse(
max(originFrame) > currentFrame,
originFrame == which.max(originFrame < currentFrame),
originFrame == max(originFrame)
)
)
)
)编辑*应该提到,实际上数据文件包含了许多不同日期的材料,现在正在更新
编辑2*好的,对不起,希望这是更清楚,如果有人有任何反馈,我可以如何更好地表达这个问题,我会很感激。
发布于 2018-03-15 18:10:12
我想出来了,
最后,我将数据帧分成三个数据帧,一个用于originFrame = CurrentFrame,originFrame < currentFrame,originFrame > currentFrame。然后,我从dataframe 2中删除了数据帧1中的所有内容,从数据帧3中删除了dataframe 1和2中的所有内容,然后从dataframe2获取了最大的originFrame,从dataframe3获取了最小originFrame。把它们绑在一起后,我就有了我需要的东西。
发布于 2018-03-12 00:08:31
您的数据,以更可消费的格式:
check <- read.table(text='material previousUser currentUser status date originFrame currentFrame
123 inventory Dave draft 2016-1 1/1/2016 1/1/2016
123 Dave Carl transfer 2016-2 2/1/2016 1/1/2016
123 Carl customer sent 2016-4 4/1/2016 1/1/2016
123 inventory Dave draft 2016-1 1/1/2016 2/1/2016
123 Dave Carl transfer 2016-2 2/1/2016 2/1/2016
123 Carl customer sent 2016-4 4/1/2016 2/1/2016
123 inventory Dave draft 2016-1 1/1/2016 3/1/2016
123 Dave Carl transfer 2016-2 2/1/2016 3/1/2016
123 Carl customer sent 2016-4 4/1/2016 3/1/2016
123 inventory Dave draft 2016-1 1/1/2016 4/1/2016
123 Dave Carl transfer 2016-2 2/1/2016 4/1/2016
123 Carl customer sent 2016-4 4/1/2016 4/1/2016
123 inventory Dave draft 2016-1 1/1/2016 5/1/2016
123 Dave Carl transfer 2016-2 2/1/2016 5/1/2016
123 Carl customer sent 2016-4 4/1/2016 5/1/2016', header=TRUE, stringsAsFactors = FALSE)
check[c('originFrame','currentFrame')] <- lapply(check[c('originFrame','currentFrame')], as.Date, format = '%m/%d/%Y')一方面,继续使用dplyr。
library(dplyr)
check %>%
mutate(datediff = currentFrame - originFrame) %>%
arrange(currentFrame, datediff) %>%
group_by(currentFrame) %>%
filter(datediff >= 0) %>%
slice(1) %>%
ungroup() %>%
select(-datediff)
# # A tibble: 5 × 7
# material previousUser currentUser status date originFrame currentFrame
# <int> <chr> <chr> <chr> <chr> <date> <date>
# 1 123 inventory Dave draft 2016-1 2016-01-01 2016-01-01
# 2 123 Dave Carl transfer 2016-2 2016-02-01 2016-02-01
# 3 123 Dave Carl transfer 2016-2 2016-02-01 2016-03-01
# 4 123 Carl customer sent 2016-4 2016-04-01 2016-04-01
# 5 123 Carl customer sent 2016-4 2016-04-01 2016-05-01https://stackoverflow.com/questions/49226131
复制相似问题