假设我们有2个数据帧,每个数据帧有2个数据,每个数据帧有6行,我们只想在左边的日期( lhs )比右边的日期(Rhs)早的时候进行cbind,同时确保每一行都没有重复的日期(在lhs和rhs中):
x = cbind(data.frame(lhs_date = seq(Sys.Date()-5, Sys.Date(),2)), letter=c("A","B","C","D","E","F") )
Y = cbind(data.frame(rhs_date = seq(Sys.Date()-5, Sys.Date(),1)), letter=c("X","Y","Y","X","J","J") )只有当lhs date < rhs date时,我们才能将x绑定或左连接到y,从而保持每一行的唯一性?
发布于 2019-11-27 22:11:44
我找到的解决方案是基于最初的agila输入:在模糊连接之后,dplyr管道操作符可以做剩下的事情:
x <- data.frame(lhs_date = seq(Sys.Date() - 5, Sys.Date(), 2), letter = c("A","B","C","D","E","F"))
y <- data.frame(rhs_date = seq(Sys.Date() - 5, Sys.Date(), 1), letter = c("X","Y","Y","X","J","J"))
z= fuzzy_left_join(
x = x,
y = y,
by = c("lhs_date" = "rhs_date"),
match_fun = list(`<`)
)
z %>%
group_by(lhs_date) %>%
mutate(flag = row_number()) %>%
filter(flag ==1)我可以很容易地在SQL中重现,但在R上遇到了困难。谢谢@Agila。虽然不完整,但你的回答指出了正确的方向,并且走得很远。
https://stackoverflow.com/questions/59053536
复制相似问题