我有两个数据文件,如下所示:
Dataframe A:
code1 code2 element1 from to
c1a c2a e1a 1 15
c1a c2a e1b 17 50
c1a c2b e1c 14 67
c1b c2c e1d 1 20
c1b c2d e1e 40 60Dataframe B:
code1 code2 element2 number
c1a c2a e2a 7
c1a c2a e2b 10
c1a c2a e2c 35如果是from =< number <= to,我基本上需要加入它们,以获得如下内容:
结果DATAFRAME
(片段,我没有足够的模拟数据。我想要这两个完整的数据合并A和B)。
code1 code2 element1 element2 from to number
c1a c2a e1a e2a 1 15 7
c1a c2a e1a e2b 1 15 10
c1a c2a e1b e2c 17 50 35我可以使用for循环来完成这个任务,并手动检查,但是我想知道是否有一种更“优雅”的方法来完成这个任务呢?
发布于 2020-05-08 02:16:25
您可以连接数据,然后filter范围内的值。
您可以在dplyr中这样做。
library(dplyr)
left_join(B, A, by = c('code1', 'code2')) %>%
filter(number >= from & number <= to)
# code1 code2 element2 number element1 from to
#1 c1a c2a e2a 7 e1a 1 15
#2 c1a c2a e2b 10 e1a 1 15
#3 c1a c2a e2c 35 e1b 17 50或在R基:
subset(merge(B, A, by = c('code1', 'code2')), number >= from & number <= to)发布于 2020-05-08 02:21:45
这里有一个使用fuzzyjoin::fuzzy_inner_join。我从您的输出中了解到,除了from =< number <= to的条件之外,您还想加入code1和code2。
from <= number
code1和code2加入到number,即由第二个不等式(即number <= to
)加入到number。
fuzzy_join的特点是,它们输出两个数据格式中的所有列。
-
library(fuzzyjoin)
fuzzy_inner_join(
df_A, df_B,
by = c(
"code1" = "code1",
"code2" = "code2",
"from" = "number",
"to" = "number"),
match_fun = c(
"code1" = function(l, r) l == r,
"code2" = function(l, r) l == r,
"from" = function(l, r) l <= r,
"to" = function(l, r) r <= l))
# code1.x code2.x element1 from to code1.y code2.y element2 number
# 1 c1a c2a e1a 1 15 c1a c2a e2a 7
# 2 c1a c2a e1a 1 15 c1a c2a e2b 10
# 3 c1a c2a e1b 17 50 c1a c2a e2c 35数据
df_A <- structure(list(code1 = c("c1a", "c1a", "c1a", "c1b", "c1b"),
code2 = c("c2a", "c2a", "c2b", "c2c", "c2d"), element1 = c("e1a",
"e1b", "e1c", "e1d", "e1e"), from = c(1L, 17L, 14L, 1L, 40L
), to = c(15L, 50L, 67L, 20L, 60L)), class = "data.frame", row.names = c(NA, -5L))
df_B <- structure(list(code1 = c("c1a", "c1a", "c1a"), code2 = c("c2a",
"c2a", "c2a"), element2 = c("e2a", "e2b", "e2c"), number = c(7L,
10L, 35L)), class = "data.frame", row.names = c(NA, -3L))https://stackoverflow.com/questions/61670728
复制相似问题