我真的很困惑,为什么我的一个功能是如此古怪的行为。下面是一些数据和函数本身:
match0 <- function(i, df) {
df <- as.data.frame(df)
j <- 1:nrow(df)
if (!is.na(df$p201[i])) {
l <- i
} else {
k <-
(!(df$Ano[i] == df$Ano[j] & df$Trimestre[i] == df$Trimestre[j] & i != j)) &
df$V2008[i] != 99 &
df$V20081[i] != 99 &
df$V20082[i] != 9999
l <- ifelse(any(k), which(k), i)
}
return(l)
}
dataset <- structure(list(UF = structure(c(11, 11), format.stata = "%8.0g"),
UPA = structure(c(110000227, 110000227), format.stata = "%12.0g"),
V1008 = structure(c(1, 1), format.stata = "%8.0g"), V1014 = structure(c(1,
1), format.stata = "%8.0g"), V2007 = structure(c(1, 1), format.stata = "%8.0g"),
V2008 = structure(c(17, 17), format.stata = "%8.0g"), V20081 = structure(c(1,
1), format.stata = "%8.0g"), V20082 = structure(c(1969, 1969
), format.stata = "%8.0g"), Ano = structure(c(2012, 2012), format.stata = "%8.0g"),
Trimestre = structure(c("1", "2"), format.stata = "%9s"),
V2003 = structure(c(1, 1), format.stata = "%8.0g")), row.names = c(NA,
-2L), class = c("tbl_df", "tbl", "data.frame"))以下是我想做的事:
dataset %>%
group_by(UF, UPA, V1008, V1014, V2007, V2008, V20081, V20082) %>%
arrange(UF, UPA, V1008, V1014, V2007, V2008, V20081, V20082, Ano, Trimestre, V2003) %>%
group_by(index = map_dbl(
seq(n()),
~ match0(.x, df = cur_data())
), .add = TRUE)该函数应该清楚地为两行生成index = 1。但是,如果运行上述代码,则不会。但是,如果我不使用map_dbl并逐行手动检查,就会得到所需的结果。
谁能帮我找出原因吗?
发布于 2020-08-09 00:51:01
来自?cur_data
cur_data()给出当前组的当前数据(不包括分组变量)
因此,它传递数据,而不对函数中正在检查的变量进行分组。当前的解决方法是将cur_group()传递给cur_data()。
library(dplyr)
dataset %>%
group_by(UF, UPA, V1008, V1014, V2007, V2008, V20081, V20082) %>%
arrange(UF, UPA, V1008, V1014, V2007, V2008, V20081, V20082,
Ano, Trimestre, V2003) %>%
group_by(index = purrr::map_dbl(seq(n()),
~ match0(.x, df = cbind(cur_group(), cur_data()))
), .add = TRUE)
# UF UPA V1008 V1014 V2007 V2008 V20081 V20082 V2003 Ano Trimestre p201 n_p index
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl>
#1 11 110000227 1 1 1 17 1 1969 1 2012 1 1 1 1
#2 11 110000227 1 1 1 17 1 1969 1 2012 2 NA 2 1将来会有cur_data_all(),它将通过分组变量传递当前数据。
https://stackoverflow.com/questions/63321217
复制相似问题