下面的代码从远程URL读取数据,然后查找五个不同列子集中的所有列的NA行。输出表miss_recode给出了这些行的ID以及变量recode_cols,该变量是行子集全部为NA的标签。
suppressMessages(library(tidyverse))
urlRemote_path <- "https://raw.githubusercontent.com/"
github_path <- "DSHerzberg/TOD-R/master/INPUT-FILES/"
fileName_path <- "SO-data.csv"
input <- suppressMessages(read_csv(url(
str_c(urlRemote_path, github_path, fileName_path)
)))
miss1 <- input %>%
filter(across(c(i001:i035),
~ is.na(.))) %>%
mutate(recode_cols = "i001:i035")
miss2 <- input %>%
filter(across(c(i036:i060),
~ is.na(.))) %>%
mutate(recode_cols = "i036:i060")
miss3 <- input %>%
filter(across(c(i061:i100),
~ is.na(.))) %>%
mutate(recode_cols = "i061:i100")
miss4 <- input %>%
filter(across(c(i101:i130),
~ is.na(.))) %>%
mutate(recode_cols = "i101:i130")
miss5 <- input %>%
filter(across(c(i131:i165),
~ is.na(.))) %>%
mutate(recode_cols = "i131:i165")
miss_recode <- bind_rows(
miss1,
miss2,
miss3,
miss4,
miss5
) %>%
select(ID, recode_cols)我想用purrr::map合并代码。下一个片段显示了我的尝试,但它返回Error: Can't subset columns that don't exist.
vec <- c("i001:i035", "i036:i060", "i061:i100", "i101:i130", "i131:i165")
miss_recode_map <- vec %>%
map_df(~
input %>%
filter(across(c(!!sym(.x)),
~ is.na(.))) %>%
mutate(recode_cols = .x) %>%
select(ID, recode_cols)
)很明显,我没有得到正确的NSE。这似乎是一个与across()相关的新问题,该问题现已在dplyr 1.0.0中提供。在本例中,似乎.x的一种用法需要引用vec的元素,而.x的另一种用法要求不用引号引用这些元素。
提前感谢您的帮助。
发布于 2020-06-25 11:55:00
您不能将"i001:i035"转换为符号,您实际上需要对其进行解析。
library(dplyr)
library(rlang)
purrr::map_df(vec, ~input %>%
filter(across(!!parse_expr(.x),~ is.na(.))) %>%
mutate(recode_cols = .x) %>%
select(ID, recode_cols))
# A tibble: 8 x 2
# ID recode_cols
# <dbl> <chr>
#1 201010 i036:i060
#2 214063 i036:i060
#3 262050 i036:i060
#4 262063 i036:i060
#5 205036 i061:i100
#6 231007 i061:i100
#7 208014 i101:i130
#8 231014 i131:i165 https://stackoverflow.com/questions/62567385
复制相似问题