在下面的代码中,purrr中的map_dfr可以工作,但是来自furrr的future_map_dfr会引发错误。我怎么才能修好它?
#install.packages("randomForest"); install.packages("tidyverse"); install.packages("iml")
library(tidyverse); library(iml); library(randomForest)
library(furrr)
plan(multiprocess)
set.seed(42)
mtcars1 <- mtcars %>% mutate(vs = as.factor(vs),
id = row_number())
x <- "vs"
y <- paste0(setdiff(setdiff(names(mtcars1), "vs"), "id"), collapse = "+")
rf = randomForest(as.formula(paste0(x, "~ ", y)), data = mtcars1, ntree = 50)
predictor <- Predictor$new(rf, data = mtcars1, y = mtcars1$vs)
# Results using map_dfr() from purrr
shapelyresults <- map_dfr(1:nrow(mtcars), ~(Shapley$new(predictor, x.interest = mtcars1[.x,]) %>%
.$results %>%
as_tibble() %>%
arrange(desc(phi)) %>%
slice(1:5) %>%
select(feature.value, phi) %>%
mutate(id = .x)))
# Attempt to use future_map_dfr() from furrr
f_shapelyresults <- future_map_dfr(1:nrow(mtcars), ~(Shapley$new(predictor, x.interest = mtcars1[.x,]) %>%
.$results %>%
as_tibble() %>%
arrange(desc(phi)) %>%
slice(1:5) %>%
select(feature.value, phi) %>%
mutate(id = .x)))发布于 2020-03-19 23:35:21
furrr根据您的配置使用future,它可以使用映射到不同CPU核心或线程的R子进程,并具有它们自己的环境/作用域。
根据我的经验,通常会出现两种类型的问题:
因此,你可以:
purrr lambda函数重写为命名函数,并在函数顶部抛出require()调用,以排除第一种类型的问题。试着做这样的事情:
library(furrr)
my_function <-
function(primary_object, Shapely_object) {
require(tidyverse); require(iml); require(randomForest)
Shapley_object$new(predictor,
x.interest = mtcars1[primary_object, ]) %>%
.$results %>%
as_tibble() %>%
arrange(desc(phi)) %>%
slice(1:5) %>%
select(feature.value, phi) %>%
mutate(id = primary_object))
}
f_shapelyresults <-
future_map_dfr(
.x = 1:nrow(mtcars), # 1st argument: primary_object, above
.f = my_function,
Shapely_object = Shapely # 2nd argument, as seen above
)https://stackoverflow.com/questions/53399103
复制相似问题