我有一个函数,在一些特定的情况下,我想把一些变量附加到变量的向量上。这后来变成了一个回归公式
代码示例:
some_function <- function (df,mdl){
vars <- c("var1", "var2", "var3")
vars <- case_when(mdl== "model1" ~ vars<-("var3", "var4", vars),
mdl== "model2" ~ vars<-("var4", "var5", vars))
target_col <- "count"
target_formula <- as.formula(sprintf("%s ~ %s",
target_col,
paste(vars, collapse = " + ")))
}mdl是一个由文本组成的缩写,应该代表不同的模型,大约有8-10个
发布于 2020-12-27 01:26:23
您可以在列表中定义model1/model2,然后使用as.formula定义map
library(purrr)
library(glue)
vars <- c("var1", "var2", "var3")
target <- "count"
models <- list(model1 = c("var3", "var4", vars),
model2 = c("var4", "var5", vars))
map(models, ~as.formula(glue("{target} ~ {paste(., collapse = ' + ')}")))输出:
$model1
count ~ var3 + var4 + var1 + var2 + var3
<environment: 0x7fec25b8b770>
$model2
count ~ var4 + var5 + var1 + var2 + var3
<environment: 0x7fecf096b560>发布于 2020-12-27 01:28:58
我们可以在base R中使用reformulate创建模型
vars <- c("var1", "var2", "var3")
target <- "count"
models <- list(model1 = c("var3", "var4", vars),
model2 = c("var4", "var5", vars))
lapply(models, reformulate, response = target)-output
#$model1
#count ~ var3 + var4 + var1 + var2 + var3
#<environment: 0x7f92c7658ed8>
#$model2
#count ~ var4 + var5 + var1 + var2 + var3
#<environment: 0x7f92c7649a88>它可以包装在function中,并与if/else一起使用条件,函数的数据集输入似乎未在OP的post中使用
some_function <- function (df,mdl){
vars <- c("var1", "var2", "var3")
vars <- if(mdl == "model1") {
c(vars, "var4")
} else c(vars, "var5")
target_col <- "count"
reformulate(vars, response = target)
}-testing
some_function(iris, "model1")
#count ~ var1 + var2 + var3 + var4
#<environment: 0x7f92f0c77370>
some_function(iris, "model2")
#count ~ var1 + var2 + var3 + var5
#<environment: 0x7f92f0ce6db8>发布于 2020-12-27 01:34:49
不要在case_when中进行赋值,尝试它几乎从来都不是一件好事。相反,试着这样做:
some_function <- function (df,mdl){
newvars <- dplyr::case_when(
mdl == "model1" ~ c("var3", "var4"),
mdl == "model2" ~ c("var4", "var5")
)
vars <- c("var1", "var2", "var3", newvars)
# something else here
vars
}
some_function(mtcars, "model1")
# [1] "var1" "var2" "var3" "var3" "var4"
some_function(mtcars, "model2")
# [1] "var1" "var2" "var3" "var4" "var5"这似乎是可以的,但有两个地方可以改进。
第一个重复"var3",也许我们可以在函数中添加vars <- unique(vars)。
case_when实际上是嵌套ifelse (或dplyr::if_else)的矢量化替代品,嵌套在其中工作,但会使事情变得难以跟踪/维护。因此,这表明mdl的长度可能大于1。但是当我们传递一个长度为2的参数时:Some_function(mtcar,c("model1","model2")) #1 "var1“"var2”"var3“"var3”"var5“
case_when中的第一次比较发现mdl == "model1"与第一个向量匹配,但它只使用c("var3","var4")中的第一个向量。此外,如果我们传递更多,那么我们会得到关于不兼容的向量长度的错误。
我怀疑您打算将mdl的长度设置为1,在这种情况下,您可能有其他几个型号,并希望在默认值的基础上添加一组额外的变量。也许这就是你最终想要的?
some_function <- function (df, mdl){
newvars <- switch(
mdl[1],
model1 = c("var3", "var4"),
model2 = c("var4", "var5"),
stop("unrecognized model: ", sQuote(mdl))
)
vars <- c("var1", "var2", "var3", newvars)
# something else here
vars
}
some_function(mtcars, "model1")
# [1] "var1" "var2" "var3" "var3" "var4"
some_function(mtcars, "model2")
# [1] "var1" "var2" "var3" "var4" "var5"
some_function(mtcars, "model3")
# Error in some_function(mtcars, "model3") : unrecognized model: 'model3'https://stackoverflow.com/questions/65458845
复制相似问题