文章/答案/技术大牛

发布

社区首页 >问答首页 >在R Tidyverse中用函数编码多个模型

问在R Tidyverse中用函数编码多个模型
EN

Stack Overflow用户

提问于 2019-06-26 04:09:19

回答 1查看 623关注 0票数 0

我正在尝试用几个公式来安装几个机器学习模型，并将它们作为list_column对象存储在tibble中。

我试图修改“”一书(第25章:许多模型)中引用的代码，但它只给出了最后的输出。有关详细信息，请参阅下面的代码。我们使用gapminder包中的gapminder数据集作为示例。

lab_formula <- as.formula("pop ~ lifeExp ")

temp_formula <- as.formula("gdppercap ~ year")

formula_list <- list(lab_formula,temp_formula)
library(gapminder)

by_country <- gapminder %>% 
  dplyr :: group_by(country, continent) %>% 
  nest()

country_model <- function(df) {
for (i in formula_list) {
  lm(formula=formula[i], data = df)
  randomForest(formula=formula[i], data = df)
  gbm(formula=formula[i], data = df, n.minobsinnode = 2)
}
}

by_country <- by_country %>% 
  mutate(model = map(data, country_model))

by_country
# A tibble: 142 x 4
   country     continent data              model    
   <fct>       <fct>     <list>            <list>   
 1 Afghanistan Asia      <tibble [12 x 4]> <S3: gbm>
 2 Albania     Europe    <tibble [12 x 4]> <S3: gbm>
 3 Algeria     Africa    <tibble [12 x 4]> <S3: gbm>
 4 Angola      Africa    <tibble [12 x 4]> <S3: gbm>
 5 Argentina   Americas  <tibble [12 x 4]> <S3: gbm>
 6 Australia   Oceania   <tibble [12 x 4]> <S3: gbm>
 7 Austria     Europe    <tibble [12 x 4]> <S3: gbm>
 8 Bahrain     Asia      <tibble [12 x 4]> <S3: gbm>
 9 Bangladesh  Asia      <tibble [12 x 4]> <S3: gbm>
10 Belgium     Europe    <tibble [12 x 4]> <S3: gbm>
# ... with 132 more rows

There is no error code but it does not achieve my objective of training the 3 machine learning models (LM, RF, GBM) with the different variables.

function

tidyverse

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-06-26 05:48:19

你需要考虑如何存储你的结果。这里有一种方法可以做到。首先，创建要应用的公式列表。

library(randomForest)
library(gbm)
library(tidyverse)

lab_formula <- as.formula("pop ~ lifeExp ")
temp_formula <- as.formula("gdpPercap ~ year")
formula_list <- list(lab_formula,temp_formula)

创建一个函数，该函数返回一次只应用于一个公式的模型列表。

country_model <- function(df, formula_list, index) {
    list(lm(formula = formula_list[[index]] , data = df), 
         randomForest(formula=formula_list[[index]], data = df),
         gbm(formula=formula_list[[index]], data = df, n.minobsinnode = 2))
}

现在将其应用于每个data，从要应用于数据的列表中传递formula_list和公式号，

df1 <- by_country %>% 
  mutate(model1 = map(data, ~country_model(., formula_list, 1)), 
         model2 = map(data, ~country_model(., formula_list, 2)))
df1

# A tibble: 142 x 5
#   country     continent data              model1     model2    
#   <fct>       <fct>     <list>            <list>     <list>    
# 1 Afghanistan Asia      <tibble [12 × 4]> <list [3]> <list [3]>
# 2 Albania     Europe    <tibble [12 × 4]> <list [3]> <list [3]>
# 3 Algeria     Africa    <tibble [12 × 4]> <list [3]> <list [3]>
# 4 Angola      Africa    <tibble [12 × 4]> <list [3]> <list [3]>
# 5 Argentina   Americas  <tibble [12 × 4]> <list [3]> <list [3]>
# 6 Australia   Oceania   <tibble [12 × 4]> <list [3]> <list [3]>
# 7 Austria     Europe    <tibble [12 × 4]> <list [3]> <list [3]>
# 8 Bahrain     Asia      <tibble [12 × 4]> <list [3]> <list [3]>
# 9 Bangladesh  Asia      <tibble [12 × 4]> <list [3]> <list [3]>
#10 Belgium     Europe    <tibble [12 × 4]> <list [3]> <list [3]>
# … with 132 more rows

现在，model1中的每一行都有一个使用公式formula_list[[1]]的三种模型的列表，类似地，对于model2，也有使用公式formula_list[[2]]的模型。

要使用这些模型进行预测，我们需要对randomForest模型进行不同的处理，因为它需要n.trees参数，当我们从函数中返回这些模型时，我们知道它是列表中的第三个模型，我们可以根据索引来区分它。

df1 %>%
   mutate(pred= map2(data,model1, function(x, y) 
     map(seq_along(y), function(i) 
        if (i == 3) predict(y[[i]], n.trees = y[[i]]$n.trees)
        else as.numeric(predict(y[[i]])))))

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/56765044

复制

相似问题

问在R Tidyverse中用函数编码多个模型
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在R Tidyverse中用函数编码多个模型EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在R Tidyverse中用函数编码多个模型
EN