首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >tidymodels:“`new_data`步骤中缺少以下所需列.”

tidymodels:“`new_data`步骤中缺少以下所需列.”
EN

Stack Overflow用户
提问于 2022-10-27 01:19:48
回答 1查看 62关注 0票数 0

我正在为{tidymodel}中的lasso回归模型创建并拟合一个工作流。模型很适合,但是当我去预测测试集时,我会发现一个错误,上面写着“new\_data中缺少了以下所需的列”。Tha列(“价格”)既列在火车上,也在测试装置上。这是个虫子吗?我遗漏了什么?

任何帮助都将不胜感激。

代码语言:javascript
复制
# split the data (target variable in house_sales_df is "price")
split <- initial_split(house_sales_df, prop = 0.8)
train <- split %>% training()
test <-  split %>% testing()

# create and fit workflow
lasso_prep_recipe <-
  recipe(price ~ ., data = train) %>%
  step_zv(all_predictors()) %>%
  step_normalize(all_numeric())

lasso_model <- 
  linear_reg(penalty = 0.1, mixture = 1) %>% 
  set_engine("glmnet")

lasso_workflow <- workflow() %>% 
  add_recipe(lasso_prep_recipe) %>% 
  add_model(lasso_model)

lasso_fit <- lasso_workflow %>% 
  fit(data = train)

# predict test set
predict(lasso_fit, new_data = test)

predict()导致此错误:

代码语言:javascript
复制
Error in `step_normalize()`:
! The following required column is missing from `new_data` in step 'normalize_MXQEf': price.
Backtrace:
  1. stats::predict(lasso_fit, new_data = test, type = "numeric")
  2. workflows:::predict.workflow(lasso_fit, new_data = test, type = "numeric")
  3. workflows:::forge_predictors(new_data, workflow)
  5. hardhat:::forge.data.frame(new_data, blueprint = mold$blueprint)
  7. hardhat:::run_forge.default_recipe_blueprint(...)
  8. hardhat:::forge_recipe_default_process(...)
 10. recipes:::bake.recipe(object = rec, new_data = new_data)
 12. recipes:::bake.step_normalize(step, new_data = new_data)
 13. recipes::check_new_data(names(object$means), object, new_data)
 14. cli::cli_abort(...)
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-10-27 02:23:48

您将得到错误,因为all_numeric()step_normalize()中选择了在预测时不可维护的结果price。使用all_numeric_predictors(),你应该很好

代码语言:javascript
复制
# split the data (target variable in house_sales_df is "price")
split <- initial_split(house_sales_df, prop = 0.8)
train <- split %>% training()
test <-  split %>% testing()

# create and fit workflow
lasso_prep_recipe <-
  recipe(price ~ ., data = train) %>%
  step_zv(all_predictors()) %>%
  step_normalize(all_numeric_predictors())

lasso_model <- 
  linear_reg(penalty = 0.1, mixture = 1) %>% 
  set_engine("glmnet")

lasso_workflow <- workflow() %>% 
  add_recipe(lasso_prep_recipe) %>% 
  add_model(lasso_model)

lasso_fit <- lasso_workflow %>% 
  fit(data = train)

# predict test set
predict(lasso_fit, new_data = test)
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/74215751

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档