首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >评估中出错(预变量,数据,环境):回归模型中的对象'oly.success‘

评估中出错(预变量,数据,环境):回归模型中的对象'oly.success‘
EN

Stack Overflow用户
提问于 2021-09-29 10:09:59
回答 1查看 38关注 0票数 0

我研究过这个问题,有些人建议更改列名可能会有效。但我似乎找不出是哪一列导致了这个问题。

我的代码

代码语言:javascript
复制
library(Amelia)
library(corrplot)
library(GGally)
library(caret)

data <- asianmen_100.free
summary(data)

#remove unwated variables
reject_vars <- names(data) %in% c("firstname","lastname","country","Event","Pool.Length","Competition",
                                  "Comp.Country","name","DOB","Date","mins","secs","minsAsSecDuration","earliest_date",
                                  "Final_Medal","Time","secsAsDuration")

data.new <- data[!reject_vars]
data.new$Age. <- as.numeric(data.new$Age.)


#Remove Target variables
remove_vars <- names(data.new) %in% c("oly_success") 
data.new <- data.new[!remove_vars]


ggcorr(data.new, label = TRUE)


# find variables that have higher cross-correlation
M <- data.matrix(data.new)
corrM <- cor(M)
highlyCorrM <- findCorrelation(corrM, cutoff=0.5)
names(data.new)[highlyCorrM]


#sample size
smp_size <- floor(2/3 * nrow(data.new)) 
set.seed(2)


#sample dataset
data.new <- data.new[sample(nrow(data.new)), ]
data.train <- data.new[1:smp_size, ]
data.test <- data.new[(smp_size+1):nrow(data.new), ]


#model building

formula = oly_success ~ .

rmodel <-  glm(formula = formula, 
               data=data.train, 
               family=binomial(link="logit")) 
  
summary(rmodel)   

以下是数据:

代码语言:javascript
复制
> head(data.new)
# A tibble: 6 x 8
   Age. timeAsDuration Success oly_success first_appear.age first_oly.age age_diff total_medal
  <dbl> <Duration>       <dbl>       <dbl>            <dbl>         <dbl>    <dbl>       <dbl>
1    20 49.37s               0           0               17            NA       NA           1
2    21 49.8s                0           0               21            NA       NA           0
3    16 57.75s               0           0               16            NA       NA           0
4    20 51.42s               0           0               17            NA       NA           0
5    21 51.01s               0           0               16            NA       NA           2
6    NA 54.11s               0           0               NA            NA       NA           0

样本数据

代码语言:javascript
复制
> dput(data.new[1:10,])
structure(list(Age. = c(20, 21, 16, 20, 21, NA, 19, 25, 26, 24
), timeAsDuration = new("Duration", .Data = c(49.37, 49.8, 57.75, 
51.42, 51.01, 54.11, 50.88, 57.69, 51.49, 49.97)), Success = c(0, 
0, 0, 0, 0, 0, 0, 0, 0, 0), oly_success = c(0, 0, 0, 0, 0, 0, 
0, 1, 0, 0), first_appear.age = c(17, 21, 16, 17, 16, NA, 19, 
25, 25, 23), first_oly.age = c(NA, NA, NA, NA, NA, NA, NA, 26, 
NA, NA), age_diff = c(NA, NA, NA, NA, NA, NA, NA, 1, NA, NA), 
    total_medal = c(1, 0, 0, 0, 2, 0, 0, 0, 0, 1)), row.names = c(NA, 
-10L), class = c("tbl_df", "tbl", "data.frame"))

我已经尝试将一些列名和目标变量的事件名改为oly.success,但仍然没有成功,我哪里错了?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-09-29 10:30:51

首先,在dput(data.new)中,目标变量名为oly_success,在公式中使用oly.success,然后使用以下命令删除目标变量:

代码语言:javascript
复制
#Remove Target variables
remove_vars <- names(data.new) %in% c("oly_success") 
data.new <- data.new[!remove_vars]

如果你修复了这些错误,你的代码就能正常工作:

代码语言:javascript
复制
library(Amelia)
library(corrplot)
library(GGally)
library(caret)
   
ggcorr(data.new, label = TRUE)


# find variables that have higher cross-correlation
M <- data.matrix(data.new)
corrM <- cor(M)
highlyCorrM <- findCorrelation(corrM, cutoff=0.5)
names(data.new)[highlyCorrM]


#sample size
smp_size <- floor(2/3 * nrow(data.new)) 
set.seed(2)


#sample dataset
data.new <- data.new[sample(nrow(data.new)), ]
data.train <- data.new[1:smp_size, ]
data.test <- data.new[(smp_size+1):nrow(data.new), ]


#model building
rmodel <-  glm(formula = oly_success ~ ., 
               data=data.new, #I use the entire dataset because the training one does not have all the levels for the logistic regression, since the example dataset is too small
               family=binomial(link="logit")) 

summary(rmodel)  
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/69374342

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档