我试图对加权数据进行线性回归。
当使用speedlm时,当数据中缺少值时,我会得到一个错误消息。
library(speedglm)
sampleData <- data.frame(w = round(runif(12,0,1)),
target = rnorm(12,100,50),
predictor = c(NA, rnorm(10, 40, 10),NA))
summary(sampleData)w target predictor Min. :0.0000 Min. : -3.381 Min. :22.58 1st Qu.:0.0000 1st Qu.: 48.321 1st Qu.:30.45 Median :1.0000 Median : 84.156 Median :37.09 Mean :0.5833 Mean : 92.306 Mean :35.03 3rd Qu.:1.0000 3rd Qu.:119.891 3rd Qu.:41.96 Max. :1.0000 Max. :223.896 Max. :43.48 NA's :2
#run linear regression without weights
linearNoWeights <- lm(formula("target~predictor"), data = sampleData)
speedLinearNoWeights <- speedlm(formula("target~predictor"), data = sampleData)
#run linear regression with weights
linearWithWeights <- lm(formula("target~predictor"), data = sampleData, weights =sampleData[,"w"] )
speedLinearWithWheights <- speedlm(formula("target~predictor"), data = sampleData, weights =sampleData[,"w"] )基本错误::交叉项目(x,y):不符合参数:警告消息: 1:在sqw *X中,较长的对象长度不是较短对象长度的倍数2:在sqw *y中,较长的对象长度不是从:base::交叉from(x,y)调用的较短对象长度的倍数。
有什么办法不强迫我在运行回归之前修复数据呢?
发布于 2016-11-22 08:47:19
您应该尝试更改na.action选项。下面是当我将na.action更改为na.exclude/na.omit时能够运行的代码。
library(speedglm)
sampleData <- data.frame(w = round(runif(12,0,1)),
target = rnorm(12,100,50),
predictor = c(NA, rnorm(10, 40, 10),NA))
summary(sampleData)
linearNoWeights <- lm(formula("target~predictor"), data = sampleData)
speedLinearNoWeights <- speedlm(formula("target~predictor"), data = sampleData)
options(na.action="na.exclude") # or "na.omit"
linearNoWeights <- lm(formula("target~predictor"), data = sampleData)
speedLinearNoWeights <- speedlm(formula("target~predictor"), data = sampleData)您可以查看na.omit或na.exclude的文档,以了解何时使用什么。希望这能有所帮助。
https://stackoverflow.com/questions/40736719
复制相似问题