文章/答案/技术大牛

发布

社区首页 >问答首页 >线性模型的因式分解-用一个因子创建lm

问线性模型的因式分解-用一个因子创建lm
EN

Stack Overflow用户

提问于 2015-10-15 08:13:12

回答 1查看 133关注 0票数 0

这个问题是这一个的一个更具体和更简化的版本。

我使用的数据集对于单个lm或speedlm计算来说太大了。

我希望将我的数据集分割成更小的部分，但是在这样做时，一个(或多个)列只包含一个因子。

下面的代码是复制我的示例的最小代码。在问题的底部，我将把我的测试脚本给那些感兴趣的人。

library(speedglm)

iris$Species <- factor(iris$Species)
i <- iris[1:20,]
summary(i)
speedlm(Sepal.Length ~ Sepal.Width + Species , i)

这将导致以下错误：

Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels

我试图将iris$Species分解，但没有成功。我真的不知道我现在该怎么解决这个问题。

如何将Species 包含到模型中？(不增加样本量)

编辑：

我知道我只有一个级别："setosa“，但是我仍然需要将它包含在线性模型中，因为我最终会用更多的因素更新模型，如下面的示例脚本所示。

对于感兴趣的人，下面是一个示例脚本，说明我将为实际数据集使用什么：

library(speedglm)

testfunction <- function(start.i, end.i) {
  return(iris[start.i:end.i,])
}

  lengthdata <- nrow(iris)
  stepsize <- 20

## attempt to factor
  iris$Species <- factor(iris$Species)

## Creates the iris dataset in split parts
  start.i <- seq(0, lengthdata, stepsize)
  end.i   <- pmin(start.i + stepsize, lengthdata)

  dat <- Map(testfunction, start.i + 1, end.i)

## Loops trough the split iris data
  for (i in dat) {
    if (!exists("lmfit")) {
      lmfit  <- speedlm(Sepal.Length ~ Sepal.Width + Species , i)
    } else if (!exists("lmfit2")) {
      lmfit2 <- updateWithMoreData(lmfit, i)
    } else {
      lmfit2 <- updateWithMoreData(lmfit2, i)
    }
  }
  print(summary(lmfit2))

factoring

回答 1

Stack Overflow用户

回答已采纳

发布于 2015-10-15 10:27:50

也许有更好的方法，但是如果您重新排序您的行，每个拆分将包含更多的级别，因此不会导致错误。我创造了一个随机的秩序，但你可能想做一个更系统的方式。

library(speedglm)

testfunction <- function(start.i, end.i) {
    return(iris.r[start.i:end.i,])
}

lengthdata <- nrow(iris)
stepsize <- 20

## attempt to factor
iris$Species <- factor(iris$Species)

##Random order
set.seed(1)
iris.r <- iris[sample(nrow(iris)),]

## Creates the iris dataset in split parts
start.i <- seq(0, lengthdata, stepsize)
end.i   <- pmin(start.i + stepsize, lengthdata)

dat <- Map(testfunction, start.i + 1, end.i)

## Loops trough the split iris data
for (i in dat) {
    if (!exists("lmfit")) {
        lmfit  <- speedlm(Sepal.Length ~ Sepal.Width + Species , i)
    } else if (!exists("lmfit2")) {
        lmfit2 <- updateWithMoreData(lmfit, i)
    } else {
        lmfit2 <- updateWithMoreData(lmfit2, i)
    }
}
print(summary(lmfit2))

编辑而不是随机顺序，您可以使用模除法系统地生成一个突出的索引向量：

spred.i <- seq(1, by = 7, length.out = 150) %% 150 + 1
iris.r <- iris[spred.i,]

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/33143257

复制

相似问题

问线性模型的因式分解-用一个因子创建lm
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问线性模型的因式分解-用一个因子创建lmEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问线性模型的因式分解-用一个因子创建lm
EN