文章/答案/技术大牛

发布

问如何规范model.matrix？
EN

Stack Overflow用户

提问于 2015-06-18 14:20:02

回答 3查看 2.5K关注 0票数 3

# first, create your data.frame
mydf <- data.frame(a = c(1,2,3), b = c(1,2,3), c = c(1,2,3))

# then, create your model.matrix
mym <- model.matrix(as.formula("~ a + b + c"), mydf)

# how can I normalize the model.matrix?

目前，为了运行我的规范化函数，我必须将我的model.matrix转换回data.frame：

normalize <- function(x) { return ((x - min(x)) / (max(x) - min(x))) }
m.norm <- as.data.frame(lapply(m, normalize))

通过简单的标准化model.matrix来避免这一步吗？

model.matrix

回答 3

Stack Overflow用户

回答已采纳

发布于 2015-06-18 14:27:28

您可以将每一列规范化，而不必转换为具有apply函数的数据帧：

apply(mym, 2, normalize)
#   (Intercept)   a   b   c
# 1         NaN 0.0 0.0 0.0
# 2         NaN 0.5 0.5 0.5
# 3         NaN 1.0 1.0 1.0

实际上，您可能想让截取保持不动，比如：

cbind(mym[,1,drop=FALSE], apply(mym[,-1], 2, normalize))
#   (Intercept)   a   b   c
# 1           1 0.0 0.0 0.0
# 2           1 0.5 0.5 0.5
# 3           1 1.0 1.0 1.0

票数 3

Stack Overflow用户

发布于 2015-06-18 14:50:19

另一种选择是使用非常有用的matrixStats包将其矢量化(尽管apply在矩阵上和在列上应用时通常也非常有效)。这样，您也可以保留原始数据结构。

library(matrixStats)
Max <- colMaxs(mym[, -1]) 
Min <- colMins(mym[, -1])
mym[, -1] <- (mym[, -1] - Min)/(Max - Min)
mym
#   (Intercept)   a   b   c
# 1           1 0.0 0.0 0.0
# 2           1 0.5 0.5 0.5
# 3           1 1.0 1.0 1.0
# attr(,"assign")
# [1] 0 1 2 3

票数 3

Stack Overflow用户

发布于 2015-06-18 16:12:25

如果您想在某种意义上“正常化”，您只需使用scale函数，该函数将std.dev设置为1。

> scale( mym )
  (Intercept)  a  b  c
1         NaN -1 -1 -1
2         NaN  0  0  0
3         NaN  1  1  1
attr(,"assign")
[1] 0 1 2 3
attr(,"scaled:center")
(Intercept)           a           b           c 
          1           2           2           2 
attr(,"scaled:scale")
(Intercept)           a           b           c 
          0           1           1           1 
> mym
  (Intercept) a b c
1           1 1 1 1
2           1 2 2 2
3           1 3 3 3
attr(,"assign")
[1] 0 1 2 3

正如你所看到的，当“拦截”术语出现时，“规范化”所有的模型矩阵是没有意义的。所以你可以这样做：

> mym[ , -1 ] <- scale( mym[,-1] )
> mym
  (Intercept)  a  b  c
1           1 -1 -1 -1
2           1  0  0  0
3           1  1  1  1
attr(,"assign")
[1] 0 1 2 3

如果您的默认对比度选项设置为"contr.sum“，并且列是因子类型，则这实际上是模型矩阵。只有当要“规范化”的变量是因素时，这才会被接受为内部到model.matrix操作：

> mym <- model.matrix(as.formula("~ a + b + c"), mydf, contrasts.arg=list(a="contr.sum"))
Error in `contrasts<-`(`*tmp*`, value = contrasts.arg[[nn]]) : 
  contrasts apply only to factors
> mydf <- data.frame(a = factor(c(1,2,3)), b = c(1,2,3), c = c(1,2,3))
> mym <- model.matrix(as.formula("~ a + b + c"), mydf, contrasts.arg=list(a="contr.sum"))
> mym
  (Intercept) a1 a2 b c
1           1  1  0 1 1
2           1  0  1 2 2
3           1 -1 -1 3 3
attr(,"assign")
[1] 0 1 1 2 3
attr(,"contrasts")
attr(,"contrasts")$a
[1] "contr.sum"

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/30917984

复制

相似问题

问如何规范model.matrix？
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何规范model.matrix？EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何规范model.matrix？
EN