文章/答案/技术大牛

发布

社区首页 >问答首页 >enet()可以工作，但在通过插入符号运行时不起作用：：train()

问enet()可以工作，但在通过插入符号运行时不起作用：：train()
EN

Stack Overflow用户

提问于 2013-10-01 17:17:41

回答 1查看 3.9K关注 0票数 7

我在试着用弹性网。从套索开始然后从那里开始。我可以让它直接运行，但是当我试图使用train在caret包中运行相同的参数时，它会失败。我想让train正常工作，这样我就可以使用它来评估模型参数。

# Works
test <- enet( x=x, y=y, lambda=0, trace=TRUE, normalize=FALSE, intercept=FALSE )
# Doesn't
enetGrid <- data.frame(.lambda=0,.fraction=c(.01,.001,.0005,.0001))
ctrl <- trainControl( method="repeatedcv", repeats=5 )
> test2 <- train( x, y, method="enet", tuneGrid=enetGrid, trControl=ctrl, preProc=NULL )
  fraction lambda RMSE Rsquared RMSESD RsquaredSD
1    1e-04      0  NaN      NaN     NA         NA
2    5e-04      0  NaN      NaN     NA         NA
3    1e-03      0  NaN      NaN     NA         NA
4    1e-02      0  NaN      NaN     NA         NA
Error in train.default(x, y, method = "enet", tuneGrid = enetGrid, trControl = ctrl,  : 
  final tuning parameters could not be determined
In addition: There were 50 or more warnings (use warnings() to see the first 50)
> warnings()
...
50: In eval(expr, envir, enclos) :
  model fit failed for Fold10.Rep5: lambda=0, fraction=0.01 Error in enet(as.matrix(trainX), trainY, lambda = lmbda) : 
  Some of the columns of x have zero variance

请注意，上述示例中的任何共线性都只是为可重复的示例(1,000行对实际数据集中的208,000行)进行减缩的结果。

我以各种方式检查了完整的数据集，包括findLinearCombos。请注意，几百个变量被排除在临床诊断之外，因此是二进制的，比例很低的1。

如何获得train(...,method="enet") to run using the exact same settings asenet()`？

用于重现性的数据、sesionInfo等

样本数据x和y是可在这里找到。

sessionInfo()结果

R version 3.0.1 (2013-05-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C         LC_TIME=C            LC_COLLATE=C         LC_MONETARY=C        LC_MESSAGES=C        LC_PAPER=C          
 [8] LC_NAME=C            LC_ADDRESS=C         LC_TELEPHONE=C       LC_MEASUREMENT=C     LC_IDENTIFICATION=C 

attached base packages:
 [1] parallel  splines   grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] scales_0.2.3        elasticnet_1.1      fscaret_0.8.5.3     gsubfn_0.6-5        proto_0.3-10        lars_1.2            caret_5.17-7       
 [8] foreach_1.4.1       cluster_1.14.4      lubridate_1.3.0     HH_2.3-37           reshape_0.8.4       latticeExtra_0.6-24 leaps_2.9          
[15] multcomp_1.2-18     perturb_2.05        Zelig_4.2-0         sandwich_2.2-10     zoo_1.7-10          survey_3.29-5       Hmisc_3.12-2       
[22] survival_2.37-4     lme4_0.999999-2     bayesm_2.2-5        stargazer_4.0       pscl_1.04.4         vcd_1.2-13          colorspace_1.2-2   
[29] mvtnorm_0.9-9995    car_2.0-18          nnet_7.3-7          gdata_2.13.2        gtools_3.0.0        spBayes_0.3-7       Formula_1.1-1      
[36] magic_1.5-4         abind_1.4-0         MapGAM_0.6-2        gam_1.08            fields_6.7.6        maps_2.3-2          spam_0.29-3        
[43] FNN_1.0             spatstat_1.31-3     mgcv_1.7-24         rgeos_0.2-19        RArcInfo_0.4-12     automap_1.0-12      gstat_1.0-16       
[50] SDMTools_1.1-13     rgdal_0.8-10        spdep_0.5-60        coda_0.16-1         deldir_0.0-22       maptools_0.8-25     nlme_3.1-110       
[57] MASS_7.3-27         Matrix_1.0-12       lattice_0.20-15     boot_1.3-9          data.table_1.8.8    xtable_1.7-1        RCurl_1.95-4.1     
[64] bitops_1.0-5        RColorBrewer_1.0-5  testthat_0.7.1      codetools_0.2-8     devtools_1.3        stringr_0.6.2       foreign_0.8-54     
[71] ggplot2_0.9.3.1     sp_1.0-11           taRifx_1.0.5        reshape2_1.2.2      plyr_1.8            functional_0.4      R.utils_1.25.2     
[78] R.oo_1.13.9         R.methodsS3_1.4.4  

loaded via a namespace (and not attached):
 [1] LearnBayes_2.12  compiler_3.0.1   dichromat_2.0-0  digest_0.6.3     evaluate_0.4.4   gtable_0.1.2     httr_0.2         intervals_0.14.0 iterators_1.0.6 
[10] labeling_0.2     memoise_0.1      munsell_0.4.2    rpart_4.1-1      spacetime_1.0-5  stats4_3.0.1     tcltk_3.0.1      tools_3.0.1      whisker_0.3-2   
[19] xts_0.9-5

更新

在数据集的15%示例上运行：

Warning in eval(expr, envir, enclos) :
  model fit failed for Fold10.Rep1: lambda=0, fraction=0.005
... (more of the same warning messages) ...
Warning in nominalTrainWorkflow(dat = trainData, info = trainInfo, method = met\
hod,  :
  There were missing values in resampled performance measures.
Error in if (lambda > 0) { : argument is of length zero
Calls: train ... train.default -> system.time -> createModel -> enet

X矩阵有806列，其中801列为虚列。这些假人中有许多是非常稀疏的( 25k左右行中有1-3个观测值)，另一些则有0.1%-5%的值是正确的。总共有108867真假和21毫米假。

machine-learning

r-caret

回答 1

Stack Overflow用户

回答已采纳

发布于 2013-10-12 13:29:19

为了解决这个问题，我现在让它起作用了。我删除了所有少于20 TRUE's的专栏(记住，这是200 K的观察结果)，因为它没有足够的信息供人使用，这只占了其中的一半。

我将不得不谨慎，这些稀疏的列不会造成太多的偏倚，等等，随着我的前进，但我希望通过使用一种方法，进行变量选择(拉索，射频等)。那就没什么问题了。

感谢@O_Devinyak的帮助。

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/19122617

复制

相似问题

问enet()可以工作，但在通过插入符号运行时不起作用：：train()
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问enet()可以工作，但在通过插入符号运行时不起作用：：train()EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问enet()可以工作，但在通过插入符号运行时不起作用：：train()
EN