首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何修正PCA过程中的DataExplorer错误:“项目2没有长度”

如何修正PCA过程中的DataExplorer错误:“项目2没有长度”
EN

Stack Overflow用户
提问于 2018-11-24 23:28:26
回答 2查看 1.4K关注 0票数 1

我有一个数据集df,有102个变量:16个int,80个因子,8个logi。没有NA值。

我以前使用过DataExplorer,没有问题,但是当我在这个数据集中运行它时.

代码语言:javascript
复制
library(DataExplorer)
create_report(df)

..。它前进的很好,输出它的进步..。

代码语言:javascript
复制
# label: correlation_analysis
#   |................................................                 |  74%
#   ordinary text without R code

..。直到它到达PCA部分时,它产生了这个错误:

代码语言:javascript
复制
#  |..................................................               |  76%
# label: principle_component_analysis
# Quitting from lines 208-221 (report.rmd) 
#
# Error in data.table(pc = paste0("PC", seq_along(pca$sdev)), var = var_exp,  : 
#  Item 2 has no length. Provide at least one item (such as NA, NA_integer_ etc) to be repeated to match the 1 row in the longest column. Or, all columns can be 0 length, for insert()ing rows into. 

我在谷歌上搜索了这个错误,但是只找到解释PCA的页面,而没有找到这个错误。有什么建议吗?

跟踪

代码语言:javascript
复制
26. stop("Item ", i, " has no length. Provide at least one item (such as NA, NA_integer_ etc) to be repeated to match the ", 
    nr, " row", if (nr > 1L) "s", " in the longest column. Or, all columns can be 0 length, for insert()ing rows into.") 
25. data.table(pc = paste0("PC", seq_along(pca$sdev)), var = var_exp, 
    pct = var_exp/sum(var_exp), cum_pct = cumsum(var_exp)/sum(var_exp)) 
24. plot_prcomp(data = structure(list(EnrollmentID = c(4603L, 8457L, 
3290L, 3323L, 6186L, 6501L, 3084L, 8662L, 7676L, 3229L, 6005L, 
3387L, 8204L, 9018L, 4517L, 3320L, 8840L, 7729L, 8835L, 5148L, 
7560L, 1239L, 5874L, 4963L, 3755L, 3397L, 9877L, 8609L, 6584L,  ... 
23. do.call(fun_name, c(list(data = data), report_config[[fun_name]])) at <text>#9
22. do_call("plot_prcomp", na_omit = TRUE) at <text>#8
21. eval(expr, envir, enclos) 
20. eval(expr, envir, enclos) 
19. withVisible(eval(expr, envir, enclos)) 
18. withCallingHandlers(withVisible(eval(expr, envir, enclos)), warning = wHandler, 
    error = eHandler, message = mHandler) 
17. handle(ev <- withCallingHandlers(withVisible(eval(expr, envir, 
    enclos)), warning = wHandler, error = eHandler, message = mHandler)) 
16. timing_fn(handle(ev <- withCallingHandlers(withVisible(eval(expr, 
    envir, enclos)), warning = wHandler, error = eHandler, message = mHandler))) 
15. valuate_call(expr, parsed$src[[i]], envir = envir, enclos = enclos, 
    debug = debug, last = i == length(out), use_try = stop_on_error != 
        2L, keep_warning = keep_warning, keep_message = keep_message, 
    output_handler = output_handler, include_timing = include_timing) 
14. evaluate::evaluate(...) 
13. evaluate(code, envir = env, new_device = FALSE, keep_warning = !isFALSE(options$warning), 
    keep_message = !isFALSE(options$message), stop_on_error = if (options$error && 
        options$include) 0L else 2L, output_handler = knit_handlers(options$render, 
        options)) 
12. in_dir(input_dir(), evaluate(code, envir = env, new_device = FALSE, 
    keep_warning = !isFALSE(options$warning), keep_message = !isFALSE(options$message), 
    stop_on_error = if (options$error && options$include) 0L else 2L, 
    output_handler = knit_handlers(options$render, options))) 
11. block_exec(params) 
10. call_block(x) 
9. process_group.block(group) 
8. process_group(group) 
7. withCallingHandlers(if (tangle) process_tangle(group) else process_group(group), 
    error = function(e) {
        setwd(wd)
        cat(res, sep = "\n", file = output %n% "") ... 
6. process_file(text, output) 
5. knitr::knit(knit_input, knit_output, envir = envir, quiet = quiet, 
    encoding = encoding) 
4. render(input = report_dir, output_file = output_file, output_dir = output_dir, 
    intermediates_dir = output_dir, params = list(data = data, 
        report_config = config, response = y), ...) 
3. withCallingHandlers(expr, warning = function(w) invokeRestart("muffleWarning")) 
2. suppressWarnings(render(input = report_dir, output_file = output_file, 
    output_dir = output_dir, intermediates_dir = output_dir, 
    params = list(data = data, report_config = config, response = y), 
    ...)) 
1. create_report(df) 

这是会话信息

代码语言:javascript
复制
sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] car_3.0-2          knitr_1.20         rmarkdown_1.10     data.table_1.11.8 
 [5] DataExplorer_0.7.0 mosaic_1.4.0       Matrix_1.2-14      mosaicData_0.17.0 
 [9] ggformula_0.9.0    ggstance_0.3.1     mdsr_0.1.6         Lahman_6.0-0      
[13] ISLR_1.2           forcats_0.3.0      stringr_1.3.1      dplyr_0.7.8       
[17] purrr_0.2.5        readr_1.1.1        tidyr_0.8.2        tibble_1.4.2      
[21] ggplot2_3.1.0      tidyverse_1.2.1    lattice_0.20-35    carData_3.0-2     

loaded via a namespace (and not attached):
 [1] ggdendro_0.1-20  httr_1.3.1       RMySQL_0.10.15   jsonlite_1.5     splines_3.5.1   
 [6] modelr_0.1.2     assertthat_0.2.0 highr_0.7        cellranger_1.1.0 yaml_2.2.0      
[11] ggrepel_0.8.0    pillar_1.3.0     backports_1.1.2  glue_1.3.0       downloader_0.4  
[16] digest_0.6.18    rvest_0.3.2      colorspace_1.3-2 htmltools_0.3.6  plyr_1.8.4      
[21] pkgconfig_2.0.2  broom_0.5.0      haven_1.1.2      scales_1.0.0     openxlsx_4.1.0  
[26] rio_0.5.10       withr_2.1.2      lazyeval_0.2.1   cli_1.0.1        magrittr_1.5    
[31] crayon_1.3.4     readxl_1.1.0     evaluate_0.12    nlme_3.1-137     MASS_7.3-50     
[36] xml2_1.2.0       foreign_0.8-71   tools_3.5.1      hms_0.4.2        munsell_0.5.0   
[41] babynames_0.3.0  zip_1.0.0        bindrcpp_0.2.2   networkD3_0.4    compiler_3.5.1  
[46] rlang_0.3.0.1    grid_3.5.1       rstudioapi_0.8   htmlwidgets_1.3  igraph_1.2.2    
[51] labeling_0.3     mosaicCore_0.6.0 gtable_0.2.0     abind_1.4-5      DBI_1.0.0       
[56] curl_3.2         reshape2_1.4.3   R6_2.3.0         gridExtra_2.3    lubridate_1.7.4 
[61] rprojroot_1.3-2  bindr_0.1.1      stringi_1.2.4    parallel_3.5.1   Rcpp_1.0.0      
[66] dbplyr_1.2.2     tidyselect_0.2.5

下面是如下注释中所要求的introduce(df_dummified)的输出:

代码语言:javascript
复制
A tibble: 1 x 9  
 rows columns discrete_columns continuous_columns  
<int>   <int>            <int>              <int>  
 9527     489                2                487  

all_missing_columns total_missing_values  
              <int>                <int>  
                  0                 7826  

complete_rows total_observations memory_usage  
        <int>              <int>        <dbl>  
         6889            4658703     18919440  
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2018-11-28 13:03:24

您还可以考虑跳过报告的PCA部分,从create_report()配置中删除“create_report”。

我也有同样的问题,这仍然为我创造了报告的其余部分:

代码语言:javascript
复制
library(DataExplorer)

config <- list(
  "introduce" = list(),
  "plot_str" = list(
    "type" = "diagonal",
    "fontSize" = 35,
    "width" = 1000,
    "margin" = list("left" = 350, "right" = 250)
  ),
  "plot_missing" = list(),
  "plot_histogram" = list(),
  "plot_qq" = list(sampled_rows = 1000L),
  "plot_bar" = list(),
  "plot_correlation" = list("cor_args" = list("use" = "pairwise.complete.obs")),
#  "plot_prcomp" = list(),
  "plot_boxplot" = list(),
  "plot_scatterplot" = list(sampled_rows = 1000L)
)

create_report(df, config = config)

票数 4
EN

Stack Overflow用户

发布于 2018-11-25 20:12:08

PCA只能应用于数值数据。只考虑PCA的数字列,删除数值以外的列。

代码语言:javascript
复制
nums <- unlist(lapply(df, is.numeric))
df_new <- df[, nums]

删除所有具有常量方差的列。

代码语言:javascript
复制
df_new <- df_new[, apply(df_new, 2, var) != 0]

参考资料:How to solve prcomp.default(): cannot rescale a constant/zero column to unit variance

现在,运行这个。这将为您创建一个不错的html报告。

代码语言:javascript
复制
create_report(df_new)
票数 3
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/53463300

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档