首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用R中的bigmemory和并行包并行地查找每列中丢失信息的百分比

使用R中的bigmemory和并行包并行地查找每列中丢失信息的百分比
EN

Stack Overflow用户
提问于 2014-05-11 21:59:31
回答 1查看 279关注 0票数 0

我想做的是:

代码语言:javascript
复制
> library(parallel)
> library(bigmemory)
> big.mat=read.big.matrix("cp2006.csv",header=T)
Warning messages:
1: In na.omit(as.integer(firstLineVals)) : NAs introduced by coercion
2: In na.omit(as.double(firstLineVals)) : NAs introduced by coercion
3: In read.big.matrix("cp2006.csv", header = T) :
  Because type was not specified, we chose double based on the first line of data.
> jobs <- lapply(1:10, function(x) mcparallel(colMeans(is.na(big.mat))*100, name = big.mat))
Error in as.character.default(name) : 
  no method for coercing this S4 class to a vector
> res  <- mccollect(jobs)

然而,问题是is.na显然不适用于big.matrix对象。我在web上进行了搜索,找到了mwhich,这是whichbigmemory中的并行版本,但不幸的是,没有找到一个很好的教程来查找专栏中缺少的(NA)值。因此,我不确定应该将什么函数提供给我的mcparallel,以使它与big.matrix对象一起工作。此外:

代码语言:javascript
复制
> col.NA.mean<-colMeans(is.na(big.mat))*100
Error in colMeans(is.na(big.mat)) : 
  'x' must be an array of at least two dimensions
In addition: Warning message:
In is.na(big.mat) : is.na() applied to non-(list or vector) of type 'S4'
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2014-05-11 23:14:53

我知道答案了。当我们调用big.mat时,我们应该使用[,],这是部分答案。

代码语言:javascript
复制
> colMeans(is.na(big.mat[,]))
             Year             Month        DayofMonth         DayOfWeek 
       0.00000000        0.00000000        0.00000000        0.00000000 
          DepTime        CRSDepTime           ArrTime        CRSArrTime 
       0.02102102        0.00000000        0.02402402        0.00000000 
    UniqueCarrier         FlightNum           TailNum ActualElapsedTime 
       1.00000000        0.00000000        0.97997998        0.02402402 
   CRSElapsedTime           AirTime          ArrDelay          DepDelay 
       0.00000000        0.02402402        0.02402402        0.02102102 
           Origin              Dest          Distance            TaxiIn 
       1.00000000        1.00000000        0.00000000        0.00000000 
          TaxiOut         Cancelled  CancellationCode          Diverted 
       0.00000000        0.00000000        1.00000000        0.00000000 
     CarrierDelay      WeatherDelay          NASDelay     SecurityDelay 
       0.00000000        0.00000000        0.00000000        0.00000000 
LateAircraftDelay 
       0.00000000 

以下是答案:

代码语言:javascript
复制
library(parallel)
library(bigmemory)
big.mat=read.big.matrix("cp2006.csv",header=T)
Warning messages:
1: In na.omit(as.integer(firstLineVals)) : NAs introduced by coercion
2: In na.omit(as.double(firstLineVals)) : NAs introduced by coercion
3: In read.big.matrix("cp2006.csv", header = T) :
Because type was not specified, we chose double based on the first line of data.
jobs <- lapply(1:10, function(x) mcparallel(colMeans(is.na(big.mat[,]))*100, name = big.mat))
Error in as.character.default(name) : 
no method for coercing this S4 class to a vector
jobs <- lapply(1:10, function(x) mcparallel(colMeans(is.na(big.mat[,]))*100, name = big.mat[,]))
res  <- mccollect(jobs)
> res
$`2006`
             Year             Month        DayofMonth         DayOfWeek 
         0.000000          0.000000          0.000000          0.000000 
          DepTime        CRSDepTime           ArrTime        CRSArrTime 
         2.102102          0.000000          2.402402          0.000000 
    UniqueCarrier         FlightNum           TailNum ActualElapsedTime 
       100.000000          0.000000         97.997998          2.402402 
   CRSElapsedTime           AirTime          ArrDelay          DepDelay 
         0.000000          2.402402          2.402402          2.102102 
           Origin              Dest          Distance            TaxiIn 
       100.000000        100.000000          0.000000          0.000000 
          TaxiOut         Cancelled  CancellationCode          Diverted 
         0.000000          0.000000        100.000000          0.000000 
     CarrierDelay      WeatherDelay          NASDelay     SecurityDelay 
         0.000000          0.000000          0.000000          0.000000 
LateAircraftDelay 
         0.000000 

$`2006`
             Year             Month        DayofMonth         DayOfWeek 
         0.000000          0.000000          0.000000          0.000000 
          DepTime        CRSDepTime           ArrTime        CRSArrTime 
         2.102102          0.000000          2.402402          0.000000 
    UniqueCarrier         FlightNum           TailNum ActualElapsedTime 
       100.000000          0.000000         97.997998          2.402402 
   CRSElapsedTime           AirTime          ArrDelay          DepDelay 
         0.000000          2.402402          2.402402          2.102102 
           Origin              Dest          Distance            TaxiIn 
       100.000000        100.000000          0.000000          0.000000 
          TaxiOut         Cancelled  CancellationCode          Diverted 
         0.000000          0.000000        100.000000          0.000000 
     CarrierDelay      WeatherDelay          NASDelay     SecurityDelay 
         0.000000          0.000000          0.000000          0.000000 
LateAircraftDelay 
         0.000000 

$`2006`
             Year             Month        DayofMonth         DayOfWeek 
         0.000000          0.000000          0.000000          0.000000 
          DepTime        CRSDepTime           ArrTime        CRSArrTime 
         2.102102          0.000000          2.402402          0.000000 
    UniqueCarrier         FlightNum           TailNum ActualElapsedTime 
       100.000000          0.000000         97.997998          2.402402 
   CRSElapsedTime           AirTime          ArrDelay          DepDelay 
         0.000000          2.402402          2.402402          2.102102 
           Origin              Dest          Distance            TaxiIn 
       100.000000        100.000000          0.000000          0.000000 
          TaxiOut         Cancelled  CancellationCode          Diverted 
         0.000000          0.000000        100.000000          0.000000 
     CarrierDelay      WeatherDelay          NASDelay     SecurityDelay 
         0.000000          0.000000          0.000000          0.000000 
LateAircraftDelay 
         0.000000 

$`2006`
             Year             Month        DayofMonth         DayOfWeek 
         0.000000          0.000000          0.000000          0.000000 
          DepTime        CRSDepTime           ArrTime        CRSArrTime 
         2.102102          0.000000          2.402402          0.000000 
    UniqueCarrier         FlightNum           TailNum ActualElapsedTime 
       100.000000          0.000000         97.997998          2.402402 
   CRSElapsedTime           AirTime          ArrDelay          DepDelay 
         0.000000          2.402402          2.402402          2.102102 
           Origin              Dest          Distance            TaxiIn 
       100.000000        100.000000          0.000000          0.000000 
          TaxiOut         Cancelled  CancellationCode          Diverted 
         0.000000          0.000000        100.000000          0.000000 
     CarrierDelay      WeatherDelay          NASDelay     SecurityDelay 
         0.000000          0.000000          0.000000          0.000000 
LateAircraftDelay 
         0.000000 

$`2006`
             Year             Month        DayofMonth         DayOfWeek 
         0.000000          0.000000          0.000000          0.000000 
          DepTime        CRSDepTime           ArrTime        CRSArrTime 
         2.102102          0.000000          2.402402          0.000000 
    UniqueCarrier         FlightNum           TailNum ActualElapsedTime 
       100.000000          0.000000         97.997998          2.402402 
   CRSElapsedTime           AirTime          ArrDelay          DepDelay 
         0.000000          2.402402          2.402402          2.102102 
           Origin              Dest          Distance            TaxiIn 
       100.000000        100.000000          0.000000          0.000000 
          TaxiOut         Cancelled  CancellationCode          Diverted 
         0.000000          0.000000        100.000000          0.000000 
     CarrierDelay      WeatherDelay          NASDelay     SecurityDelay 
         0.000000          0.000000          0.000000          0.000000 
LateAircraftDelay 
         0.000000 

$`2006`
             Year             Month        DayofMonth         DayOfWeek 
         0.000000          0.000000          0.000000          0.000000 
          DepTime        CRSDepTime           ArrTime        CRSArrTime 
         2.102102          0.000000          2.402402          0.000000 
    UniqueCarrier         FlightNum           TailNum ActualElapsedTime 
       100.000000          0.000000         97.997998          2.402402 
   CRSElapsedTime           AirTime          ArrDelay          DepDelay 
         0.000000          2.402402          2.402402          2.102102 
           Origin              Dest          Distance            TaxiIn 
       100.000000        100.000000          0.000000          0.000000 
          TaxiOut         Cancelled  CancellationCode          Diverted 
         0.000000          0.000000        100.000000          0.000000 
     CarrierDelay      WeatherDelay          NASDelay     SecurityDelay 
         0.000000          0.000000          0.000000          0.000000 
LateAircraftDelay 
         0.000000 

$`2006`
             Year             Month        DayofMonth         DayOfWeek 
         0.000000          0.000000          0.000000          0.000000 
          DepTime        CRSDepTime           ArrTime        CRSArrTime 
         2.102102          0.000000          2.402402          0.000000 
    UniqueCarrier         FlightNum           TailNum ActualElapsedTime 
       100.000000          0.000000         97.997998          2.402402 
   CRSElapsedTime           AirTime          ArrDelay          DepDelay 
         0.000000          2.402402          2.402402          2.102102 
           Origin              Dest          Distance            TaxiIn 
       100.000000        100.000000          0.000000          0.000000 
          TaxiOut         Cancelled  CancellationCode          Diverted 
         0.000000          0.000000        100.000000          0.000000 
     CarrierDelay      WeatherDelay          NASDelay     SecurityDelay 
         0.000000          0.000000          0.000000          0.000000 
LateAircraftDelay 
         0.000000 

$`2006`
             Year             Month        DayofMonth         DayOfWeek 
         0.000000          0.000000          0.000000          0.000000 
          DepTime        CRSDepTime           ArrTime        CRSArrTime 
         2.102102          0.000000          2.402402          0.000000 
    UniqueCarrier         FlightNum           TailNum ActualElapsedTime 
       100.000000          0.000000         97.997998          2.402402 
   CRSElapsedTime           AirTime          ArrDelay          DepDelay 
         0.000000          2.402402          2.402402          2.102102 
           Origin              Dest          Distance            TaxiIn 
       100.000000        100.000000          0.000000          0.000000 
          TaxiOut         Cancelled  CancellationCode          Diverted 
         0.000000          0.000000        100.000000          0.000000 
     CarrierDelay      WeatherDelay          NASDelay     SecurityDelay 
         0.000000          0.000000          0.000000          0.000000 
LateAircraftDelay 
         0.000000 

$`2006`
             Year             Month        DayofMonth         DayOfWeek 
         0.000000          0.000000          0.000000          0.000000 
          DepTime        CRSDepTime           ArrTime        CRSArrTime 
         2.102102          0.000000          2.402402          0.000000 
    UniqueCarrier         FlightNum           TailNum ActualElapsedTime 
       100.000000          0.000000         97.997998          2.402402 
   CRSElapsedTime           AirTime          ArrDelay          DepDelay 
         0.000000          2.402402          2.402402          2.102102 
           Origin              Dest          Distance            TaxiIn 
       100.000000        100.000000          0.000000          0.000000 
          TaxiOut         Cancelled  CancellationCode          Diverted 
         0.000000          0.000000        100.000000          0.000000 
     CarrierDelay      WeatherDelay          NASDelay     SecurityDelay 
         0.000000          0.000000          0.000000          0.000000 
LateAircraftDelay 
         0.000000 

$`2006`
             Year             Month        DayofMonth         DayOfWeek 
         0.000000          0.000000          0.000000          0.000000 
          DepTime        CRSDepTime           ArrTime        CRSArrTime 
         2.102102          0.000000          2.402402          0.000000 
    UniqueCarrier         FlightNum           TailNum ActualElapsedTime 
       100.000000          0.000000         97.997998          2.402402 
   CRSElapsedTime           AirTime          ArrDelay          DepDelay 
         0.000000          2.402402          2.402402          2.102102 
           Origin              Dest          Distance            TaxiIn 
       100.000000        100.000000          0.000000          0.000000 
          TaxiOut         Cancelled  CancellationCode          Diverted 
         0.000000          0.000000        100.000000          0.000000 
     CarrierDelay      WeatherDelay          NASDelay     SecurityDelay 
         0.000000          0.000000          0.000000          0.000000 
LateAircraftDelay 
         0.000000 

> 
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/23598404

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档