首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >doParallel比顺序处理慢

doParallel比顺序处理慢
EN

Stack Overflow用户
提问于 2020-11-10 18:07:52
回答 1查看 187关注 0票数 0

我试图通过使用RStudio包并遵循教程这里,使并行处理在本地安装RStudio或在这里云上工作。

不幸的是,打开并行处理似乎会降低计算速度,而不是加快它的速度。

测试操作:

代码语言:javascript
复制
microbenchmark(foreach(i=1:1000) %dopar% sum(tanh(1:i)))

system.time(foreach(i=1:1000) %dopar% sum(tanh(1:i)))

没有并行处理的结果

代码语言:javascript
复制
Unit: milliseconds
                                    expr      min       lq    mean   median       uq      max neval
 foreach(i = 1:1000) %do% sum(tanh(1:i)) 183.1157 196.3723 222.237 206.3648 227.4821 417.8161   100

   user  system elapsed 
   0.33    0.04    0.19

结果打开并行处理后-花了2倍的时间!

代码语言:javascript
复制
Unit: milliseconds
                                       expr      min       lq     mean   median       uq      max neval
 foreach(i = 1:1000) %dopar% sum(tanh(1:i)) 331.3142 371.2502 406.0369 389.7049 412.8814 814.3407   100

   user  system elapsed 
   0.28    0.10    0.37 

真奇怪!有小费吗?下面是我运行的完整脚本以及本地RStudio会话和RStudio云的日志。

完整脚本

代码语言:javascript
复制
install.packages('doParallel')
library(doParallel)
install.packages('microbenchmark')
library(microbenchmark)

# Without parallel processing
microbenchmark(foreach(i=1:1000) %do% sum(tanh(1:i)))

system.time(foreach(i=1:1000) %do% sum(tanh(1:i)))

# Without parallel processing, get a warning
microbenchmark(foreach(i=1:1000) %dopar% sum(tanh(1:i)))

system.time(foreach(i=1:1000) %dopar% sum(tanh(1:i)))

# Turn on parallel with several cores
registerDoParallel(detectCores() - 2)

# See number of cores
getDoParWorkers()

# Test for speed improvement With parallel processing
microbenchmark(foreach(i=1:1000) %dopar% sum(tanh(1:i)))

system.time(foreach(i=1:1000) %dopar% sum(tanh(1:i)))

# Return to one worker
registerDoParallel(1)
registerDoSEQ()

本地运行日志:

代码语言:javascript
复制
Restarting R session...


Warning message:
<REDACTED LINE>
Error 6 (The handle is invalid)
Features disabled: R source file indexing, Diagnostics
Error in summary.connection(connection) : invalid connection
Error in summary.connection(connection) : invalid connection
<REDACTED LINE>
> install.packages('doParallel')
Installing doParallel [1.0.16] ...
    OK [linked cache]
> library(doParallel)
Loading required package: foreach
Loading required package: iterators
Loading required package: parallel
Warning messages:
1: package ‘doParallel’ was built under R version 4.0.3 
2: package ‘foreach’ was built under R version 4.0.3 
3: package ‘iterators’ was built under R version 4.0.3 
> install.packages('microbenchmark')
Installing microbenchmark [1.4-7] ...
    OK [linked cache]
> library(microbenchmark)
Warning message:
package ‘microbenchmark’ was built under R version 4.0.3 
> 
> # Without parallel processing
> microbenchmark(foreach(i=1:1000) %do% sum(tanh(1:i)))
Unit: milliseconds
                                    expr      min       lq    mean   median       uq      max neval
 foreach(i = 1:1000) %do% sum(tanh(1:i)) 183.1157 196.3723 222.237 206.3648 227.4821 417.8161   100
> 
> system.time(foreach(i=1:1000) %do% sum(tanh(1:i)))
   user  system elapsed 
   0.33    0.04    0.19 
> 
> # Without parallel processing, get a warning
> microbenchmark(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
Unit: milliseconds
                                       expr      min      lq     mean   median       uq     max neval
 foreach(i = 1:1000) %dopar% sum(tanh(1:i)) 178.1788 188.879 213.9808 197.2124 227.6921 698.484   100
Warning message:
executing %dopar% sequentially: no parallel backend registered 
> 
> system.time(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
   user  system elapsed 
   0.22    0.03    0.25 
> 
> # Turn on parallel with several cores
> registerDoParallel(detectCores() - 2)
> 
> # See number of cores
> getDoParWorkers()
[1] 6
> 
> # Test for speed improvement With parallel processing
> microbenchmark(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
Unit: milliseconds
                                       expr      min       lq     mean   median       uq      max neval
 foreach(i = 1:1000) %dopar% sum(tanh(1:i)) 331.3142 371.2502 406.0369 389.7049 412.8814 814.3407   100
> 
> system.time(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
   user  system elapsed 
   0.28    0.10    0.37 
> 
> # Return to one worker
> registerDoParallel(1)
> registerDoSEQ()

来自RStudio云的日志:

代码语言:javascript
复制
Restarting R session...

> install.packages('doParallel')
Installing package into ‘/home/rstudio-user/R/x86_64-pc-linux-gnu-library/4.0’
(as ‘lib’ is unspecified)
trying URL 'http://package-proxy/src/contrib/doParallel_1.0.16.tar.gz'
Content type 'application/x-tar' length 59776 bytes (58 KB)
==================================================
downloaded 58 KB

* installing *binary* package ‘doParallel’ ...
* DONE (doParallel)

The downloaded source packages are in
    ‘/tmp/RtmplDZYAT/downloaded_packages’
> library(doParallel)
Loading required package: foreach
Loading required package: iterators
Loading required package: parallel
> install.packages('microbenchmark')
Installing package into ‘/home/rstudio-user/R/x86_64-pc-linux-gnu-library/4.0’
(as ‘lib’ is unspecified)
trying URL 'http://package-proxy/src/contrib/microbenchmark_1.4-7.tar.gz'
Content type 'application/x-tar' length 61382 bytes (59 KB)
==================================================
downloaded 59 KB

* installing *binary* package ‘microbenchmark’ ...
* DONE (microbenchmark)

The downloaded source packages are in
    ‘/tmp/RtmplDZYAT/downloaded_packages’
> library(microbenchmark)
> 
> # Without parallel processing
> microbenchmark(foreach(i=1:1000) %do% sum(tanh(1:i)))
Unit: milliseconds
                                    expr      min       lq     mean   median       uq      max neval
 foreach(i = 1:1000) %do% sum(tanh(1:i)) 121.6417 126.5681 130.8152 129.7511 133.3043 171.6484   100
> 
> system.time(foreach(i=1:1000) %do% sum(tanh(1:i)))
   user  system elapsed 
  0.126   0.000   0.126 
> 
> # Without parallel processing, get a warning
> microbenchmark(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
Unit: milliseconds
                                       expr      min       lq     mean   median       uq      max neval
 foreach(i = 1:1000) %dopar% sum(tanh(1:i)) 117.6518 124.2508 127.9016 127.1467 129.9798 171.9952   100
Warning message:
executing %dopar% sequentially: no parallel backend registered 
> 
> system.time(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
   user  system elapsed 
  0.169   0.000   0.169 
> 
> # Turn on parallel with several cores
> registerDoParallel(detectCores() - 2)
> 
> # See number of cores
> getDoParWorkers()
[1] 14
> 
> # Test for speed improvement With parallel processing
> microbenchmark(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
Unit: milliseconds
                                       expr      min       lq     mean   median       uq      max neval
 foreach(i = 1:1000) %dopar% sum(tanh(1:i)) 262.9285 302.7655 340.1377 325.8734 359.3806 707.4004   100
> 
> system.time(foreach(i=1:1000) %dopar% sum(tanh(1:i)))
   user  system elapsed 
  0.136   0.176   0.313 
> 
> # Return to one worker
> registerDoParallel(1)
> registerDoSEQ()
> 
EN

回答 1

Stack Overflow用户

发布于 2020-11-10 18:39:55

总之,您应该在Linux上使用mclapply函数来获得更好的性能。

这里几乎没有什么问题。首先,并不是所有的任务都适合于多处理,在这种情况下,你看起来不太适合(玩具小任务)。另一件事是,在R中,多进程可能被分成多个会话/多个进程,检查这个问题,找出为什么这种区别如此重要。R,mclapply,vs

对于Linux,您应该使用多线程,这将大大提高效率。如果foreach是一个多会话(而不是多会话),那么它必须创建一个单独的会话并在它们之间进行通信。因此,对于这样一个小玩具的例子,这个额外的处理是相当重要的。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/64774418

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档