首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >sparkR Rstudio错误

sparkR Rstudio错误
EN

Stack Overflow用户
提问于 2016-07-20 05:29:38
回答 1查看 277关注 0票数 0

使用Rstudio中的sparkR无法错误读取数据。

怎么解决我能做什么

环境

代码语言:javascript
复制
R:version 3.3.1
RStudio:Version 0.99.902 
sparkR:Version 1.6.1
mac:Version 10.11.6

代码语言:javascript
复制
SPARK_HOME <- "/usr/local/Cellar/apache-spark/1.6.1/libexec"
Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.4.0" "sparkr-shell"')
.libPaths(c(file.path(SPARK_HOME, "R", "lib"), .libPaths()))
library(SparkR)

sc <- sparkR.init(master="local[3]", sparkHome=SPARK_HOME,
               sparkEnvir=list(spark.driver.maemory="6g",
                               sparkPackages="com.databricks:spark-csv_2.10:1.4.0"))

sqlContext <- sparkRSQL.init(sc)

警告

代码语言:javascript
复制
WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.

代码语言:javascript
复制
df <- read.df(sqlContext, "iris.csv", source="com.databricks.spark.csv", inferSchema="true")

警告

代码语言:javascript
复制
WARN : Your hostname, xxxx-no-MacBook-Pro.local resolves to a loopback/non-reachable address: fe80:0:0:0:701f:d8ff:fe34:fd1%8, but we couldn't find any external IP address!

误差

代码语言:javascript
复制
ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)

警告

代码语言:javascript
复制
16/07/20 14:00:44 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)

误差

代码语言:javascript
复制
16/07/20 14:00:44 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job
16/07/20 14:00:44 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
16/07/20 14:00:44 INFO TaskSchedulerImpl: Cancelling stage 0
16/07/20 14:00:44 INFO DAGScheduler: ResultStage 0 (first at CsvRelation.scala:267) failed in 60.099 s
16/07/20 14:00:44 INFO DAGScheduler: Job 0 failed: first at CsvRelation.scala:267, took 60.168711 s
16/07/20 14:00:44 ERROR RBackendHandler: loadDF on org.apache.spark.sql.api.r.SQLUtils failed
invokeJava(isStatic = TRUE, className, methodName, ...) でエラー: 
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)

因为解决的方法不被理解。请告诉我。

EN

回答 1

Stack Overflow用户

发布于 2016-07-20 14:26:17

尝尝这个

代码语言:javascript
复制
Sys.setenv(SPARK_HOME="/usr/local/Cellar/apache-spark/1.6.1/libexec")
Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.4.0" "sparkr-shell"')
library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R","lib")))
sc <- sparkR.init(master="local", sparkEnvir = list(spark.driver.memory="4g", spark.executor.memory="6g"))

sqlContext <- sparkRSQL.init(sc)

对我来说很管用。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/38472978

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档