首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在kubernetes中运行的新创建的spark executor不知道ignite配置

在kubernetes中运行的新创建的spark executor不知道ignite配置
EN

Stack Overflow用户
提问于 2020-02-06 04:04:49
回答 1查看 132关注 0票数 0

我有一个spark驱动和执行器在kubernetes上运行,执行器与apache ignite实例对话。但是如果executor-1死了,executor-2将由驱动程序创建。

现在新创建的executor-2正在抱怨executor 2):

代码语言:javascript
复制
class org.apache.ignite.IgniteIllegalStateException: Ignite instance with provided name doesn't exist. Did you call Ignition.start(..) to start an Ignite instance? [name=shared-grid]

at org.apache.ignite.internal.IgnitionEx.grid(IgnitionEx.java:1390)
at org.apache.ignite.Ignition.ignite(Ignition.java:531)
at org.apache.ignite.spark.impl.package$.ignite(package.scala:86)
at org.apache.ignite.spark.impl.IgniteRelationProvider$$anonfun$configProvider$1$2.apply(IgniteRelationProvider.scala:226)
at org.apache.ignite.spark.impl.IgniteRelationProvider$$anonfun$configProvider$1$2.apply(IgniteRelationProvider.scala:223)
at org.apache.ignite.spark.Once.apply(IgniteContext.scala:224)
at org.apache.ignite.spark.IgniteContext.ignite(IgniteContext.scala:145)
at org.apache.ignite.spark.impl.IgniteSQLDataFrameRDD.compute(IgniteSQLDataFrameRDD.scala:65)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

我们需要做什么才能将ignite实例详细信息传递给新创建的executor实例

EN

回答 1

Stack Overflow用户

发布于 2020-02-07 00:29:04

你使用IgniteSparkSession?了吗?我找不到票证,但它看起来像是一个已知的问题,有时IgniteSparkSession无法在物理分布式集群上启动内部客户端。下一个代码:

代码语言:javascript
复制
IgniteSparkSession igniteSession = IgniteSparkSession.builder()
               .appName("Spark Ignite catalog example")
               .igniteConfig(configPath)
               .getOrCreate();

可能会产生以下异常:

代码语言:javascript
复制
class org.apache.ignite.IgniteIllegalStateException: Ignite instance with provided name doesn't exist. Did you call Ignition.start(..) to start an Ignite instance? [name=grid]

作为解决方法,您可以尝试在启动作业之前使用提供的每个spark节点的配置启动客户端节点,但我不确定它是否能正常工作。

我建议在当前问题不解决之前避免使用IgniteSparkSession

请使用DataFrame接口语法:

代码语言:javascript
复制
String configPath = "client.xml";

SparkConf sparkConf = new SparkConf()
 .setAppName("Example");

SparkSession session = SparkSession.builder()
 .config(sparkConf)
 .getOrCreate();

Dataset < Row > csvDataset = session.read()
 .format("csv")
 .option("sep", ",")
 .option("header", true)
 .load("person.csv");

Dataset < Row > resultDF = csvDataset
 .select("id", "name", "city_id", "company")
 .sort("id")
 .limit(10000);

for (int i = 0; i < 10; i++) {
 DataFrameWriter < Row > df = resultDF
  .write()
  .format(IgniteDataFrameSettings.FORMAT_IGNITE())
  .option(IgniteDataFrameSettings.OPTION_CONFIG_FILE(), configPath)
  .option(IgniteDataFrameSettings.OPTION_TABLE(), "Person")
  .option(IgniteDataFrameSettings.OPTION_CREATE_TABLE_PRIMARY_KEY_FIELDS(), "id, city_id")
  .option(IgniteDataFrameSettings.OPTION_CREATE_TABLE_PARAMETERS(), "template=partitioned,backups=1")
  .mode(Append);

 df.save();
}

session.close();

这段代码运行良好。我将为它检查JIRA问题。也许我会创建一个新的。

更新:这是新的票证https://issues.apache.org/jira/browse/IGNITE-12637

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/60083307

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档