我们无法解决以下问题:我们试图使用Apache将数据保存到存储区。问题是当我们上传一个包含依赖项中的org.json包的胖jar时,df.save()应用程序正在失败。
java.lang.NoClassDefFoundError: org/json/JSONException
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:10847)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:10047)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10128)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:209)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
at org.apache.hudi.hive.HoodieHiveClient.updateHiveSQLs(HoodieHiveClient.java:384)
at org.apache.hudi.hive.HoodieHiveClient.updateHiveSQLUsingHiveDriver(HoodieHiveClient.java:367)
at org.apache.hudi.hive.HoodieHiveClient.updateHiveSQL(HoodieHiveClient.java:357)
at org.apache.hudi.hive.HoodieHiveClient.createTable(HoodieHiveClient.java:262)
at org.apache.hudi.hive.HiveSyncTool.syncSchema(HiveSyncTool.java:176)
at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:130)
at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:94)
at org.apache.hudi.HoodieSparkSqlWriter$.org$apache$hudi$HoodieSparkSqlWriter$$syncHive(HoodieSparkSqlWriter.scala:321)
at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$metaSync$2.apply(HoodieSparkSqlWriter.scala:363)
at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$metaSync$2.apply(HoodieSparkSqlWriter.scala:359)即使我转到集群库并显式添加这依赖项,它在保存时仍然失败。另一方面,当我在笔记本上创建new JSONException("hello")时,一切似乎都很好。是什么导致了这种行为?谢谢
发布于 2021-01-23 17:27:06
结果发现,问题在于亚稳态服务和spark进程之间存在不同的类路径,因为它们运行在分离的JVM中。问题是通过一个init脚本将jar下载到类路径文件夹中。
发布于 2020-11-03 23:13:21
这可能是因为jar没有找到执行器节点,请尝试addJar (https://spark.apache.org/docs/latest/api/java/org/apache/spark/SparkContext.html#addJar-java.lang.String-)
发布于 2020-11-11 14:10:49
你用的是什么版本的Hudi?在0.6.0版本中,JSON有一个问题,并且有一个公开发行。我建议你现在使用0.5.2版。
https://stackoverflow.com/questions/64668022
复制相似问题