我试图从一个运行在EMR5.35(Hadoop2.10,Spark2.4.8,HBase 1.4.13)上的火花程序中连接到HBase,而不是试图连接到HBase,我的Spark程序运行得很完美。
但是,当我添加我的HBase代码时,星火程序在创建配置时会死掉:
conf = HBaseConfiguration.create();
for (Iterator<Map.Entry<String, String>> it = conf.iterator(); it.hasNext(); ) {
Map.Entry<String, String> e = it.next();
System.out.println(e);
}
connection = ConnectionFactory.createConnection(conf);
admin = connection.getAdmin();我试着增加资源:
conf = HBaseConfiguration.create();
conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
conf.addResource(new Path("/etc/hbase/conf/hbase-site.xml"));但没有成功。
我已经注释掉了HBaseconfiguration.create()之后的所有行,但是程序无论如何都会死。我相信问题就在这里。我找不到有用的堆栈跟踪。司机一碰到线就死了。
POM:
<properties>
<spark.version>2.4.8</spark.version>
<hbase.version>1.4.13</hbase.version>
<hadoop.version>2.10.1</hadoop.version>
<jackson.version>2.13.2</jackson.version>
<!-- Maven stuff -->
<java.build.version>1.8</java.build.version>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>bom</artifactId>
<version>2.17.103</version>
<type>pom</type>
<scope>import</scope>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-all</artifactId>
<version>4.1.77.Final</version>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty</artifactId>
<version>3.9.9.Final</version>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-aws</artifactId>
<version>${hadoop.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>${hbase.version}</version>
<scope>provided</scope>
</dependency>
<!-- AWS -->
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>s3</artifactId>
</dependency>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>athena</artifactId>
</dependency>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>auth</artifactId>
</dependency>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>opensearch</artifactId>
</dependency>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>apache-client</artifactId>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpcore</artifactId>
<version>4.4.15</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-client</artifactId>
<version>5.6.16</version>
</dependency>发布于 2022-06-12 14:43:13
通过删除“提供的”作用域-and从而包括对uber-jar的依赖,加上添加Hbase依赖(而不是100%确定这是否是严格必要的)解决了这个问题。
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>${hbase.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase</artifactId>
<version>${hbase.version}</version>
<type>pom</type>
</dependency>https://stackoverflow.com/questions/72567864
复制相似问题