下面是我来自IntelliJ的代码:
package com.dmngaya
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql.SparkSession
object ReadVertexPage {
def main(args: Array[String]): Unit = {
val conf: SparkConf = new SparkConf().setAppName("ReadVertexPage").setMaster("local")
val sc: SparkContext = new SparkContext(conf)
val spark = SparkSession
.builder()
.appName("Spark SQL basic example")
.getOrCreate()
val jdbcDF1 = spark.read.format("jdbc").options(
Map(
"driver" -> "com.tigergraph.jdbc.Driver",
"url" -> "jdbc:tg:http://127.0.0.1:14240",
"username" -> "tigergraph",
"password" -> "tigergraph",
"graph" -> "gsql_demo", // graph name
"dbtable" -> "vertex Page", // vertex type
"limit" -> "10", // number of vertices to retrieve
"debug" -> "0")).load()
jdbcDF1.show
}
}当我在shell中运行它时,它运行的是文件:/opt/shell/bin/shell jars /home/tigergraph/ecosys/tools/etl/tg-jdbc-driver/tg-jdbc-driver/target/tg-jdbc-driver-1.2.jar。
scala> val jdbcDF1 = spark.read.format("jdbc").options(
| Map(
| "driver" -> "com.tigergraph.jdbc.Driver",
| "url" -> "jdbc:tg:http://127.0.0.1:14240",
| "username" -> "tigergraph",
| "password" -> "tigergraph",
| "graph" -> "gsql_demo", // graph name
| "dbtable" -> "vertex Page", // vertex type
| "limit" -> "10", // number of vertices to retrieve
| "debug" -> "0")).load()
jdbcDF1: org.apache.spark.sql.DataFrame = [v_id: string, page_id: string]
scala> jdbcDF1.show
result:
+----+--------+
|v_id| page_id|
+----+--------+
| 7| 7|
| 5| 5|
| 10| 10|
|1002| 1002|
| 3| 3|
|1000|new page|
|1003| 1003|
| 1| 1|
| 6| 6|
|1001| |在IntelliJ中,我有以下错误:
20/11/23 10:43:43 INFO SharedState:将hive.metastore.warehouse.dir ('null')设置为spark.sql.warehouse.dir SharedState的值20/11/23 10:43信息SharedState:仓库路径是SharedState线程“主”java.lang.ClassNotFoundException中的异常:未能找到数据源: jdbc。请在org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:679),org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:733),org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:248),org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:221),com.dmngaya.ReadVertexPage$.main(ReadVertexPage.scala:25),com.dmngaya.ReadVertexPage.main(ReadVertexPage,http://spark.apache.org/third-party-projects.html找到包裹( java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$5(DataSource.scala:653) at scala.util.Try$.apply(Try.scala:213 ))在org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$4(DataSource.scala:653) at scala.util.Failure.orElse(Try.scala:224) at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:653) .5多出20/11/23 10:43:46 INFO SparkContext:从关机钩子调用stop() 20/11/23 10:43:46 INFO SparkUI:http://tigergraph-01:4040 20/11/23 10:43:46 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!20/11/23 10:43:46 INFO MemoryStore: MemoryStore已清除20/11/23 10:43:46 INFO BlockManager: BlockManager stopped 20/11/23 10:43:47 INFO BlockManagerMaster: BlockManagerMaster stopped 20/11/23 10:43:47 INFO OutputCommitCoordinator stopped 20/11/23 10:43:47 INFO SparkContext:成功停止20/11/23 10:43:47 INFO ShutdownHookManager:关机挂钩20/11/23 10:43:47 INFO ShutdownHookManager:删除/tmp/ 23目录火花-66dd4dc4-c70b-4836-805b-d68b3183ccbf出口代码1完成
我怎么才能解决呢?
发布于 2020-11-25 03:38:48
您应该在pom/sbt中添加依赖项tg-jdbc-driver-1.2。
https://stackoverflow.com/questions/64971868
复制相似问题