在尝试从EMR木星笔记本连接RDS中的MySql数据库时,我发现了以下错误:
所用代码:
from pyspark.sql import SparkSession
hostname="hostname"
dbname = "mysql"
jdbcPort = 3306
username = "user"
password = "password"
jdbc_url = "jdbc:mysql://{0}:{1}/{2}?user={3}&password={4}".format(hostname,jdbcPort, dbname,username,password)
query = "(select * from framework.File_Columns) as table1"
df1 = spark.read.format('jdbc').options(driver = 'com.mysql.jdbc.Driver',url=jdbc_url, dbtable=query ).load()
df1.show()错误消息:
调用o89.showString时发生错误。::org.apache.spark.SparkException:由于阶段失败而中止作业:阶段0.0中的任务0失败4次,最近的失败:阶段0.0中丢失的任务0.3 (TID 3,ip-172-31-37-50.us-west-2计算程序1):java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
我已将所需的mysql-connector-java-5.1.47.jar下载到/home/hadoop/mysql-connector-java-5.1.47.jar,并按以下方式更新了Spark配置文件:
spark.master yarn
spark.driver.extraClassPath :/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/home/hadoop/extrajars/*:/home/hadoop/extrajars/mysql-connector-java-5.1.47.jar
spark.driver.extraLibraryPath /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/home/hadoop/extrajars/*:/home/hadoop/extrajars/mysql-connector-java-5.1.47.jar
spark.executor.extraClassPath :/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/home/hadoop/extrajars/*:/home/hadoop/extrajars/mysql-connector-java-5.1.47.jar
spark.executor.extraLibraryPath /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/home/hadoop/extrajars/*:/home/hadoop/extrajars/mysql-connector-java-5.1.47.jar我还需要做些什么,以便从木星笔记本连接到MySql DB吗?
发布于 2020-04-23 14:16:43
当您从朱庇特笔记本中运行它时,它无法找到驱动程序类,为了避免这种情况,您可以尝试将mysql-connector-java-5.1.47.jar复制到$SPARK_HOME/jars文件夹中。它将解决您的司机问题,根据我的个人经验。
发布于 2021-12-09 11:02:54
您也可以这样做:
spark.conf.set("jars", "s3://bucket-name/folder-name/mysql-connector-java-5.1.38-bin.jar")
https://stackoverflow.com/questions/61387861
复制相似问题