我有一个带有sparkVersion:"3.1.1“的星火操作符,并且希望将它用于minIO的结构化流。但是,我还没有找到比Hadoop2.7.0更新的库的兼容组合。(不支持新的s3a:// paths)
是否有一组兼容的spark/hadoop/aws库用于3.1.1版本的spark?
我当前在sbt中的依赖项应该基于https://mvnrepository.com/依赖项工作,但它们不工作(NoSuchMethodError):
scalaVersion := "2.12.0"
lazy val Versions = new {
val spark = "3.1.1"
val hadoop = "3.2.0"
val scalatest = "3.0.4"
}
"org.apache.spark" %% "spark-core" % Versions.spark % Provided
, "org.apache.spark" %% "spark-sql" % Versions.spark % Provided
, "org.apache.spark" %% "spark-hive" % Versions.spark % Provided
, "org.scalatest" %% "scalatest" % Versions.scalatest % Test
, "org.apache.hadoop" % "hadoop-aws" % Versions.hadoop
, "org.apache.hadoop" % "hadoop-common" % Versions.hadoop
, "org.apache.hadoop" % "hadoop-mapreduce-client-core" % Versions.hadoop
, "org.apache.hadoop" % "hadoop-client" % Versions.hadoop
, "com.typesafe" % "config" % "1.3.1"
, "com.github.scopt" %% "scopt" % "3.7.0"
, "com.github.melrief" %% "purecsv" % "0.1.1"
, "joda-time" % "joda-time" % "2.9.9"非常感谢你的帮助
发布于 2022-04-08 08:17:17
这一组合的图书馆工作:
"org.apache.spark" %% "spark-core" % "3.1.1" % Provided,
"org.apache.spark" %% "spark-sql" % "3.1.1" % Provided,
"org.apache.hadoop" % "hadoop-aws" % "3.2.0",
"org.apache.hadoop" % "hadoop-common" % "3.2.0",
"org.apache.hadoop" % "hadoop-client" % "3.2.0",
"org.apache.hadoop" % "hadoop-mapreduce-client-core" % "3.2.0",
"org.apache.hadoop" % "hadoop-minikdc" % "3.2.0",
"com.amazonaws" % "aws-java-sdk-bundle" % "1.11.375",
"com.typesafe" % "config" % "1.3.1"
, "joda-time" % "joda-time" % "2.9.9"诀窍是将此图像用于星火gcr.io/spark-operator/spark:v3.1.1-hadoop3,因为默认的图像仍然有Hadoop2.7,即使是Spark3.1.1也是如此
https://stackoverflow.com/questions/71648622
复制相似问题