我陷入了一个问题,目前正试图找到解决方案。此问题与将流数据输出存储到Azure Datalake相关。下面是我在存储数据时得到的异常
Exception in thread "main" org.apache.hadoop.fs.InvalidPathException: Invalid path name Wrong FS: adl://<azure-data-lake>.azuredatalakestore.net/eventstore/_spark_metadata, expected: adl://<azure-data-lake>.azuredatalakestore.net/
at org.apache.hadoop.fs.AbstractFileSystem.checkPath(AbstractFileSystem.java:383)
at org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:110)
at org.apache.hadoop.fs.FileContext$14.next(FileContext.java:1120)
at org.apache.hadoop.fs.FileContext$14.next(FileContext.java:1116)
at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
at org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1116)
at org.apache.hadoop.fs.FileContext$Util.exists(FileContext.java:1581)
at org.apache.spark.sql.execution.streaming.HDFSMetadataLog$FileContextManager.exists(HDFSMetadataLog.scala:390)
at org.apache.spark.sql.execution.streaming.HDFSMetadataLog.<init>(HDFSMetadataLog.scala:65)
at org.apache.spark.sql.execution.streaming.CompactibleFileStreamLog.<init>(CompactibleFileStreamLog.scala:46)
at org.apache.spark.sql.execution.streaming.FileStreamSinkLog.<init>(FileStreamSinkLog.scala:85)
at org.apache.spark.sql.execution.streaming.FileStreamSink.<init>(FileStreamSink.scala:95)
at org.apache.spark.sql.execution.datasources.DataSource.createSink(DataSource.scala:316)
at org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:293)下面是我的pom依赖项
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.3.0</version>
</dependency>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.3.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.microsoft.azure/azure-eventhubs-spark -->
<dependency>
<groupId>com.microsoft.azure</groupId>
<artifactId>azure-eventhubs-spark_2.11</artifactId>
<version>2.3.12</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>2.4.3</version>
</dependency>
<dependency>
<groupId>com.microsoft.azure</groupId>
<artifactId>azure-eventhubs</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>com.microsoft.azure</groupId>
<artifactId>azure-data-lake-store-sdk</artifactId>
<version>2.2.8</version>
</dependency>
<dependency>
<groupId>com.microsoft.azure</groupId>
<artifactId>azure-eventhubs-eph</artifactId>
<version>2.4.0</version>
</dependency>任何关于这方面的帮助都将不胜感激。
发布于 2019-10-23 16:19:05
最后,我能够通过添加适当的maven依赖项来解决这个问题。
我使用的依赖项是
通用hadoop- v3.8.1
希望这能帮助其他人解决这类问题。
谢谢阿维纳什
https://stackoverflow.com/questions/58460761
复制相似问题