我试图使用Sqoop从S3导出一个parquet文件到Server,并得到以下错误:
19/07/09 16:12:57 ERROR sqoop.Sqoop: Got异常运行Sqoop: org.kitesdk.data.DatasetNotFoundException:未知数据集URI模式: dataset:s3://mybucket/data-lake/serving-zone/part-00002-b5a1da42.snappy.parquet检查s3数据集的JAR是否位于类路径org.kitesdk.data.DatasetNotFoundException:未知数据集URI模式: dataset:s3://mybucket/data-lake/serving-zone/part-00002-b5a1da42.snappy.parquet检查s3数据集的JAR是否在类路径上在org.kitesdk.data.spi.Registration.lookupDatasetUri(Registration.java:128) at org.kitesdk.data.Datasets.load(Datasets.java:103) at org.kitesdk.data.Datasets.load(Datasets.java:140) at org.kitesdk.data.mapreduce.DatasetKeyInputFormat$ConfigBuilder.readFrom(DatasetKeyInputFormat.java:92) at org.kitesdk.data.mapreduce.DatasetKeyInputFormat$ConfigBuilder.readFrom(DatasetKeyInputFormat.java:139)在org.apache.sqoop.mapreduce.JdbcExportJob.configureInputFormat(JdbcExportJob.java:83) at org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJobBase.java:434) at org.apache.sqoop.manager.SQLServerManager.exportTable(SQLServerManager.java:192) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:80) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:99)org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
在上面的位置上有Dataset,并且路径URI没有问题。我尝试过从相同的路径导出CSV文件,并且成功了。
下面是我的Sqoop导出命令:
sqoop export --driver com.microsoft.sqlserver.jdbc.SQLServerDriver
--connection-manager org.apache.sqoop.manager.SQLServerManager
--connect "jdbc:sqlserver://localhost:1433;databaseName=salesdb"
--table DimEmployee_test --num-mappers 128
--export-dir s3://mybucket/data-lake/serving-zone/part-00002-b5a1da42.snappy.parquet
--username db-user --password mypassword发布于 2019-07-31 14:49:16
您的--连接URI似乎有些尴尬,尝试使用这种格式:
jdbc:jtds:sqlserver://<HOST>:<PORT>/<DATABASE>https://stackoverflow.com/questions/56958814
复制相似问题