首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用spark-streaming将avro数据集加载到Teradata时出现问题

使用spark-streaming将avro数据集加载到Teradata时出现问题
EN

Stack Overflow用户
提问于 2018-02-09 01:42:06
回答 1查看 553关注 0票数 0

我尝试通过spark streaming (jdbc)将avro文件的数据集加载到Teradata表中。配置设置正确,加载在一定程度上成功(我可以验证数据行是否已插入到表中),但中途开始出现异常,加载失败。堆栈跟踪如下所示。有什么可能导致这种情况的线索吗?

代码语言:javascript
复制
18/02/08 17:27:42 ERROR executor.Executor: Exception in task 2.0 in stage 0.0 (TID 0)
java.sql.BatchUpdateException: [Teradata JDBC Driver] [TeraJDBC 16.20.00.02] [Error 1154] [SQLState HY000] A failure occurred while inserting the batch of rows destined for database table "database"."table". Details of the failure can be found in the exception chain that is accessible with getNextException.
    at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeBatchUpdateException(ErrorFactory.java:149)
    at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeBatchUpdateException(ErrorFactory.java:133)
    at com.teradata.jdbc.jdbc.fastload.FastLoadManagerPreparedStatement.executeBatch(FastLoadManagerPreparedStatement.java:2389)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:592)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:670)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:670)
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:926)
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:926)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1951)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1951)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:99)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.sql.SQLException: [Teradata JDBC Driver] [TeraJDBC 16.20.00.02] [Error 1147] [SQLState HY000] The next failure(s) in the exception chain occurred while beginning FastLoad of database table "database"."table"
    at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeDriverJDBCException(ErrorFactory.java:95)
    at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeDriverJDBCException(ErrorFactory.java:70)
    at com.teradata.jdbc.jdbc.fastload.FastLoadManagerPreparedStatement.beginFastLoad(FastLoadManagerPreparedStatement.java:966)
    at com.teradata.jdbc.jdbc.fastload.FastLoadManagerPreparedStatement.executeBatch(FastLoadManagerPreparedStatement.java:2210)
EN

回答 1

Stack Overflow用户

发布于 2018-03-07 01:19:13

该问题源于试图以append模式将数据加载到现有表。快速加载不支持这一点。每次进程运行时,该表都应为空(也称为截断)。这使得这对于在处理数据之前暂存数据很有用。但不是用来储存的。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/48691646

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档