首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >按表源添加时间属性时,Flink数据类型不匹配

按表源添加时间属性时,Flink数据类型不匹配
EN

Stack Overflow用户
提问于 2020-02-13 11:45:09
回答 1查看 409关注 0票数 0

我尝试根据flink doc添加具有事件时间属性的表源。我的代码如下:

代码语言:javascript
复制
class SISSourceTable
    extends StreamTableSource[Row]
    with DefinedRowtimeAttributes
    with FlinkCal
    with FlinkTypeTags {
  private[this] val profileProp = ConfigurationManager.loadBusinessProperty
  val topic: String = ...
  val schemas = Seq(
    (TsCol, SQLTimestamp),
    (DCol, StringTag),
    (CCol, StringTag),
    (RCol, StringTag)
  )

  override def getProducedDataType: DataType = DataTypes.ROW(extractFields(schemas): _*)

  override def getTableSchema: TableSchema =
    new TableSchema.Builder()
      .fields(extractFieldNames(schemas), extractFieldDataTypes(schemas))
      .build()

  override def getRowtimeAttributeDescriptors: util.List[RowtimeAttributeDescriptor] =
    Collections.singletonList(
      new RowtimeAttributeDescriptor(
        TsCol,
        new ExistingField(TsCol),
        new AscendingTimestamps
      )
    )

  override def getDataStream(execEnv: StreamExecutionEnvironment): DataStream[Row] = {
    val windowTime: Int = profileProp.getProperty("xxx", "300").toInt
    val source = prepareSource(topic)
    val colsToCheck = List(RCol, CCol, DCol)

    execEnv
      .addSource(source)
      .map(new MapFunction[String, Map[String, String]]() {
        override def map(value: String): Map[String, String] = ...
      })
      .map(new MapFunction[Map[String, String], Row]() {
        override def map(value: Map[String, String]): Row = {
          Row.of(new Timestamp(value(TsCol).toLong * 1000), value(DCol), value(CCol), value(RCol))
        }
      })
      .assignTimestampsAndWatermarks(new BoundedOutOfOrdernessTimestampExtractor[Row](Time.seconds(windowTime)) {
        override def extractTimestamp(element: Row): Long = element.getField(0).asInstanceOf[Timestamp].getTime
      })
  }
}

我在getDataStream方法中得到的source是一个Kafka字符串源。这是我从每个卡夫卡记录中提取的TsCol。我想使用TsCol作为事件时间。但是,TsCol是一个10位的timestamp with string数据类型,所以我需要将它转换为13位Long数据类型。当我尝试使用13位数的长数据作为rowtime时,我得到了异常,表示rowtime只能从SQL_TIMESTAMP列中提取。所以我最终把to转换成了java.sql.Timestamp。当我注册上面的源表并运行flink时。我得到了以下异常:

代码语言:javascript
复制
org.apache.flink.table.api.TableException: TableSource of type com.mob.mobeye.flink.table.source.StayInStoreSourceTable returned a DataStream of data type ROW<`t` TIMESTAMP(3), `mac` STRING, `c` STRING, `r` STRING> that does not match with the data type ROW<`t` TIMESTAMP(3), `mac` STRING, `c` STRING, `r` STRING> declared by the TableSource.getProducedDataType() method. Please validate the implementation of the TableSource.
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecTableSourceScan.translateToPlanInternal(StreamExecTableSourceScan.scala:113)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecTableSourceScan.translateToPlanInternal(StreamExecTableSourceScan.scala:55)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecTableSourceScan.translateToPlan(StreamExecTableSourceScan.scala:55)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:86)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:46)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlan(StreamExecCalc.scala:46)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecExchange.translateToPlanInternal(StreamExecExchange.scala:84)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecExchange.translateToPlanInternal(StreamExecExchange.scala:44)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecExchange.translateToPlan(StreamExecExchange.scala:44)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecGroupWindowAggregate.translateToPlanInternal(StreamExecGroupWindowAggregate.scala:140)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecGroupWindowAggregate.translateToPlanInternal(StreamExecGroupWindowAggregate.scala:55)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecGroupWindowAggregate.translateToPlan(StreamExecGroupWindowAggregate.scala:55)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:86)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:46)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlan(StreamExecCalc.scala:46)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecLookupJoin.translateToPlanInternal(StreamExecLookupJoin.scala:97)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecLookupJoin.translateToPlanInternal(StreamExecLookupJoin.scala:40)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecLookupJoin.translateToPlan(StreamExecLookupJoin.scala:40)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:86)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:46)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlan(StreamExecCalc.scala:46)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecLookupJoin.translateToPlanInternal(StreamExecLookupJoin.scala:97)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecLookupJoin.translateToPlanInternal(StreamExecLookupJoin.scala:40)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecLookupJoin.translateToPlan(StreamExecLookupJoin.scala:40)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:86)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:46)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlan(StreamExecCalc.scala:46)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToTransformation(StreamExecSink.scala:185)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.scala:133)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.scala:50)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlan(StreamExecSink.scala:50)
    at org.apache.flink.table.planner.delegation.StreamPlanner.$anonfun$translateToPlan$1(StreamPlanner.scala:61)
    at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233)
    at scala.collection.Iterator.foreach(Iterator.scala:937)
    at scala.collection.Iterator.foreach$(Iterator.scala:937)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1425)
    at scala.collection.IterableLike.foreach(IterableLike.scala:70)
    at scala.collection.IterableLike.foreach$(IterableLike.scala:69)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at scala.collection.TraversableLike.map(TraversableLike.scala:233)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:226)
    at scala.collection.AbstractTraversable.map(Traversable.scala:104)
    at org.apache.flink.table.planner.delegation.StreamPlanner.translateToPlan(StreamPlanner.scala:60)
    at org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:149)
    at org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:439)
    at org.apache.flink.table.api.internal.TableEnvironmentImpl.insertInto(TableEnvironmentImpl.java:327)
    at org.apache.flink.table.api.internal.TableImpl.insertInto(TableImpl.java:411)

我很困惑为什么

ROW<t时间戳(3)、mac字符串、c字符串、r STRING>

与数据类型不匹配

ROW<t时间戳(3)、mac字符串、c字符串、r STRING>

我在另一个地方得到了类似的错误,在那里我替换了时间戳,因为它起作用了。但在这里,我需要将列t提取为rowtime,因此它的类型必须是TIMESTAMP(3)。我非常感谢有人能帮助解决这个问题。

EN

回答 1

Stack Overflow用户

发布于 2020-02-18 22:19:02

你使用的是什么flink版本?如果我没记错的话,您使用的是低于1.9.2的版本。对吗?

如果是这样的话,异常消息不是很有帮助,因为它有一个在https://issues.apache.org/jira/browse/FLINK-15726中修复的错误。在此之前,实际上相同的类型被打印了两次。

你的实现中有几个问题。类型不匹配很可能是因为您生成了由map运算符在

代码语言:javascript
复制
      .map(new MapFunction[Map[String, String], Row]() {
        override def map(value: Map[String, String]): Row = {
          Row.of(new Timestamp(value(TsCol).toLong * 1000), value(DCol), value(CCol), value(RCol))
        }
      })

尝试将其更改为

代码语言:javascript
复制
      .map(new MapFunction[Map[String, String], Row]() {
        override def map(value: Map[String, String]): Row = {
          Row.of(new Timestamp(value(TsCol).toLong * 1000), value(DCol), value(CCol), value(RCol))
        }
      }).returns(Types.ROW(Types.SQL_TIMESTAMP, Types.STRING, Types.STRING, Types.STRING))

其次,您不需要在TableSource中分配时间戳和水印。它们将根据通过DefinedRowtimeAttributes提供的信息自动分配。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/60200151

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档