首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何使流从卡夫卡主题到elasticsearch与汇合?

如何使流从卡夫卡主题到elasticsearch与汇合?
EN

Stack Overflow用户
提问于 2019-04-23 17:51:37
回答 1查看 1K关注 0票数 1

我从机器中读取数据,并将其作为JSON流到kafka主题。我想阅读这个主题,并将流数据存储到elasticsearch中。

我的步骤: 1.创建从JSON到AVRO的KSQL流

json流:

代码语言:javascript
复制
 CREATE STREAM source_json_pressure 
 (
  timestamp BIGINT, 
  opcuaObject VARCHAR, 
  value DOUBLE
  ) 
 WITH (KAFKA_TOPIC='7d12h100mbpressure',
   VALUE_FORMAT='JSON');

阿夫罗溪流:

代码语言:javascript
复制
 CREATE STREAM target_avro_pressure 
  WITH (
     KAFKA_TOPIC='7d12h100mbpressure_avro', 
     VALUE_FORMAT='AVRO'
  ) AS 
  SELECT * FROM source_json_pressure;

在这之后我得到了阿夫罗流:

代码语言:javascript
复制
 ksql> print "7d12h100mbpressure_avro";

 Format:AVRO
 23.04.19 19:29:58 MESZ,   jK?C, {"TIMESTAMP": 1556040449728, "OPCUAOBJECT": "DatLuDrUeb.EinDru", "VALUE": 7.42}

我的elasticsearch.properties:

代码语言:javascript
复制
15 name=elasticsearch-sink
16 connector.class=io.confluent.connect.elasticsearch.ElasticsearchSinkConnector
17 tasks.max=1
18 topics=7d12h100mbpressure_avro
19 key.ignore=true
20 connection.url=http://localhost:9200
21 type.name=kafka-connect

在此之后,我期待ES中的流,但是我得到没有流数据的索引。

我在哪里搞错了?

来自汇合日志连接的错误:

代码语言:javascript
复制
 [2019-04-24 11:01:29,316] INFO [Consumer clientId=consumer-4, groupId=connect-elasticsearch-sink] Setting newly assigned partitions: 7d12h100mbpressure_avro-3, 7d12h100mbpressure_avro-2, 7d12h100mbpressure_avro-1, 7d12h100mbpressure_avro-0 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:290)
 [2019-04-24 11:01:29,327] INFO [Consumer clientId=consumer-4, groupId=connect-elasticsearch-sink] Resetting offset for partition 7d12h100mbpressure_avro-3 to offset 0. (org.apache.kafka.clients.consumer.internals.Fetcher:584)
 [2019-04-24 11:01:29,327] INFO [Consumer clientId=consumer-4, groupId=connect-elasticsearch-sink] Resetting offset for partition 7d12h100mbpressure_avro-2 to offset 0. (org.apache.kafka.clients.consumer.internals.Fetcher:584)
 [2019-04-24 11:01:29,327] INFO [Consumer clientId=consumer-4, groupId=connect-elasticsearch-sink] Resetting offset for partition 7d12h100mbpressure_avro-1 to offset 0. (org.apache.kafka.clients.consumer.internals.Fetcher:584)
 [2019-04-24 11:01:29,328] INFO [Consumer clientId=consumer-4, groupId=connect-elasticsearch-sink] Resetting offset for partition 7d12h100mbpressure_avro-0 to offset 0. (org.apache.kafka.clients.consumer.internals.Fetcher:584)
 [2019-04-24 11:01:29,667] ERROR WorkerSinkTask{id=elasticsearch-sink-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:177)
 org.apache.kafka.connect.errors.ConnectException: Tolerance exceeded in error handler
    at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:178)
    at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:104)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.convertAndTransformRecord(WorkerSinkTask.java:484)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:464)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:320)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:224)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:192)
    at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
    at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
 Caused by: org.apache.kafka.connect.errors.DataException: Failed to deserialize data for topic 7d12h100mbpressure_avro to Avro:
    at io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:107)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.lambda$convertAndTransformRecord$0(WorkerSinkTask.java:484)
    at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)
    at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)
    ... 13 more 
 Caused by: org.apache.kafka.common.errors.SerializationException: Error retrieving Avro schema for id 92747
 Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Schema not found; error code: 40403
    at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:226)
    at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:252)
    at io.confluent.kafka.schemaregistry.client.rest.RestService.getId(RestService.java:482)
    at io.confluent.kafka.schemaregistry.client.rest.RestService.getId(RestService.java:475)
    at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getSchemaByIdFromRegistry(CachedSchemaRegistryClient.java:151)
    at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getBySubjectAndId(CachedSchemaRegistryClient.java:230)
    at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getById(CachedSchemaRegistryClient.java:209)
    at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:116)
    at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserializeWithSchemaAndVersion(AbstractKafkaAvroDeserializer.java:215)
    at io.confluent.connect.avro.AvroConverter$Deserializer.deserialize(AvroConverter.java:145)
    at io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:90)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.lambda$convertAndTransformRecord$0(WorkerSinkTask.java:484)
    at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)
    at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)
    at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:104)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.convertAndTransformRecord(WorkerSinkTask.java:484)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:464)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:320)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:224)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:192)
    at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
    at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
 [2019-04-24 11:01:29,668] ERROR WorkerSinkTask{id=elasticsearch-sink-0} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:178)

我的连接-avro-分发.属性:

代码语言:javascript
复制
 # Bootstrap Kafka servers. If multiple servers are specified, they should be comma-separated.
   bootstrap.servers=localhost:9092

  key.converter=io.confluent.connect.avro.AvroConverter
  key.converter.schema.registry.url=http://localhost:8081
  value.converter=io.confluent.connect.avro.AvroConverter
  value.converter.schema.registry.url=http://localhost:8081

  config.storage.topic=connect-configs
  offset.storage.topic=connect-offsets
  status.storage.topic=connect-statuses

  config.storage.replication.factor=1
  offset.storage.replication.factor=1
  status.storage.replication.factor=1

 internal.key.converter=org.apache.kafka.connect.json.JsonConverter
 internal.value.converter=org.apache.kafka.connect.json.JsonConverter
 internal.key.converter.schemas.enable=false
 internal.value.converter.schemas.enable=false
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-04-24 18:06:00

但是,您可以从Elastic接收器设置key.ignore=true,但这并不能阻止Connect尝试反序列化记录。

当您只执行confluent start时,它将始终使用AvroConverter作为键和值转换器。

值得一提的是,我认为,KSQL中的VALUE_FORMAT='AVRO'只将值作为Avro,而不是关键。

其中一个原因也许可以解释为什么你会看到

  • 主体未找到
  • 模式未找到
  • 检索id的Avro模式时出错

为了解决这个问题,在您的elasticsearch.properties中,您可以重写key.converter,使其成为类似于org.apache.kafka.connect.storage.StringConverter的其他东西。

另外,与使用Connect+KSQL进行调试不同,我建议使用kafka-avro-console-consumer并包括--property print.key=true选项,以查看是否有类似的错误。

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/55816841

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档