我是开始卡夫卡连接在独立模式如下
/usr/local/confluent/bin/connect-standalone /usr/local/confluent/etc/kafka/connect-standalone.properties /usr/local/confluent/etc/kafka-connect-elasticsearch/quickstart-elasticsearch.properties之后,我使用rest创建了一个包含所有细节的连接器。像这样
curl -X POST -H "Content-Type: application/json" --data '{"name":"elastic-search-sink-audit","config":{"connector.class":"io.confluent.connect.elasticsearch.ElasticsearchSinkConnector","tasks.max":"5","topics":"fsp-AUDIT_EVENT_DEMO","key.ignore":"true","connection.url":"https://**.amazonaws.com","type.name":"kafka-connect-distributed","name":"elastic-search-sink-audit","errors.tolerance":"all","errors.deadletterqueue.topic.name":"fsp-dlq-audit-event"}}' http://localhost:8083/connectors | jq之后,当我检查状态时,我可以看到5个任务正在运行。
curl localhost:8083/connectors/elastic-search-sink-audit/tasks | jq问题1:
这是否意味着我正在以分布式模式或仅以独立模式运行我的kafka连接程序?
问题2:
我是否必须修改connect-distributed.properties文件并像单独启动一样启动?
问题3:
目前,我只在一个EC2中运行我所有的设置,现在如果我必须增加5个EC2来使连接器更加并行并加快速度,我如何才能使这个连接理解5多个EC2已经被添加,并且它必须共享工作负载?
问题4:,我必须运行,启动和创建卡夫卡连接在所有的ec2和刚刚开始?如何确认所有5个EC2都在同一个连接器下正常运行。
最后,我给出了尝试启动连接器的分布式模式。起初我是这样开始的
/usr/local/confluent/bin/connect-distributed /usr/local/confluent/etc/kafka/connect-distributed.properties /usr/local/confluent/etc/kafka-connect-elasticsearch/quickstart-elasticsearch.properties然后在另一个使用rest的会话中,我提交了如下所示
curl -X POST -H "Content-Type: application/json" --data '{"name":"elastic-search-sink-audit","config":{"connector.class":"io.confluent.connect.elasticsearch.ElasticsearchSinkConnector","tasks.max":"5","topics":"fsp-AUDIT_EVENT_DEMO","key.ignore":"true","connection.url":"https://**.amazonaws.com","type.name":"kafka-connect-distributed","name":"elastic-search-sink-audit","errors.tolerance":"all","errors.deadletterqueue.topic.name":"fsp-dlq-audit-event"}}' http://localhost:8083/connectors | jq但一碰到这个,我就开始犯这样的错误
rror: NOT_ENOUGH_REPLICAS (org.apache.kafka.clients.producer.internals.Sender:598)
[2020-02-01 13:48:15,551] WARN [Producer clientId=producer-3] Got error produce response with correlation id 159 on topic-partition connect-configs-0, retrying (2147483496 attempts left). Error: NOT_ENOUGH_REPLICAS (org.apache.kafka.clients.producer.internals.Sender:598)
[2020-02-01 13:48:15,652] WARN [Producer clientId=producer-3] Got error produce response with correlation id 160 on topic-partition connect-configs-0, retrying (2147483495 attempts left). Error: NOT_ENOUGH_REPLICAS (org.apache.kafka.clients.producer.internals.Sender:598)
[2020-02-01 13:48:15,753] WARN [Producer clientId=producer-3] Got error produce response with correlation id 161 on topic-partition connect-configs-0, retrying (2147483494 attempts left). Error: NOT_ENOUGH_REPLICAS (org.apache.kafka.clients.producer.internals.Sender:598)
[2020-02-01 13:48:15,854] WARN [Producer clientId=producer-3] Got error produce response with correlation id 162 on topic-partition connect-configs-0, retrying (2147483493 attempts left). Error: NOT_ENOUGH_REPLICAS (org.apache.kafka.clients.producer.internals.Sender:598)
[2020-02-01 13:48:15,956] WARN [Producer clientId=producer-3] Got error produce response with correlation id 163 on topic-partition connect-configs-0, retrying (2147483492 attempts left). Error: NOT_ENOUGH_REPLICAS (org.apache.kafka.clients.producer.internals.Sender:598)最后,当我尝试使用curl创建连接器时,请求超时。
{ "error_code": 500, "message": "Request timed out" }请帮我理解一下。
发布于 2020-02-01 13:46:16
这两种模式都启动了REST。
分布式模式不接受连接器的属性文件,必须将其发布。没有理由单独这样做,因为您从命令行提供的连接器已经在运行。
推荐分布式模式,因为连接器的状态存储回Kafka主题,而不是保存在运行独立模式的单机上的文件中。
有关详细信息,请参阅- 卡夫卡连接概念
卡夫卡将如何连接理解5多个EC2已经被添加,并且它必须共享工作负载? 我必须运行,启动和创建卡夫卡连接在所有的ec2和刚刚开始?如何确认所有5个EC2都在同一个连接器下正常运行。
好吧,您的EC2机器不知道启动任何进程,除非它们是某个分布式集群的一部分,所以您必须使用相同的设置启动每个进程的分布式模式(Confluent的Ansible repo使这非常容易)。
您可以访问任何连接服务器的/status端点,以查看哪些地址正在运行哪些任务
NOT_ENOUGH_REPLICAS
因为您没有足够的代理来创建用于跟踪状态的内部Kafka主题
https://stackoverflow.com/questions/60017403
复制相似问题