我有3个节点的kafka集群,每个节点都有zookeeper和kafka。如果我显式地杀死引导节点zookeeper和kafka,那么整个集群都不会接受任何传入的数据,并等待节点返回。
kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 3 min.insync.replicas=2 --partitions 6 --topic logs使用上述命令创建的主题。
节点1
server.properties
broker.id=0
listeners=PLAINTEXT://:9092
advertised.listeners=PLAINTEXT://10.0.2.4:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181,10.0.2.5:2181,10.0.14.7:2181
zookeeper.connection.timeout.ms=18000
group.initial.rebalance.delay.ms=0zookeeper.properties
tickTime=2000
dataDir=/tmp/zookeeper/
initLimit=5
syncLimit=2
server.0=0.0.0.0:2888:3888
server.1=analyzer1:2888:3888
server.2=10.0.14.4:2888:3888
clientPort=2181每个节点对应的kafka和zookeeper分别为上述格式。
当我检查其余节点的zookeeper状态时,我可以看到一个新的领导者。但producer仍然无法发送数据。另外,两个kafka节点未响应以下错误。
WARN Client session timed out, have not heard from server in 30004ms for sessionid 0x0 (org.apache.zookeeper.ClientCnxn)有人能帮我吗?
如果您想要可用节点上的kafka日志?
[2020-10-08 19:40:13,607] WARN Client session timed out, have not heard from server in 12002ms for sessionid 0x2acefe00000 (org.apache.zookeeper.ClientCnxn)
[2020-10-08 19:40:13,608] INFO Client session timed out, have not heard from server in 12002ms for sessionid 0x2acefe00000, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2020-10-08 19:40:13,709] INFO [ZooKeeperClient Kafka server] Waiting until connected. (kafka.zookeeper.ZooKeeperClient)
[2020-10-08 19:40:13,709] INFO [ZooKeeperClient Kafka server] Connected. (kafka.zookeeper.ZooKeeperClient)
[2020-10-08 19:40:13,709] INFO [ZooKeeperClient Kafka server] Waiting until connected. (kafka.zookeeper.ZooKeeperClient)
[2020-10-08 19:40:13,709] INFO [ZooKeeperClient Kafka server] Connected. (kafka.zookeeper.ZooKeeperClient)
[2020-10-08 19:40:13,866] INFO Opening socket connection to server 10.0.14.7/10.0.14.7:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2020-10-08 19:40:13,867] INFO Socket error occurred: 10.0.14.7/10.0.14.7:2181: Connection refused (org.apache.zookeeper.ClientCnxn)
[2020-10-08 19:40:13,968] INFO [ZooKeeperClient Kafka server] Waiting until connected. (kafka.zookeeper.ZooKeeperClient)
[2020-10-08 19:40:13,968] INFO [ZooKeeperClient Kafka server] Waiting until connected. (kafka.zookeeper.ZooKeeperClient)
[2020-10-08 19:40:14,093] WARN [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Connection to node 1 (/10.0.2.5:9092) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2020-10-08 19:40:14,093] INFO [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Error sending fetch request (sessionId=205463854, epoch=INITIAL) to node 1: {}. (org.apache.kafka.clients.FetchSessionHandler)
java.io.IOException: Connection to 10.0.2.5:9092 (id: 1 rack: null) failed.
at org.apache.kafka.clients.NetworkClientUtils.awaitReady(NetworkClientUtils.java:71)
at kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:103)
at kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:206)
at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:300)
at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:135)
at kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:134)
at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:117)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)发布于 2020-10-08 22:29:21
当您只在KafkaProducer属性bootstrap_servers中提到其中一个代理时,通常会发生这种情况。我的假设是,此属性仅设置为代理10.0.2.5:9092,而不是列出所有三个节点。
尽管只提到其中一个就足以与整个集群通信,但建议至少列出两个代理地址(以逗号分隔的列表形式),以处理您面临的此类情况。
在单个代理失败的情况下,代理可能会将分区领导者切换为活动代理,如您在日志中所看到的那样。即使您没有列出属性bootstrap_servers中的所有代理,生产者也会确定它需要将数据发送到哪个代理(分区领导者)。
https://stackoverflow.com/questions/64264379
复制相似问题