我在GKE上有一个Strimzi集群,我也部署了kafkaExporter。Kafka主题有acls,并且定义了一个消费者组(Spark-KafkaSource-*),它可以从主题中读取数据。
我正在运行一个火花StructuredStreaming程序,它从卡夫卡主题中读取数据。问题是-当我检查指标KafkaExporter -> kafka_consumergroup_lag时,似乎没有向消费者组显示
消费者组出现在米制: kafka_consumergroup_members ->中。
kafka_consumergroup_members{consumergroup="spark-kafka-source-657d6441-5716-43d9-b456-73657a5534a3-594190416-driver-0",container=“反之亦然-卡夫卡-gke-卡夫卡-出口商”,endpoint=“”,instance="10.40.0.65:9404",job=“监视/卡夫卡-资源-度量”,kubernetes_pod_name="versa-kafka-gke-kafka-exporter-84c7ffbb79-jzqjn",namespace="kafka",node_ip="10.142.0.24",node_name="gke-versa-kafka-gke-default-pool-a92b23b7-n0x2",pod="versa-kafka-gke-kafka-exporter-84c7ffbb79-jzqjn",strimzi_io_cluster=-Kafka-gke,strimzi_io_kind="Kafka",strimzi_io_name="versa-kafka-gke-kafka-exporter"}
这里是山雀:
kafka-deployment.yaml (contains the kafkExporter tag)
-----------------------------------------------------
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: versa-kafka-gke #1
spec:
kafka:
version: 3.0.0
replicas: 3
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: tls
port: 9093
type: internal
tls: true
authentication:
type: tls
- name: external
port: 9094
type: loadbalancer
tls: true
authentication:
type: tls
authorization:
type: simple
readinessProbe:
initialDelaySeconds: 15
timeoutSeconds: 5
livenessProbe:
initialDelaySeconds: 15
timeoutSeconds: 5
config:
offsets.topic.replication.factor: 3
transaction.state.log.replication.factor: 3
transaction.state.log.min.isr: 2
log.message.format.version: "3.0"
inter.broker.protocol.version: "3.0"
storage:
type: jbod
volumes:
- id: 0
type: persistent-claim
size: 500Gi
deleteClaim: false
logging: #9
type: inline
loggers:
kafka.root.logger.level: "INFO"
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
name: kafka-metrics
key: kafka-metrics-config.yml
zookeeper:
replicas: 3
storage:
type: persistent-claim
size: 2Gi
deleteClaim: false
resources:
requests:
memory: 1Gi
cpu: "1"
limits:
memory: 2Gi
cpu: "1.5"
logging:
type: inline
loggers:
zookeeper.root.logger: "INFO"
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
name: kafka-metrics
key: zookeeper-metrics-config.yml
entityOperator: #11
topicOperator: {}
userOperator: {}
kafkaExporter:
topicRegex: ".*"
groupRegex: ".*"
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kafka-metrics
labels:
app: strimzi
data:
kafka-metrics-config.yml: |
# See https://github.com/prometheus/jmx_exporter for more info about JMX Prometheus Exporter metrics
lowercaseOutputName: true
rules:
# Special cases and very specific rules
- pattern: kafka.server<type=(.+), name=(.+), clientId=(.+), topic=(.+), partition=(.*)><>Value
name: kafka_server_$1_$2
type: GAUGE
labels:
clientId: "$3"
topic: "$4"
partition: "$5"
- pattern: kafka.server<type=(.+), name=(.+), clientId=(.+), brokerHost=(.+), brokerPort=(.+)><>Value
name: kafka_server_$1_$2
type: GAUGE
labels:
clientId: "$3"
broker: "$4:$5"
- pattern: kafka.server<type=(.+), cipher=(.+), protocol=(.+), listener=(.+), networkProcessor=(.+)><>connections
name: kafka_server_$1_connections_tls_info
type: GAUGE
labels:
cipher: "$2"
protocol: "$3"
listener: "$4"
networkProcessor: "$5"
- pattern: kafka.server<type=(.+), clientSoftwareName=(.+), clientSoftwareVersion=(.+), listener=(.+), networkProcessor=(.+)><>connections
name: kafka_server_$1_connections_software
type: GAUGE
labels:
clientSoftwareName: "$2"
clientSoftwareVersion: "$3"
listener: "$4"
networkProcessor: "$5"
- pattern: "kafka.server<type=(.+), listener=(.+), networkProcessor=(.+)><>(.+):"
name: kafka_server_$1_$4
type: GAUGE
labels:
listener: "$2"
networkProcessor: "$3"
- pattern: kafka.server<type=(.+), listener=(.+), networkProcessor=(.+)><>(.+)
name: kafka_server_$1_$4
type: GAUGE
labels:
listener: "$2"
networkProcessor: "$3"
# Some percent metrics use MeanRate attribute
# Ex) kafka.server<type=(KafkaRequestHandlerPool), name=(RequestHandlerAvgIdlePercent)><>MeanRate
- pattern: kafka.(\w+)<type=(.+), name=(.+)Percent\w*><>MeanRate
name: kafka_$1_$2_$3_percent
type: GAUGE
# Generic gauges for percents
- pattern: kafka.(\w+)<type=(.+), name=(.+)Percent\w*><>Value
name: kafka_$1_$2_$3_percent
type: GAUGE
- pattern: kafka.(\w+)<type=(.+), name=(.+)Percent\w*, (.+)=(.+)><>Value
name: kafka_$1_$2_$3_percent
type: GAUGE
labels:
"$4": "$5"
# Generic per-second counters with 0-2 key/value pairs
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*, (.+)=(.+), (.+)=(.+)><>Count
name: kafka_$1_$2_$3_total
type: COUNTER
labels:
"$4": "$5"
"$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*, (.+)=(.+)><>Count
name: kafka_$1_$2_$3_total
type: COUNTER
labels:
"$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*><>Count
name: kafka_$1_$2_$3_total
type: COUNTER
# Generic gauges with 0-2 key/value pairs
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+), (.+)=(.+)><>Value
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
"$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+)><>Value
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+)><>Value
name: kafka_$1_$2_$3
type: GAUGE
# Emulate Prometheus 'Summary' metrics for the exported 'Histogram's.
# Note that these are missing the '_sum' metric!
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+), (.+)=(.+)><>Count
name: kafka_$1_$2_$3_count
type: COUNTER
labels:
"$4": "$5"
"$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.*), (.+)=(.+)><>(\d+)thPercentile
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
"$6": "$7"
quantile: "0.$8"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+)><>Count
name: kafka_$1_$2_$3_count
type: COUNTER
labels:
"$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.*)><>(\d+)thPercentile
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
quantile: "0.$6"
- pattern: kafka.(\w+)<type=(.+), name=(.+)><>Count
name: kafka_$1_$2_$3_count
type: COUNTER
- pattern: kafka.(\w+)<type=(.+), name=(.+)><>(\d+)thPercentile
name: kafka_$1_$2_$3
type: GAUGE
labels:
quantile: "0.$4"
zookeeper-metrics-config.yml: |
# See https://github.com/prometheus/jmx_exporter for more info about JMX Prometheus Exporter metrics
lowercaseOutputName: true
rules:
# replicated Zookeeper
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+)><>(\\w+)"
name: "zookeeper_$2"
type: GAUGE
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+), name1=replica.(\\d+)><>(\\w+)"
name: "zookeeper_$3"
type: GAUGE
labels:
replicaId: "$2"
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+), name1=replica.(\\d+), name2=(\\w+)><>(Packets\\w+)"
name: "zookeeper_$4"
type: COUNTER
labels:
replicaId: "$2"
memberType: "$3"
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+), name1=replica.(\\d+), name2=(\\w+)><>(\\w+)"
name: "zookeeper_$4"
type: GAUGE
labels:
replicaId: "$2"
memberType: "$3"
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+), name1=replica.(\\d+), name2=(\\w+), name3=(\\w+)><>(\\w+)"
name: "zookeeper_$4_$5"
type: GAUGE
labels:
replicaId: "$2"
memberType: "$3"kafkaUser yaml:
---------
kind: KafkaUser
metadata:
name: syslog-vani-noacl
labels:
strimzi.io/cluster: versa-kafka-gke
spec:
authentication:
type: tls
authorization:
type: simple
acls:
# Topics and groups used by the HTTP clients through the HTTP Bridge
# Change to match the topics used by your HTTP clients
- resource:
type: topic
name: syslog.ueba-us4.v1.versa.demo3
patternType: literal
operation: Read
host: "*"
- resource:
type: topic
name: syslog.ueba-us4.v1.versa.demo3
patternType: literal
operation: Describe
host: "*"
- resource:
type: topic
name: syslog.ueba-us4.v1.versa.demo3
patternType: literal
operation: Write
host: "*"
- resource:
type: group
name: 'spark-kafka-source-'
patternType: prefix
operation: Read
host: "*"
- resource:
type: group
name: 'ss.consumer'
patternType: literal
operation: Read
host: "*"
- resource:
type: group
name: 'versa-console-consumer'
patternType: literal
operation: Read
host: "*"在kafkaUser yaml中提到的消费者组中没有一个是在度量-> kafka_consumergroup_lag中出现的。
有什么办法来调试/修复这个问题吗?
蒂娅!
请注意:我的星火程序是在dataproc上运行的(也就是说,不是部署Kafka的kubernetes集群),这会影响kafkaExporter显示消费群体滞后的方式吗?
发布于 2022-08-22 21:32:35
卡夫卡出口商正在导出Prometheus的度量标准,其基础是__consumer_offsets主题中承诺的消费者补偿。因此,当一些消费者连接到您的Kafka集群,使用一些消息并提交它们时,它将看到它们并在度量中显示它们。
另一方面,KafkaUser CR只列出ACL。因此,您给用户使用这样一个消费者组的权利。但这并不意味着消费者群体的存在。只有当用户使用它并提交一些东西时,它才会出现。
所以你所看到的可能是完全没有问题的,也是可以期待的。
https://stackoverflow.com/questions/73450706
复制相似问题