用于采集和暴露k8s集群的metrics,它负责监听 K8s apiserver 从而生成metrics数据,指标数据通过 /metrics Endpoint 暴露,主要是适配 Prometheus1,下载将 kube-state-metrics 可以找个海外机器拉取下来上传到国内镜像仓库进行拉取)3,安装kube-state-metricskubectl apply -f ./4,查看是否按照成功[root@k8s-node1 kube-state-metrics [root@k8s-node1 kube-state-metrics]# curl 172.16.7.134:8080/healthzOK#通过 /metrics 接口可查看其采集的全量数据。 [root@k8s-node1 kube-state-metrics]# curl 172.16.7.134:8080/metrics#登录其他POD里面执行[root@centos-777bdddd57
0x00 概述 在K8S集群部署kube-state-metrics微服务的时候,发现容器日志不停刷报错日志,主要报错日志如下: E0824 13:09:36.768882 1 reflector.go :205] k8s.io/kube-state-metrics/pkg/collectors/builder.go:508: Failed to list *v1.Secret: secrets is scope E0824 13:09:36.742450 1 reflector.go:205] k8s.io/kube-state-metrics/pkg/collectors/builder.go: 那一步,并没有给kube-state-metrics提供cluster层级的权限; 0x02 给kube-state-metrics赋权cluster-admin 执行如下命令,给system:serviceaccount \ --clusterrole=cluster-admin \ --user=system:serviceaccount:monitoring:kube-state-metrics
在创建或关联 TKE 集群到 TMP 实例时,通常会自动为您部署包含 kube-state-metrics 在内的核心监控组件。 社区维护的 kube-prometheus-stack chart 是一个包含了 Prometheus 和 kube-state-metrics 的流行选择。 /kube-state-metrics:vX.Y.Z 修改为一个国内可以访问的镜像地址。 修改前:# image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.5.0修改后 (示例):image: ccr.ccs.tencentyun.com -f .默认情况下,这会将 kube-state-metrics 安装在 kube-system 命名空间。
因此我们需要kube-state-metrics,来帮助我们完成这些采集操作。 kube-state-metrics是通过轮询的方式对Kubernetes API进行操作,然后返回有关资源对象指标的Metrics信息:CronJob、DaemonSet、Deployment、Job : - kind: ServiceAccount name: kube-state-metrics namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: kube-state-metrics namespace: kube-system 创建kube-state-metrics的deployment : kube-state-metrics containers: - name: kube-state-metrics image: quay.io/coreos
查看 CPU 资源分配的额度是否超过进群总额度 表达式: sum(kube_pod_container_resource_limits_cpu_cores{job="kube-state-metrics 表达式: kube_deployment_status_observed_generation{job="kube-state-metrics"} ! 表达式: ( kube_deployment_spec_replicas{job="kube-state-metrics"} ! 表达式: ( kube_statefulset_status_replicas_ready{job="kube-state-metrics"} ! 表达式: kube_job_spec_completions{job="kube-state-metrics",job_name=~".*loki-weixin-notify.
kube-state-metrics:kube-state-metrics 通过监听 API Server 生成有关资源对象的状态指标,比如 Deployment、Node、Pod,需要注意的是 kube-state-metrics 不过 kube-state-metrics 和 metrics-server 之间还是有很大不同的,二者的主要区别如下: kube-state-metrics 主要关注的是业务相关的一些元数据,比如 Deployment 而 kube-state-metrics 是获取集群最新的指标。 kind: ClusterRole name: kube-state-metrics subjects: - kind: ServiceAccount name: kube-state-metrics 配置了 Pod的自动发现,所以可以给 kube-state-metrics 的 Pod 配置上对应的 annotations 来自动被发现,然后直接创建即可: - job_name: 'kube-state-metrics
简单介绍: kube-state-metrics (KSM) 是一项简单的服务,用于侦听 Kubernetes API 服务器并生成有关对象状态的指标,它并不关注各个 Kubernetes 组件的运行状况 /kube-state-metrics:v2.10.1 Step 3.替换镜像为weiyigeek/kube-state-metrics:v2.10.1后部署即可。 /kube-state-metrics created # deployment.apps/kube-state-metrics created # serviceaccount/kube-state-metrics created # service/kube-state-metrics created Step 4.查看验证 kube-state-metrics 部署结果, 为 service/kube-state-metrics $ kubectl get deployment,service,pod -n kube-system -l app.kubernetes.io/name=kube-state-metrics #
docker pull registry.aliyuncs.com/chenby/ceph:v18.2.1# k8s.gcr.io 仓库: # 官方地址: docker pull k8s.gcr.io/kube-state-metrics /kube-state-metrics:v2.8.2 # 镜像地址: docker pull k8s.chenby.cn/kube-state-metrics/kube-state-metrics: v2.8.2 # 阿里云地址: docker pull registry.aliyuncs.com/chenby/kube-state-metrics:v2.8.2# registry.k8s.io
我有N个服务在运行中 而这些则是 kube-state-metrics 提供的内容,它基于 client-go 开发,轮询 Kubernetes API,并将 Kubernetes 的结构化信息转换为 kube-state-metrics 是 kubernetes 开源的一个插件。 废话不多说,直接上教程。。。 部署教程 下载 在官网 https://github.com/kubernetes/kube-state-metrics 下载相应的源码以及部署脚本,本次使用 release1.9.7,即 v1.9.7 版本的 kube-state-metrics 执行 cd /kube-state-metrics/examples/standard,可以看到几个文件: cluster-role-binding.yaml : kube-state-metrics namespace: kube-system 部署 cd /kube-state-metrics/examples/standard kubectl create
查看 CPU 资源分配的额度是否超过进群总额度 表达式: sum(kube_pod_container_resource_limits_cpu_cores{job="kube-state-metrics 表达式: sum (kube_pod_container_resource_requests_memory_bytes{job="kube-state-metrics"} ) by (namespace 表达式: kube_deployment_status_observed_generation{job="kube-state-metrics"} ! 表达式: ( kube_deployment_spec_replicas{job="kube-state-metrics"} ! 表达式: ( kube_statefulset_status_replicas_ready{job="kube-state-metrics"} !
v0.18.0 quay.io/prometheus/alertmanager:v0.18.0 # docker tag registry.cn-hangzhou.aliyuncs.com/loong576/kube-state-metrics :v1.8.0 quay.io/coreos/kube-state-metrics:v1.8.0 # docker tag registry.cn-hangzhou.aliyuncs.com/loong576 created servicemonitor.monitoring.coreos.com/grafana created clusterrole.rbac.authorization.k8s.io/kube-state-metrics created clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created deployment.apps/kube-state-metrics created service/kube-state-metrics created serviceaccount/kube-state-metrics created servicemonitor.monitoring.coreos.com
因此我们需要kube-state-metrics,来帮助我们完成这些采集操作。 : - kind: ServiceAccount name: kube-state-metrics namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: kube-state-metrics namespace: kube-system 创建kube-state-metrics的deployment : kube-state-metrics containers: - name: kube-state-metrics image: quay.io/coreos /kube-state-metrics:v1.6.0 ports: - name: http-metrics containerPort: 8080
|__scheme__://__address____metrics_path__|Deploy/ds等| Tips : 注意kube-state-metrics监控的URL的动态发现是基于标签的自动补全 流程步骤: Step 1.我们先查看当前kube-state-metrics兼容性矩阵与我们kubernetes集群版本的对应参考地址,下面最多记录5个kube状态度量和5个kubernetes版本。 : labels: app.kubernetes.io/name: kube-state-metrics app.kubernetes.io/version: kubectl apply -f kube-state-metrics.yaml # deployment.apps/kube-state-metrics created # service/ 服务端口 443 8080 示例: kube-state-metrics 收集到的节点信息, 如验证指标是否采集成功请求kube-state-metrics的pod ip+8080端口出现以下页面则正常
镜像: docker pull quay.io/coreos/kube-state-metrics:v1.9.7 docker tag quay.io/coreos/kube-state-metrics :v1.9.7 harbor.od.com/public/kube-state-metrics:v1.9.7 docker push harbor.od.com/public/kube-state-metrics /kube-state-metrics Deployment RBAC vim /var/k8s-yaml/kube-state-metrics/deployment.yaml apiVersion: spec: containers: - name: kube-state-metrics image: harbor.od.com/public/kube-state-metrics vim /var/k8s-yaml/kube-state-metrics/rbac.yaml apiVersion: v1 kind: ServiceAccount metadata: labels
pull registry.aliyuncs.com/chenby/ceph:v18.2.1 # k8s.gcr.io 仓库: # 官方地址: docker pull k8s.gcr.io/kube-state-metrics /kube-state-metrics:v2.8.2 # 镜像地址: docker pull k8s.chenby.cn/kube-state-metrics/kube-state-metrics :v2.8.2 # 阿里云地址: docker pull registry.aliyuncs.com/chenby/kube-state-metrics:v2.8.2 # registry.k8s.io
clusterrolebinding.rbac.authorization.k8s.io/prometheus created service/prometheus created 4,创建k8s的matrix % kubectl apply -f kube-state-metrics / deployment.apps/kube-state-metrics created serviceaccount/kube-state-metrics created clusterrole.rbac.authorization.k8s.io /kube-state-metrics created clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created service /kube-state-metrics created 访问k8s的资源需要k8s的ssl认证 A,生成serving.key (umask 077; openssl genrsa -out serving.key AGE custom-metrics-apiserver ClusterIP 10.96.172.58 <none> 443/TCP 8h kube-state-metrics
/kube-state-metrics:v2.3.0 这个镜像没办法拉取 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/ : Container ID: Image: k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.3.0 Image 同样的,我们通过docker仓库找一下相同的,然后通过kubectl edit pod修改一下 k8s.gcr.io/kube-state-metrics/kube-state-metrics 替换为: ] └─$ ansible node -m shell -a "docker pull dyrnq/kube-state-metrics:v2.3.0" 192.168.26.82 | CHANGED | rc=0 >> v2.3.0: Pulling from dyrnq/kube-state-metrics e8614d09b7be: Pulling fs layer 53ccb90bafd7:
这里将主要介绍kube-state-metrics,而对于应用内部的监控实践后边有时间再单独总结。 kube-state-metrics使用kubernetes的go语言客户端client-go可以从Kubernetes集群中获取各种资源对象的指标。 3.1 在Kubernetes上部署kube-state-metrics kube-state-metrics已经给出了在Kubernetes部署的manifest定义文件,具体的文件定义都在这里。 kube-state-metrics,并开始拉取metrics,当然集群外部的Prometheus也能从集群中的Prometheus拉取到这些数据了。 关于kube-state-metrics暴露的所有监控指标可以参考kube-state-metrics的文档kube-state-metrics Documentation。
3.新建目录重新梳理下 [root@elasticsearch01 manifests]# mkdir -p operator node-exporter alertmanager grafana kube-state-metrics / clusterrole.rbac.authorization.k8s.io/kube-state-metrics created clusterrolebinding.rbac.authorization.k8s.io /kube-state-metrics created deployment.apps/kube-state-metrics created role.rbac.authorization.k8s.io /kube-state-metrics created rolebinding.rbac.authorization.k8s.io/kube-state-metrics created service/ kube-state-metrics created serviceaccount/kube-state-metrics created [root@elasticsearch01 manifests]
kube-state-metrics 首先,我们需要安装 kube-state-metrics,这个组件是一个监听 Kubernetes API 的服务,可以暴露每个资源对象状态的相关指标数据。 要安装 kube-state-metrics 也非常简单,在对应的 GitHub 仓库下就有对应的安装资源清单文件: $ git clone https://github.com/kubernetes/ /kube-state-metrics configured clusterrole.rbac.authorization.k8s.io/kube-state-metrics configured deployment.apps /kube-state-metrics configured serviceaccount/kube-state-metrics configured service/kube-state-metrics configured $ kubectl get pods -n kube-system -l app.kubernetes.io/name=kube-state-metrics NAME