首先,我的kubernetes集群是基于裸金属环境的.
集群信息:
k get no -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k-master Ready master 142d v1.18.10 192.168.6.211 <none> CentOS Linux 7 (Core) 3.10.0-957.el7.x86_64 docker://18.9.9
k-node-1 Ready <none> 142d v1.18.10 192.168.6.212 <none> CentOS Linux 7 (Core) 3.10.0-957.el7.x86_64 docker://18.9.9
k-node-2 Ready <none> 142d v1.18.10 192.168.6.213 <none> CentOS Linux 7 (Core) 3.10.0-957.el7.x86_64 docker://18.9.9当我在集群中安装Prometheus以监视k8s时,我的步骤是:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups:
- extensions
- networking.k8s.io
resources:
- ingresses
verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
namespace: monitor
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: monitorapiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-conf
namespace: monitor
data:
prometheus.yml: |
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s #
scrape_configs:
- job_name: 'kubernetes-nodes'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: 'kubernetes-service'
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: service
- job_name: 'kubernetes-endpoints'
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: endpoints
- job_name: 'kubernetes-ingress'
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: ingress
- job_name: 'kubernetes-pods'
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: podapiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus-server
namespace: monitor
spec:
selector:
matchLabels:
app: prometheus-server
replicas: 1
template:
metadata:
labels:
app: prometheus-server
spec:
containers:
- name: prometheus-server
image: prom/prometheus
volumeMounts:
- name: conf
mountPath: "/etc/prometheus"
readOnly: true
ports:
- containerPort: 9090
serviceAccountName: prometheus
volumes:
- name: conf
configMap:
name: prometheus-conf无论如何,prometheus相关的服务和入口已经创建,我可以访问prometheus webUI,但是服务发现页面中有这么多不健康的目标。

有关kubernetes-节点详细信息:服务器返回HTTP状态403禁止

我不知道怎么修理它和其他更多。有人能教我吗?谢谢!
发布于 2021-03-19 09:40:35
您可以检查服务器端证书文件中的kubelet:
# openssl x509 -text -noout -in /var/lib/kubelet/pki/kubelet.crt也许是另一个文件名。检查输出的SAN部分,如下所示:
X509v3 Subject Alternative Name:
DNS:host.yourdomain.com 如果只有节点名称被显示,那么您通过ip访问它将不会成功。这样,要么通过api-server探测节点,要么在SAN字段中使用ip重新生成证书文件。
https://stackoverflow.com/questions/66704353
复制相似问题