我有以下node-exporter守护进程集。
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: node-exporter
labels:
app: node-exporter
tier: monitor
category: platform
spec:
template:
metadata:
labels:
app: node-exporter
tier: monitor
category: platform
name: node-exporter
spec:
containers:
- image: prom/node-exporter:0.12.0
name: node-exporter
ports:
- containerPort: 9100
hostPort: 9100
name: scrape
hostNetwork: true
hostPID: true当我运行此命令时,kube-controller-manager重复输出一个错误,如下所示:
E1117 18:31:23.197206 1 endpoints_controller.go:513]
Endpoints "node-exporter" is invalid:
[subsets[0].addresses[0].nodeName: Forbidden: Cannot change NodeName for 172.17.64.5 to ip-172-17-64-5.ec2.internal,
subsets[0].addresses[1].nodeName: Forbidden: Cannot change NodeName for 172.17.64.6 to ip-172-17-64-6.ec2.internal,
subsets[0].addresses[2].nodeName: Forbidden: Cannot change NodeName for 172.17.80.5 to ip-172-17-80-5.ec2.internal,
subsets[0].addresses[3].nodeName: Forbidden: Cannot change NodeName for 172.17.80.6 to ip-172-17-80-6.ec2.internal,
subsets[0].addresses[4].nodeName: Forbidden: Cannot change NodeName for 172.17.96.6 to ip-172-17-96-6.ec2.internal]但是这些日志的输出量太大,使得在我们的日志控制台上很难看到其他日志。我可以看看如何解决这个错误吗?
因为我是从头开始构建我的k8s集群的,所以cloud-provider=aws标志一开始没有被激活,最近我把它打开了,但不确定它是否与这个问题有关。
发布于 2016-11-18 05:46:15
看起来这是由我的另一个清单文件引起的
apiVersion: v1
kind: Service
metadata:
name: node-exporter
labels:
app: node-exporter
tier: monitor
category: platform
annotations:
prometheus.io/scrape: 'true'
spec:
clusterIP: None
ports:
- name: scrape
port: 9100
protocol: TCP
selector:
app: node-exporter
type: ClusterIP我认为这对于公开上面的节点导出器守护进程集是必要的,但是当我在守护进程集(实际上是pod)清单中设置hostNetwork: true时,它可能会引入某种冲突。不过,我不能100%确定,在我删除这个服务后,错误消失了,而我仍然可以从k8s集群外部访问172-17-96-6:9100。
我只是在设置prometheus和node-exporter,https://coreos.com/blog/prometheus-and-kubernetes-up-and-running.html时紧跟着这篇文章
以防其他人面临同样的问题,我在这里留下我的评论。
https://stackoverflow.com/questions/40662809
复制相似问题