我已经运行了2天的k8s集群,然后它开始表现得很奇怪。
我的具体问题是关于库贝代理的。kube没有更新iptable。
从kube日志中,我可以看到它无法连接到kubernetes-apiserver (在我的例子中,连接是kube> k8s API服务器)。但吊舱显示是在运行。
问:我期待库贝-代理荚会下降,如果它不能注册为事件的apiserver。
如何通过活性探针实现这种行为?
注意:杀了豆荚后,kube-proxy可以正常工作.
kube-代理日志
sudo docker logs 1de375c94fd4 -f
W0910 15:18:22.091902 1 server.go:195] WARNING: all flags other than --config, --write-config-to, and --cleanup are deprecated. Please begin using a config file ASAP.
I0910 15:18:22.091962 1 feature_gate.go:226] feature gates: &{{} map[]}
time="2018-09-10T15:18:22Z" level=warning msg="Running modprobe ip_vs failed with message: `modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.15.0-33-generic/modules.dep.bin'\nmodprobe: WARNING: Module ip_vs not found in directory /lib/modules/4.15.0-33-generic`, error: exit status 1"
time="2018-09-10T15:18:22Z" level=error msg="Could not get ipvs family information from the kernel. It is possible that ipvs is not enabled in your kernel. Native loadbalancing will not work until this is fixed."
I0910 15:18:22.185086 1 server.go:409] Neither kubeconfig file nor master URL was specified. Falling back to in-cluster config.
I0910 15:18:22.186885 1 server_others.go:140] Using iptables Proxier.
W0910 15:18:22.438408 1 server.go:601] Failed to retrieve node info: nodes "$(node_name)" not found
W0910 15:18:22.438494 1 proxier.go:306] invalid nodeIP, initializing kube-proxy with 127.0.0.1 as nodeIP
I0910 15:18:22.438595 1 server_others.go:174] Tearing down inactive rules.
I0910 15:18:22.861478 1 server.go:444] Version: v1.10.2
I0910 15:18:22.867003 1 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_max' to 2883584
I0910 15:18:22.867046 1 conntrack.go:52] Setting nf_conntrack_max to 2883584
I0910 15:18:22.867267 1 conntrack.go:83] Setting conntrack hashsize to 720896
I0910 15:18:22.893396 1 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
I0910 15:18:22.893505 1 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
I0910 15:18:22.893737 1 config.go:102] Starting endpoints config controller
I0910 15:18:22.893749 1 controller_utils.go:1019] Waiting for caches to sync for endpoints config controller
I0910 15:18:22.893742 1 config.go:202] Starting service config controller
I0910 15:18:22.893765 1 controller_utils.go:1019] Waiting for caches to sync for service config controller
I0910 15:18:22.993904 1 controller_utils.go:1026] Caches are synced for endpoints config controller
I0910 15:18:22.993921 1 controller_utils.go:1026] Caches are synced for service config controller
W0910 16:13:28.276082 1 reflector.go:341] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: watch of *core.Endpoints ended with: very short watch: k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Unexpected watch close - watch lasted less than a second and no items received
W0910 16:13:28.276083 1 reflector.go:341] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: watch of *core.Service ended with: very short watch: k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Unexpected watch close - watch lasted less than a second and no items received
E0910 16:13:29.276678 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Endpoints: Get https://127.0.0.1:6553/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:29.276677 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Service: Get https://127.0.0.1:6553/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:30.277201 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Endpoints: Get https://127.0.0.1:6553/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:30.278009 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Service: Get https://127.0.0.1:6553/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:31.277723 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Endpoints: Get https://127.0.0.1:6553/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:31.278574 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Service: Get https://127.0.0.1:6553/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:32.278197 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Endpoints: Get https://127.0.0.1:6553/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:32.279134 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Service: Get https://127.0.0.1:6553/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:33.278684 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Endpoints: Get https://127.0.0.1:6553/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused
E0910 16:13:33.279587 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Service: Get https://127.0.0.1:6553/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6553: getsockopt: connection refused发布于 2018-09-25 19:46:10
问:我期待库贝-代理荚会下降,如果它不能注册为事件的apiserver。
库贝代理人不应该被打倒。它监听上的事件,并在发生更改/部署时执行它需要做的任何事情。我能想到的理由是,它可能是缓存信息,以使系统上的iptable保持一致。Kubernetes的设计方式是这样的:如果您的主机/kube-apiserver/或主组件发生故障,那么通信仍然应该在没有停机时间的情况下流向节点。
如何通过活性探针实现这种行为?
您可以始终将活性探测添加到kube-proxy DaemonSet中,但这不是推荐的做法:
spec:
containers:
- command:
- /usr/local/bin/kube-proxy
- --config=/var/lib/kube-proxy/config.conf
image: k8s.gcr.io/kube-proxy-amd64:v1.11.2
imagePullPolicy: IfNotPresent
name: kube-proxy
resources: {}
securityContext:
privileged: true
livenessProbe:
exec:
command:
- curl <apiserver>:10256/healthz
initialDelaySeconds: 5
periodSeconds: 5确保在kube上启用了--healthz-port。
https://stackoverflow.com/questions/52494732
复制相似问题