我使用Ubuntu18.04构建了一个Kubernetes集群,并面临DNS问题,因此基本上容器无法通过它们的主机名进行通信。
正在起作用的事情:
kubernetes.default库伯奈特大师:
root@k8s-1:~# cat /etc/resolv.conf | grep -v ^\\#
nameserver 127.0.0.53
search home
root@k8s-1:~# 豆荚:
root@k8s-1:~# kubectl exec dnsutils cat /etc/resolv.conf
nameserver 169.254.25.10
search default.svc.cluster.local svc.cluster.local cluster.local home
options ndots:5
root@k8s-1:~# CoreDNS吊舱是健康的:
root@k8s-1:~# kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
coredns-58687784f9-8rmlw 1/1 Running 0 35m
coredns-58687784f9-hp8hp 1/1 Running 0 35m
root@k8s-1:~#CoreDNS荚的日志:
root@k8s-1:~# kubectl describe pods --namespace=kube-system -l k8s-app=kube-dns | tail -n 2
Normal Started 35m kubelet, k8s-2 Started container coredns
Warning DNSConfigForming 12s (x33 over 35m) kubelet, k8s-2 Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 4.2.2.1 4.2.2.2 208.67.220.220
root@k8s-1:~# kubectl logs --namespace=kube-system coredns-58687784f9-8rmlw
.:53
2020-02-09T22:56:14.390Z [INFO] plugin/reload: Running configuration MD5 = b9d55fc86b311e1d1a0507440727efd2
2020-02-09T22:56:14.391Z [INFO] CoreDNS-1.6.0
2020-02-09T22:56:14.391Z [INFO] linux/amd64, go1.12.7, 0a218d3
CoreDNS-1.6.0
linux/amd64, go1.12.7, 0a218d3
root@k8s-1:~#
root@k8s-1:~# kubectl logs --namespace=kube-system coredns-58687784f9-hp8hp
.:53
2020-02-09T22:56:20.388Z [INFO] plugin/reload: Running configuration MD5 = b9d55fc86b311e1d1a0507440727efd2
2020-02-09T22:56:20.388Z [INFO] CoreDNS-1.6.0
2020-02-09T22:56:20.388Z [INFO] linux/amd64, go1.12.7, 0a218d3
CoreDNS-1.6.0
linux/amd64, go1.12.7, 0a218d3
root@k8s-1:~#CoreDNS似乎暴露了:
root@k8s-1:~# kubectl get svc --namespace=kube-system | grep coredns
coredns ClusterIP 10.233.0.3 53/UDP,53/TCP,9153/TCP 37m
root@k8s-1:~#
root@k8s-1:~# kubectl get ep coredns --namespace=kube-system
NAME ENDPOINTS AGE
coredns 10.233.64.2:53,10.233.65.3:53,10.233.64.2:53 + 3 more... 37m
root@k8s-1:~#这些是我的问题荚--所有的集群都因为这个问题而受到影响:
root@k8s-1:~# kubectl get pods -o wide -n default
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox 1/1 Running 0 17m 10.233.66.7 k8s-3
dnsutils 1/1 Running 0 50m 10.233.66.5 k8s-3
nginx-86c57db685-p8zhc 1/1 Running 0 43m 10.233.64.3 k8s-1
nginx-86c57db685-st7rw 1/1 Running 0 47m 10.233.66.6 k8s-3
root@k8s-1:~# 能够通过IP地址使用DNS和容器到达internet:
root@k8s-1:~# kubectl exec -it nginx-86c57db685-st7rw -- sh -c "ping 10.233.64.3"
PING 10.233.64.3 (10.233.64.3) 56(84) bytes of data.
64 bytes from 10.233.64.3: icmp_seq=1 ttl=62 time=0.481 ms
64 bytes from 10.233.64.3: icmp_seq=2 ttl=62 time=0.551 ms
...
root@k8s-1:~# kubectl exec -it nginx-86c57db685-st7rw -- sh -c "ping google.com"
PING google.com (172.217.21.174) 56(84) bytes of data.
64 bytes from fra07s64-in-f174.1e100.net (172.217.21.174): icmp_seq=1 ttl=61 time=77.9 ms
...
root@k8s-1:~# kubectl exec -it nginx-86c57db685-st7rw -- sh -c "ping kubernetes.default"
PING kubernetes.default.svc.cluster.local (10.233.0.1) 56(84) bytes of data.
64 bytes from kubernetes.default.svc.cluster.local (10.233.0.1): icmp_seq=1 ttl=64 time=0.030 ms
64 bytes from kubernetes.default.svc.cluster.local (10.233.0.1): icmp_seq=2 ttl=64 time=0.069 ms
...Actual发行:
root@k8s-1:~# kubectl exec -it nginx-86c57db685-st7rw -- sh -c "ping nginx-86c57db685-p8zhc"
ping: nginx-86c57db685-p8zhc: Name or service not known
command terminated with exit code 2
root@k8s-1:~#
root@k8s-1:~# kubectl exec -it nginx-86c57db685-st7rw -- sh -c "ping dnsutils"
ping: dnsutils: Name or service not known
command terminated with exit code 2
root@k8s-1:~#
oot@k8s-1:~# kubectl exec -ti busybox -- nslookup nginx-86c57db685-p8zhc
Server: 169.254.25.10
Address: 169.254.25.10:53
** server can't find nginx-86c57db685-p8zhc.default.svc.cluster.local: NXDOMAIN
*** Can't find nginx-86c57db685-p8zhc.svc.cluster.local: No answer
*** Can't find nginx-86c57db685-p8zhc.cluster.local: No answer
*** Can't find nginx-86c57db685-p8zhc.home: No answer
*** Can't find nginx-86c57db685-p8zhc.default.svc.cluster.local: No answer
*** Can't find nginx-86c57db685-p8zhc.svc.cluster.local: No answer
*** Can't find nginx-86c57db685-p8zhc.cluster.local: No answer
*** Can't find nginx-86c57db685-p8zhc.home: No answer
command terminated with exit code 1
root@k8s-1:~#我是否遗漏了什么,或者如何使用主机名修复容器之间的通信?
非常感谢
更新
更多支票:
root@k8s-1:~# kubectl exec -ti dnsutils -- nslookup kubernetes.default
Server: 169.254.25.10
Address: 169.254.25.10#53
Name: kubernetes.default.svc.cluster.local
Address: 10.233.0.1我创建了StatefulSet:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/website/master/content/en/examples/application/web/web.yaml我可以为您提供“nginx”服务:
root@k8s-1:~/kplay# k exec dnsutils -it nslookup nginx
Server: 169.254.25.10
Address: 169.254.25.10#53
Name: nginx.default.svc.cluster.local
Address: 10.233.66.8
Name: nginx.default.svc.cluster.local
Address: 10.233.64.3
Name: nginx.default.svc.cluster.local
Address: 10.233.65.5
Name: nginx.default.svc.cluster.local
Address: 10.233.66.6还可以在使用FQDN时与状态集成员联系。
root@k8s-1:~/kplay# k exec dnsutils -it nslookup web-0.nginx.default.svc.cluster.local
Server: 169.254.25.10
Address: 169.254.25.10#53
Name: web-0.nginx.default.svc.cluster.local
Address: 10.233.65.5
root@k8s-1:~/kplay# k exec dnsutils -it nslookup web-1.nginx.default.svc.cluster.local
Server: 169.254.25.10
Address: 169.254.25.10#53
Name: web-1.nginx.default.svc.cluster.local
Address: 10.233.66.8但不要只使用主机名:
root@k8s-1:~/kplay# k exec dnsutils -it nslookup web-0
Server: 169.254.25.10
Address: 169.254.25.10#53
** server can't find web-0: NXDOMAIN
command terminated with exit code 1
root@k8s-1:~/kplay# k exec dnsutils -it nslookup web-1
Server: 169.254.25.10
Address: 169.254.25.10#53
** server can't find web-1: NXDOMAIN
command terminated with exit code 1
root@k8s-1:~/kplay#它们都生活在同一个名称空间中:
root@k8s-1:~/kplay# k get pods -n default
NAME READY STATUS RESTARTS AGE
busybox 1/1 Running 22 22h
dnsutils 1/1 Running 22 22h
nginx-86c57db685-p8zhc 1/1 Running 0 22h
nginx-86c57db685-st7rw 1/1 Running 0 22h
web-0 1/1 Running 0 11m
web-1 1/1 Running 0 10m另一个测试证实了我能够使用ping服务:
kubectl create deployment --image nginx some-nginx
kubectl scale deployment --replicas 2 some-nginx
kubectl expose deployment some-nginx --port=12345 --type=NodePort
root@k8s-1:~/kplay# k exec dnsutils -it nslookup some-nginx
Server: 169.254.25.10
Address: 169.254.25.10#53
Name: some-nginx.default.svc.cluster.local
Address: 10.233.63.137Final思想
有趣的事实,但也许这就是库伯内特斯应该怎么做的?我可以联系到服务主机名和状态设置成员,如果想单独到达一些吊舱。如果没有状态集,那么到达单个荚似乎并不非常重要,至少在我的k8s使用中(可能对每个人来说都是如此)。
发布于 2020-02-11 09:00:37
我建议您遵循这,这样我们就可以分离出您的CoreDNS中可能出现的问题,正如您所看到的,它运行得很好。
如果没有状态集,那么到达单个荚似乎并不非常重要,至少在我的k8s使用中(可能对每个人来说都是如此)。
可以使用DNS记录到达一个吊舱,但正如您所说的,它在常规K8s实现中并不是很重要。
当启用时,豆荚被分配一个
pod-ip-address.my-namespace.pod.cluster.local形式的DNS A记录。例如,在名称空间1.2.3.4的名称为cluster.local的名称空间default中,带有IPD7的pod将有一个条目:1-2-3-4.default.pod.cluster.local。来源
示例
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
dnsutils 1/1 Running 20 20h 10.28.2.3 gke-lab-default-pool-87c6b085-wcp8
sample-pod 1/1 Running 0 2m11s 10.28.2.4 gke-lab-default-pool-87c6b085-wcp8
$ kubectl exec -ti dnsutils -- nslookup 10-28-2-4.default.pod.cluster.local
Server: 10.31.240.10
Address: 10.31.240.10#53
Name: 10-28-2-4.default.pod.cluster.local
Address: 10.28.2.4有趣的事实,但也许这就是库伯内特斯应该怎么做的?
是的,您的CoreDNS正在按预期工作,您所描述的一切都是预期的。
https://serverfault.com/questions/1002432
复制相似问题