我尝试使用kubeadm构建HA集群,以下是我的配置:
kind: MasterConfiguration
kubernetesVersion: v1.11.4
apiServerCertSANs:
- "aaa.xxx.yyy.zzz"
api:
controlPlaneEndpoint: "my.domain.de:6443"
apiServerExtraArgs:
apiserver-count: 3
etcd:
local:
image: quay.io/coreos/etcd:v3.3.10
extraArgs:
listen-client-urls: "https://127.0.0.1:2379,https://$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4):2379"
advertise-client-urls: "https://$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4):2379"
listen-peer-urls: "https://$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4):2380"
initial-advertise-peer-urls: "https://$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4):2380"
initial-cluster-state: "new"
initial-cluster-token: "kubernetes-cluster"
initial-cluster: ${CLUSTER}
name: $(hostname -s)
localEtcd:
serverCertSANs:
- "$(hostname -s)"
- "$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4)"
peerCertSANs:
- "$(hostname -s)"
- "$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4)"
networking:
podSubnet: "${POD_SUBNET}/${POD_SUBNETMASK}"
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: foobar.fedcba9876543210
ttl: 24h0m0s
usages:
- signing
- authentication我在所有三个节点上运行此命令,并启动这些节点。在加入calico之后,看起来一切都很好,我甚至成功地添加了一个工人:
ubuntu@master-2-test2:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master-1-test2 Ready master 1h v1.11.4
master-2-test2 Ready master 1h v1.11.4
master-3-test2 Ready master 1h v1.11.4
node-1-test2 Ready <none> 1h v1.11.4看着控制平面,一切看起来都很好。
curl https://192.168.0.125:6443/api/v1/nodes可以在主节点和工作节点上工作。所有pods都在运行:
ubuntu@master-2-test2:~$ sudo kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-node-9lnk8 2/2 Running 0 1h
calico-node-f7dkk 2/2 Running 1 1h
calico-node-k7hw5 2/2 Running 17 1h
calico-node-rtrvb 2/2 Running 3 1h
coredns-78fcdf6894-6xgqc 1/1 Running 0 1h
coredns-78fcdf6894-kcm4f 1/1 Running 0 1h
etcd-master-1-test2 1/1 Running 0 1h
etcd-master-2-test2 1/1 Running 1 1h
etcd-master-3-test2 1/1 Running 0 1h
kube-apiserver-master-1-test2 1/1 Running 0 40m
kube-apiserver-master-2-test2 1/1 Running 0 58m
kube-apiserver-master-3-test2 1/1 Running 0 36m
kube-controller-manager-master-1-test2 1/1 Running 0 17m
kube-controller-manager-master-2-test2 1/1 Running 1 17m
kube-controller-manager-master-3-test2 1/1 Running 0 17m
kube-proxy-5clt4 1/1 Running 0 1h
kube-proxy-d2tpz 1/1 Running 0 1h
kube-proxy-q6kjw 1/1 Running 0 1h
kube-proxy-vn6l7 1/1 Running 0 1h
kube-scheduler-master-1-test2 1/1 Running 1 24m
kube-scheduler-master-2-test2 1/1 Running 0 24m
kube-scheduler-master-3-test2 1/1 Running 0 24m但是尝试启动一个pod时,什么也没有发生:
~$ kubectl get deployments
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx 1 0 0 0 32m我转向查看调度器和控制器,令我沮丧的是,有很多错误,控制器充满了:
E1108 00:40:36.638832 1 reflector.go:205] k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:124: Failed to list <nil>: Unauthorized
E1108 00:40:36.639161 1 reflector.go:205] k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:124: Failed to list <nil>: Unauthorized有时还会使用:
garbagecollector.go:649] failed to discover preferred resources: Unauthorized
E1108 00:40:36.639356 1 reflector.go:205] k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:124: Failed to list <nil>: Unauthorized
E1108 00:40:36.640568 1 reflector.go:205] k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:124: Failed to list <nil>: Unauthorized
E1108 00:40:36.642129 1 reflector.go:205] k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:124: Failed to list <nil>: Unauthorized而调度程序也有类似的错误:
E1107 23:25:43.026465 1 reflector.go:205] k8s.io/kubernetes/vendor/k8s.io/client-go/informers/factory.go:130: Failed to list *v1beta1.ReplicaSet: Get https://mydomain.de:6443/apis/extensions/v1beta1/replicasets?limit=500&resourceVersion=0: EOF
E1107 23:25:43.026614 1 reflector.go:205] k8s.io/kubernetes/vendor/k8s.io/client-go/informers/factory.go:130: Failed to list *v1.Node: Get https://mydomain.de:e:6443/api/v1/nodes?limit=500&resourceVersion=0: EOF到目前为止,我还不知道如何纠正这些错误。任何帮助都将不胜感激。
更多信息:
kube-proxy的kubeconfig为:
----
apiVersion: v1
kind: Config
clusters:
- cluster:
certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
server: https://my.domain.de:6443
name: default
contexts:
- context:
cluster: default
namespace: default
user: default
name: default
current-context: default
users:
- name: default
user:
tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
Events: <none>发布于 2018-11-08 09:32:23
不过,我觉得不错,你能描述下workernode并查看pod可用的资源吗?另请描述pod并查看其显示的错误。
发布于 2018-11-08 10:40:27
与此终结点上的active kube-apiserver对话时出现一些身份验证问题(证书):https://mydomain.de:6443/apis/extensions/v1beta1/replicasets?limit=500&resourceVersion=0。
以下是一些建议:
您的kube-apiserver的load balancer是否指向正确的one?您使用的是L4 (TCP)负载均衡器而不是L7 (HTTP)负载均衡器吗?
您是否将相同的证书复制到所有地方,并确保它们是相同的?
USER=ubuntu # customizable
CONTROL_PLANE_IPS="10.0.0.7 10.0.0.8"
for host in ${CONTROL_PLANE_IPS}; do
scp /etc/kubernetes/pki/ca.crt "${USER}"@$host:
scp /etc/kubernetes/pki/ca.key "${USER}"@$host:
scp /etc/kubernetes/pki/sa.key "${USER}"@$host:
scp /etc/kubernetes/pki/sa.pub "${USER}"@$host:
scp /etc/kubernetes/pki/front-proxy-ca.crt "${USER}"@$host:
scp /etc/kubernetes/pki/front-proxy-ca.key "${USER}"@$host:
scp /etc/kubernetes/pki/etcd/ca.crt "${USER}"@$host:etcd-ca.crt
scp /etc/kubernetes/pki/etcd/ca.key "${USER}"@$host:etcd-ca.key
scp /etc/kubernetes/admin.conf "${USER}"@$host:
done您是否检查了/etc/kubernetes/manifests下的kube-apiserver和kube-controller-manager配置是否相同
https://stackoverflow.com/questions/53200161
复制相似问题