首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >当使用容器作为CRI时,kubernetes的脱机安装失败

当使用容器作为CRI时,kubernetes的脱机安装失败
EN

Server Fault用户
提问于 2021-11-24 08:25:37
回答 1查看 2.8K关注 0票数 1

由于某种原因,我不得不建立一个没有互联网连接的裸金属Kubernetes集群。

由于取消了dockershim,所以我决定使用容器作为CRI,但是在执行kubeadm init时,由于超时,kubeadm的脱机安装失败了。

代码语言:javascript
复制
    Unfortunately, an error has occurred:
            timed out waiting for the condition

    This error is likely caused by:
            - The kubelet is not running
            - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

    If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
            - 'systemctl status kubelet'
            - 'journalctl -xeu kubelet'

通过journalctl -u kubelet -f,我可以看到很多错误日志:

代码语言:javascript
复制
11 24 16:25:25 rhel8 kubelet[9299]: E1124 16:25:25.473188    9299 controller.go:144] failed to ensure lease exists, will retry in 7s, error: Get "https://133.117.20.57:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/rhel8?timeout=10s": dial tcp 133.117.20.57:6443: connect: connection refused
11 24 16:25:25 rhel8 kubelet[9299]: E1124 16:25:25.533555    9299 kubelet.go:2407] "Error getting node" err="node \"rhel8\" not found"
11 24 16:25:25 rhel8 kubelet[9299]: I1124 16:25:25.588986    9299 kubelet_node_status.go:71] "Attempting to register node" node="rhel8"
11 24 16:25:25 rhel8 kubelet[9299]: E1124 16:25:25.589379    9299 kubelet_node_status.go:93] "Unable to register node with API server" err="Post \"https://133.117.20.57:6443/api/v1/nodes\": dial tcp 133.117.20.57:6443: connect: connection refused" node="rhel8"
11 24 16:25:25 rhel8 kubelet[9299]: E1124 16:25:25.634625    9299 kubelet.go:2407] "Error getting node" err="node \"rhel8\" not found"
11 24 16:25:25 rhel8 kubelet[9299]: E1124 16:25:25.735613    9299 kubelet.go:2407] "Error getting node" err="node \"rhel8\" not found"
11 24 16:25:25 rhel8 kubelet[9299]: E1124 16:25:25.835815    9299 kubelet.go:2407] "Error getting node" err="node \"rhel8\" not found"
11 24 16:25:25 rhel8 kubelet[9299]: E1124 16:25:25.936552    9299 kubelet.go:2407] "Error getting node" err="node \"rhel8\" not found"
11 24 16:25:26 rhel8 kubelet[9299]: E1124 16:25:26.036989    9299 kubelet.go:2407] "Error getting node" err="node \"rhel8\" not found"
11 24 16:25:26 rhel8 kubelet[9299]: E1124 16:25:26.137464    9299 kubelet.go:2407] "Error getting node" err="node \"rhel8\" not found"
11 24 16:25:26 rhel8 kubelet[9299]: E1124 16:25:26.238594    9299 kubelet.go:2407] "Error getting node" err="node \"rhel8\" not found"
11 24 16:25:26 rhel8 kubelet[9299]: E1124 16:25:26.338704    9299 kubelet.go:2407] "Error getting node" err="node \"rhel8\" not found"
11 24 16:25:26 rhel8 kubelet[9299]: E1124 16:25:26.394465    9299 event.go:273] Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"rhel8.16ba6aab63e58bd8", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"rhel8", UID:"rhel8", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"Starting", Message:"Starting kubelet.", Source:v1.EventSource{Component:"kubelet", Host:"rhel8"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xc05f9812b2b227d8, ext:5706873656, loc:(*time.Location)(0x55a228f25680)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xc05f9812b2b227d8, ext:5706873656, loc:(*time.Location)(0x55a228f25680)}}, Count:1, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'Post "https://133.117.20.57:6443/api/v1/namespaces/default/events": dial tcp 133.117.20.57:6443: connect: connection refused'(may retry after sleeping)
11 24 16:25:27 rhel8 kubelet[9299]: E1124 16:25:27.143503    9299 kubelet.go:2407] "Error getting node" err="node \"rhel8\" not found"
11 24 16:25:27 rhel8 kubelet[9299]: E1124 16:25:27.244526    9299 kubelet.go:2407] "Error getting node" err="node \"rhel8\" not found"
11 24 16:25:27 rhel8 kubelet[9299]: E1124 16:25:27.302890    9299 remote_runtime.go:116] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"k8s.gcr.io/pause:3.2\": failed to pull image \"k8s.gcr.io/pause:3.2\": failed to pull and unpack image \"k8s.gcr.io/pause:3.2\": failed to resolve reference \"k8s.gcr.io/pause:3.2\": failed to do request: Head \"https://k8s.gcr.io/v2/pause/manifests/3.2\": dial tcp: lookup k8s.gcr.io on [::1]:53: read udp [::1]:39732->[::1]:53: read: connection refused"
11 24 16:25:27 rhel8 kubelet[9299]: E1124 16:25:27.302949    9299 kuberuntime_sandbox.go:70] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to get sandbox image \"k8s.gcr.io/pause:3.2\": failed to pull image \"k8s.gcr.io/pause:3.2\": failed to pull and unpack image \"k8s.gcr.io/pause:3.2\": failed to resolve reference \"k8s.gcr.io/pause:3.2\": failed to do request: Head \"https://k8s.gcr.io/v2/pause/manifests/3.2\": dial tcp: lookup k8s.gcr.io on [::1]:53: read udp [::1]:39732->[::1]:53: read: connection refused" pod="kube-system/kube-scheduler-rhel8"
11 24 16:25:27 rhel8 kubelet[9299]: E1124 16:25:27.302989    9299 kuberuntime_manager.go:815] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"k8s.gcr.io/pause:3.2\": failed to pull image \"k8s.gcr.io/pause:3.2\": failed to pull and unpack image \"k8s.gcr.io/pause:3.2\": failed to resolve reference \"k8s.gcr.io/pause:3.2\": failed to do request: Head \"https://k8s.gcr.io/v2/pause/manifests/3.2\": dial tcp: lookup k8s.gcr.io on [::1]:53: read udp [::1]:39732->[::1]:53: read: connection refused" pod="kube-system/kube-scheduler-rhel8"
11 24 16:25:27 rhel8 kubelet[9299]: E1124 16:25:27.303080    9299 pod_workers.go:765] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"kube-scheduler-rhel8_kube-system(e5616b23d0312e4995fcb768f04aabbb)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"kube-scheduler-rhel8_kube-system(e5616b23d0312e4995fcb768f04aabbb)\\\": rpc error: code = Unknown desc = failed to get sandbox image \\\"k8s.gcr.io/pause:3.2\\\": failed to pull image \\\"k8s.gcr.io/pause:3.2\\\": failed to pull and unpack image \\\"k8s.gcr.io/pause:3.2\\\": failed to resolve reference \\\"k8s.gcr.io/pause:3.2\\\": failed to do request: Head \\\"https://k8s.gcr.io/v2/pause/manifests/3.2\\\": dial tcp: lookup k8s.gcr.io on [::1]:53: read udp [::1]:39732->[::1]:53: read: connection refused\"" pod="kube-system/kube-scheduler-rhel8" podUID=e5616b23d0312e4995fcb768f04aabbb

当我对Internet连接做同样的事情时,安装就成功了。而且,当使用docker而不是收集器时,即使没有Internet连接,安装也是成功的。

EN

回答 1

Server Fault用户

回答已采纳

发布于 2021-11-24 08:25:37

它是由容器造成的,它的设置可以在默认情况下从k8s.gcr.io中提取其k8s.gcr.io,即使没有互联网连接。

此设置是围绕/etc/containerd/config.toml文件的第57行指定的。

代码语言:javascript
复制
   [plugins."io.containerd.grpc.v1.cri"]
     
     sandbox_image = "k8s.gcr.io/pause:3.2"

我当前的k8s集群是v1.22.1,所以它使用的是pause:3.5,而不是pause:3.2。通过将其更改为现有映像(这次是k8s.gcr.io:3.5),我成功地构建了没有Internet连接的kubernetes集群。

票数 3
EN
页面原文内容由Server Fault提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://serverfault.com/questions/1084454

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档