我不知道该怎么调试它。我有一个Kubernetes主节点和三个从节点。使用本指南https://github.com/gluster/gluster-kubernetes/blob/master/docs/setup-guide.md,我已经在三个节点上部署了一个Gluster集群。
我创建了卷,一切都正常。但是,当我重新启动一个从节点,并且该节点重新连接到主节点时,从节点内部的glusterd.service显示为死机,并且在此之后什么都不工作。
[root@kubernetes-node-1 /]# systemctl status glusterd.service
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled)
Active: inactive (dead)我不知道从这里做什么,例如,/var/log/glusterfs/glusterd.log上次更新是在3天前(在重启或pod deletion+recreation之后,它没有更新错误)。
我只想知道glusterd在哪里崩溃,这样我就能找出原因。
我如何调试这个崩溃?
所有节点(主节点+从节点)都在Ubuntu Desktop 18 64 bit LTS Virtualbox VMs上运行。
请求的日志(kubectl get all --all-namespaces):
NAMESPACE NAME READY STATUS RESTARTS AGE
glusterfs pod/glusterfs-7nl8l 0/1 Running 62 22h
glusterfs pod/glusterfs-wjnzx 1/1 Running 62 2d21h
glusterfs pod/glusterfs-wl4lx 1/1 Running 112 41h
glusterfs pod/heketi-7495cdc5fd-hc42h 1/1 Running 0 22h
kube-system pod/coredns-86c58d9df4-n2hpk 1/1 Running 0 6d12h
kube-system pod/coredns-86c58d9df4-rbwjq 1/1 Running 0 6d12h
kube-system pod/etcd-kubernetes-master-work 1/1 Running 0 6d12h
kube-system pod/kube-apiserver-kubernetes-master-work 1/1 Running 0 6d12h
kube-system pod/kube-controller-manager-kubernetes-master-work 1/1 Running 0 6d12h
kube-system pod/kube-flannel-ds-amd64-785q8 1/1 Running 5 3d19h
kube-system pod/kube-flannel-ds-amd64-8sj2z 1/1 Running 8 3d19h
kube-system pod/kube-flannel-ds-amd64-v62xb 1/1 Running 0 3d21h
kube-system pod/kube-flannel-ds-amd64-wx4jl 1/1 Running 7 3d21h
kube-system pod/kube-proxy-7f6d9 1/1 Running 5 3d19h
kube-system pod/kube-proxy-7sf9d 1/1 Running 0 6d12h
kube-system pod/kube-proxy-n9qxq 1/1 Running 8 3d19h
kube-system pod/kube-proxy-rwghw 1/1 Running 7 3d21h
kube-system pod/kube-scheduler-kubernetes-master-work 1/1 Running 0 6d12h
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 6d12h
elastic service/glusterfs-dynamic-9ad03769-2bb5-11e9-8710-0800276a5a8e ClusterIP 10.98.38.157 <none> 1/TCP 2d19h
elastic service/glusterfs-dynamic-a77e02ca-2bb4-11e9-8710-0800276a5a8e ClusterIP 10.97.203.225 <none> 1/TCP 2d19h
elastic service/glusterfs-dynamic-ad16ed0b-2bb6-11e9-8710-0800276a5a8e ClusterIP 10.105.149.142 <none> 1/TCP 2d19h
glusterfs service/heketi ClusterIP 10.101.79.224 <none> 8080/TCP 2d20h
glusterfs service/heketi-storage-endpoints ClusterIP 10.99.199.190 <none> 1/TCP 2d20h
kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 6d12h
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
glusterfs daemonset.apps/glusterfs 3 3 0 3 0 storagenode=glusterfs 2d21h
kube-system daemonset.apps/kube-flannel-ds-amd64 4 4 4 4 4 beta.kubernetes.io/arch=amd64 3d21h
kube-system daemonset.apps/kube-flannel-ds-arm 0 0 0 0 0 beta.kubernetes.io/arch=arm 3d21h
kube-system daemonset.apps/kube-flannel-ds-arm64 0 0 0 0 0 beta.kubernetes.io/arch=arm64 3d21h
kube-system daemonset.apps/kube-flannel-ds-ppc64le 0 0 0 0 0 beta.kubernetes.io/arch=ppc64le 3d21h
kube-system daemonset.apps/kube-flannel-ds-s390x 0 0 0 0 0 beta.kubernetes.io/arch=s390x 3d21h
kube-system daemonset.apps/kube-proxy 4 4 4 4 4 <none> 6d12h
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
glusterfs deployment.apps/heketi 1/1 1 0 2d20h
kube-system deployment.apps/coredns 2/2 2 2 6d12h
NAMESPACE NAME DESIRED CURRENT READY AGE
glusterfs replicaset.apps/heketi-7495cdc5fd 1 1 0 2d20h
kube-system replicaset.apps/coredns-86c58d9df4 2 2 2 6d12h请求:
tasos@kubernetes-master-work:~$ kubectl logs -n glusterfs glusterfs-7nl8l
env variable is set. Update in gluster-blockd.service发布于 2019-02-11 19:55:41
请查看以下类似主题:
GlusterFS deployment on k8s cluster-- Readiness probe failed: /usr/local/bin/status-probe.sh
和
https://github.com/gluster/gluster-kubernetes/issues/539
检查tcmu-runner.log日志以进行调试。
更新:
我认为这将是你的问题:https://github.com/gluster/gluster-kubernetes/pull/557
PR已准备好,但未合并。
更新2:
https://github.com/gluster/glusterfs/issues/417
请确保已安装rpcbind。
https://stackoverflow.com/questions/54621863
复制相似问题