首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >由于CrashLoopBackOff,单群集Kube PODs上的CockroachDB失败

由于CrashLoopBackOff,单群集Kube PODs上的CockroachDB失败
EN

Stack Overflow用户
提问于 2019-01-14 01:05:16
回答 3查看 362关注 0票数 1

使用VirtualBox和4个Centos7 OS安装。

在基本的单群集kubernetes安装之后:

https://kubernetes.io/docs/setup/independent/install-kubeadm/ https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/

代码语言:javascript
复制
[root@k8s-master cockroach]# kubectl get nodes
NAME         STATUS   ROLES    AGE   VERSION
k8s-master   Ready    master   41m   v1.13.2
k8s-slave1   Ready    <none>   39m   v1.13.2
k8s-slave2   Ready    <none>   39m   v1.13.2
k8s-slave3   Ready    <none>   39m   v1.13.2

我已经在主服务器上创建了3个NFS PV,供我的从机作为cockroachdb statefulset.yaml的一部分进行拾取,如下所述:

https://www.cockroachlabs.com/blog/running-cockroachdb-on-kubernetes/

然而,我的蟑螂豆荚就是一直不能相互通信。

代码语言:javascript
复制
    [root@k8s-slave1 kubernetes]# kubectl get pods
NAME            READY   STATUS             RESTARTS   AGE
cockroachdb-0   0/1     CrashLoopBackOff   6          8m47s
cockroachdb-1   0/1     CrashLoopBackOff   6          8m47s
cockroachdb-2   0/1     CrashLoopBackOff   6          8m47s

[root@k8s-slave1 kubernetes]# kubectl get pvc
NAME                    STATUS   VOLUME           CAPACITY   ACCESS MODES   STORAGECLASS   AGE
datadir-cockroachdb-0   Bound    cockroachdbpv0   10Gi       RWO                           17m
datadir-cockroachdb-1   Bound    cockroachdbpv2   10Gi       RWO                           17m
datadir-cockroachdb-2   Bound    cockroachdbpv1   10Gi       RWO                           17m

...the的蟑螂豆荚日志并没有告诉我为什么.

代码语言:javascript
复制
    [root@k8s-slave1 kubernetes]# kubectl logs cockroachdb-0
++ hostname -f
+ exec /cockroach/cockroach start --logtostderr --insecure --advertise-host cockroachdb-0.cockroachdb.default.svc.cluster.local --http-host 0.0.0.0 --join cockroachdb-0.cockroachdb,cockroachdb-1.cockroachdb,cockroachdb-2.cockroachdb --cache 25% --max-sql-memory 25%
W190113 17:00:46.589470 1 cli/start.go:1055  RUNNING IN INSECURE MODE!

- Your cluster is open for any client that can access <all your IP addresses>.
- Any user, even root, can log in without providing a password.
- Any user, connecting as root, can read or write any data in your cluster.
- There is no network encryption nor authentication, and thus no confidentiality.

Check out how to secure your cluster: https://www.cockroachlabs.com/docs/v2.1/secure-a-cluster.html
I190113 17:00:46.595544 1 server/status/recorder.go:609  available memory from cgroups (8.0 EiB) exceeds system memory 3.7 GiB, using system memory
I190113 17:00:46.600386 1 cli/start.go:1069  CockroachDB CCL v2.1.3 (x86_64-unknown-linux-gnu, built 2018/12/17 19:15:31, go1.10.3)
I190113 17:00:46.759727 1 server/status/recorder.go:609  available memory from cgroups (8.0 EiB) exceeds system memory 3.7 GiB, using system memory
I190113 17:00:46.759809 1 server/config.go:386  system total memory: 3.7 GiB
I190113 17:00:46.759872 1 server/config.go:388  server configuration:
max offset             500000000
cache size             947 MiB
SQL memory pool size   947 MiB
scan interval          10m0s
scan min idle time     10ms
scan max idle time     1s
event log enabled      true
I190113 17:00:46.759896 1 cli/start.go:913  using local environment variables: COCKROACH_CHANNEL=kubernetes-insecure
I190113 17:00:46.759909 1 cli/start.go:920  process identity: uid 0 euid 0 gid 0 egid 0
I190113 17:00:46.759919 1 cli/start.go:545  starting cockroach node
I190113 17:00:46.762262 22 storage/engine/rocksdb.go:574  opening rocksdb instance at "/cockroach/cockroach-data/cockroach-temp632709623"
I190113 17:00:46.803749 22 server/server.go:851  [n?] monitoring forward clock jumps based on server.clock.forward_jump_check_enabled
I190113 17:00:46.804168 22 storage/engine/rocksdb.go:574  opening rocksdb instance at "/cockroach/cockroach-data"
I190113 17:00:46.828487 22 server/config.go:494  [n?] 1 storage engine initialized
I190113 17:00:46.828526 22 server/config.go:497  [n?] RocksDB cache size: 947 MiB
I190113 17:00:46.828536 22 server/config.go:497  [n?] store 0: RocksDB, max size 0 B, max open file limit 60536
W190113 17:00:46.838175 22 gossip/gossip.go:1499  [n?] no incoming or outgoing connections
I190113 17:00:46.838260 22 cli/start.go:505  initial startup completed, will now wait for `cockroach init`
or a join to a running cluster to start accepting clients.
Check the log file(s) for progress.
I190113 17:00:46.841243 22 server/server.go:1402  [n?] no stores bootstrapped and --join flag specified, awaiting init command.
W190113 17:01:16.841095 89 cli/start.go:535  The server appears to be unable to contact the other nodes in the cluster. Please try:

- starting the other nodes, if you haven't already;
- double-checking that the '--join' and '--listen'/'--advertise' flags are set up correctly;
- running the 'cockroach init' command if you are trying to initialize a new cluster.

If problems persist, please see https://www.cockroachlabs.com/docs/v2.1/cluster-setup-troubleshooting.html.
I190113 17:01:31.357765 1 cli/start.go:756  received signal 'terminated'
I190113 17:01:31.359529 1 cli/start.go:821  initiating graceful shutdown of server
initiating graceful shutdown of server
I190113 17:01:31.361064 1 cli/start.go:872  too early to drain; used hard shutdown instead
too early to drain; used hard shutdown instead

...any的想法是如何进一步调试?

EN

回答 3

Stack Overflow用户

回答已采纳

发布于 2019-01-26 19:14:23

好吧,归根结底,我使用NAT作为我的virtualbox外部网络适配器。我将其更改为Bridged,然后一切都开始完美地工作。如果有人能告诉我为什么,那就太棒了:)

票数 0
EN

Stack Overflow用户

发布于 2019-01-14 02:28:53

我在https://github.com/cockroachdb/cockroach/blob/master/cloud/kubernetes/cockroachdb-statefulset.yaml上浏览了*.yaml文件,我注意到底部没有提到storageClassName,这意味着在卷申请过程中,pods将查找标准存储类。我不确定您在调配3个NFS卷时是否使用了以下注释-

代码语言:javascript
复制
storageclass.kubernetes.io/is-default-class=true

您应该能够使用以下命令检查相同的内容:

代码语言:javascript
复制
kubectl get storageclass

如果输出没有显示Standard存储类,那么我建议要么通过添加注释来重新调整持久卷定义,要么在接近class statefulset.yaml文件的末尾添加空字符串作为storageClassName

可以使用以下命令查看更多日志-

代码语言:javascript
复制
kubectl describe cockroachdb-{statefulset}
票数 0
EN

Stack Overflow用户

发布于 2019-07-28 21:57:46

在我的例子中,使用helm chart,如下所示:

代码语言:javascript
复制
$ helm install stable/cockroachdb \
  -n cockroachdb \
  --namespace cockroach \
  --set Storage=10Gi \
  --set NetworkPolicy.Enabled=true \
  --set Secure.Enabled=true

在等待完成为蟑螂添加csr之后:

代码语言:javascript
复制
$ watch kubectl get csr

有几个csr正在等待:

代码语言:javascript
复制
$ kubectl get csr
NAME                                         AGE    REQUESTOR                                                   CONDITION
cockroachdb.client.root                      130m   system:serviceaccount:cockroachdb:cockroachdb-cockroachdb   Pending
cockroachdb.node.cockroachdb-cockroachdb-0   130m   system:serviceaccount:cockroachdb:cockroachdb-cockroachdb   Pending
cockroachdb.node.cockroachdb-cockroachdb-1   129m   system:serviceaccount:cockroachdb:cockroachdb-cockroachdb   Pending
cockroachdb.node.cockroachdb-cockroachdb-2   130m   system:serviceaccount:cockroachdb:cockroachdb-cockroachdb   Pending

要批准该运行以下命令:

代码语言:javascript
复制
$ kubectl get csr -o json | \
  jq -r '.items[] | select(.metadata.name | contains("cockroach.")) | .metadata.name' | \
  xargs -n 1 kubectl certificate approve
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/54171192

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档