这里是完整的yaml文件(没有问题,因为它很长,而且下面的describe涵盖了许多重要的位元):
https://gist.github.com/sporkmonger/46a820f9a1ed8a73d89a319dffb24608
使用我在这里创建的公共容器映像:sporkmonger/nsq-k8s:0.3.8
容器与官方的NSQ映像相同,但是使用Debian而不是阿尔卑斯/musl来解决DNS问题,这往往是阿尔卑斯岛的一个问题。
下面是我描述其中一个吊舱时发生的情况:
❯ kubectl describe pod nsqd-0
Name: nsqd-0
Namespace: default
Node: minikube/192.168.99.100
Start Time: Sun, 04 Dec 2016 20:58:06 -0800
Labels: app=nsq
Status: Terminating (expires Sun, 04 Dec 2016 21:02:31 -0800)
Termination Grace Period: 60s
IP: 172.17.0.8
Controllers: PetSet/nsqd
Containers:
nsqd:
Container ID: docker://381e4a1313e4e13a63b8a17004d79a6e828a8bc1c9e20419b319f8a9757f266b
Image: sporkmonger/nsq-k8s:0.3.8
Image ID: docker://sha256:01691a91cee3e1a6992b33a10e99baa57c5b8ce7b765849540a830f0b554e707
Ports: 4150/TCP, 4151/TCP
Command:
/bin/sh
-c
Args:
/usr/local/bin/nsqd
-data-path
/data
-broadcast-address
$(hostname -f)
-lookupd-tcp-address
nsqlookupd-0.nsqlookupd.default.svc.cluster.local:4160
-lookupd-tcp-address
nsqlookupd-1.nsqlookupd.default.svc.cluster.local:4160
-lookupd-tcp-address
nsqlookupd-2.nsqlookupd.default.svc.cluster.local:4160
State: Running
Started: Sun, 04 Dec 2016 20:58:11 -0800
Ready: True
Restart Count: 0
Liveness: http-get http://:http/ping delay=5s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:http/ping delay=1s timeout=1s period=10s #success=1 #failure=3
Volume Mounts:
/data from datadir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-k6ufj (ro)
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
datadir:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: datadir-nsqd-0
ReadOnly: false
default-token-k6ufj:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-k6ufj
QoS Class: BestEffort
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
4m 4m 1 {default-scheduler } Normal Scheduled Successfully assigned nsqd-0 to minikube
4m 4m 1 {kubelet minikube} spec.containers{nsqd} Normal Pulling pulling image "sporkmonger/nsq-k8s:0.3.8"
4m 4m 1 {kubelet minikube} spec.containers{nsqd} Normal Pulled Successfully pulled image "sporkmonger/nsq-k8s:0.3.8"
4m 4m 1 {kubelet minikube} spec.containers{nsqd} Normal Created Created container with docker id 381e4a1313e4; Security:[seccomp=unconfined]
4m 4m 1 {kubelet minikube} spec.containers{nsqd} Normal Started Started container with docker id 381e4a1313e4
0s 0s 1 {kubelet minikube} spec.containers{nsqd} Normal Killing Killing container with docker id 381e4a1313e4: Need to kill pod.一个相当有代表性的30秒的集群活动观察:
❯ kubectl get pods -w
NAME READY STATUS RESTARTS AGE
nsqadmin-0 1/1 Running 3 33m
nsqadmin-1 1/1 Running 0 32m
nsqd-0 1/1 Running 0 6m
nsqd-1 1/1 Running 0 4m
nsqd-2 1/1 Terminating 0 1m
nsqd-3 1/1 Running 0 30s
nsqlookupd-0 1/1 Running 0 30s
NAME READY STATUS RESTARTS AGE
nsqlookupd-1 0/1 Pending 0 0s
nsqlookupd-1 0/1 Pending 0 0s
nsqlookupd-1 0/1 ContainerCreating 0 0s
nsqlookupd-1 0/1 Running 0 4s
nsqlookupd-1 1/1 Running 0 8s
nsqlookupd-2 0/1 Pending 0 0s
nsqlookupd-2 0/1 Pending 0 0s
nsqlookupd-2 0/1 ContainerCreating 0 0s
nsqlookupd-2 0/1 Terminating 0 0s
nsqd-2 0/1 Terminating 0 2m
nsqd-2 0/1 Terminating 0 2m
nsqd-2 0/1 Terminating 0 2m
nsqlookupd-2 0/1 Terminating 0 4s
nsqlookupd-2 0/1 Terminating 0 5s
nsqlookupd-2 0/1 Terminating 0 5s
nsqlookupd-2 0/1 Terminating 0 5s
nsqlookupd-1 1/1 Terminating 0 29s
nsqlookupd-1 0/1 Terminating 0 30s
nsqlookupd-1 0/1 Terminating 0 30s
nsqlookupd-1 0/1 Terminating 0 30s
nsqlookupd-0 1/1 Terminating 0 1m
nsqd-2 0/1 Pending 0 0s
nsqd-2 0/1 Pending 0 0s
nsqd-2 0/1 ContainerCreating 0 0s
nsqlookupd-0 0/1 Terminating 0 1m
nsqlookupd-0 0/1 Terminating 0 1m
nsqlookupd-0 0/1 Terminating 0 1m
nsqlookupd-0 0/1 Pending 0 0s
nsqlookupd-0 0/1 Pending 0 0s
nsqlookupd-0 0/1 ContainerCreating 0 0s
nsqd-2 0/1 Running 0 4s
nsqlookupd-0 0/1 Running 0 4s
nsqd-2 1/1 Running 0 6s
nsqlookupd-0 1/1 Running 0 10s
nsqlookupd-0 1/1 Terminating 0 10s
nsqlookupd-0 0/1 Terminating 0 11s
nsqlookupd-0 0/1 Terminating 0 11s
nsqlookupd-0 0/1 Terminating 0 11s
nsqd-2 1/1 Terminating 0 12s
nsqlookupd-0 0/1 Pending 0 0s
nsqlookupd-0 0/1 Pending 0 0s
nsqlookupd-0 0/1 ContainerCreating 0 0s
nsqlookupd-0 0/1 Running 0 3s
nsqlookupd-0 1/1 Running 0 10s典型的集装箱日志:
❯ kubectl logs nsqd-0
[nsqd] 2016/12/05 05:21:34.666963 nsqd v0.3.8 (built w/go1.6.2)
[nsqd] 2016/12/05 05:21:34.667170 ID: 794
[nsqd] 2016/12/05 05:21:34.667200 NSQ: persisting topic/channel metadata to nsqd.794.dat
[nsqd] 2016/12/05 05:21:34.669232 TCP: listening on [::]:4150
[nsqd] 2016/12/05 05:21:34.669284 HTTP: listening on [::]:4151
[nsqd] 2016/12/05 05:21:35.896901 200 GET /ping (172.17.0.1:51322) 1.511µs
[nsqd] 2016/12/05 05:21:40.290550 200 GET /ping (172.17.0.1:51392) 2.167µs
[nsqd] 2016/12/05 05:21:40.304599 200 GET /ping (172.17.0.1:51394) 1.856µs
[nsqd] 2016/12/05 05:21:50.289018 200 GET /ping (172.17.0.1:51452) 1.865µs
[nsqd] 2016/12/05 05:21:50.299567 200 GET /ping (172.17.0.1:51454) 1.951µs
[nsqd] 2016/12/05 05:22:00.296685 200 GET /ping (172.17.0.1:51548) 2.029µs
[nsqd] 2016/12/05 05:22:00.300842 200 GET /ping (172.17.0.1:51550) 1.464µs
[nsqd] 2016/12/05 05:22:10.295596 200 GET /ping (172.17.0.1:51698) 2.206µs我很想知道为什么库伯奈特斯一直在杀这些豆荚。集装箱本身似乎并没有行为不端,库伯奈特本身似乎也在结束这里的一切……
发布于 2016-12-05 06:08:22
弄明白了。
我的服务都有相同的选择器。每个服务都匹配所有的吊舱,导致Kubernetes认为它一次运行的次数太多了,所以它随机地杀死了“额外的”。
https://stackoverflow.com/questions/40967740
复制相似问题