我遇到了一个问题,做一个滚动更新我们的网站,运行在一个容器在我们的集群,称为网站集群。集群包含两个吊舱。一个吊舱有一个集装箱,运行我们的生产网站,另一个容器,运行同一网站的分期版本。下面是用于生产pod的复制控制器的yaml:
apiVersion: v1
kind: ReplicationController
metadata:
# These labels describe the replication controller
labels:
project: "website-prod"
tier: "front-end"
name: "website"
name: "website"
spec: # specification of the RC's contents
replicas: 1
selector:
# These labels indicate which pods the replication controller manages
project: "website-prod"
tier: "front-end"
name: "website"
template:
metadata:
labels:
# These labels belong to the pod, and must match the ones immediately above
# name: "website"
project: "website-prod"
tier: "front-end"
name: "website"
spec:
containers:
- name: "website"
image: "us.gcr.io/skywatch-app/website"
ports:
- name: "http"
containerPort: 80
command: ["nginx", "-g", "daemon off;"]
livenessProbe:
httpGet:
path: "/"
port: 80
initialDelaySeconds: 60
timeoutSeconds: 3我们做了一个改变,增加了一个新的网页到我们的网站。在将它部署到生产吊舱后,我们在测试生产现场时得到了间歇性404。我们使用以下命令更新pod (假设95.0版本目前正在运行):
packer build website.json
gcloud docker push us.gcr.io/skywatch-app/website
gcloud container clusters get-credentials website-cluster --zone us-central1-f
kubectl rolling-update website --update-period=20s --image=us.gcr.io/skywatch-app/website:96.0以下是这些命令的输出:
==> docker: Creating a temporary directory for sharing data...
==> docker: Pulling Docker image: nginx:1.9.7
docker: 1.9.7: Pulling from library/nginx
docker: d4bce7fd68df: Already exists
docker: a3ed95caeb02: Already exists
docker: a3ed95caeb02: Already exists
docker: 573113c4751a: Already exists
docker: 31917632be33: Already exists
docker: a3ed95caeb02: Already exists
docker: 1e7c116578c5: Already exists
docker: 03c02c160fd7: Already exists
docker: f852bb4464c4: Already exists
docker: a3ed95caeb02: Already exists
docker: a3ed95caeb02: Already exists
docker: a3ed95caeb02: Already exists
docker: Digest: sha256:3b50ebc3ae6fb29b713a708d4dc5c15f4223bde18ddbf3c8730b228093788a3c
docker: Status: Image is up to date for nginx:1.9.7
==> docker: Starting docker container...
docker: Run command: docker run -v /tmp/packer-docker358675979:/packer-files -d -i -t nginx:1.9.7 /bin/bash
docker: Container ID: 0594bf37edd1311535598971140535166df907b1c19d5f76ddda97c53f884d5b
==> docker: Provisioning with shell script: /tmp/packer-shell010711780
==> docker: Uploading nginx.conf => /etc/nginx/nginx.conf
==> docker: Uploading ../dist/ => /var/www
==> docker: Uploading ../dist => /skywatch/website
==> docker: Uploading /skywatch/ssl/ => /skywatch/ssl
==> docker: Committing the container
docker: Image ID: sha256:d469880ae311d164da6786ec73afbf9190d2056accedc9d2dc186ef8ca79c4b6
==> docker: Killing the container: 0594bf37edd1311535598971140535166df907b1c19d5f76ddda97c53f884d5b
==> docker: Running post-processor: docker-tag
docker (docker-tag): Tagging image: sha256:d469880ae311d164da6786ec73afbf9190d2056accedc9d2dc186ef8ca79c4b6
docker (docker-tag): Repository: us.gcr.io/skywatch-app/website:96.0
Build 'docker' finished.
==> Builds finished. The artifacts of successful builds are:
--> docker: Imported Docker image: sha256:d469880ae311d164da6786ec73afbf9190d2056accedc9d2dc186ef8ca79c4b6
--> docker: Imported Docker image: us.gcr.io/skywatch-app/website:96.0
[2016-05-16 15:09:39,598, INFO] The push refers to a repository [us.gcr.io/skywatch-app/website]
e75005ca29bf: Preparing
5f70bf18a086: Preparing
5f70bf18a086: Preparing
5f70bf18a086: Preparing
0b3fbb980e2d: Preparing
40f240c1cbdb: Preparing
673cf6d9dedb: Preparing
5f70bf18a086: Preparing
ebfc3a74f160: Preparing
031458dc7254: Preparing
5f70bf18a086: Preparing
5f70bf18a086: Preparing
12e469267d21: Preparing
ebfc3a74f160: Waiting
031458dc7254: Waiting
12e469267d21: Waiting
5f70bf18a086: Layer already exists
673cf6d9dedb: Layer already exists
40f240c1cbdb: Layer already exists
0b3fbb980e2d: Layer already exists
ebfc3a74f160: Layer already exists
031458dc7254: Layer already exists
12e469267d21: Layer already exists
e75005ca29bf: Pushed
96.0: digest: sha256:ff865acd292409f3b5bf3c14494a6016a45d5ea831e5260304007a2b83e21189 size: 7328
[2016-05-16 15:09:40,483, INFO] Fetching cluster endpoint and auth data.
kubeconfig entry generated for website-cluster.
[2016-05-16 15:10:18,823, INFO] Created website-8c10af72294bdfc4d2d6a0e680e84f09
Scaling up website-8c10af72294bdfc4d2d6a0e680e84f09 from 0 to 1, scaling down website from 1 to 0 (keep 1 pods available, don't exceed 2 pods)
Scaling website-8c10af72294bdfc4d2d6a0e680e84f09 up to 1
Scaling website down to 0
Update succeeded. Deleting old controller: website
Renaming website-8c10af72294bdfc4d2d6a0e680e84f09 to website
replicationcontroller "website" rolling updated这一切看起来都不错,但我们得到随机404在新的一页后,完成了。当我运行kubectl获取吊舱时,我发现我有三个吊舱运行,而不是预期的两个吊舱:
NAME READY STATUS RESTARTS AGE
website-8c10af72294bdfc4d2d6a0e680e84f09-iwfjo 1/1 Running 0 1d
website-keys9 1/1 Running 0 1d
website-staging-34caf57c958848415375d54214d98b8a-yo4sp 1/1 Running 0 3d使用kubectl describe pod命令,我确定pod website-8c10af72294bdfc4d2d6a0e680e84f09-iwfjo运行的是新版本(96.0),而pod website-keys9运行的是旧版本(95.0)。我们得到404,因为传入的请求将被随机地送达旧版本的网站。当我手动删除运行旧版本的吊舱时,404就消失了。
有人知道在什么情况下滚动更新不会删除豆荚运行旧版本的网站?我是否需要在yaml或命令中进行更改,以确保删除运行旧版本的pod?
感谢你对此的任何帮助或建议。
发布于 2016-07-30 22:40:55
我是库伯奈特虫#27721。但是,即使不是,您仍然会有一段时间,当您的用户流量被交付给旧的和新的豆荚。这对大多数应用程序来说都不错,但在您的例子中,这是不可取的,因为它会导致意外的404。我建议您使用不同于旧的标签集来创建新的吊舱,例如将图像版本放在标签中。然后您可以更新服务以选择新的标签--这将很快(不是原子性的,而是快速的)将所有流量从旧的服务后端切换到新的服务后端。
但是它可能更容易切换到使用部署。
https://serverfault.com/questions/777346
复制相似问题