我在gcloud上的Kubernetes集群中运行django应用程序。我将数据库迁移实现为一个helm pre-intall挂钩,它启动我的应用程序容器并执行数据库迁移。按照官方教程https://cloud.google.com/sql/docs/mysql/connect-kubernetes-engine中的推荐,我在sidecar模式中使用了cloud-sql-proxy
基本上,这将启动我的应用程序和作业描述的pod中的cloud-sql-proxy容器。问题是cloud-sql-proxy在我的应用程序完成迁移后永远不会终止,导致预安装作业超时并取消我的部署。如何在应用程序容器完成后正常退出cloud-sql-proxy容器,以便完成作业?
这是我的helm预装钩子模板定义:
apiVersion: batch/v1
kind: Job
metadata:
name: database-migration-job
labels:
app.kubernetes.io/managed-by: {{ .Release.Service | quote }}
app.kubernetes.io/instance: {{ .Release.Name | quote }}
app.kubernetes.io/version: {{ .Chart.AppVersion }}
helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
annotations:
# This is what defines this resource as a hook. Without this line, the
# job is considered part of the release.
"helm.sh/hook": pre-install,pre-upgrade
"helm.sh/hook-weight": "-1"
"helm.sh/hook-delete-policy": hook-succeeded,hook-failed
spec:
activeDeadlineSeconds: 230
template:
metadata:
name: "{{ .Release.Name }}"
labels:
app.kubernetes.io/managed-by: {{ .Release.Service | quote }}
app.kubernetes.io/instance: {{ .Release.Name | quote }}
helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
spec:
restartPolicy: Never
containers:
- name: db-migrate
image: {{ .Values.my-project.docker_repo }}{{ .Values.backend.image }}:{{ .Values.my-project.image.tag}}
imagePullPolicy: {{ .Values.my-project.image.pullPolicy }}
env:
- name: DJANGO_SETTINGS_MODULE
value: "{{ .Values.backend.django_settings_module }}"
- name: SENDGRID_API_KEY
valueFrom:
secretKeyRef:
name: sendgrid-api-key
key: sendgrid-api-key
- name: DJANGO_SECRET_KEY
valueFrom:
secretKeyRef:
name: django-secret-key
key: django-secret-key
- name: DB_USER
value: {{ .Values.postgresql.postgresqlUsername }}
- name: DB_PASSWORD
{{- if .Values.postgresql.enabled }}
value: {{ .Values.postgresql.postgresqlPassword }}
{{- else }}
valueFrom:
secretKeyRef:
name: database-password
key: database-pwd
{{- end }}
- name: DB_NAME
value: {{ .Values.postgresql.postgresqlDatabase }}
- name: DB_HOST
{{- if .Values.postgresql.enabled }}
value: "postgresql"
{{- else }}
value: "127.0.0.1"
{{- end }}
workingDir: /app-root
command: ["/bin/sh"]
args: ["-c", "python manage.py migrate --no-input"]
{{- if eq .Values.postgresql.enabled false }}
- name: cloud-sql-proxy
image: gcr.io/cloudsql-docker/gce-proxy:1.17
command:
- "/cloud_sql_proxy"
- "-instances=<INSTANCE_CONNECTION_NAME>=tcp:<DB_PORT>"
- "-credential_file=/secrets/service_account.json"
securityContext:
#fsGroup: 65532
runAsNonRoot: true
runAsUser: 65532
volumeMounts:
- name: db-con-mnt
mountPath: /secrets/
readOnly: true
volumes:
- name: db-con-mnt
secret:
secretName: db-service-account-credentials
{{- end }}足够有趣的是,如果我在迁移完成后使用"kubectl delete job database- upgrade job“终止作业,那么helm升级就完成了,我的新应用程序版本也安装好了。
发布于 2020-06-22 07:19:37
好吧,我有一个解决方案,它将工作,但可能是老生常谈。首先,这是Kubernetes缺乏的功能,这是在这个issue中讨论的。
在Kubernetesv1.17中,容器在相同的Pods can share process namespaces中。这使我们能够从app容器中删除代理容器。因为这是一个Kubernetes任务,所以应用程序容器的enable postStop handlers不应该有任何异常。
使用此解决方案,当您的应用程序正常(或异常)结束并退出时,Kubernetes将从垂死的容器中运行最后一个命令,在本例中为kill another process。这应该会导致作业完成成功或失败,这取决于您将如何终止进程。进程退出代码将是容器退出代码,那么它基本上将是作业退出代码。
https://stackoverflow.com/questions/62503682
复制相似问题