我一直在尝试在k8s集群上设置RabbitMQ,我最终完成了所有设置,但managementUI上只显示了一个节点。以下是我的步骤:
1. Dockerfile设置
我这样做是为了启用autocluster
FROM rabbitmq:3.8-rc-management-alpine
MAINTAINER kevlai
RUN rabbitmq-plugins --offline enable rabbitmq_peer_discovery_k8s2.设置RBAC
apiVersion: v1
kind: ServiceAccount
metadata:
name: borecast-rabbitmq
namespace: borecast-production
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: borecast-rabbitmq
namespace: borecast-production
rules:
- apiGroups:
- ""
resources:
- endpoints
verbs:
- get
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: borecast-rabbitmq
namespace: borecast-production
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: dev
subjects:
- kind: ServiceAccount
name: borecast-rabbitmq
namespace: borecast-production3.设置秘密
apiVersion: v1
kind: Secret
metadata:
name: rabbitmq-secret
namespace: borecast-production
type: Opaque
data:
username: a2V2
password: Ym9yZWNhc3RydWx6
secretCookie: c2VjcmV0Y29va2llaGVyZQ==4.设置StorageClass
我正在设置StorageClass,这样k8s就会自动在亚马逊网络服务上为我做配置。
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: rabbitmq-sc
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
zone: us-east-2a
reclaimPolicy: Retain5.设置StatefulSets和服务
您可以看到有两个服务。无头服务是针对pod本身的。至于管理服务,我将公开Ingress控制器的服务,以便可以从外部访问它。
---
apiVersion: v1
kind: Service
metadata:
name: borecast-rabbitmq-management-service
namespace: borecast-production
labels:
app: borecast-rabbitmq
spec:
ports:
- port: 15672
targetPort: 15672
name: http
- port: 5672
targetPort: 5672
name: amqp
selector:
app: borecast-rabbitmq
---
apiVersion: v1
kind: Service
metadata:
name: borecast-rabbitmq-service
namespace: borecast-production
labels:
app: borecast-rabbitmq
spec:
clusterIP: None
ports:
- port: 5672
name: amqp
selector:
app: borecast-rabbitmq
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: borecast-rabbitmq
namespace: borecast-production
spec:
serviceName: borecast-rabbitmq-service
replicas: 3
template:
metadata:
labels:
app: borecast-rabbitmq
spec:
serviceAccountName: borecast-rabbitmq
containers:
- image: docker.borecast.com/borecast-rabbitmq:v1.0.3
name: borecast-rabbitmq
imagePullPolicy: Always
resources:
requests:
memory: "256Mi"
cpu: "150m"
limits:
memory: "512Mi"
cpu: "250m"
ports:
- containerPort: 5672
name: amqp
env:
- name: RABBITMQ_DEFAULT_USER
valueFrom:
secretKeyRef:
name: rabbitmq-secret
key: username
- name: RABBITMQ_DEFAULT_PASS
valueFrom:
secretKeyRef:
name: rabbitmq-secret
key: password
- name: RABBITMQ_ERLANG_COOKIE
valueFrom:
secretKeyRef:
name: rabbitmq-secret
key: secretCookie
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: K8S_SERVICE_NAME
# value: borecast-rabbitmq-service.borecast-production.svc.cluster.local
value: borecast-rabbitmq-service
- name: RABBITMQ_USE_LONGNAME
value: "true"
- name: RABBITMQ_NODENAME
value: "rabbit@$(MY_POD_NAME).$(K8S_SERVICE_NAME)"
# value: rabbit@$(MY_POD_NAME).borecast-rabbitmq-service.borecast-production.svc.cluster.local
- name: RABBITMQ_NODE_TYPE
value: disc
- name: AUTOCLUSTER_TYPE
value: "k8s"
- name: AUTOCLUSTER_DELAY
value: "10"
- name: AUTOCLUSTER_CLEANUP
value: "true"
- name: CLEANUP_WARN_ONLY
value: "false"
- name: K8S_ADDRESS_TYPE
value: "hostname"
- name: K8S_HOSTNAME_SUFFIX
value: ".$(K8S_SERVICE_NAME)"
# value: .borecast-rabbitmq-service.borecast-production.svc.cluster.local
volumeMounts:
- name: rabbitmq-volume
mountPath: /var/lib/rabbitmq
imagePullSecrets:
- name: regcred
volumeClaimTemplates:
- metadata:
name: rabbitmq-volume
namespace: borecast-production
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: rabbitmq-sc
resources:
requests:
storage: 5Gi问题
一切都很正常。但是,当我访问管理UI时(即,我正在访问borecast-rabbitmq-management-service,端口15672),我只看到一个节点出现,而实际上它应该是三个:

另请注意,群集名称为
rabbit@borecast-rabbitmq-0.borecast-rabbitmq-service.borecast-production.svc.cluster.local但当我注销并再次登录时,有时数字0会更改为1或borecast-rabbitmq-0的2。
还要注意,节点名是
rabbit@borecast-rabbitmq-1.borecast-rabbitmq-service你猜对了,有时候这个数字是2或者borecast-rabbitmq-1的0。
我一直在尝试调试,但没有成功。每个pod的日志不会引起任何怀疑,每个服务和状态集都正常工作。我多次重复这五个步骤,如果您的集群在亚马逊网络服务上,您可以按照以下步骤完全复制我的设置(当然是在创建命名空间borecast-production之后)。如果有人能说明这件事,我将永远感激不尽。
发布于 2019-07-11 19:07:03
问题出在无头服务名称定义上:
- name: K8S_SERVICE_NAME
# value: borecast-rabbitmq-service.borecast-production.svc.cluster.local
value: borecast-rabbitmq-service它是节点名称的构建块:
- name: RABBITMQ_NODENAME
value: "rabbit@$(MY_POD_NAME).$(K8S_SERVICE_NAME)"而正确的节点名称应该是POD的完全限定域名(<statefulset name>-<ordinal index>.<headless_svc_name>.<namespace>.svc.cluster.local):
- name: RABBITMQ_NODENAME
value: "rabbit@$(MY_POD_NAME).$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local"因此,您最终选择了NodeName
borecast-rabbitmq-1.borecast-rabbitmq-service
而不是:
borecast-rabbitmq-1.borecast-rabbitmq-service.borecast-production.svc.cluster.local
使用nslookup util从集群内部查找borecast-rabbitmq StatefulSet创建的pod的fqdn (换句话说:pod的SRV记录),如here所述,以了解RABBITMQ_NODENAME的预期形式。
发布于 2020-12-29 04:45:00
尝试暴露4369用于无头服务;
发布于 2021-08-26 16:18:29
也有同样的问题,归根结底
rabbitmq资源,包括在清单下创建的pvc,重新安装清单中的所有内容。https://stackoverflow.com/questions/56971652
复制相似问题