我正试图在我的AWS K8s集群中部署一个。
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
labels:
app: prometheus
chart: prometheus-11.12.1
component: node-exporter
heritage: Helm
release: prometheus
name: prometheus-node-exporter
namespace: operations-tools-test
spec:
selector:
matchLabels:
app: prometheus
component: node-exporter
release: prometheus
template:
metadata:
labels:
app: prometheus
chart: prometheus-11.12.1
component: node-exporter
heritage: Helm
release: prometheus
spec:
containers:
- args:
- --path.procfs=/host/proc
- --path.sysfs=/host/sys
- --web.listen-address=:9100
image: prom/node-exporter:v1.0.1
imagePullPolicy: IfNotPresent
name: prometheus-node-exporter
ports:
- containerPort: 9100
hostPort: 9100
name: metrics
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /host/proc
name: proc
readOnly: true
- mountPath: /host/sys
name: sys
readOnly: true
dnsPolicy: ClusterFirst
hostNetwork: true
hostPID: true
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: prometheus-node-exporter
serviceAccountName: prometheus-node-exporter
terminationGracePeriodSeconds: 30
volumes:
- hostPath:
path: /proc
type: ""
name: proc
- hostPath:
path: /sys
type: ""
name: sys但是,在部署之后,它不会被部署在一个节点上。
该文件的pod.yml文件如下所示:
apiVersion: v1
kind: Pod
metadata:
annotations:
kubernetes.io/psp: eks.privileged
generateName: prometheus-node-exporter-
labels:
app: prometheus
chart: prometheus-11.12.1
component: node-exporter
heritage: Helm
pod-template-generation: "1"
release: prometheus
name: prometheus-node-exporter-xxxxx
namespace: operations-tools-test
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: DaemonSet
name: prometheus-node-exporter
resourceVersion: "51496903"
selfLink: /api/v1/namespaces/namespace-x/pods/prometheus-node-exporter-xxxxx
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- ip-xxx-xx-xxx-xxx.ec2.internal
containers:
- args:
- --path.procfs=/host/proc
- --path.sysfs=/host/sys
- --web.listen-address=:9100
image: prom/node-exporter:v1.0.1
imagePullPolicy: IfNotPresent
name: prometheus-node-exporter
ports:
- containerPort: 9100
hostPort: 9100
name: metrics
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /host/proc
name: proc
readOnly: true
- mountPath: /host/sys
name: sys
readOnly: true
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: prometheus-node-exporter-token-xxxx
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
hostNetwork: true
hostPID: true
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: prometheus-node-exporter
serviceAccountName: prometheus-node-exporter
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/disk-pressure
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/memory-pressure
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/pid-pressure
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/unschedulable
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/network-unavailable
operator: Exists
volumes:
- hostPath:
path: /proc
type: ""
name: proc
- hostPath:
path: /sys
type: ""
name: sys
- name: prometheus-node-exporter-token-xxxxx
secret:
defaultMode: 420
secretName: prometheus-node-exporter-token-xxxxx
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2020-11-06T23:56:47Z"
message: '0/4 nodes are available: 2 node(s) didn''t have free ports for the requested
pod ports, 3 Insufficient pods, 3 node(s) didn''t match node selector.'
reason: Unschedulable
status: "False"
type: PodScheduled
phase: Pending
qosClass: BestEffort如上文所示,nodeAffinity查找metadata.name,这与我在节点中作为标签的内容完全匹配。
但当我运行以下命令时,
kubectl describe po prometheus-node-exporter-xxxxx我参加了这些活动:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 60m default-scheduler 0/4 nodes are available: 1 Insufficient pods, 3 node(s) didn't match node selector.
Warning FailedScheduling 4m46s (x37 over 58m) default-scheduler 0/4 nodes are available: 2 node(s) didn't have free ports for the requested pod ports, 3 Insufficient pods, 3 node(s) didn't match node selector.我还检查了Scheduler的云监视日志,也没有看到我失败的吊舱的任何日志。
节点还有足够的资源。
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 520m (26%) 210m (10%)
memory 386Mi (4%) 486Mi (6%)我看不出为什么它不应该安排一个吊舱。有人能帮我吗?
提亚
发布于 2020-11-10 12:03:32
如评论中所示:
请向问题中添加您遵循的步骤(编辑Helm图表中的任何值等)。另外,请检查节点是否超出了可以在其上调度的豆荚的限制,。在这里,您可以找到更多参考链接:链接。在给定节点上没有占用9100个进程。@DawidKruk达到了过氧化物酶的极限。谢谢!,我希望他们在这方面给我一些错误,而不是模糊的节点选择器属性不匹配。
不太清楚为什么会显示以下消息:
无法在节点上调度Pods (**Pending** 状态)的问题与 $ kubectl get events E 131命令中的消息连接。E 232
当节点达到其最大容量时,就会显示上面的消息(例如:node1可以调度30吊舱的最大值)。
关于Insufficient Pods的更多信息可以在这个github问题的评论中找到:
那是真的。那是因为CNI在EKS上的实现。最大豆荚数受到附加到实例的网络接口的限制,乘以每个ENI的ips数--这取决于实例的大小。很明显,对于小的例子,这个数字可能是相当低的数字。 Docs.aws.amazon.com: AWSEC2:用户指南:使用ENI:每个ENI可用的IP -- https://github.com/kubernetes/autoscaler/issues/1576#issuecomment-454100551
追加资源:
https://stackoverflow.com/questions/64724219
复制相似问题