我已经在Ubuntu服务器上配置了prometheus警报管理器来监视多个azure vms。目前,所有vm实例警报都被通知给默认的电子邮件组。我需要触发警报
在alertmanager.yml中尝试了几个与路由信任的组合,但是它没有像预期的那样工作。
如果有人能够解释警报管理器中发送组特定的警报通知背后的逻辑,帮助非常感激。
谢谢你给我时间!
route:
group_wait: 30s
group_interval: 5m
repeat_interval: 2h
receiver: 'default-receiver'
routes:
- match:
alertname: A_down
receiver: TeamA
- match:
alertname: B_down
receiver: TeamB我当前的Alertmanager.yml文件:
global:
resolve_timeout: 1m
route:
receiver: 'email-notifications'
receivers:
- name: 'email-notifications'
email_configs:
- to: alertgroups@example.com
from: default@example.com
smarthost: smtp.gmail.com:587
auth_username: default@example.com
auth_identity: default@example.com
auth_password: password
send_resolved: truealertrule.yml文件:
groups:
- name: alert.rules
rules:
- alert: InstanceDown
# Condition for alerting
expr: up == 0
for: 1m
# Annotation - additional informational labels to store more information
annotations:
title: 'Instance {{ $labels.instance }} down'
description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minute.'
# Labels - additional labels to be attached to the alert
labels:
severity: 'critical'
- alert: HostOutOfMemory
# Condition for alerting
expr: node_memory_MemAvailable / node_memory_MemTotal * 100 < 80
for: 5m
# Annotation - additional informational labels to store more information
annotations:
title: 'Host out of memory (instance {{ $labels.instance }})'
description: 'Node memory is filling up (< 25% left)\n VALUE = {{ $value }}\n LABELS: {{ $labels }}'
# Labels - additional labels to be attached to the alert
labels:
severity: 'warning'
- alert: HostHighCpuLoad
# Condition for alerting
expr: (sum by (instance) (irate(node_cpu{job="node_exporter_metrics",mode="idle"}[5m]))) > 80
for: 5m
# Annotation - additional informational labels to store more information
annotations:
title: 'Host high CPU load (instance {{ $labels.instance }})'
description: 'CPU load is > 30%\n VALUE = {{ $value }}\n LABELS: {{ $labels }}'
# Labels - additional labels to be attached to the alert
labels:
severity: 'warning'
- alert: HostOutOfDiskSpace
# Condition for alerting
expr: (node_filesystem_avail{mountpoint="/"} * 100) / node_filesystem_size{mountpoint="/"} < 70
for: 5m
# Annotation - additional informational labels to store more information
annotations:
title: 'Host out of disk space (instance {{ $labels.instance }})'
description: 'Disk is almost full (< 50% left)\n VALUE = {{ $value }}\n LABELS: {{ $labels }}'发布于 2021-09-08 00:32:13
使用此配置:
routes:
- match:
alertname: A_down
receiver:
- default-receiver
- TeamA
- match:
alertname: B_down
receiver:
- default-receiver
- TeamB不要忘记使用“接收者”块定义默认接收器、TeamA和TeamB。
https://stackoverflow.com/questions/69083594
复制相似问题