首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >AlertManager -根据特定Jobname的路由向不同的接收者发送警报

AlertManager -根据特定Jobname的路由向不同的接收者发送警报
EN

Stack Overflow用户
提问于 2021-09-07 06:51:53
回答 1查看 4K关注 0票数 2

我已经在Ubuntu服务器上配置了prometheus警报管理器来监视多个azure vms。目前,所有vm实例警报都被通知给默认的电子邮件组。我需要触发警报

  1. Team A(user1,user2,user3) &默认组,如果服务器A(使用Jobname)崩溃,
  2. Team B(User1,User2)和默认组(如果服务器B崩溃)。

在alertmanager.yml中尝试了几个与路由信任的组合,但是它没有像预期的那样工作。

如果有人能够解释警报管理器中发送组特定的警报通知背后的逻辑,帮助非常感激。

谢谢你给我时间!

代码语言:javascript
复制
route:
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 2h

  receiver: 'default-receiver'

  routes:
  - match:
      alertname: A_down
    receiver: TeamA
  - match:
      alertname: B_down
    receiver: TeamB

我当前的Alertmanager.yml文件:

代码语言:javascript
复制
global:
 resolve_timeout: 1m

route:
 receiver: 'email-notifications'

receivers:
- name: 'email-notifications'
  email_configs:
  - to: alertgroups@example.com
    from: default@example.com
    smarthost: smtp.gmail.com:587
    auth_username: default@example.com
    auth_identity: default@example.com
    auth_password: password
    send_resolved: true

alertrule.yml文件:

代码语言:javascript
复制
groups:
- name: alert.rules
  rules:
  - alert: InstanceDown
   # Condition for alerting
    expr: up == 0
    for: 1m
   # Annotation - additional informational labels to store more information
    annotations:
      title: 'Instance {{ $labels.instance }} down'
      description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minute.'
   # Labels - additional labels to be attached to the alert
    labels:
        severity: 'critical'

  - alert: HostOutOfMemory
   # Condition for alerting
    expr: node_memory_MemAvailable / node_memory_MemTotal * 100 < 80
    for: 5m
   # Annotation - additional informational labels to store more information
    annotations:
      title: 'Host out of memory (instance {{ $labels.instance }})'
      description: 'Node memory is filling up (< 25% left)\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}'
   # Labels - additional labels to be attached to the alert
    labels:
        severity: 'warning'

  - alert: HostHighCpuLoad
   # Condition for alerting
    expr: (sum by (instance) (irate(node_cpu{job="node_exporter_metrics",mode="idle"}[5m]))) > 80
    for: 5m
   # Annotation - additional informational labels to store more information
    annotations:
      title: 'Host high CPU load (instance {{ $labels.instance }})'
      description: 'CPU load is > 30%\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}'
   # Labels - additional labels to be attached to the alert
    labels:
        severity: 'warning'

  - alert: HostOutOfDiskSpace
   # Condition for alerting
    expr: (node_filesystem_avail{mountpoint="/"}  * 100) / node_filesystem_size{mountpoint="/"} < 70
    for: 5m
   # Annotation - additional informational labels to store more information
    annotations:
      title: 'Host out of disk space (instance {{ $labels.instance }})'
      description: 'Disk is almost full (< 50% left)\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}'
EN

回答 1

Stack Overflow用户

发布于 2021-09-08 00:32:13

使用此配置:

代码语言:javascript
复制
  routes:
  - match:
      alertname: A_down
    receiver:
    - default-receiver
    - TeamA
  - match:
      alertname: B_down
    receiver: 
    - default-receiver
    - TeamB

不要忘记使用“接收者”块定义默认接收器、TeamA和TeamB。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/69083594

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档