文章/答案/技术大牛

发布

社区首页 >问答首页 >来自AlertManager容器的cAdvisor通知

问来自AlertManager容器的cAdvisor通知
EN

Stack Overflow用户

提问于 2021-08-19 10:16:15

回答 1查看 640关注 0票数 2

我正在使用常用的监视工具(Prometheus、cAdvisor、AlertManager)，我面临的问题是，每30分钟就有一台服务器触发containerCpuUsage，但不幸的是，我不知道这是哪个容器(我猜这是cAdvisor本身，但它的cpu使用率确实很低！)所以我的第一个问题是，有没有办法告诉AlertManager --基于prometheus规则--也发送容器名称？

(cAdvisor本身比其他容器使用更多的CPU )

cadvisor-rule.yaml

- alert: ContainerCpuUsage
    expr: (sum(rate(container_cpu_usage_seconds_total[3m])) BY (instance, name) * 100) > 80
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "Container CPU usage (instance {{ $labels.instance }})"
      description: "Container CPU usage is above 80%\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"

我试过{{ $labels.name }}和{{ $labels.job }}，但不起作用。

因此，让我们调用实例名为A，然后在其中有一个nginx & cadvisor容器。监视工具正在另一个实例上运行，我如何将容器名称放入规则标签，或者是否有其他方法来实现它！

prometheus

monitoring

prometheus-alertmanager

cadvisor

回答 1

Stack Overflow用户

发布于 2021-10-18 12:13:17

在cAdvisor中，有时容器本身可以占用更多的CPU。

  # cAdvisor can sometimes consume a lot of CPU, so this alert will fire constantly.
  # If you want to exclude it from this alert, exclude the serie having an empty name: container_cpu_usage_seconds_total{name!=""}

在我的示例中，我使用cAdvisor启动了--name=cadvisor容器，并添加了以下规则表达式：

expr: (sum(rate(container_cpu_usage_seconds_total{name!="cadvisor"}[3m])) BY (instance, name) * 100) > 80

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/68846003

复制

相似问题

问来自AlertManager容器的cAdvisor通知
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问来自AlertManager容器的cAdvisor通知EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问来自AlertManager容器的cAdvisor通知
EN