Prometheus

Question

我正在通过 Prometheus.io 监控 docker 个容器。我的问题是我只是得到 cpu_user_seconds_total 或 cpu_system_seconds_total.

如何将这个不断增加的值转换为 CPU 百分比？

目前我正在查询：

rate(container_cpu_user_seconds_total[30s])

但我认为它不太正确（与顶部相比）。

如何将cpu_user_seconds_total百分比转换为CPU百分比？（如顶部）

Answer 1

每秒计算 returns 一个值，因此乘以 100 将得到一个百分比：

rate(container_cpu_user_seconds_total[30s]) * 100

Answer 2

我也找到了这种方法来获得CPU用法是准确的：

100 - (avg by (instance) (irate(node_cpu_seconds_total{job="node",mode="idle"}[5m])) * 100)

发件人：http://www.robustperception.io/understanding-machine-cpu-usage/

Answer 3

对于 Windows 用户 - wmi_exporter

100 - (avg by (instance) (irate(wmi_cpu_time_total{mode="idle"}[2m])) * 100)

Answer 4

请注意 container_cpu_user_seconds_total 和 container_cpu_system_seconds_total 是 per-container 计数器，它们显示 CPU 特定容器在 user space 和 [=17 中使用的时间=] 相应地（有关详细信息，请参阅 these docs for more details). Cadvisor exposes additional metric - container_cpu_usage_seconds_total. This metric equals to the sum of the container_cpu_user_seconds_total and container_cpu_system_seconds_total, e.g. it shows overall CPU time used by each container. See these docs。

container_cpu_usage_seconds_total 是一个 counter, e.g. it increases over time. This isn't very informative for determining CPU usage at a particular time. Prometheus provides rate() 函数，它 returns 平均 per-second 超过柜台的增长率。例如，以下查询 returns 在过去 5 分钟内 per-container container_cpu_usage_seconds_total 指标的平均 per-second 增加（请参阅 5m 回顾 window方括号):

rate(container_cpu_usage_seconds_total[5m])

这基本上是过去 5 分钟内使用的 CPU 个内核的平均数。只需将其乘以 100 即可得到 CPU 百分比使用率。请注意，如果容器在过去 5 分钟内使用了超过一个 CPU 核心，则结果值可能会超过 100%。

rate(container_cpu_usage_seconds_total[5m]) 通常 returns 在生产 Kubernetes 中有很多长标签的大量时间序列，因此最好使用以下查询：

每个 pod 在过去 5 分钟内使用的平均 CPU 个核心数：

sum(rate(container_cpu_usage_seconds_total{container!=""}[5m])) by (pod)

每个节点在过去 5 分钟内使用的平均 CPU 个内核数：

sum(rate(container_cpu_usage_seconds_total{container!=""}[5m])) by (node)

每个命名空间在过去 5 分钟内使用的平均 CPU 个核心数：

sum(rate(container_cpu_usage_seconds_total{container!=""}[5m])) by (namespace)

container!="" 过滤器删除了与 cgroups 层次结构相关的多余指标 - 有关详细信息，请参阅 this answer。

Prometheus - 将 cpu_user_seconds 转换为 CPU 使用百分比？

Prometheus - Convert cpu_user_seconds to CPU Usage %?

performance

cpu-usage

performance-testing

每个 pod 在过去 5 分钟内使用的平均 CPU 个核心数：

每个节点在过去 5 分钟内使用的平均 CPU 个内核数：

每个命名空间在过去 5 分钟内使用的平均 CPU 个核心数：