Prometheus node_exporter计算CPU利用率的标准公式应该是哪一个

洒满阳光的午后

已于 2023-12-28 14:54:45 修改

阅读量2.6k

点赞数 10

CC 4.0 BY-SA版权

文章标签： prometheus node_exporter CPU

于 2023-12-26 09:15:23 首次发布

本文链接：https://blog.csdn.net/sinat_32582203/article/details/135196180

文章探讨了使用PromQL中的irate和rate公式计算CPU利用率的误区，指出在CPU利用率较低时，第一种方法的结果不准确。推荐使用第二种方法，基于/proc/stat文件获取更精确的数据。

有关CPU利用率计算的两个公式

使用PromQL计算CPU利用率，目前网上有两种说法。第一种（rate和irate均可，一个反映区间值，一个反映瞬时值，新版本指标名为node_cpu_seconds_total），计算单核：

1 - rate(node_cpu{mode="idle"}[5m])

计算节点，取各核平均值：

avg(1 - rate(node_cpu{mode="idle"}[5m])) by (instance)

第二种，计算单核：

1 - sum(increase(node_cpu{mode="idle"}[1m])) by (cpu,instance) / sum(increase(node_cpu[1m])) by (cpu,instance)

计算节点：

1 - sum(increase(node_cpu{mode="idle"}[1m])) by (instance) / sum(increase(node_cpu[1m])) by (instance)

为什么使用irate/rate计算CPU利用率结果是不准确/错误的

在实际使用中我们会发现，当节点的CPU实际利用率较低时，使用第一种公式计算出的CPU利用率与实际相差较大，这是因为该公式存在逻辑上的错误。以1 - rate(node_cpu{mode="idle"}[5m])，该公式计算逻辑为1 - 5分钟内CPU idle时间/CPU总运行时间5分钟，即假定CPU 5分钟内处于各状态的时间之和为5m。

我们使用一台4核空闲机器进行测试，top显示节点与单核CPU利用率约在1%左右，此时计算sum(increase(node_cpu[5m])) by (cpu)，可见5分钟内各状态时间之和约为50-60s：
在这里插入图片描述
使用chaosd加压（chaosd attack stress cpu -l 50 -w 4），top显示节点与单核CPU利用率约在75-85%左右，此时计算sum(increase(node_cpu[5m])) by (cpu)，可见5分钟内各状态时间之和约为180-190s：

继续加压（chaosd attack stress cpu -l 90 -w 4），top显示节点与单核CPU利用率约在92-93%左右，此时计算sum(increase(node_cpu[5m])) by (cpu)，可见5分钟内各状态时间之和约为280-290s：
在这里插入图片描述
由此可见，只有在CPU利用率较高时，CPU各状态之和才近似等于节点运行时间（此问题发生在grub启动参数/boot/grub2/grub.cfg中添加了nohz=off的场景中，可通过cat /proc/cmdline查看，如未添加该参数，irate/rate仍可以得到正确值）。

结论

使用irate/rate计算CPU利用率是不准确的，且CPU利用率越低计算结果越不准确。

因此在使用node_exporter计算CPU利用率时，因采用第二种方法。

node_exporter CPU指标的数值来源为 /proc/stat

# https://man7.org/linux/man-pages/man5/proc.5.html

       /proc/stat
              kernel/system statistics.  Varies with architecture.
              Common entries include:

              cpu 10132153 290696 3084719 46828483 16683 0 25195 0
              175628 0
              cpu0 1393280 32966 572056 13343292 6130 0 17875 0 23933 0
                     The amount of time, measured in units of USER_HZ
                     (1/100ths of a second on most architectures, use
                     sysconf(_SC_CLK_TCK) to obtain the right value),
                     that the system ("cpu" line) or the specific CPU
                     ("cpuN" line) spent in various states:

                     user   (1) Time spent in user mode.

                     nice   (2) Time spent in user mode with low
                            priority (nice).

                     system (3) Time spent in system mode.

                     idle   (4) Time spent in the idle task.  This value
                            should be USER_HZ times the second entry in
                            the /proc/uptime pseudo-file.

                     iowait (since Linux 2.5.41)
                            (5) Time waiting for I/O to complete.  This
                            value is not reliable, for the following
                            reasons:

                            •  The CPU will not wait for I/O to
                               complete; iowait is the time that a task
                               is waiting for I/O to complete.  When a
                               CPU goes into idle state for outstanding
                               task I/O, another task will be scheduled
                               on this CPU.

                            •  On a multi-core CPU, the task waiting for
                               I/O to complete is not running on any
                               CPU, so the iowait of each CPU is
                               difficult to calculate.

                            •  The value in this field may decrease in
                               certain conditions.

                     irq (since Linux 2.6.0)
                            (6) Time servicing interrupts.

                     softirq (since Linux 2.6.0)
                            (7) Time servicing softirqs.

                     steal (since Linux 2.6.11)
                            (8) Stolen time, which is the time spent in
                            other operating systems when running in a
                            virtualized environment

                     guest (since Linux 2.6.24)
                            (9) Time spent running a virtual CPU for
                            guest operating systems under the control of
                            the Linux kernel.

                     guest_nice (since Linux 2.6.33)
                            (10) Time spent running a niced guest
                            (virtual CPU for guest operating systems
                            under the control of the Linux kernel).