Skip to content

Possible Bug in CPU Usage Calculation — Only User Time Considered, Ignoring System/Other Times #337

@trevor211

Description

@trevor211

Hi,

I noticed that in perftest, the CPU usage is calculated using values from the first line of /proc/stat, but only the user and idle fields are used. Specifically, the calculation looks like this:

ustat_diff = user[1] - user[0];  // only user time
idle_diff = idle[1] - idle[0];   // idle time
cpu_usage = (ustat_diff / (ustat_diff + idle_diff)) * 100;

However, this approach seems incomplete because it ignores other CPU usage fields such as system, nice, irq, etc. A more accurate calculation should include all CPU time components. Typically, it should be:

total_diff = Δuser + Δnice + Δsystem + Δidle + Δiowait + Δirq + Δsoftirq + Δsteal + ...
idle_diff = Δidle + Δiowait
cpu_usage = (total_diff - idle_diff) / total_diff * 100

In short:

cpu_usage = (Δtotal - Δidle) / Δtotal

By only considering user time, the current calculation underestimates total CPU usage, especially under workloads that heavily use system or irq time.

This becomes particularly problematic in two-sided RDMA mode (e.g., RDMA Send/Recv), where the application might use event-driven programming (e.g., using poll/select/epoll). In such cases, user time remains low while system time or interrupt time may be high. As a result, the current method can misleadingly show near-zero CPU usage, even when the system is busy handling RDMA events.

Is this an oversight or intentional simplification? If it's not intentional, I believe this might be a bug worth correcting.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions