Articles in this section

FAQ - What is %steal CPU statistic in mpstat

Detail

When dealing with mpstat, multiple lines of output are present. This short article describes them and focuses specifically on the %steal part of the output.

16:10:08     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
16:10:10     all    7.87    0.00   11.61    6.07    0.00    4.87    1.73    0.00    0.00   67.85
16:10:10       0    9.28    0.00   12.89    5.67    0.00    5.15    1.55    0.00    0.00   65.46
16:10:10       1    6.91    0.00   13.30    5.85    0.00    2.66    1.60    0.00    0.00   69.68
16:10:10       2    7.53    0.00   11.83    5.91    0.00    5.91    2.15    0.00    0.00   66.67
16:10:10       3    8.60    0.00   10.75    5.38    0.00    4.84    2.15    0.00    0.00   68.28
16:10:10       4    8.47    0.00   11.11    6.35    0.00    5.29    1.59    0.00    0.00   67.20
16:10:10       5    5.43    0.00   13.59    4.89    0.00    5.98    2.17    0.00    0.00   67.93
16:10:10       6    8.38    0.00    9.42    7.33    0.00    5.24    2.09    0.00    0.00   67.54
16:10:10       7    7.69    0.00    9.34    7.14    0.00    4.40    1.65    0.00    0.00   69.78

As seen in the output above, mpstat provides the following:

namedetail
%usrthe amount of time spent in userland, handling applications
%nicethe amount of time spent in the ‘nice’ state, for applications with higher/lower ‘nice’ values, or priorities
%sysamount of time spent in dealing with system handling, this is the kernel and drivers
%iowaitthe amount of time the CPU has spent waiting for available IO bandwidth. High values indicate bottleneck in network or disk
%irqthe amount of time spent waiting on hard hardware interrupts to finish
%softhow much time the CPU has spent waiting on soft interrupts. This will often relate to network drivers and will be directly connected to %sys increasing as well
%stealthe amount of time stolen from the CPU by the hypervisor (this is discussed in detail below)
%guestthis will only show if the machine itself is a hypervisor - the amount of time spent serving virtual machines
%gnicethis shows the %guest usage when the guest virtual machine has %nice priority applied to it - niced guest

Answer

When running Aerospike in a virtualized platform, it is particularly important to monitor the %steal. This parameter shows the amount of time the physical CPU has “stolen” from the vCPU. In other words, this parameter shows how much time the vCPU core has spent waiting while the physical CPU core deals with another vCPU. In a 1:1 ratio, this number should be close to 0, if not 0. Overprovisioned vCPUs (i.e. those where the vCPU count exceeds the physical CPU count on the underlying hardware) will notice this number increase.

If you see this number increase, check with your cloud team (if hosting internally) or cloud vendor (if using a public cloud host).

Also note that a %steal of just 2% is significant. While it doesn’t exactly translate to this representation, it can be visualised as the vCPU core not doing anything for 1 second out of each 50 seconds. In this time, the physical CPU core is dealing with another vCPU request, from another virtual machine.


Notes

mpstat

Applies To Earliest Version

Pre 4.9

Applies To Latest Version

Current Version
Was this article helpful?
0 out of 0 found this helpful