Wednesday, February 5, 2014

Linux: Understanding Processor Queue in Vmstat Output

Linux tools such as mpstat, iostat, vmstat are useful for evaluating overall system performance at the OS level.  In this article, we will look at what statistics vmstat provides, especially focusing on CPU Run Queue (or Processor Queue).


vmstat Sample Ouput


vmstat reports information about processes, memory, paging, block IO, traps, and cpu activity.  It takes an optional delay argument and other options.  For example, the following command asks vmstat make updates every 60 seconds.

$vmstat 60

The first report produced gives averages since the last reboot. Additional reports give information on a sampling period of length delay. The process and memory reports are instantaneous in either case. If no delay is specified, only one report is printed with the average values since boot.

procs -----------memory---------- -swap- ----io--- --system-- -----cpu------
 r  b  swpd   free   buff  cache   si so bi    bo   in   cs   us sy id wa st

 3  0   0 34138256 2654680 30542460 0  0 112 7052 1220 26560   11  9 79  0  0
134 0   0 34070712 2654812 30589228 0  0 19 15130 1174 2061612 79 10  9  1  0
13  0   0 34077516 2654868 30581300 0  0 32 10688 1123 1218261 68 17 15  0  0
 7  0   0 33928980 2654924 30538396 0  0 33  9762 1402 338231  41 12 46  1  0
 7  0   0 33904644 2654932 30556204 0  0 0   7094 1260 22498   12  8 80  0  0
 5  0   0 33869408 2654956 30564832 0  0 77  7298 1203 107386  15  9 76  0  0
 6  0   0 33907632 2654976 30532548 0  0 78  7220 1215 106477  14  9 76  0  0
 5  1   0 33898596 2655004 30539928 0  0 32  7172 1202 21369   10  8 81  0  0
 2  0   0 33883432 2655008 30547144 0  0 19  7158 1208 30007   10  8 81  0  0

As highlighted above, we have a high value (i.e., 134) of Run Queue, which will be discussed further later.

vmstat (Virtual Memory Statistics)


The summary information output by vmstat includes:
  • r
    • CPU Run Queue which is the actual number of lightweight processes in the run queue (including processes waiting to run but that are held up by the CPU)
  • b
    • Number of processes sleeping (usually waiting for IO)
  • swpd
    • Total swap space used (default: KB)
  • free
    • Total free memory (default: KB)
  • buff
    • Total buffer memory usage (default: KB)—represents how much portion of RAM is dedicated to cache disk block
  • cache
    • Total disk cache memory usage  (default: KB)—similar to buff, only this time it caches pages from file reading.
  • si
    • Memory swapped in from disk (in KB/sec)—the amount of memory paged-in 
    • If you see high si/so values, it indicates the system is swapping 
  • so
    • Memory swapped out to disk (in KB/sec)—the amount of memory paged-out
    • If you see high si/so values, it indicates the system is swapping 
  • bi
    • Blocks read in from IO devices (blocks per sec)
  • bo
    • Blocks written to IO devices (blocks per sec)
  • in
    • Interrupts per second
  • cs
    • Context switch per second
  • us
    • Percentage of user CPU utilization
  • sy
    • Percentage of kernel or system CPU utilization
    • High levels of system time mean something is wrong, or the application is making many system calls. Investigating the cause of high system time is  always worthwhile.
  • id
    • Percentage of idle or available CPU
    • The sum of the “us” column and “sy” column should be equal to 100 minus the value in the “id” column, that is, 100 – (“id” column value)
  • wa
    • Time spent waiting for IO—Prior to Linux 2.5.41, included in idle
  • st
    • Time stolen from a virtual machine—Prior to Linux 2.6.11, unknown


CPU Run Queue (Processor Queue)


As we have seen that there was 134 threads (note that we have a 24-processor Linux system) in the Processor Queue at one time, this may be something to be alerted.  It means that there are 134 threads that were running or that could run if there were available CPU.  Keep in mind that the run queue length represents everything on the machine, so sometimes there are other threads from completely separate processes that want to run.

If there are more threads to run than available CPUs, performance will begin to degrade. In general, you want the processor queue length to be 0 on Windows and equal to (or less than) the number of CPUs on Unix systems.

If the run queue length is too long for any significant period of time, it is an indication that your machine is overloaded and you should look into reducing the amount of work the machine is doing (either by moving jobs to another machine, or optimizing your code).[1]

Context Switches


When the machine is overloaded, you will also witness high number of context switches occurring at the same time.  From the line showing high run-queue length (i.e., 134), we also see high number of context switch per second (i.e., 2061612).

There are two types of context switches:
  • Voluntary thread context switches 
    • An executing thread voluntarily takes itself off the CPU
  • Involuntary thread context switches 
    • A thread is taken off the CPU as a result of an expiring time quantum or has been preempted by a higher priority thread
High involuntary context switches are an indication there are more threads ready to run than there are virtual processors available to run them, which results in:
  • A high run queue depth in vmstat
  • A high CPU utilization
  • A high number of migrations
For our case, the high run queue length occurred only for a short duration.  And, it may be due to the fact that our application and database were run on the same system.  At the time of high run queue lengths, database may be performing lots of I/O's.

References

  1. How to Troubleshoot High CPU Usage of Java Applications? (Xml and More)

1 comment:

yosabrams0918 said...

you've got an ideal blog here! would you prefer to make some invite posts on my blog? gsn casino slots