Tuesday, June 13, 2017

Linux sar Command: Using -o and -f in Pairs

System Activity Reporter (SAR) is one of the important tool to monitor Linux servers. By using this command you can analyse the history of different resource usages.

In this article, we will examine how to monitor resource usages of servers (e.g., in a cluster) during the entire run of an application (e.g., a benchmark) using the following sar command pairs:
  • Data Collection
    • nohup sar -A -o /tmp/sar.data 10 > /dev/null &
  • Record Extraction
    • sar -f /tmp/sar.data [-u | -d | -n DEV]

Sar Command Options


In the data collection phase, we will use -o option to save data in a file of binary format and then use -f option combined with other options (e.g.,  [-u | -d | -n DEV]) to extract records related to different statistics (e.g., CPU, I/O, Network):

Main options

       -o [ filename ]
              Save the readings in the file in binary form. Each reading is in
              a separate record. The default value of the  filename  parameter
              is  the  current daily data file, the /var/log/sa/sadd file. The
              -o option is exclusive of the -f option.  All the data available
              from  the  kernel  are saved in the file (in fact, sar calls its
              data collector sadc with the option "-S ALL". See sadc(8) manual
              page).


       -f [ filename ]
              Extract records from filename (created by the -o filename flag).
              The default value of the filename parameter is the current daily
              data file, the /var/log/sa/sadd file. The -f option is exclusive
              of the -o option.

Others

       -u [ ALL ]
              Report CPU utilization. The ALL keyword indicates that  all  the
              CPU fields should be displayed.

       -d    Report activity for each block device  (kernels  2.4  and  newer
              only).

       -n { keyword [,...] | ALL }
              Report network statistics.


Monitoring the Entire Run of a Benchmark


In the illustration, we will use three benchmarks (i.e., scan / aggregation / join) in the HiBench suite as examples (see [2] for details).  At beginning of each benchmark run, we will start up sar commands on the servers of a cluster; then followed by running spark application of a specific workload; finally, we will kill the sar processes at the end of run.

run.sh
#!/bin/bash

if [ $# -ne 2 ]; then
  echo "usage: run.sh "
  echo "  where could be:"
  echo "    scan"
  echo "    aggregation"
  echo "    join"
  echo "  where could be:"
  echo "    mapreduce"
  echo "    spark/java"
  echo "    spark/scala"
  echo "    spark/python"
  exit 1
fi

workload=$1
target=$2
workloadsRoot=/data/hive/BDCSCE-HiBench/workloads

mkdir ~/$workload/$target

echo "start all sar commands ..."

./stats.sh start

while read -r vmIp
do
  echo "start stats on $vmIp"
  ./myssh opc@$vmIp "~/stats.sh start" &
done < vm.lst

# run a test in different workloads using different lang interfaces
$workloadsRoot/$workload/$target/bin/run.sh


echo "stop all sar commands ..."
./stats.sh stop

while read -r vmIp
do
  echo "stop stats on $vmIp"
  ./myssh opc@$vmIp "~/stats.sh stop" &
done < vm.lst


stats.sh

#!/bin/sh

case $1 in
  'start')
        pkill sar
        rm /tmp/sar.data
        nohup sar -A -o /tmp/sar.data 10 > /dev/null &
        ;;
  'stop')
        pkill sar
        scp /tmp/sar.data ~
        ;;
  '*')
        echo "usage: $0 start|stop"
        ''
esac

CPU Statistics


To view the overall CPU statistics, you can use option -u as follows:

$ sar -f sar.data -u

03:39:28 PM     CPU     %user     %nice   %system   %iowait    %steal     %idle

03:39:38 PM     all      0.03      0.00      0.01      0.02      0.00     99.94

03:39:48 PM     all      0.05      0.00      0.05      0.02      0.01     99.88

<snipped> 

Average:        all      0.09      0.00      0.02      0.02      0.00     99.86

           

I/O Statistics of Block Devices


To view the activity for each block device, you can use option -d as follows:

$ sar -f sar.data -d


03:39:28 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
03:39:38 PM dev202-16      1.20      0.00     16.06     13.33      0.02     14.67      6.50      0.78
03:39:38 PM dev202-32      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
03:39:38 PM dev202-48      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
03:39:38 PM dev202-64      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
03:39:38 PM dev202-80      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
03:39:38 PM  dev251-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
03:39:38 PM  dev251-1      1.20      0.00     16.06     13.33      0.02     14.67      6.50      0.78
03:39:38 PM  dev251-2      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
03:39:38 PM  dev251-3      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
03:39:38 PM  dev251-4      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
<snipped>

Average:          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
Average:    dev202-16      1.22      0.00     15.79     12.99      0.01     11.85      6.57      0.80
Average:    dev202-32      0.85      0.00      8.92     10.46      0.01     10.27      4.18      0.36
Average:    dev202-48      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:    dev202-64      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:    dev202-80      0.21      0.00      1.74      8.43      0.00      0.30      0.08      0.00
Average:     dev251-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:     dev251-1      1.25      0.00     15.97     12.73      0.01     11.78      6.37      0.80
Average:     dev251-2      0.90      0.00      8.92      9.88      0.01     10.44      3.95      0.36
Average:     dev251-3      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:     dev251-4      0.22      0.00      1.74      8.00      0.00      0.28      0.08      0.00

If you are interested in the average tps of dev251-1:
              tps 
                     Indicate  the  number  of  transfers per second that were
                     issued to the device.  Multiple logical requests  can  be
                     combined  into  a  single  I/O  request  to the device. A
                     transfer is of indeterminate size.
you can specify the following command:
$ sar -f "$destDir/sar.data" -d | grep Average  | grep dev251-1 | awk '{print $3}'

Network Statistics


To view the overall statistics of network devices like eth0, bond, etc, you can use option -n as follows:

Syntax: 
sar -n [VALUE]
The VALUE can be:
  • DEV: For network devices like eth0, bond, etc. 
  • EDEV: For network device failure details 
  • NFS: For NFS client info 
  • NFSD: For NFS server info 
  • SOCK: For sockets in use for IPv4 
  • IP: For IPv4 network traffic 
  • EIP: For IPv4 network errors 
  • ICMP: For ICMPv4 network traffic 
  • EICMP: For ICMPv4 network errors 
  • TCP: For TCPv4 network traffic 
  • ETCP: For TCPv4 network errors 
  • UDP: For UDPv4 network traffic 
  • SOCK6, IP6, EIP6, ICMP6, UDP6 : For IPv6 
  • ALL: For all above mentioned information.
$ sar -f sar.data -n DEV

03:39:28 PM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s

03:39:38 PM      eth0     12.35     16.47      1.34      4.04      0.00      0.00      0.00
03:39:38 PM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
03:39:48 PM      eth0      9.63     14.64      1.17      4.03      0.00      0.00      0.00
03:39:48 PM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
<snipped> 

Average:        IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s 
  Average:         eth0     11.26     16.14      3.95      6.46      0.00      0.00      0.00 
  Average:           lo      1.23      1.23      0.33      0.33      0.00      0.00      0.00

If you are interested in the average rxkB/s or txkB/s of eth0:
              rxkB/s
                     Total number of kilobytes received per second.

              txkB/s
                     Total number of kilobytes transmitted per second.

you can specify the following command:
sar -f "$destDir/sar.data" -n DEV|grep Average|grep eth0 |awk '{print $5}'
sar -f "$destDir/sar.data" -n DEV|grep Average|grep eth0 |awk '{print $6}'

References

  1. sar command for Linux system performance monitoring
  2. Three Benchmarks for SQL Coverage in HiBench Suite ― a Bigdata Micro Benchmark Suite

2 comments:

Smith said...

office.com/setup is very easy to install, download and redeem. Use of it is also simple and the user can learn the use of it easily. Online Support&help option is also available in all application which provides an instant guideline.

Unknown said...

Thanks for sharing such a great information with us. Your Post is very unique and all information is reliable for new readers. Keep it up in future, thanks for sharing such a useful post. Our toll-free number is accessible throughout the day and night for the customer if they face any technical issue in BROTHER PRINTER Call us +1888-621-0339 Brother Printer Support USA Brother Printer Tech Support Phone Number