System Activity Reporter (SAR) is one of the important tool to monitor Linux servers. By using this command you can analyse the history of different resource usages.
In this article, we will examine how to monitor resource usages of servers (e.g., in a cluster) during the entire run of an application (e.g., a benchmark) using the following sar command pairs:
In the data collection phase, we will use -o option to save data in a file of binary format and then use -f option combined with other options (e.g., [-u | -d | -n DEV]) to extract records related to different statistics (e.g., CPU, I/O, Network):
Main options
-o [ filename ]
Save the readings in the file in binary form. Each reading is in
a separate record. The default value of the filename parameter
is the current daily data file, the /var/log/sa/sadd file. The
-o option is exclusive of the -f option. All the data available
from the kernel are saved in the file (in fact, sar calls its
data collector sadc with the option "-S ALL". See sadc(8) manual
page).
-f [ filename ]
Extract records from filename (created by the -o filename flag).
The default value of the filename parameter is the current daily
data file, the /var/log/sa/sadd file. The -f option is exclusive
of the -o option.
Others
-u [ ALL ]
Report CPU utilization. The ALL keyword indicates that all the
CPU fields should be displayed.
-d Report activity for each block device (kernels 2.4 and newer
only).
-n { keyword [,...] | ALL }
Report network statistics.
In the illustration, we will use three benchmarks (i.e., scan / aggregation / join) in the HiBench suite as examples (see [2] for details). At beginning of each benchmark run, we will start up sar commands on the servers of a cluster; then followed by running spark application of a specific workload; finally, we will kill the sar processes at the end of run.
run.sh
#!/bin/bash
if [ $# -ne 2 ]; then
echo "usage: run.sh "
echo " where could be:"
echo " scan"
echo " aggregation"
echo " join"
echo " where could be:"
echo " mapreduce"
echo " spark/java"
echo " spark/scala"
echo " spark/python"
exit 1
fi
workload=$1
target=$2
workloadsRoot=/data/hive/BDCSCE-HiBench/workloads
mkdir ~/$workload/$target
echo "start all sar commands ..."
./stats.sh start
while read -r vmIp
do
echo "start stats on $vmIp"
./myssh opc@$vmIp "~/stats.sh start" &
done < vm.lst
# run a test in different workloads using different lang interfaces
$workloadsRoot/$workload/$target/bin/run.sh
echo "stop all sar commands ..."
./stats.sh stop
while read -r vmIp
do
echo "stop stats on $vmIp"
./myssh opc@$vmIp "~/stats.sh stop" &
done < vm.lst
stats.sh
#!/bin/sh
case $1 in
'start')
pkill sar
rm /tmp/sar.data
nohup sar -A -o /tmp/sar.data 10 > /dev/null &
;;
'stop')
pkill sar
scp /tmp/sar.data ~
;;
'*')
echo "usage: $0 start|stop"
''
esac
To view the overall CPU statistics, you can use option -u as follows:
$ sar -f sar.data -u
03:39:28 PM CPU %user %nice %system %iowait %steal %idle
03:39:38 PM all 0.03 0.00 0.01 0.02 0.00 99.94
03:39:48 PM all 0.05 0.00 0.05 0.02 0.01 99.88
<snipped>
Average: all 0.09 0.00 0.02 0.02 0.00 99.86
To view the activity for each block device, you can use option -d as follows:
$ sar -f sar.data -d
03:39:28 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
03:39:38 PM dev202-16 1.20 0.00 16.06 13.33 0.02 14.67 6.50 0.78
03:39:38 PM dev202-32 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:39:38 PM dev202-48 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:39:38 PM dev202-64 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:39:38 PM dev202-80 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:39:38 PM dev251-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:39:38 PM dev251-1 1.20 0.00 16.06 13.33 0.02 14.67 6.50 0.78
03:39:38 PM dev251-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:39:38 PM dev251-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:39:38 PM dev251-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
<snipped>
Average: DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
Average: dev202-16 1.22 0.00 15.79 12.99 0.01 11.85 6.57 0.80
Average: dev202-32 0.85 0.00 8.92 10.46 0.01 10.27 4.18 0.36
Average: dev202-48 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: dev202-64 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: dev202-80 0.21 0.00 1.74 8.43 0.00 0.30 0.08 0.00
Average: dev251-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: dev251-1 1.25 0.00 15.97 12.73 0.01 11.78 6.37 0.80
Average: dev251-2 0.90 0.00 8.92 9.88 0.01 10.44 3.95 0.36
Average: dev251-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: dev251-4 0.22 0.00 1.74 8.00 0.00 0.28 0.08 0.00
If you are interested in the average tps of dev251-1:
tps
Indicate the number of transfers per second that were
issued to the device. Multiple logical requests can be
combined into a single I/O request to the device. A
transfer is of indeterminate size.
you can specify the following command:
To view the overall statistics of network devices like eth0, bond, etc, you can use option -n as follows:
03:39:28 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s
03:39:38 PM eth0 12.35 16.47 1.34 4.04 0.00 0.00 0.00
03:39:38 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:39:48 PM eth0 9.63 14.64 1.17 4.03 0.00 0.00 0.00
03:39:48 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00
<snipped>
Average: IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s
Average: eth0 11.26 16.14 3.95 6.46 0.00 0.00 0.00
Average: lo 1.23 1.23 0.33 0.33 0.00 0.00 0.00
If you are interested in the average rxkB/s or txkB/s of eth0:
rxkB/s
Total number of kilobytes received per second.
txkB/s
Total number of kilobytes transmitted per second.
you can specify the following command:
In this article, we will examine how to monitor resource usages of servers (e.g., in a cluster) during the entire run of an application (e.g., a benchmark) using the following sar command pairs:
- Data Collection
- nohup sar -A -o /tmp/sar.data 10 > /dev/null &
- Record Extraction
- sar -f /tmp/sar.data [-u | -d | -n DEV]
Sar Command Options
In the data collection phase, we will use -o option to save data in a file of binary format and then use -f option combined with other options (e.g., [-u | -d | -n DEV]) to extract records related to different statistics (e.g., CPU, I/O, Network):
Main options
-o [ filename ]
Save the readings in the file in binary form. Each reading is in
a separate record. The default value of the filename parameter
is the current daily data file, the /var/log/sa/sadd file. The
-o option is exclusive of the -f option. All the data available
from the kernel are saved in the file (in fact, sar calls its
data collector sadc with the option "-S ALL". See sadc(8) manual
page).
-f [ filename ]
Extract records from filename (created by the -o filename flag).
The default value of the filename parameter is the current daily
data file, the /var/log/sa/sadd file. The -f option is exclusive
of the -o option.
Others
-u [ ALL ]
Report CPU utilization. The ALL keyword indicates that all the
CPU fields should be displayed.
-d Report activity for each block device (kernels 2.4 and newer
only).
-n { keyword [,...] | ALL }
Report network statistics.
Monitoring the Entire Run of a Benchmark
In the illustration, we will use three benchmarks (i.e., scan / aggregation / join) in the HiBench suite as examples (see [2] for details). At beginning of each benchmark run, we will start up sar commands on the servers of a cluster; then followed by running spark application of a specific workload; finally, we will kill the sar processes at the end of run.
run.sh
#!/bin/bash
if [ $# -ne 2 ]; then
echo "usage: run.sh
echo " where
echo " scan"
echo " aggregation"
echo " join"
echo " where
echo " mapreduce"
echo " spark/java"
echo " spark/scala"
echo " spark/python"
exit 1
fi
workload=$1
target=$2
workloadsRoot=/data/hive/BDCSCE-HiBench/workloads
mkdir ~/$workload/$target
echo "start all sar commands ..."
./stats.sh start
while read -r vmIp
do
echo "start stats on $vmIp"
./myssh opc@$vmIp "~/stats.sh start" &
done < vm.lst
# run a test in different workloads using different lang interfaces
$workloadsRoot/$workload/$target/bin/run.sh
echo "stop all sar commands ..."
./stats.sh stop
while read -r vmIp
do
echo "stop stats on $vmIp"
./myssh opc@$vmIp "~/stats.sh stop" &
done < vm.lst
#!/bin/sh
case $1 in
'start')
pkill sar
rm /tmp/sar.data
nohup sar -A -o /tmp/sar.data 10 > /dev/null &
;;
'stop')
pkill sar
scp /tmp/sar.data ~
;;
'*')
echo "usage: $0 start|stop"
''
esac
CPU Statistics
$ sar -f sar.data -u
03:39:28 PM CPU %user %nice %system %iowait %steal %idle
03:39:38 PM all 0.03 0.00 0.01 0.02 0.00 99.94
03:39:48 PM all 0.05 0.00 0.05 0.02 0.01 99.88
<snipped>
Average: all 0.09 0.00 0.02 0.02 0.00 99.86
I/O Statistics of Block Devices
$ sar -f sar.data -d
03:39:28 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
03:39:38 PM dev202-16 1.20 0.00 16.06 13.33 0.02 14.67 6.50 0.78
03:39:38 PM dev202-32 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:39:38 PM dev202-48 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:39:38 PM dev202-64 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:39:38 PM dev202-80 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:39:38 PM dev251-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:39:38 PM dev251-1 1.20 0.00 16.06 13.33 0.02 14.67 6.50 0.78
03:39:38 PM dev251-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:39:38 PM dev251-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:39:38 PM dev251-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
<snipped>
Average: DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
Average: dev202-16 1.22 0.00 15.79 12.99 0.01 11.85 6.57 0.80
Average: dev202-32 0.85 0.00 8.92 10.46 0.01 10.27 4.18 0.36
Average: dev202-48 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: dev202-64 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: dev202-80 0.21 0.00 1.74 8.43 0.00 0.30 0.08 0.00
Average: dev251-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: dev251-1 1.25 0.00 15.97 12.73 0.01 11.78 6.37 0.80
Average: dev251-2 0.90 0.00 8.92 9.88 0.01 10.44 3.95 0.36
Average: dev251-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: dev251-4 0.22 0.00 1.74 8.00 0.00 0.28 0.08 0.00
If you are interested in the average tps of dev251-1:
tps
Indicate the number of transfers per second that were
issued to the device. Multiple logical requests can be
combined into a single I/O request to the device. A
transfer is of indeterminate size.
you can specify the following command:
$ sar -f "$destDir/sar.data" -d | grep Average | grep dev251-1 | awk '{print $3}'
Network Statistics
Syntax:
sar -n [VALUE]The VALUE can be:
- DEV: For network devices like eth0, bond, etc.
- EDEV: For network device failure details
- NFS: For NFS client info
- NFSD: For NFS server info
- SOCK: For sockets in use for IPv4
- IP: For IPv4 network traffic
- EIP: For IPv4 network errors
- ICMP: For ICMPv4 network traffic
- EICMP: For ICMPv4 network errors
- TCP: For TCPv4 network traffic
- ETCP: For TCPv4 network errors
- UDP: For UDPv4 network traffic
- SOCK6, IP6, EIP6, ICMP6, UDP6 : For IPv6
- ALL: For all above mentioned information.
03:39:28 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s
03:39:38 PM eth0 12.35 16.47 1.34 4.04 0.00 0.00 0.00
03:39:38 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:39:48 PM eth0 9.63 14.64 1.17 4.03 0.00 0.00 0.00
03:39:48 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00
<snipped>
Average: IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s
Average: eth0 11.26 16.14 3.95 6.46 0.00 0.00 0.00
Average: lo 1.23 1.23 0.33 0.33 0.00 0.00 0.00
If you are interested in the average rxkB/s or txkB/s of eth0:
rxkB/s
Total number of kilobytes received per second.
txkB/s
Total number of kilobytes transmitted per second.
you can specify the following command:
sar -f "$destDir/sar.data" -n DEV|grep Average|grep eth0 |awk '{print $5}'
sar -f "$destDir/sar.data" -n DEV|grep Average|grep eth0 |awk '{print $6}'
2 comments:
office.com/setup is very easy to install, download and redeem. Use of it is also simple and the user can learn the use of it easily. Online Support&help option is also available in all application which provides an instant guideline.
Thanks for sharing such a great information with us. Your Post is very unique and all information is reliable for new readers. Keep it up in future, thanks for sharing such a useful post. Our toll-free number is accessible throughout the day and night for the customer if they face any technical issue in BROTHER PRINTER Call us +1888-621-0339 Brother Printer Support USA Brother Printer Tech Support Phone Number
Post a Comment