sub-second: Monitoring Linux Servers during Performance Tests

Monday, November 7, 2011

Monitoring Linux Servers during Performance Tests

The server resource usage of the servers used in a load or performance test should be monitored and saved to be able to later verify system bottlenecks and behavior. Linux servers can be monitored in a number of different ways.

If you are using Performance Center to drive load, linux monitors can be enabled as part of the load test, allowing the linux metrics to be conveniently combined with other metrics in a single report. That requires the rstat daemon to be running on the linux servers monitored. For CentOS, rstat daemon can be installed as follows:

sudo yum install rpcbind
sudo wget http://www.wyzaerd.com/public/rpc.rstatd-4.0.1.tar.gz
sudo tar xvzf rpc.rstatd-4.0.1.tar.gz
cd rpc.rstatd-4.0.1
sudo ./configure
sudo make
sudo make install
sudo /usr/local/sbin/rpc.rstatd

Monitors can then be configured in performance center to collect stats from the linux server running rstatd.

Another tool for viewing and collecting server resource usage on linux is jperfmeter (http://jperfmeter.sourceforge.net/). Jperfmeter provides a gui view of resource usage metrics of a list of servers, showing values over time. For example the following shows the jmeter console for a group of 4 linux servers:

The jmeter metrics can be saved to a delimited text file for later analysis:

A third tool for monitoring linux server resource usage is sar. The following script will collect server resource usage on CPU, memory, network, and disk utilization:

# Run sar every 10 seconds until stopped
# cpu
sar -u 10 > sar.cpu.log &
# free memory
sar -r 10 > sar.freememory.log &
# disk total
sar -b 10 > sar.disk.log &
# network by device
# - Note that you need to filter by the adaptor in use.
# - Run "sar -n DEV 10" to see which adaptor is being used
sar -n DEV 10 |grep eth1 > sar.network.log &

On some versions of sar, it is necessary to provide a number of data points to collect such as the following:

# Run sar every 10 seconds until stopped
# cpu
sar -u 10 99999 > sar.cpu.log &
# free memory
sar -r 10 99999 > sar.freememory.log &
# disk total
sar -b 10 99999 > sar.disk.log &
# network by device
# - Note that you need to filter by the adaptor in use.
# - Run "sar -n DEV 10" to see which adaptor is being used
sar -n DEV 10 99999 |grep eth1 > sar.network.log &

For a test of one or two hours, a 10 second collection interval is convenient, providing a couple of hundred data points. For an overnight test, a 300 second collection interval is usually appropriate.

Output

The cpu log file shows user and system CPU % utilization:

03:07:55 PM CPU %user %nice %system %iowait %steal %idle

03:08:55 PM all 73.99 0.00 2.43 0.21 0.00 23.37

03:09:55 PM all 81.79 0.00 2.67 0.21 0.00 15.34

03:10:55 PM all 82.29 0.00 2.68 0.17 0.00 14.86

The free memory log file shows how much memory is free and used:

03:07:55 PM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit

03:08:55 PM 110106128 88246712 44.49 356468 42850352 30363972 7.61

03:09:55 PM 110053452 88299388 44.52 356472 42879192 30371420 7.61

03:10:55 PM 109989584 88363256 44.55 356484 42914152 30372688 7.61

The disk log file shows read and write transfers per second and bytes read and written per second

03:07:55 PM tps rtps wtps bread/s bwrtn/s

03:08:55 PM 7889.59 0.00 7889.59 0.00 58582.09

03:09:55 PM 8454.59 0.00 8454.59 0.00 62458.76

03:10:55 PM 8456.30 0.00 8456.30 0.00 62645.15

03:11:55 PM 7257.61 0.00 7257.61 0.00 57384.76

The network log file shows packets received and transmitted per second and bytes received and transmitted per second.

03:00:01 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s

03:08:55 PM eth1 3285.46 2965.12 956.75 1824.97 0.00 0.00 1.05

03:09:55 PM eth1 3640.33 3307.06 1053.38 2074.92 0.00 0.00 1.14

03:10:55 PM eth1 3617.67 3283.23 1047.62 2061.22 0.00 0.00 1.65

03:11:55 PM eth1 2917.34 2657.74 842.35 1686.10 0.00 0.00 1.38

03:12:55 PM eth1 3859.74 3502.98 1119.06 2194.43 0.00 0.00 1.15

These can be charted to provide a simple overview of server resource usage as follows:

sub-second

Monday, November 7, 2011

Monitoring Linux Servers during Performance Tests

Output

No comments:

Post a Comment