Monday, August 27, 2012

Performance Testing ROI and Black Swans


What return on investment/ROI can be expected from performance testing?  There are two categories of "return" for performance and stability testing.

One category of return is a simple, quantifiable return in which performance testing results allow the amount of hardware required to be reduced in a couple of ways:

  • Application tuning and optimization allowing the same amount of hardware to handle additional traffic
  • Proof of hardware overcapacity, showing that a reduced amount of hardware would be sufficient to handle peak traffic, allowing hardware inventory to be reduced  

The second category of return is not a return so much as an insurance policy, or protection against a fat tail or black swan event.  In this case, you are protecting yourself against system stability bugs whose consequences can be absolutely catastrophic, up to the complete destruction of the business.

Consider the example of Knight Capital's botched software rollout on August 1st, 2012.  The rollout triggered unexpected trades that cost the firm $440 million, depleting its operating capital.  This weakened position allowed outside investors to take a controlling 70% stake in the company as terms of its bailout.

By doing performance and stability testing, catastrophic stability bugs may be missed due to limitations of the test environment or test data, or failure to anticipate the nature of production behavior leading to the disaster.  However, if a catastophic stability bug is discovered and fixed, which happens regularly, a black swan event has been quietly and successfully sidestepped.  The value is avoiding a devastating black swan event that brings down the company through loss of customers, loss of reputation, lawsuits, takeovers, etc.  Not to mention more individual losses such as loss of job, bonus, promotion, career, etc.  The cost of the testing is fixed and predicable, simply the cost of supporting ongoing performance testing, in effect the insurance policy premiums protecting against the black swan.


Friday, August 24, 2012

How To Add a Timestamp to Each Line of Output on Linux

For performance monitoring on linux, you typically would like your shell script metrics and statistics that are logged periodically to include the timestamp with each line output.  That way, the information can later be correlated with other system events such as event log entries or system errors.  There are two basic scenarios.
  1. Your shell script is echoing information to standard output periodically in some loop
  2. You are just calling a program such as vmstat directly that outputs information periodically
Adding a timestamp in the two scenarios can be handled easily as follows:

Monitoring Loop

If your monitoring script includes a loop that is outputting stats periodically, a timestamp can be added to each line by capturing the output of the "date" command and then adding that information to the output line.
The output of the date command can be captured by using forward single quotes, as follows:

savedDate = `date`

Here is an example of a script making use of this technique:

#
# output stats every $1 seconds until stopped
#
if [ "$1" == "" ] ; then
  echo "syntax= .stats.sh <intervalSeconds>"
  exit
fi
time=$1
echo "Time , QueriesPerSechod" > mysql-stats.log

while :
do
    qps=`mysql -u mysql -e "show status" | awk '{if ($1 == "Queries") print $2}'`
    date=`date`
    echo "$date , $qps" >> mysql-stats.log
    sleep $time
done

Here is an example of the output:

Time , QueriesPerSecond
Wed Aug  8 16:11:58 PDT 2012 , 3029
Wed Aug  8 16:12:08 PDT 2012 , 18195 
Wed Aug  8 16:12:18 PDT 2012 , 18903
...

Direct Output

If you are just calling an app or script such as vmstat, iostat, etc., that outputs stats periodically for you, you can achieve the same effect by piping the command through an app that prepends a timestamp to each line.  For example, to prepend timestamps to the output from vmstat, free, and iostat, you would do the following, depending on a script "timestamp.sh" with the timestamp prepend behavior:

# free
free -g -s 10 | ./timestamp.sh > free.log &
# vmstat
vmstat 10 | ./timestamp.sh > vmstat.log &
# iostat
iostat 10 | ./timestamp.sh > iostat.log &

This would give output such as the following:

07/23/12 10:41:44,procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
07/23/12 10:41:44, r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
07/23/12 10:41:44, 6  0   2736 3381052 505844 159601296    0    0    62   119    1    1  6  1 93  0  0
07/23/12 10:41:54, 2  0   2736 3358836 505900 159601728    0    0   102    25  640  414  1  0 99  0  0
07/23/12 10:42:04, 7  0   2736 3329736 505948 159601936    0    0     9    32  927 1263  1  0 98  0  0

The timestamp.sh script is as follows:


#!/usr/bin/perl
#while (<>) { print localtime() . ": $_"; }
while (<>) 
{
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
$secpad = sprintf( "%02d", $sec );
$minpad = sprintf( "%02d", $min );
$hourpad = sprintf( "%02d", $hour );
$monpad = sprintf( "%02d", $mon );
$mdaypad = sprintf( "%02d", $mday );
$yearpad = sprintf( "%02d", ($year + 1900) % 100 );
$time = "$monpad\/$mdaypad\/$yearpad $hourpad:$minpad:$secpad";
print $time . ",$_" ;
#17/04/112 16:40:14
}





Tuesday, August 14, 2012

How to Measure Response Times in Java

This post describes how to measure response time in Java.  You can use either System.currentTimeMillis() or System.nanoTime() depending on what response times you will be recording.

System.currentTimeMillis is accurate to 1 millisecond, so is typically appropriate if you have response times on the order of magnitude of 1 second or more.  Using currentTimeMillis, you could report results in seconds with three decimals of precision, such as 1.205 seconds, or results in milliseconds with zero decimals of precision, such as 1205 milliseconds. 

nanoTime is typically appropriate if you want to report response times with greater precision than milliseconds, useful in cases in which differences of a fraction of a millisecond or microsecond are needed.

The following java code sample shows how to capture the response time of a transaction using both currentTimeMillis and nanoTime, showing the results in milliseconds with three decimals of precision.

Code Sample


/**
 * Created with IntelliJ IDEA.
 * User: ehammer
 */

import java.net.*;
import java.io.*;

public class TimerTest
{
    public static final Double NANO_TO_MILLIS = 1000000.0;
    public static void main(String[] args) throws MalformedURLException, IOException
    {
        // start timer
        long nanoStart = System.nanoTime();
        long milliStart = System.currentTimeMillis();

        // do work to be timed
        doWork();

        // stop timer
        long nanoEnd = System.nanoTime();
        long milliEnd = System.currentTimeMillis();

        // report response times
        long nanoTime = nanoEnd - nanoStart;
        long milliTime = milliEnd - milliStart;
        reportResponseTimes(nanoTime, milliTime);
    }

    private static void reportResponseTimes(long nanoTime, long milliTime)
    {
        // convert nanoseconds to milliseconds and display both times with three digits of precision (microsecond)
        String nanoFormatted = String.format("%,.3f", nanoTime / NANO_TO_MILLIS);
        String milliFormatted = String.format("%,.3f", milliTime / 1.0 );

        System.out.println("Milliseconds using nanoTime(): " + nanoFormatted);
        System.out.println("Milliseconds using currentTimeMillis: " + milliFormatted);
    }

    private static void doWork() throws MalformedURLException, IOException
    {
        URL url = new URL("http://www.google.com/");
        URLConnection conn = url.openConnection();
        BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
        while (in.readLine() != null) {}
        in.close();
    }
}


Sample Output


Milliseconds using nanoTime(): 277.913
Milliseconds using currentTimeMillis: 278.000

Process finished with exit code 0

Walk Through

Timer is started as follows:
        long nanoStart = System.nanoTime();
        long milliStart = System.currentTimeMillis();

The transaction to be timed is then run.

The timer is stopped:
        long nanoEnd = System.nanoTime();
        long milliEnd = System.currentTimeMillis();

The difference between end and start is taken:
        long nanoTime = nanoEnd - nanoStart;
        long milliTime = milliEnd - milliStart;

Results are formatted in the desired precision, in this case as milliseconds with three decimals of precision (microsecond precision):
        String nanoFormatted = String.format("%,.3f", nanoTime / NANO_TO_MILLIS);
        String milliFormatted = String.format("%,.3f", milliTime / 1.0 );

Note that nanoTime results show microsecond precision, while currentTimeMillis shows results in 1 millisecond increments.

Monday, August 13, 2012

Recording Loadrunner Traffic in Fiddler

It can be useful to capture loadrunner http traffic in fiddler.  This can be useful, for example, for seeing the timeline of http requests, which ones are slow, what sort of parallelism exists, if any, etc.  It can also be useful to compare the vugen timeline of requests to the same information captured directly from the browser, as vugen may not match browser behavior as expected, by serializing what is sent asynchronously in parallel by the browser, etc.

Vugen can be configured in Runtime Settings, Internet Protocol, Proxy, Use Proxy Server, address=localhost, port=8888.
You then put breakpoints in the script before and after the section you want to record, run the script in vugen to the first breakpoint (with logging disabled), start capturing traffic in fiddler, run the script to the second breakpoint and stop capturing traffic in fiddler.  You can then see a visualization of the request pipeline in fiddler for your loadrunner traffic:
It is also a good idea to capture the traffic by running the script from the controller on the load generator to reduce vugen overhead.  You will get a more compact and realistic pipeline that way.

You can then record the same transaction manually using the web browser for comparison to loadrunner.