Tuesday, December 3, 2013

Performance Testing in a Continuous Integration Environment

Software development groups with continuous integration in place (to handle automatic build and test on any check-in) should ideally include performance and load testing in the continuous integration process.  This article describes my experiences implementing such a solution at a large enterprise software company.

Infrastructure Needed

  • A continuous integration system such as jenkins, teamcity, etc. that is able to build code and run tests
  • An automated deployment system that can programmatically deploy and start newly built code
  • A performance test framework supporting performance tests that are runnable by the continuous integration system against the automatically deployed code

Performance test framework
The performance test framework could be a third-party framework such as jmeter that integrates well with CI systems, unit test frameworks (such as testng or junit) and other open source software.  the solution I chose was to implement an in-house performance test framework.  This perf test framework was designed to make it trivial to take an arbitrary junit test class and convert it to a performance test with less than 5 minutes of work.  This allows the extensive library of automated tests already implemented (in the junit framework) to be leveraged as performance test cases.

This performance test framework supported functionality including the following, which would be a reasonable minimal set of performance test framework functionality:


  • Allow any test case to be converted to a performance test with minimal effort.  (In this case, it is achieved by having the test class extend a PerfTestBase class and implement an IPerfTest interface.)
  • Allow any block of code to be wrapped as a transaction to be measured.
  • Allow the test to be run for a configured amount of time or a configured number of iterations.
  • Allow the test to be run concurrently by a configurable number of client threads.
  • Allow the test to gradually scale up load as configured, such as 1 test case at a time, 2 concurrent test cases, 5 concurrent test cases, etc.
  • Handle exceptions and errors.
  • Allow the test to pass or fail based on configurable performance criteria such as response time SLAs, performance versus previous test runs, etc.
  • Allow the test to be run in an IDE, from the command line, and from the continuous integration system.
  • Persist results in a database for historical reporting
  • Automatically generate test result artifacts and charts including response time trends over time, performance relative to previous test runs, performance versus number of concurrent test cases, etc.
  • Automatically calculate any performance metrics needed for reporting.
  • Automatically push results to sinks such as email distribution lists, dashboards, third party reporting repositories, etc.

Stretch goals for a performance test framework include the following:


  • Handle stress test cases in which the test process, the services under test, or the OS might freeze up, requiring external monitoring and catastrophic failure handling.
  • Handle capacity testing in which the absolute capacity of a service is reached programmatically.
  • Handle performance of fault-tolerance scenarios in which some of the services under test are programmatically brought down and up.


Findings
My experience doing performance testing within the continuous integration system is that the tests find, on a daily basis, deep stability and performance bugs. Bugs found include deadlocks, race conditions, out of memory exceptions, process crashes, response time regressions, etc.  This allows a very large amount of performance testing to be done with minimal staffing resources.  There are up-front costs implementing the tests and getting them running the continuous integration system.  Once the performance tests are implemented, the bulk of the ongoing work involves triaging, driving and resolving the performance and stability bugs that are found automatically on a daily basis.

I believe this type of system constitutes a significant leap forward versus traditional manual performance testing.  Traditional, manual performance testing  is labor intensive and requires expensive, antiquated third-party testing tools.

Tuesday, February 5, 2013

Timing Command Line Applications on Linux



The command line time utility available on linux as /usr/bin/time can be used to measure the elapsed time of test applications launched from the command line, and also measure other information such as cpu time and other metrics, outputting the result as a csv file that can then be easily charted.  Using the "format" option of the time utility allows the output to be formatted as csv with a single line per app run, rather than the default output which contains multiple lines of output.

Script


echo TestCase , Seconds , RealTime , CpuTimeUser , CpuTimeKernel
/usr/bin/time --format="MyTestCase1 , %e , %E , %U , %S" ./mytestcase1.sh mytestcase1parameter

/usr/bin/time --format="MyTestCase2 , %e , %E , %U , %S" java -jar mytestcase2.jar


Output


TestCase , Seconds , RealTime , CpuTimeUser , CpuTimeKernel
MytestCase1 , 679.33 , 11:19.33 , 227.13 , 105.85
MytestCase2 , 460.99 , 7:40.99 , 96.32 , 44.64

Format String


The format string can be configured to output the following options:

       Time
       %E     Elapsed real time (in [hours:]minutes:seconds).
       %e     (Not in tcsh.) Elapsed real time (in seconds).
       %S     Total number of CPU-seconds that the process spent in kernel mode.
       %U     Total number of CPU-seconds that the process spent in user mode.
       %P     Percentage of the CPU that this job got, computed as (%U + %S) / %E.

       Memory
       %M     Maximum resident set size of the process during its lifetime, in Kbytes.
       %t     (Not in tcsh.) Average resident set size of the process, in Kbytes.
       %K     Average total (data+stack+text) memory use of the process, in Kbytes.
       %D     Average size of the processâs unshared data area, in Kbytes.
       %p     (Not in tcsh.) Average size of the processâs unshared stack space, in Kbytes.
       %X     Average size of the processâs shared text space, in Kbytes.
       %Z     (Not in tcsh.) Systemâs page size, in bytes.  This is a per-system constant, but varies between systems.
       %F     Number of major page faults that occurred while the process was running.  These are faults where the page has to be  read  in  from
              disk.
       %R     Number of minor, or recoverable, page faults.  These are faults for pages that are not valid but which have not yet been claimed by
              other virtual pages.  Thus the data in the page is still valid but the system tables must be updated.
       %W     Number of times the process was swapped out of main memory.
       %c     Number of times the process was context-switched involuntarily (because the time slice expired).
       %w     Number of waits: times that the program was context-switched voluntarily, for instance while waiting for an I/O operation  to  com-
              plete.

       I/O
       %I     Number of file system inputs by the process.
       %O     Number of file system outputs by the process.
       %r     Number of socket messages received by the process.
       %s     Number of socket messages sent by the process.
       %k     Number of signals delivered to the process.
       %C     (Not in tcsh.) Name and command-line arguments of the command being timed.
       %x     (Not in tcsh.) Exit status of the command.