Thursday, May 24, 2012

Jstat Java Heap and Garbage Collection Analysis

Java memory utilization and garbage collection activity can be analyzed fairly nicely with a set of seven charts showing utilization and capacity of the different generations, and garbage collection activity in the generations:

  • Garbage Collection - Old Generation (number of collections and time spent in collection)
  • Garbage Collection - Young Generation (number of collections and time spent in collection)
  • Utilization and Capacity - Old Generation Space
  • Utilization and Capacity - Survivor Space 0
  • Utilization and Capacity - Survivor Space 1
  • Utilization and Capacity - Eden Space
  • Utilization and Capacity - Permanent Space

The following is an example of these seven charts:

Garbage collections in the old generation, number of collections per interval and time spent in collections per interval:

Garbage collections in the young generation, number of collections per interval and time spent in collections per interval:
Old generation utilization and capacity:
Survivor space 1 utilization and capacity:
Survivor space 0 utilization and capacity:
Eden space utilization and capacity:
Permanent space utilization and capacity:


Jstat is a tool that comes with the Java JDK that can be used to monitor the java heap, including information on the generations, the number of garbage collections and the time taken by garbage collections.  With its output the above seven charts can be created.

Jstat can be collected as follows for a given java process pid:

jstat -gc 2862 60000


In this case information is output every 60 seconds for java process 2862:

S0C    S1C    S0U    S1U      EC       EU        OC         OU       PC     PU    YGC     YGCT    FGC    FGCT     GCT   
104832.0 104832.0 26921.8  0.0   838912.0 224806.4 9437184.0   797674.3  85568.0 85341.5    112    3.127   0      0.000    3.127
104832.0 104832.0  0.0   52798.7 838912.0 828526.5 9437184.0   843524.4  86144.0 86031.4    122    3.402   0      0.000    3.402
,104832.0 104832.0  0.0   26349.2 838912.0 282471.0 9437184.0   902120.4  86336.0 86084.9    133    3.759   0      0.000    3.759

The columns that are output can be interpreted as follows:

ColumnDescription
S0CCurrent survivor space 0 capacity (KB).
S1CCurrent survivor space 1 capacity (KB).
S0USurvivor space 0 utilization (KB).
S1USurvivor space 1 utilization (KB).
ECCurrent eden space capacity (KB).
EUEden space utilization (KB).
OCCurrent old space capacity (KB).
OUOld space utilization (KB).
PCCurrent permanent space capacity (KB).
PUPermanent space utilization (KB).
YGCNumber of young generation GC Events.
YGCTYoung generation garbage collection time.
FGCNumber of full GC events.
FGCTFull garbage collection time.
GCTTotal garbage collection time.


It is convenient to have a script that determines the java pid rather than having to enter it manually.  A script such as the following can do that:

ps -ef|grep java|grep -v grep|awk '{print "jstat -gc "$2" 10000}'

This will find the java process running and generate a jstat command on that process id.  If the system may have multiple java processes, additional filters are needed to isolate a single pid.

It is also convenient to prepend a timestamp to jstat rows.  While there is a  jstat flag to output milliseconds since java startup, a human-readable timestamp is more useful for later correlation to other system events.  This can be done using the following timestamp.sh perl script:


#!/usr/bin/perl
#while (<>) { print localtime() . ": $_"; }
while (<>) 
{
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
$secpad = sprintf( "%02d", $sec );
$minpad = sprintf( "%02d", $min );
$hourpad = sprintf( "%02d", $hour );
$monpad = sprintf( "%02d", $mon );
$mdaypad = sprintf( "%02d", $mday );
$yearpad = sprintf( "%02d", ($year + 1900) % 100 );
$time = "$monpad\/$mdaypad\/$yearpad $hourpad:$minpad:$secpad";
print $time . ",$_";
#17/04/112 16:40:14
}

The timestamp.sh script can be integrated into the jstat output as follows:

jstat -gc 2862 10000 | timestamp.sh

This can be combined with java pid filter and run in the background outputting to a log file as follows:

ps -ef|grep java|grep -v grep|awk '{print "jstat -gc "$2" 10000 | ./timestamp.sh > jstat.log"}' | sh &

The result is data such as the following:


04/22/12 15:07:49, S0C    S1C    S0U    S1U      EC       EU        OC         OU       PC     PU    YGC     YGCT    FGC    FGCT     GCT   
04/22/12 15:07:49,104832.0 104832.0 26921.8  0.0   838912.0 224806.4 9437184.0   797674.3  85568.0 85341.5    112    3.127   0      0.000    3.127
04/22/12 15:08:49,104832.0 104832.0  0.0   52798.7 838912.0 828526.5 9437184.0   843524.4  86144.0 86031.4    122    3.402   0      0.000    3.402
04/22/12 15:09:49,104832.0 104832.0  0.0   26349.2 838912.0 282471.0 9437184.0   902120.4  86336.0 86084.9    133    3.759   0      0.000    3.759
04/22/12 15:10:49,104832.0 104832.0 48749.7  0.0   838912.0 544053.8 9437184.0   954768.5  86336.0 86092.8    144    4.083   0      0.000    4.083
04/22/12 



Analyze jstat log

To get the total time and number of Full GC and young generation GCs, the cumulative values output by jstat need to be diff'd versus the previous line.  The jstat log can be analyzed into a simple csv file with this per-interval GC information using a perl script such as the following:


# Read in jstat log file
# Calculate the difference from previous lines for full and young gen GCs, both count and time
#    YGC Number of young generation GC Events.
#    YGCT Young generation garbage collection time.
#    FGC Number of full GC events.
#    FGCT Full garbage collection time.
#
#  04/21/12 16:15:26, S0C    S1C    S0U    S1U      EC       EU        OC         OU       PC     PU    YGC     YGCT    FGC    FGCT     GCT   
#  04/21/12 16:15:26,3328.0 3328.0  0.0   3248.5 1302784.0 732415.6 5863232.0  4035920.6  21248.0 8874.5    195   38.069   6     70.234  108.303
#
#  Output the result to jstat-parsed.log

$infile = $ARGV[0];
$outfile = "jstat-parsed.csv";

open (INFILE, "<$infile") or die "can't open input file $infile";
open (OUTFILE, ">$outfile") or die "can't open output file $outfile";

print OUTFILE "Time,GCYoungGenCount,GCYoungGenTime,GCOldGenCount,GCOldGenTime,OldGenCapacity,OldGenUtilization\n";
print "Time,YGCount,YGTime,FGCount,FGTime\n";
#read header
<INFILE>;
#prime variables on first line of input
$line = <INFILE>;
@cells = split(/\s+/, $line);
$YGC = $cells[11];
$YGCT = $cells[12];
$FGC = $cells[13];
$FGCT = $cells[14];
#print "PRIME:timestamphere,$YGC,$YGCT,$FGC,$FGCT\n";

# now read in each line, calculating the diff from previous line and outputting values
while ($line = <INFILE>) {
#$line = <INFILE>;
# get new row's values
@cells = split(/\s+/, $line);
$time = $cells[0] . " " . $cells[1];
$time =~ s/,.*//g;
$nYGC = $cells[11];
$nYGCT = $cells[12];
$nFGC = $cells[13];
$nFGCT = $cells[14];
$OGCapacity = $cells[7];
$OG = $cells[8];
#print "NEW:$time,$nYGC,$nYGCT,$nFGC,$nFGCT\n";

# calculate differences from previous row's values
$diffYGC = $nYGC - $YGC;
$diffYGCT = $nYGCT - $YGCT;
$diffFGC = $nFGC - $FGC;
$diffFGCT = $nFGCT - $FGCT;
print OUTFILE "$time,$diffYGC,$diffYGCT,$diffFGC,$diffFGCT,$OGCapacity,$OG\n";
print "$time,$diffYGC,$diffYGCT,$diffFGC,$diffFGCT,$OGCapacity,$OG\n";

# now reset variables to current line's values
$YGC = $nYGC;
$YGCT = $nYGCT;
$FGC = $nFGC;
$FGCT = $nFGCT;
}
close (OUTFILE);
close (INFILE);



This perl script will give output such as the following:

Time,GCYoungGenCount,GCYoungGenTime,GCOldGenCount,GCOldGenTime,OldGenCapacity,OldGenUtilization
04/21/12 15:39:13,68,3.349,2,0.0530000000000002,10754040.0,5733223.0
04/21/12 15:44:13,68,3.261,2,0.0499999999999998,10754040.0,4523825.4
04/21/12 15:49:13,69,3.41,2,0.0550000000000002,10754040.0,3766710.6
04/21/12 15:54:13,68,3.23700000000001,2,0.0509999999999999,10754040.0,3008307.3
04/21/12 15:59:13,68,3.169,2,0.052,10754040.0,2428630.4


2 comments:

  1. Awesome post... can you please share the code for generating 7 charts for GC

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete