Wednesday, March 27, 2013

Analyzing the Performance Issue Caused by WebLogic Session Size Too Big

In this article, we will show you lots of interesting stuffs:
  1. How to investigate an issue with too many live objects kept in old generation, which cannot be reclaimed in Full GC by HotSpot VM
    • For background information regarding Garbage Collectors, read [1-7]
  2. How important sesssion-timeout element in web.xml is
  3. How to use java tool to generate heap dump
  4. How to use Eclipse Memory Analyzer to analyze heap dump (i.e. fod.hprof file)
    • Download link: here

The Issues


We have seen slow performance in one of our benchmarks (i.e., FOD).  Here are the symptoms:
  • GC took about 50% of the application's CPU time
  • Full GC's were not triggered by full Perm Gen, but by full Old Gen
    • The old generation is completely full and it isn't clearing up anything but a few soft references during Full GC.
So, the symptoms all point to too many live objects kept in Old Gen and they have caused many Full GC's.  There could be different causes for the above symptoms.  To investigate, you need to create heap dumps and check out what objects HotSpot VM is holding onto.  Below we will show you how to investigate this.

Generating Heap Dump


First, we have used a Java tool jmap to generate a summary of the heap contents:

$./jmap -histo 16345 >/scratch/aime1/tmp/fod.summary

 num     #instances         #bytes  class name
----------------------------------------------
   1:      15777231      822030680  [Ljava.lang.Object;
   2:       3074270      463615192  [C
   3:       4076972      331840256  [Ljava.util.HashMap$Entry;
   4:       9287101      297187232  java.util.HashMap$Entry
   5:       2105437      151591464  org.apache.myfaces.trinidad.bean.util.PropertyHashMap
   6:       4075444       97810656  java.util.HashMap$FrontCache
   7:       1893014       90864672  java.util.HashMap
   8:       2018043       80721720  org.apache.myfaces.trinidad.bean.util.FlaggedPropertyMap
   9:       3017161       72411864  java.lang.String


Unfortunately, it didn't tell us much regarding the biggest java object that have taken ~800M bytes.  But, we did notice that two other property maps also took up a lot of memory space.  So, the next step is to generate a full dump as follows:

$./jmap -dump:live,file=/scratch/aime1/tmp/fod.hprof 16345

To analyze the full heap dump, you need to use Eclipse Memory Analyzer which we have chosen the standalone version.

Analyzing Heap Dump With Memory Analyzer (MAT)


I started the Memory Analyzer with the following command:

$ cd /scratch/sguan/mat/
$ ./MemoryAnalyzer &

When I opened the heap dump file (i.e., fod.hprof), MAT failed with a message related to Java Heap Space.  So, I need to modified the following line in the MemoryAnalyzer.ini file:

-vmargs
-Xmx1024m

by changing -Xmx1024m to -Xmx6240m.  Note that how big you should set for -Xmx option depends on:
  • Size of heap dump
    • Our original file size is 6.4g and was reduced to 3.6g after loading.  Note that MAT will remove unreachable objects in the loading process.
  • Size of physical RAM in your system

The Culprit— WebLogic Session Objects


From the MAT, we have found out that the biggest live object kept in the heap was:
  • weblogic.servlet.internal.session.MemorySessionContext @ 0x705e0ff08 
Also, for two previously-mentioned property map objects, they are also related to the session object.  If you trace those objects back to the GC root, you will see that they are pointed to by the session state. They are the reason the session state is so big.

session-timeout Element in web.xml


Session object is used by WebLogic container to track a user's session that uses the StoreFrontModule web application in our FOD benchmark.  Every time a new user comes in, it starts a session. This session object keeps user's data or states in it. If the user leaves, the session is considered idle and the server will retain it in memory for a period of time before reclaiming it.

Java web applications can be configured with a session timeout value which specifies the number of minutes a session can be idle before it is abandoned.  This session timeout element is defined in the Web Application Deployment Descriptor web.xml[9]. For example, this was the session timeout value we had:

<session-config>
<session-timeout>35</session-timeout>
</session-config>

The problem with our benchmark's slowness is due to this session timeout value being too high.  This has caused the sessions being piled up and taking up lots of memory. Setting it to be a lower value (see [10] on how to patch web.xml in an EAR), our benchmark then performs normally.

No comments: