Sunday, January 15, 2012

Understanding Garbage Collection

Every Java programmer will encounter
  • java.lang.OutOfMemoryError
sooner or later.  The most often offered advices include:
  • Try increase the MaxPermSize first
  • Increase the maximum size of java heap size to 512m
In this article, we will show what Java heap space or MaxPermSize is.  Using Hotspot as our JVM, we will introduce you to the following topics:
Garbage Collector

A garbage collector is responsible for
  • Allocating memory
  • Ensuring that any referenced objects remain in memory 
  • Recovering memory used by objects that are no longer reachable from references in executing code. 

The process of locating and removing those dead objects can stall your Java application while consuming as much as 25 percent of throughput.
GC provided by Hotspot VM is a generational GC[9,16,18]—memory is divided into generations, that is, separate pools holding objects of different ages.  The design is based on the Weak Generational Hypothesis:
  1. Most newly-allocated objects die young
  2. There are few references from old to young objects

This separation into generations has proven effective at reducing garbage collection pause times and overall costs in a wide range of applications.  Hotspot VM divides its heap space into three generational spaces:
  1. Young Generation
    • When a Java application allocates Java objects, those objects are allocated in the young generation space
    • Is typically small and collected frequently
    • The garbage collection algorithm chosen for a young generation typically puts a premium on speed, since young generation collections are frequent. 
  2. Old Generationn
    • Objects that are longer-lived are eventually promoted, or tenured, to the old generation
    • Is typically larger than the young generation, and its occupancy grows more slowly
    • The old generation is typically managed by an algorithm that is more space efficient, because the old generation takes up most of the heap and old generation algorithms have to work well with low garbage densities.
  3. Permanent Generation
    • Holds VM and Java class metadata as well as interned Strings and class static variables
    • Note that PermGen will be removed from Java heap in JDK 8.
Minor vs Full GC

A garbage collection occurs when any one of those three generational spaces is considered full and there is some request for additional space that is not available.  There are two types of garbage collection activities: minor and full.   When the young generation fills up, it triggers a minor collection in which the surviving objects are moved to the old generation. When the old generation fills up, it triggers a full collection which involves the entire object heap.

Minor GC
  • Occurs when the young generation space does not have enough room
  • Tends to be short in duration relative to full garbage collections
Full GC
  • When the old or permanent generation fills up, a full collection is typically done.
  • A Full GC can also be started explicitly by the application using System.gc()
  • Takes a longer time depending upon the heap size.  However, if it takes longer than 3 to 5 seconds, then it's too long[1].
Full garbage collections typically have the longest duration and as a result are the number one reason for applications not meeting their latency or throughput requirements.   The goal of GC tuning is to reduce the number and frequency of full garbage collections experienced by the application.  To achieve it, we can approach from two sides:
  • From the system side
    • You should use as large a heap size as possible without causing your system to "swap" pages to disk.  Typically, you should use 80 percent of the available RAM (not taken by the operating system or other processes) for your JVM[1].
    • The larger the Java heap space, the better the garbage collector and application perform when it comes to throughput and latency.
  • From the application side
    • A reduction in object allocations, more importantly, object retention helps reduce the live data size, which in turn helps the GC and application to perform.
    • Read this article -- Java Performance Tips [19].
OutOfMemoryError

The dreaded OutOfMemoryError is something that no Java programmer ever wishes to see. Yet it can happen, particularly if your application involves a large amount of data processing, or is long lived.
The total memory size of an application includes:
  • Java heap size
  • Thread stacks
  • I/O buffers
  • Memory allocated by native libraries
If an application runs out of memory and JVM GC fails to reclaim more object spaces, an OutOfMemoryError exception will be thrown.  An OutOfMemoryError does not necessarily imply a memory leak. The issue might simply be a configuration issue, for example if the specified heap size (or the default size if not specified) is insufficient for the application.

JVM Command Options

Whether running a client or server application, if your system is running low on heap and spending a lot of time with garbage collection you'll want to investigate adjusting your heap size. You also don't want to set your heap size too large and impact other applications running on the system.   

GC tuning is non-trivial.  Finding the optimal generation space sizes involves an iterative process[3,10,12].  Here we assume you have successfully identified the optimal heap space sizes for your application.  Then you can use the following JVM command options to set them:



GC Command Line OptionsDescription
-Xms Sets the  initial and minimum size of java heap size.  For example, -Xms512m (note that there is no "=").
-Xmx Sets the maximum size of java heap size
-Xmn Sets the initial, minimum, and maximum size of the young generation space.   Note that  the size of the old generation space is implicitly set based on the size of the young generation space.
-XX:PermSize=<n>[g|m|k] Sets the initial and minimum size of permanent generation space
-XX:MaxPermSize=<n>[g|m|k] Sets the maximum size of permanent generation space


As a final note, ergonomics for servers was first introduced in Java SE 5.0[13]. It has greatly reduced application tuning time for server applications, particularly with heap sizing and advanced GC tuning. In many cases no tuning options when running on a server is the best tuning you can do.

References
  1. Tuning Java Virtual Machines (JVMs)
  2. Diagnosing Java.lang.OutOfMemoryError
  3. Java Performance by Charlie Hunt and Binu John
  4. Java HotSpot VM Options
  5. GCViewer (a free open source tool)
  6. Comparison Between Sun JDK And Oracle JRockit 
  7. Java SE 6 Performance White Paper 
  8. F&Q about Garbage Collection 
  9. Hotspot Glossary of Terms 
  10. Memory Management in the Java HotSpot Virtual Machine White Paper
  11. Java Hotspot Garbage Collection 
  12. FAQ about GC in the Hotspot JVM  (with good details) 
  13. Java Heap Sizing: How do I size my Java heap correctly? 
  14. Java Tuning White Paper 
  15. JRockit JVM Heap Size Options
  16. Pick up performance with generational garbage collection
  17. Which JVM?
  18. Memory Management in the Java HotSpot™ Virtual Machine
  19. Java Performance Tips
  20. A Case Study of java.lang.OutOfMemoryError: GC overhead limit exceeded
  21. HotSpot—java.lang.OutOfMemoryError: PermGen space

No comments: