Sunday, September 14, 2014

G1 GC: Tuning Mixed Garbage Collections in JDK 8

To tune G1 GC (Garbage First Garbage Collector),  a place to start with is [1]. For the latest information, read [2].

Assuming you have minimal knowledge of G1 GC, the focus of this article is to tune the Mixed GC for better performance.

The Need to Tune Mixed GC


G1 GC is a generational garbage collector—read [3] to learn what eden, survivor, and old generation spaces are.  It also uses region-based architecture which divides large contiguous Java heap space into multiple fixed-sized heap regions.

The destination region for a particular object depends upon the object's age; an object that has aged sufficiently is evacuated to an old generation region (or promoted); otherwise, the object is evacuated to a survivor region and will be included in the CSet[5] of the next young or mixed garbage collection.

During young collections, G1 GC adjusts its young generation (eden and survivor sizes) to meet its pause target. During mixed collections, the G1 GC adjusts the number of old regions that are collected based on a target number of mixed garbage collections, the percentage of live objects in each region of the heap, and the overall acceptable heap waste percentage (see details later).

Depending on your application's workload, Full GC could be expensive in G1 GC.  Since Mixed GC collects both young and old regions, you can get better performance by tuning Mixed GC—the goal is to reduce the number of Full GC's when your applications run.

More on Mixed Garbage Collections


Upon successful completion of a concurrent marking cycle, the G1 GC switches from performing young garbage collections to performing mixed garbage collections. In a mixed garbage collection, the G1 GC optionally adds some old regions to the set of eden and survivor regions that will be collected. The exact number of old regions added is controlled by a number of flags that will be discussed later. After the G1 GC collects a sufficient number of old regions (over multiple mixed garbage collections), G1 reverts to performing young garbage collections until the next marking cycle completes.

To summarize, Mixed GC can be characterized by:
  • Collecting both young and old regions
  • Denoted by: [GC pause (G1 Evacuation Pause) (mixed) in the GC log file[2]
  • Only after a completed marking cycle
  • A sequence of mixed GC events to collect old regions, up to 8 by default

Tuning Mixed Garbage Collections


You can play with the following options to tune Mixed GC:[1,2,4]
  • -XX:InitiatingHeapOccupancyPercent 
    • Percentage of the (entire) heap occupancy to start a concurrent GC cycle. It is used by GCs that trigger a concurrent GC cycle based on the occupancy of the entire heap, not just one of the generations.  The default value is 45.
    • This option can be used to change the marking threshold
      • If threshold is exceeded, a concurrent marking will be initiated next.
      • The higher the threshold is, the less concurrent marking cycles will be, which also means the less mixed GC evacuation will be.
  • -XX:G1MixedGCLiveThresholdPercent 
    • This option can be used to change the threshold which determines whether a region should be added to the CSet or not.
      • Only regions whose live data percentage are less than the threshold will be added to the CSet.
      • The higher the threshold (default: 65) is, the more likely a region will be added to the CSet, which also means more mixed GC evacuation and longer evacuation time will happen. 
  • -XX:G1HeapWastePercent
    • Amount of space, expressed as a percentage of the heap size that G1 is willing not to collect to avoid expensive GCs.
    • Current default is 10%.  If we reduce it to 5%, it will clean up old regions more, which also means mixed GC will be longer 
      • Note that G1 will continue triggering mixed GC if the reclaimable is higher than the waste threshold.  So, there will be more mixed GC and maybe more expensive.
Besides the above-mentioned options, you may also want to tune the following ones:
  • -XX:G1MixedGCCountTarget 
    • Sets the target number of mixed garbage collections after a marking cycle to collect old regions with at most G1MixedGCLIveThresholdPercent live data. 
    • The default is 8 mixed garbage collections. The goal for mixed collections is to be within this target number.
  • -XX:G1OldCSetRegionThresholdPercent
    • Sets an upper limit on the number of old regions to be collected during a mixed garbage collection cycle. The default is 10 percent of the Java heap.
  • -XX:G1HeapRegionSize
    • Sets the size of a G1 region. The value will be a power of two and can range from 1MB to 32MB. 
    • The goal is to have around 2048 regions based on the minimum Java heap size.

Conclusion


Each application is unique, you may need to tune G1 GC in an iterative process.  Note that all of the above options are product options except G1MixedGCLiveThresholdPercent and G1OldCSetRegionThresholdPercent, which are experimental options.  Be warned that: for the experimental options, Oracle may remove them at its discretion in the future releases.  

References

  1. Garbage First Garbage Collector Tuning
  2. g1gc logs - Ergonomics -how to print and how to understand 
  3. Understanding Garbage Collection  
  4. Garbage First (G1) Garbage Collection Options
  5. CSet
    • The G1 GC reduces heap fragmentation by incremental parallel copying of live objects from one or more sets of regions (called Collection Set (CSet)) into different new region(s) to achieve compaction.
    • The goal of G1 GC is to reclaim as much heap space as possible, starting with those regions that contain the most reclaimable space (i..e, garbage first), while attempting to not exceed the pause time goal.