Sunday, November 2, 2014

G1 GC: What Is "to-space exhausted" in GC Log?

As shown in [1], you can enable basic garbage collection (GC) event printing using two options:
  • -XX:+PrintGCTimeStamps (or -XX:+PrintGCDateStamps) 
  • -XX:+PrintGCDetails
These additional printing incurs minimal overhead while provide useful information for understanding the healthiness of your running garbage collector (e.g., Generation First GC).

In this article, we will discuss the significance of to-space exhausted events in your GC logs.

What Is "to-space exhausted"?


When you see to-space exhausted messages in GC logs, it means that G1 does not have enough memory for either survived or tenured objects, or for both, and the Java heap cannot be further expanded since it is already at its max.[5]  This event only happens in mixed GC 's when G1 tries to copy live objects from the source space to the destination space.  Note that, in G1, it avoids heap fragmentation by compaction (i.e., done when objects are copied from source region to destination region).

Example message:

5843.494: [GC pause (G1 Evacuation Pause) (mixed)... (to-space exhausted), 0.1145790 secs]

When that happens, GC pause time will be longer (i.e., since both RSet and CSet[5] need to be re-generated) and sometimes it may require a Full GC to resolve the issue. 

A Case Study of "to-space exhausted" Events


Using a benchmark with large live data size,[3] a "to-space exhausted" event can be triggered if there is not enough breathing room for shuffling live data objects in the Java heap.  For example, we have a test case which its live data set occupies 68% of the Java heap at run time.  In a 4-hour run, we have seen the following sequence of events:

5843.494: [GC pause (G1 Evacuation Pause) (mixed)... (to-space exhausted), 0.1145790 secs]
5848.831: [GC pause (G1 Evacuation Pause) (mixed)... (to-space exhausted), 0.0847160 secs]
5856.640: [GC pause (G1 Evacuation Pause) (mixed)... 0.0864060 secs]
5866.182: [GC pause (G1 Evacuation Pause) (mixed)... 0.0821220 secs]
5875.550: [GC pause (G1 Evacuation Pause) (mixed)... 0.0549090 secs]
5882.353: [GC pause (G1 Evacuation Pause) (mixed)... 0.0827630 secs]
5896.331: [GC pause (G1 Evacuation Pause) (mixed)... 0.0632660 secs]
5901.966: [GC pause (G1 Evacuation Pause) (mixed)... 0.5356580 secs]
5902.813: [GC pause (G1 Evacuation Pause) (mixed)... (to-space exhausted), 1.3480320 secs]
5904.162: [Full GC (Allocation Failure) ... 5.0413590 secs]

A series of "to-space exhausted" events eventually led to a Full GC.  As discussed in [4], adding -Xmn400m on the command line to the same test case further add insult to injury—instead of seeing 3 "to-space exhausted" events, now we saw 20 of them.  This is due to we have further cut down the breathing room for G1.



How to Deal with "to-space exhausted" Events?


"to-space exhausted" events happen when G1 runs out of space (either survivor space or old space; see diagram above) to copy live objects to.  When it happens, sometimes G1's ergonomics can get by the issue by dynamically re-sizing heap regions.  If not, it eventually leads to a Full GC event.

When "to-space exhausted" events happen, it could mean you have a very tight heap allocated for your application.  The easiest approach to resolve this is increasing your heap size if it's possible. Alternatively, sometimes it's possible to tune G1's mixed GC to get by (see [4]). As shown in the above section, we have a high live data occupancy (i.e., 68%) in Java heap.  However, after tuning mixed GC, we can get by with a couple of Full GC's in a 4-hour run, which yields satisfactory result.

Note that mixed GC can clean up both young and old regions. If it does not take an old region, it would be a young(-only) GC. G1's mixed GC's are meant as a replacement for Full GC's. Instead of doing all the work at once, the idea is to achieve the same task in steps. Each small steps in mixed GC is smaller than a Full GC. Because G1 does not intend to reclaim all garbage as in a Full GC event, mixed GC uses less time.  However, Full GC is a compaction event which it can walk around "to-space exhausted" issue without needing extra space.

Acknowledgement


Some writings here are based on the feedback from Thomas Schatzl and Yu Zhang. However, the author would assume the full responsibility for the content himself.

References

  1. g1gc logs - basic - how to print and how to understand
  2. g1gc logs - Ergonomics -how to print and how to understand
  3. JRockit: How to Estimate the Size of Live Data Set
  4. JDK 8: Is Tuning MaxNewSize in G1 GC a Good Idea?
  5. Garbage First Garbage Collector Tuning
    • Note that "to-space overflow" message was removed and only "to-space exhausted" remains.
    • RSet (Remembered Set)—Tracks object references into a given region. There is one RSet per region in the heap. The RSet enables the parallel and independent collection of a region. The overall footprint impact of RSets is less than 5%.
    • CSet (Collection Set)—Are the set of regions that will be collected in a GC. All live data in a CSet is evacuated (copied/moved) during a GC. Sets of regions can be Eden, survivor, and/or old generation. CSets have a less than 1% impact on the size of the JVM. 
  6. HotSpot Virtual Machine Garbage Collection Tuning Guide
  7. Garbage-First Garbage Collector Tuning
  8. Getting Started with the G1 Garbage Collector
  9. Garbage-First Garbage Collector (JDK 8 HotSpot Virtual Machine Garbage Collection Tuning Guide)
  10. Other JDK 8 articles on Xml and More
  11. Concurrent Marking in G1

No comments: