Unless you really know what you are doing, the –XgcPrio flag is the preferred way to tell JRockit what garbage collection strategies to run.[1] However, if you want further control over GC behavior, a more fine grained garbage collection strategy can be set from the command line using the –Xgc flag.
In this article, we will examine four Generational Garbage Collectors (Generational GCs) in JRockit, which uses nursery. Note that JRockit refers to the young generations[2] as nurseries.
Here we only cover Generational GCs that use the nursery. For example, we will ignore "–Xgc:singlecon"—or single generational concurrent—in this discussion. As shown below, JRockit supports four different garbage collection strategies. They differ only in the using either concurrent or parallel in its mark and sweep phases.
Running JRockit with –Xverbose:gc will output plenty of verbose information on what the JVM memory management system is doing. This information includes garbage collections, where they take place (nurseries or old space), changes of GC strategy, and the time a particular garbage collection takes. For example, we have specified the following options:
Stopping the world means that collectors are halting all executing Java threads to run a collection cycle, thus guaranteeing that new objects are not allocated and objects do not suddenly become unreachable while the collector is running. This has the obvious disadvantage that the application can perform no useful work while a collection cycle is running.
In memory management, the time spent in GC is detrimental to an application's performance. So, minimizing the time spent in GC seems to be a good thing to achieve (or is it?). Collectors use simple stop-the-world strategy consumes less CPU time because it is simple and requires less bookkeeping. However, simple stop-the-world collectors halt all Java threads and they are not a good fit for pause critical applications. Note that lantency is caused by not spending CPU cycle executing Java code.
Optimizing for low lantencies is basically a matter of avoiding stopping the world and let the Java application run as much as possible. However, performing GC and executing Java code concurrently requires a lot more bookkeeping and thus, the total time spent in GC will be longer. Also, if the garbage collector gets too little total CPU time and it can't keep up with the allocation frequency, the heap will fill up and an OutOfMemoryError[6-8] will be thrown by the JVM. Therefore, another key to low latency is to keep heap usage and fragmentation at a proper level.
To conclude, the tradeoff in memory management is between maximizing throughput and maintaining low latencies. In the real world, we can't achieve both at the same time.
Generational Garbage Collection Strategies
The garbage collection strategy is defined by nursery, mark strategy and sweep strategy. The nursery can either be present or not present. The mark and sweep phases can either be concurrent or parallel.
Here we only cover Generational GCs that use the nursery. For example, we will ignore "–Xgc:singlecon"—or single generational concurrent—in this discussion. As shown below, JRockit supports four different garbage collection strategies. They differ only in the using either concurrent or parallel in its mark and sweep phases.
-Xgc: option | Mark | Sweep |
genconcon or gencon | Concurrent | Concurrent |
genconpar | Concurrent | Parallel |
genparpar or genpar (default) | Parallel | Parallel |
genparcon | Parallel | Concurrent |
GC Mode
Running JRockit with –Xverbose:gc will output plenty of verbose information on what the JVM memory management system is doing. This information includes garbage collections, where they take place (nurseries or old space), changes of GC strategy, and the time a particular garbage collection takes. For example, we have specified the following options:
-Xverbose:gc -Xverbosedecorations=level,module,timestamp,millis,pidto find the GC mode of each strategy uses. As you can see, different strategies are designed for different purposes:
- Throughput, or
- Low lantency (or short pausetimes)
-Xgc: option | GC Mode |
genconcon or gencon | Garbage collection optimized for short pausetimes, strategy: Generational Concurrent Mark & Sweep. |
genconpar | Garbage collection optimized for short pausetimes, strategy: Generational Concurrent Mark, Parallel Sweep. |
genparpar or genpar | Garbage collection optimized for throughput, strategy: Generational Parallel Mark & Sweep. |
genparcon | Garbage collection optimized for short pausetimes, strategy: Generational Parallel Mark, Concurrent Sweep. |
Stopping the World (STW)
Stopping the world means that collectors are halting all executing Java threads to run a collection cycle, thus guaranteeing that new objects are not allocated and objects do not suddenly become unreachable while the collector is running. This has the obvious disadvantage that the application can perform no useful work while a collection cycle is running.
Even though an algorithm such as mark and sweep may run mostly in parallel, it is still a complicated problem that references may change during actual garbage collections. If the Java application is allowed to run, executing arbitrary field assignments and move instructions at the same time as the garbage collector tries to clean up the heap, synchronization between the application and the garbage collector is needed. Stopping the world for short periods of time is necessary for most garbage collectors and this is one of the main sources of latency and non-determinism in a runtime.
The mark-and-sweep algorithm consists of two phases:[5]
Parallel collectors require stop-the-world pause for the whole duration of major collection phases (mark or sweep), but employ all available cores to compress pause time. Parallel collectors usually have better throughput, but they are not a good fit for pause critical applications.
Concurrent collectors try to do most work concurrently (though they also do it in parallel on multi-core systems), stopping the application only for short duration. Note that the concurrent collection algorithm in JRockit is fairly different from both HotSpot's concurrent collectors (CMS[3] and G1[4] ).
Parallel vs Concurrent Collectors
The mark-and-sweep algorithm consists of two phases:[5]
- Mark phase
- In which, it finds and marks all accessible objects (or live objects).
- Marking is very parallelizable and large parts of a mark phase can also be run concurrently with executing Java code.
- Sweep phase
- In which, it scans through the heap and reclaims all the unmarked objects.
- Sweeping and compaction (JRockit uses partial compaction to avoid fragmentation), however, tend to be more troublesome for parallelization, even though it is fairly simple to compact parts of the heap while others are swept, thus achieving better throughput.
Parallel collectors require stop-the-world pause for the whole duration of major collection phases (mark or sweep), but employ all available cores to compress pause time. Parallel collectors usually have better throughput, but they are not a good fit for pause critical applications.
Concurrent collectors try to do most work concurrently (though they also do it in parallel on multi-core systems), stopping the application only for short duration. Note that the concurrent collection algorithm in JRockit is fairly different from both HotSpot's concurrent collectors (CMS[3] and G1[4] ).
Conclusions
In memory management, the time spent in GC is detrimental to an application's performance. So, minimizing the time spent in GC seems to be a good thing to achieve (or is it?). Collectors use simple stop-the-world strategy consumes less CPU time because it is simple and requires less bookkeeping. However, simple stop-the-world collectors halt all Java threads and they are not a good fit for pause critical applications. Note that lantency is caused by not spending CPU cycle executing Java code.
Optimizing for low lantencies is basically a matter of avoiding stopping the world and let the Java application run as much as possible. However, performing GC and executing Java code concurrently requires a lot more bookkeeping and thus, the total time spent in GC will be longer. Also, if the garbage collector gets too little total CPU time and it can't keep up with the allocation frequency, the heap will fill up and an OutOfMemoryError[6-8] will be thrown by the JVM. Therefore, another key to low latency is to keep heap usage and fragmentation at a proper level.
To conclude, the tradeoff in memory management is between maximizing throughput and maintaining low latencies. In the real world, we can't achieve both at the same time.
References
- Oracle JRockit- The Definitive Guide
- Understanding Garbage Collection (HotSpot)
- Understanding CMS GC Logs (HotSpot)
- Garbage First Garbage Collector Tuning (HotSpot)
- Mark-and-Sweep Garbage Collection
- How to Debug Native OutOfMemory in JRockit (Xml and More)
- Diagnosing OutOfMemoryError or Memory Leaks in JRockit (XML and More)
- Diagnosing Java.lang.OutOfMemoryError (XML and More)
- JRockit: All Posts on "Xml and More"
- JRockit R27.8.1 and R28.3.1 versioning
- Note that R28 went from R28.2.9 to R28.3.1—these are just ordinary maintenance releases, not feature releases. There is zero significance to the jump in minor version number.
- JRockit: Out-of-the-Box Behavior of Four Generational GC's (Xml and More)
No comments:
Post a Comment