In this article, we will present a case of TLA tuning using an application with the following characteristics:
- A multi-threaded application
- Allocating a lot of objects
- Allocating large objects
TLA Basics
Without much ado, see [1] for the differences between JRockit's large-object implementations between pre- and post-R28. The version of JRockit used in this article is R28.2.5. For this version, TLA sizes can be tuned using the following option:- -XXtlaSize:min=size,preferred=size,wasteLimit=size
Chaning Default TLA Sizes
For our Linux system, the default TLA settings are:
- min=2k
- preferred=16k
- wasteLimit=2k
- Setting -XXtlaSize:wasteLimit to the same value as -XXtlaSize:min.
After trials and errors, we have found the following TLA settings to be better than the defaults:
- -XXtlaSize:min=8k,preferred=512k,wasteLimit=8k
Performance Comparison: Default vs. TLA Tuned
Looking at different KPIs, we have found the application's performance improved by 5.9% in Average Response Time (ART) and 3.32% in 90% Response Time. Better performance is achieved by reducing pause time % in GC and Total CPU % at the expense of total memory footprint (-1.6%).
Conclusion
As the benchmark results show, you can tune your application's performance by tuning TLA sizes. The performance improvement may vary based on your JRockit versions and system capabilities. But, before you do any fine tuning, read [2, 4] first.
Finally, be warned that options that are specified with -XX are not stable and are not recommended for casual use. These options are subject to change without notice.[3]
References
- JRockit: Thread Local Area Size and Large Objects
- Optimizing Memory Allocation Performance (Section 4.4 of this pdf)
- -XXtlaSize Parameters
- Oracle JRockit- The Definitive Guide
- finding L2 cache size in Linux
No comments:
Post a Comment