Now you can see detailed data for individual Pods, Containers and Nodes in your Kubernetes
The Java G1 GC is the newest GC algorithm in Java. It is about to become the default GC in Java 9. Instead of running synthetic benchmarks, here we compare it against other GC algorithms in a real world use case.
The system under test was our 'Data Collector' servers. All JVMs connected to DripStat ping the data collectors every minute at the same time. So at the tick of each minute, all JVMs simultaneously send a request to the data collectors with their performance metric data.
The data collectors verify the data and pass it on to the storage system.
Our aim is to have the datacollector return the request in < 500ms. As can be seen, our typical response time is about 250ms.
(We use DripStat to monitor itself, so all screenshots you see from here on are from DripStat.)
Each JVM was running on a single AWS instance.
CPU Core Count - 2
Java Version - 1.8.0_40 64 bit
System RAM - 3.94 GB
Initially we were using the Parallel GC. We started seeing a lot of occasional spikes in memory usage and GC activity when using this.
This was negatively impacting our response time since every few minutes a large stop-the-world pause would occur.
We tried looking at the details of what was going on:
It seems all the data coming in at once at the tick of the minute was causing the Survivor Space to fill up and spill into Old Gen. Which then resulted in memory being held for longer, ultimately resulting in the large full GC pause.
We tried a lot of combinations of resizing Eden, Survivor Space and Old Gen, and even increasing the max heap size, but we could never achieve the magic ratio to keep the full GC pauses from frequently occuring.
Concurrent Mark Sweep (CMS) GC###
We then decided to switch GC algorithms and tried using the CMS collector instead.
While this eliminated the large single pause, it still kept a portion of the CPU constantly busy doing GC. Not only that, our heap still kept reaching close to 2gb before being collected. This did not seem like an optimal solution either.
We then tried using the G1 GC. We even added the
-XX:+UseStringDeduplication argument, which the G1 collector can use to eliminate storing duplicates of Strings.
The results here were much better. The heap didn't grow past the 1 GB mark. The CPU was not occupied doing GC either.
Looking at the individual pauses, one can see that the single longest pause was just 120ms over a 4 hr period. Most pauses were only about 25ms and only a single such pause occured every 4 minutes.
We found G1 gc to be the best among all other options available. It did not require any special tuning and worked perfectly with both short individual pause times and small number of pauses. The impact to CPU usage was minimal. We have made it the default GC for all our systems.