Java applications tend to contain hundreds (sometimes thousands) of threads. The majority of these threads are in WAITING, TIMED_WAITING (i.e., dormant) state, while only a small portion of the threads are actively executing lines of code. So, we were curious to know whether dormant threads consume less memory than active threads.
To figure out an answer to this question, we conducted a small study. Our findings were interesting, thus sharing with you.
What is stored in thread’s stack?
Before you read further, you should first know what information are stored in the thread’s stack. To get a complete picture of what information is stored in the thread’s stack, you can refer to this article. In a nutshell, following information are stored in the thread’s stack:
1. Local variables that are created in the method.
2. Code path that are currently being executed by the thread.
Study
To facilitate our study, we wrote two simple programs. Let’s review these two programs and their performance characteristics.
1. Threads with empty stack frames
We created a simple java program that will create 1000 threads. All the threads in this program will have their stack frames almost empty so that it doesn’t have to consume any memory.
public class EmptyStackFrameProgram { public void start() { // Create 1000 threads for (int counter = 0; counter < 1000; ++counter) { new EmptyStackFrameThread().start(); } } } public class EmptyStackFrameThread extends Thread { public void run() { try { // Just sleep forever while (true) { Thread.sleep(10000); } } catch (Exception e) { } } }
In this java program, 1000 threads are created in the ‘EmptyStackFrameProgram’ class. All the ‘EmptyStackFrameThread’ threads go on an infinite sleep, and they don’t do anything. Thus their stack frame will be almost empty because they aren’t executing any new lines of code or creating any new local variables.
Note: We are putting the threads to infinite sleep, so they will not die down, which is essential to study their memory usage.
2. Threads with loaded stack frames
Here is another simple java program that will create 1000 threads. All the threads in this program will have their stack frames fully loaded with data. So that they will consume a lot more memory than the earlier program.
public class FullStackFrameProgram { public void start() { // Create 1000 threads with full stack for (int counter = 0; counter < 1000; ++counter) { new FullStackFrameThread().start(); } } } public class FullStackFrameThread extends Thread { public void run() { try { int x = 0; simpleMethod(x); } catch (Exception e) { } } /** * Loop for 10,000 times and then sleep. So that stack will be filled up. * * @param counter * @throws Exception */ private void simpleMethod(int x) throws Exception { // Creating local variables to fill up the stack. float y = 1.2f * x; double z = 1.289898d * x; // Looping for 10,000 iterations to fill up the stack. if (x < 10000) { simpleMethod(++x); } // After 10,000 iterations, sleep forever while (true) { Thread.sleep(10000); } } }
In this java program, 1000 threads are created in the ‘FullStackFrameProgram’ class. All the ‘FullStackFrameThread’ threads repeatedly invoke the ‘simpleMethod(int counter)’ for 10,000 times. After 10,000 invocations, threads will go for infinite sleep. Since thread is invoking the ‘simpleMethod(int counter)’ 10,000 times, each thread will have 10,000 stack frames and each stack frame will be filled up with local variables ‘x’, ‘y’, ‘z’.
Fig: EmptyStackFrameThread and FullStackFrameThread stack
The above diagram shows the visualization of EmptyStackFrameThread’s stack and FullStackFrameThread’s stack. You can notice ‘EmptyStackFrameThread’ contains only 2 stack frames. On the other hand ‘FullStackFrameThread’ contains 10,000+ stack frames. Besides that, each stack frame of ‘FullStackFrameThread’ will contain local variables x, y, z. This will make the ‘FullStackFrameThread’ stack to be fully loaded. Thus one would expect ‘FullStackFrameThread’ stack to consume more memory.
Memory consumption
We executed the above two programs with following settings:
a. Configured thread’s stack size to be 2 MB (i.e., passed -Xss2m JVM argument to both programs).
b. Used OpenJDK 1.8.0_265, 64-Bit Server VM
c. Both the programs simultaneously on AWS ‘t3a.medium’ EC2 instance.
Below you can see the program’s memory consumption reported by the system monitoring tool – ‘top’ .
Fig: Memory consumption of both programs are same
You can notice both the programs are consuming exactly 4686 MB memory. This indicates both program threads consume the same amount of memory, even though FullStackFrameThread is hyperactive, whereas EmptyStackFrameThread is almost dormant.
To validate this theory, we further analyzed both the programs using the JVM root cause analysis tool yCrash. Below is the thread analysis report generated by the yCrash tool.
Fig: yCrash tool reporting EmptyStackFrameThread stack almost empty
Fig: yCrash tool reporting FullStackFrameThread stack almost full
yCrash tool also clearly points out that EmptyStackFrameProgram contains 1000 threads, almost with two stack frames, whereas FullStackFrameProgram contains 1000 threads with 10,000 stack frames.
Conclusion
This study clearly indicates that memory to threads are allocated at the time of creation, and memory is not allocated based on the thread’s runtime demand. Both hyper working threads and almost dormant threads consume the same amount of memory. Modern Java applications tend to create hundreds (sometimes several thousands) of threads. But most of those threads are in WAITING or TIMED_WAITING state and do nothing. Given that threads occupy maximum amount of allocated memory right when it is created, as an application developer you can do the following to optimize your application’s memory consumption:
1.Create only *necessary* threads for your application.
2. Try to come up with optimal stack size (i.e., -Xss) for your application threads. Say if you are configuring thread’s stack size (i.e., -Xss) to 2 MB, and in runtime, if your application uses only 512 KB, you will be wasting 1.5 MB of memory for each thread in your application. If your application has 500 threads, you will be wasting 750 MB (i.e., 500 threads x 1.5 MB) of memory per JVM instance, which is not cheap in the modern cloud computing era.
You can use tools like yCrash, which can tell you how many threads are in active use and how many threads are in dormant state. It can also tell you how deep is each thread’s stack. Based on it, you can come up with optimal thread count and thread’s stack size for your application.