Java 7: memory model unraveled!

This is a series of posts (It’s not a serie yet, but trust me, it will be) about java memory, the model, the different parts of the model, what they are for and how they are being used. As a java developer doing your daily coding tasks, you are hardly aware of the fact that even java uses memory, you are probably not thinking about it on a daily basis. It is only every once and awhile that it pops up in your mind and usually it is because all hell has broken loose in the form of an OutOfMemoryError. You don’t have to know about the java memory model to be a java developer. Sometimes curious coders do investigate only to give up right after they have been bitch-slapped with terms like ‘Eden space’ and ‘Method area’. Terms that, in my opinion, don’t really speak for themselves.

Why bother?

I can think a whole lot of reasons why. Personally I think in order to become a better developer, you should understand the memory model. Figuring this out made me more aware, more conscious on what I was really doing when assigning variables.

It has helped me when troubleshooting memory issues. When an application is going out of memory, I found it very helpful to know the difference between heap space and permgen space. I good to know because you can immediately start looking for the usual suspects instead of just second guessing.

Or maybe you are junior programmer and you have just experienced your first OutOfMemoryError exception: frenzied managers declaring the apocalypse, co developers going berserk, spouting expensive words and memory jibbah jabbah. And maybe, just maybe, you want to know what they the heck they are talking about.

Java memory and its model

Like any other process on your computer, the JVM uses memory of your machine. It does not really matter if this is pure RAM or processor cache, this is what I will call the native memory. It’s memory that can be allocated to any other non-java process.

Java memory is the native memory that is being taken by your JVM in order to be able to run java applications. This memory is structured by a model, separated in different parts ( also known as spaces or data area’s ) and each part has it’s purpose. These parts are basically native machine memory put under control of the java memory manager.  Below my attempt at a comprehensible picture of the java memory model for java 7. For java 8, it looks a bit different, but more on that later.

Frames

What are frames? They seem to be the building blocks of a stack. I must admit, before writing this post, I had no idea that they were a thing in java memory. According to the java 7 JVM specification (section 2.6), a frame is used for storing data, partial results, returning values from methods and dispatching exceptions.  And here comes the clue: a frame is created when a new method is invoked and destroyed when that method is done. It doesn’t matter if the method ended normal or threw an exception. You might say that a frame is controlling the scope of local variables.

Frames are put on the stack of the thread that created the frame. In other words, if a thread invokes a method, a new frame will be created and put on the stack of that thread. So basically frames are somewhat the building blocks of the stack. Each frame has its own array of local variables, its own operand stack and some class related references.

Only one frame per thread can be active at any given point in time. This is because of how the stack works. A frame cannot be shared with another thread.

The stack

The stack is a data structure. You can only add data on the ‘top’ (which is called pushing) and you can only remove data that is on ‘top’ ( which is called popping). So the first thing you pushed on the stack will be the last thing that will be removed (something called ‘LiFo’, which stands for ‘Last in, First out’ ).  This is how a stack works and it is the reason that for every stack, there can only be one active frame at any given time, which is the one on top of the stack.

Every thread in java has its own stack. So there can be more stacks at the same time. It is only logical that the stack of a thread is created at the same time when its owning thread is created. That means that a running java application can have multiple stacks at the same time. All the stacks in a single java program do share the same heap space. The amount of memory used can be set with an

So what is the role of the stack in the java memory model? Well, it holds the local variables of a method, the parameters of a method and it plays a part in returning values when a method has finished. All of this information is packed in a frame. Here is a simple representation of the stack and what it holds.

 

The picture shows a bit how the stack works. When the main method is being invoked, the args parameter is being pushed onto the stack. Then the ‘age’ variable right up to the ‘name’ variable, in that very order.

There is something funky though: both args and name seem to have values on the stack that do not match what is in the code. This is true. These variables hold references to objects (the values in the picture are not a good representation of pointers or references. Its just a way to indicate that they are not holding objects). Objects are not stored on the stack, they are stored in the heap.  More on how this actually works and interacts with the heap in the following post.

Each stack has a fixed size of memory. That means that if the stack grows and would require more memory then permitted, the JVM throws a StackOverFlowError. A great way of triggering such an error is creating a never ending recursive method.

A stack can also cause an OutOfMemorError. Since more native memory is allocated when a new thread is started, it is possible that there is no more native memory available to create a new stack.

The heap

The heap is a run-time data area, accessible by all stacks. It is created when the JVM is started. This is the area where all the objects and arrays are stored. The heap consists of multiple spaces that all have their own purpose and in their own way, those spaces help the performance of the JVM. But more details on that in a later post about garbage collection.

The young generation: This is the part of the heap where all the new or recently created objects are stored in (hence ‘young generation’ 😉 ). Although it is not very clear in my picture of the memory model, the young generation is not as large as the old generation. The young generation is separated into 3 more spaces: the eden space and 2 survivor spaces.

Eden space:  Honestly, the name ‘eden space’ really doesn’t speak for itself in my opinion. But this is the space where all the newly created objects are stored. It is not a very large space. This is because most objects are only needed for a short time. So the eden space is small so it fills up quite quickly and when almost full, a garbage collection is triggered. This means that every object without a reference will be destroyed and there is again memory available in the eden space.  In fact, after a garbage collection, eden space is completely empty again (more on garbage collection and how this works later). How is this possible? Surely there must be objects that still have a reference and survive the garbage collection onslaught?

The survivor spaces: This is where the survivors of a garbage collection are stored. Not forever though. After a garbage collection of the young generation (that is right, a garbage collection effects the entire young generation, not only eden space) ,  one of the two survivor spaces is empty at the start of a new cycle. The survivor spaces are small in size, because the amount of objects in there will always be very small. When objects have survived a certain amount of garbage collection cycles, they are ‘evicted’ from the survivor space and moved to another space.

The old generation or tenured space: Again, the name ‘tenured space’ really doesn’t say much to me, but maybe that is because English is not my native language. I prefer ‘old generation’ because they are basically the same. But as you can guess, this is the space where die hard objects are stored. These are objects which are expected to survive forever.  The old generation or tenured space is the largest space. It needs to hold the most objects. It should be noted that at any given time during the lifetime of a java application, these objects are no longer being referenced. In that case, they are eligible for garbage collection and will be destroyed when a garbage collection is done.

The heap space can grow up to a certain amount of memory during the lifetime of a program. How come? It is explained in great depth in this article, but here is a short summary. On startup of your JVM ( so when you start a java application) the JVM sets an initial heaps size and a maximum heap size. The default values depend on the system configuration (the hardware where the JVM is being run on), according do this oracle article (search for -Xmsn or -Xmxn ). Consider this example: the initial heap size could be 64 MB and the maximum size could be 200 MB. That means that there is 136 MB of unused memory. This memory is reserved by the JVM. It is not used yet, but it could be used in the future. Because the JVM reserves this unused native memory, it cannot be used by any other process.

When the heap size reaches it limit, a garbage collection cycle is triggered. When there is only a very small amount of memory freed after a garbage collection cycle, the JVM will throw an OutOfMemoryError: Java heap space exception.  Pay attention to the name of the error, its different from the one that is thrown when the stack runs out of memory. And also pay attention to the information that follows the colon. The JVM is very specific about the part that has run out of memory. It specifically says that the error is about the heap space.

The Method Area or Permgen space

The permgen space, method area or permanent generation is a tricky space to write about. There has been written a lot about it and some articles just contradict each other. In this post, I will only address what I think is important to know about the permanent generation. I think i will do more research on this space and then try to clear things out in a separate post.

What do know is that the permanent generation in java 7 holds meta data of all the classes that have been loaded so far. More details on how that happens can be read in this (older) article. What kind of metadata are we talking about? Well about anything there is to know about classes so that the JVM knows how to describe those classes.  That means field data, method data, names of the classes, static methods and fields, etc etc.  Before java 7, String interning and creating String pools also happened in the permanent generation. In java 7, this is no longer the case.

When the permanent generation is full but more meta data needs to be stored, an OutOfMemoryError: PermGen space is thrown. Again the JVM is very specific about which space has run out of memory. This is important information when you face a java memory issue in real life. When a space is running out of memory, there are basically 2 main reasons: your application needs more memory then you initially estimated or you have a memory leak.  But a memory leak in the heap space has different causes then a memory leak in the permgen space. More on that in a follow up post.

The native stack

The  native stack is basically the same as the java stack but instead of supporting java methods, it supports methods that are written in a different language then java. What language is again depending on the configuration of the machine that the JVM is running on.

Just like the java stack, there is a native stack for every thread that is started by your java application. And again, the JVM throws the same out of memory errors as the java stack.