This is part #1 of a series of articles that will provide you a consolidated view of the Sun Java HotSpot VM arguments from the online Oracle documentation. It will also serve as a foundation & baseline for future Blog posts, case studies on Java HotSpot VM performance tuning and the upcoming JDK 1.7 so I suggest you bookmark this post URL for future useful references.
Server-Class Machine Detection (HotSpot JDK 1.5+)
Find below the Java HotSpot behaviour when using Server-Class Machine Detection mode (neither -server nor -client is specified). It is important to note that –server mode should normally be used when using a minimum of 2 processors and minimum of 2 GB of physical memory.
Platform
Default VM
Architecture
OS
client VM
if server-class, server VM; otherwise, client VM
server VM
SPARC 32-bit
Solaris
X
i586
Solaris
X
Linux
X
Microsoft Windows
X
SPARC 64-bit
Solaris
—
X
AMD64
Linux
—
X
Microsoft Windows
—
X
Legend: X = default VM — = client VM not provided for this platform
Java HotSpot VM argument matrix – standard options
* Standard options guaranteed to be supported by all VM implementations
This case study describes the complete root cause analysis and resolution of a Java ZipFile OutOfMemoryError problem triggered during the deployment of an Oracle Weblogic Integration 9.2 application.
Environment Specs
·Java EE server: Oracle Weblogic Integration 9.2 MP2
·OS: Sun Solaris 5.10
·JDK: Sun Java VM 1.5.0_10-b03 - 32-bit
·Java VM arguments: -server –Xms2560m –Xmx2560m -XX:PermSize=256m -XX:MaxPermSize=512m
·Platform type: BPM
Monitoring and troubleshooting tools
·Solaris pmap command
·Java verbose GC (for Java Heap and PermGen memory monitoring)
Problem overview
·Problem type: java.lang.OutOfMemoryError at java.util.zip.ZipFile.open(Native Method)
An OutOfMemoryError problem was observed during the deployment process of one of our Weblogic Integration 9.2 application.
Gathering and validation of facts
As usual, a Java EE problem investigation requires gathering of technical and non technical facts so we can either derived other facts and/or conclude on the root cause. Before applying a corrective measure, the facts below were verified in order to conclude on the root cause:
·What is the client impact? HIGH (full outage of our application)
·Recent change of the affected platform? Yes, a minor update to the application was done along with an increase of minimum and maximum size the Java Heap from 2 GIG (2048m) to 2.5 GIG (2560m) in order to reduce the frequency of full garbage collections
·Any recent traffic increase to the affected platform? No
·Since how long this problem has been observed? Since the increase of the Java Heap to 2.5 GIG
· Is the OutOfMemoryError happening on start-up or under load? The OOM is triggered at deployment time only with no traffic to the environment
·What was the utilization of the Java Heap at that time? Really low at 20% utilization only (no traffic)
·What was the utilization of the PermGen space at that time? Healthy at ~ 70% utilization and not leaking
·Did a restart of the Weblogic server resolve the problem? No
-Conclusion #1: The problem trigger seems to be related to the 500 MB increase of the Java Heap to 2.5 GIG. ** This problem initially puzzled the troubleshooting team until more deep dive analysis was done as per below **
Weblogic log file analysis
A first analysis of the problem was done by reviewing the OOM error in the Weblogic managed server log.
java.lang.OutOfMemoryError
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.<init>(ZipFile.java:203)
at java.util.jar.JarFile.<init>(JarFile.java:132)
at java.util.jar.JarFile.<init>(JarFile.java:97)
at weblogic.utils.jars.JarFileDelegate.<init>(JarFileDelegate.java:32)
at weblogic.utils.jars.VirtualJarFactory.createVirtualJar(VirtualJarFactory.java:24)
at weblogic.application.ApplicationFileManagerImpl.getVirtualJarFile(ApplicationFileManagerImpl.java:194)
at weblogic.application.internal.EarDeploymentFactory.findOrCreateComponentMBeans(EarDeploymentFactory.java:162)
at weblogic.application.internal.MBeanFactoryImpl.findOrCreateComponentMBeans(MBeanFactoryImpl.java:48)
at weblogic.application.internal.MBeanFactoryImpl.createComponentMBeans(MBeanFactoryImpl.java:110)
at weblogic.application.internal.MBeanFactoryImpl.initializeMBeans(MBeanFactoryImpl.java:76)
at weblogic.management.deploy.internal.MBeanConverter.createApplicationMBean(MBeanConverter.java:88)
at weblogic.management.deploy.internal.MBeanConverter.createApplicationForAppDeployment(MBeanConverter.java:66)
at weblogic.management.deploy.internal.MBeanConverter.setupNew81MBean(MBeanConverter.java:314)
As you can see, the OutOfMemoryError is thrown during the loading of our application (EAR file). The Weblogic server relies on the Java JDK ZipFile class to load any application EAR / jar files.
-Conclusion #2: The OOM error is triggered during a native call (ZipFile.open(Native Method)) from the Java JDK ZipFile to load our application EAR file. This native JVM operation requires proper native memory and virtual address space available in order to execute its loading operation. The conclusion at this point was that our Java VM 1.5 was running out of native memory / virtual address space at deployment time.
Sun Java VM native memory and MMAP files
Before we go any further in the root cause analysis, you way want to revisit the internal structure of the Sun Java HotSpot VM for JDK 1.5 and JDK 1.6. Proper understanding of the internal Sun Java VM is quite important, especially when dealing with native memory problems.
When using JDK 1.4 / 1.5, any JAR / ZIP file loaded by the Java VM get mapped entirely into an address space. This means that the more EAR / JAR files you are loading to a single JVM, the higher is the native memory footprint of your Java process.
This also means that the higher is your Java Heap and PermGen space; the lower memory is left for your native memory spaces such as C-Heap and MMAP Files which can definitely be a problem if you are deploying too many separate applications (EAR files) to a single 32-bit Java process.
Please note that Sun came up with improvements in JDK 1.6 (Mustang) and changed the behaviour so that the JAR file's central directory is still mapped, but the entries themselves are read separately; reducing the native memory requirement.
I suggest that you review the Sun Bug Id link below for more detail on such JDK 1.4 / 1.5 limitation.
Native memory footprint: Solaris pmap to the rescue!
The Solaris pmap command allows Java EE application developers to analyse how an application uses memory by providing a breakdown of all allocated address spaces of the Java process.
Now back to our OOM problem, pmap snapshots were generated following the OutOfMemoryError and analysis of the data below did reveal that we were getting very close to the upper 4 GIG limit of a 32-bit process on Solaris following our Java Heap increase to 2.5 GIG.
Now find below the reduced raw data along with snapshot with explanation of the findings.
As you can see in the above pmap analysis, our 32-bit Java process size was getting very close to the 4 GIG limit; leaving no room for additional EAR file deployment.
We can see a direct correlation between the Java Heap increase and the appearance of this OutOfMemoryError. Since the Java 1.5 VM is mapping the entire EAR file to a native memory address space; proper native memory and address space must be available to fulfil such ZipFile.open() native operation.
The solution was to simply revert back our Java Heap increase which did allow the deployment of our EAR file.
Other long term solutions will be discussed shortly such as vertical scaling of the Weblogic Integration (adding more JVM’s / managed servers to existing physical servers), switching to the 64-bit JVM and / or upgrade to the Sun Java VM 1.6.
Conclusion and recommendations
- - When facing an OutOfMemoryError with the Sun Java VM, always do proper analysis and your due diligence to determine which memory space is the problem (Java Heap, PermGen, native / C-Heap)
- - When facing an OutOfMemoryError due to native memory exhaustion, always generate Solaris pmap snapshots of your Java process and do your analysis. Do not increase the Java Heap (Xms / Xmx) as this will make the problem even worse
- - Be very careful before attempting to increase your Java Heap or PermGen space. Always ensure that you understand your total Java VM memory footprint and that you have enough native memory available for the non Java Heap memory allocations such as MMAP Files for your application EAR files
The short article will provide you with the most common problem patterns you can face with hanging Java Threads at socketinputstream.socketread0. For more detail and troubleshooting approaches for this type of problem, please visit my orignal post on this subject.
Problem overview
Any communication protocol such as HTTP / HTTPS, JDBC, RMI etc. ultimately rely on the JDK java.net layer to perform lower TCP-IP / Socket operations. The creation of a java.net.Socket is required in order for your application & JVM to connect, send and receive the data from an external source (Web Service, Oracle database etc.).
The SocketInputStream.socketRead0 is the actual native blocking IO operation executed by the JDK in order to read and receive the data from the remote source. This is why Thread hanging problems on such operation is so common in the Java EE world.
java.net.socketinputstream.socketread0() – why is it hanging?
There are a few common scenarios which can lead your application and Java EE server Threads to hang for some time or even forever at java.net.socketinputstream.socketread0.
# Problem pattern #1
Slowdown or instability of a remote service provider invoked by your application such as:
- A Web Service provider (via HTTP/HTTPS)
- A RDBMS (Oracle) database
- A RMI server etc.
- Other remote service providers (FTP, pure TCP-IP etc.)
This is by far the most common problem ~90%+. See below an example of hang Thread from the Thread Dump data extract due to instability of a remote Web Service provider:
# Problem pattern #2
Functional problem causing long running transaction(s) from your remote service provider
This is quite similar to problem pattern #1 but the difference is that the remote service provider is healthy but taking more time to process certain requests from your application due to a bad functional behaviour.
A good example is a long running Oracle database SQL query (lack of indexes, execution plan issue etc.) that will show up in the Thread Dump as per below:
# Problem pattern #3
Intermittent or consistent network slowness or latency.
Severe network latency will cause the data transfer between the client and server to slowdown, causing the SocketInputStream write() and read() operations to take more time to complete and Thread to hang at socketinputstream.socketread0 until the bytes data is received from the network .
java.net.socketinputstream.socketread0() – what is the solution?
# Problem pattern #1
The solution for this problem pattern is to contact the support team of the affected remote service provider and share your observations from your application, Threads etc. so they can investigate and resolve their global system problem.
# Problem pattern #2
The solution for this problem pattern will depend of the technology involved. A root cause analysis must be performed in order to identify and fix the culprit (missing table indexes, too much data returned from the Web Service etc.).
# Problem pattern #3
The solution for this problem pattern will require the engagement of your network support team so they can conduct some a “sniffing” activity of the TCP-IP packets between your application server(s) and your affected remote service provider (s).
You should also attempt to replicate the problem using OS commands such as ping and traceroute to provide some guidance to your network support team.
Final recommendation – timeout implementation is critical!
Proper timeouts should be implemented, when possible, in order to prevent a domino affect situation on your Java EE application server. The timeout value should be low enough to prevent your Threads to hang for too long and tested properly (negative testing).
Socket timeouts (connect, read & write operations) for Web Services via HTTP / HTTPS are quite easy to implement and can be achieved by following your proper API documentation (JAX-WS, Weblogic, WAS, JBoss WS, Apache AXIS etc.).
This is the part 1 of a series of posts that will provide you with a root cause analysis approach and resolution guide for Java EE OutOfMemoryError problems.
The part 1 will focus on how you can first isolate the problem and identify which JVM memory space ran out of memory.
OutOfMemoryError: What is it?
This is one of the most common problem you can face when supporting or developing Java EE production systems or a standalone Java application.
An OutOfMemoryError is thrown by the Java VM when it cannot allocate an object because it is out of memory, and no more memory could be made available by the garbage collector.
The actual Java Exception returned by the JVM is java.lang.OutOfMemoryError which is part of the java.lang.Error Exception list. The error message provided by the VM will be different depending of your JVM vendor and version and depending of which memory space is depleted.
Find below an example of OutOfMemoryError thrown by a Java HotSpot VM 1.6 following the depletion of the Java Heap space.
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
at java.lang.StringBuilder.append(StringBuilder.java:119)
at sun.security.util.ManifestDigester.<init>(ManifestDigester.java:117)
at java.util.jar.JarVerifier.processEntry(JarVerifier.java:250)
at java.util.jar.JarVerifier.update(JarVerifier.java:188)
at java.util.jar.JarFile.initializeVerifier(JarFile.java:321)
at java.util.jar.JarFile.getInputStream(JarFile.java:386)
at org.jboss.virtual.plugins.context.zip.ZipFileWrapper.openStream(ZipFileWrapper.java:215)
at org.jboss.virtual.plugins.context.zip.ZipEntryContext.openStream(ZipEntryContext.java:1084)
at org.jboss.virtual.plugins.context.zip.ZipEntryHandler.openStream(ZipEntryHandler.java:154)
at org.jboss.virtual.VirtualFile.openStream(VirtualFile.java:241)
Analysis step #1 - Identify which JVM memory space ran out of memory
The first step is to determine which memory space is depleted. The analysis approach will depend on which JVM vendor and version you are using. I have built a quick matrix guide to help with this task. Please simply review each of the affected memory spaces below and determine which one is applicable in your situation.
Please feel free to post any comment or question if you are still having doubts with these problem isolation steps.
Review the OutOfMemoryError
message. It should give you information
such as java.lang.OutOfMemoryError: Java
heap space.
If you are not seeing any
explicit error message then you need to analyze the OutOfMemoryError
Stack Trace. Look at the first 5 lines; it will give you the type
of Java operation the Thread was executing that lead to
the OOM error. Java Heap depletion will be
triggered by Java operations such as population of a
StringBuffer, adding objects in a HashMap data structure etc.
If you are not sure about
either #1 or #2 approaches then you will
need to enable the JVM verbose GC (-verbose:gc) in
order to identify and confirm if Java Heap
depletion (Young / Old Gen) is your problem.
Problem Patterns
The Java Heap is the most common
memory space that you will face OutOfMemoryError since it is storing your Java
program short and long term Object instances.
The most common problem is a lack
of proper maximum capacity (via -Xmx argument). Java Heap memory leaks are also
quite common and will require you to analyze the JVM Heap Dump to pinpoint the
root cause.
JVM Memory Space: PermGen ** NOTE: PermGen was replaced by the Metaspace starting with JDK 1 .8
Applicable JVM vendors
·Oracle Java HotSpot (JDK 1.7 and lower)
Analysis Approach
Review the OutOfMemoryError
message. It should give you information such asjava.lang.OutOfMemoryError: PermGen
full.
If you are not seeing any
explicit error message then you need to analyze the OutOfMemoryError
Stack Trace. Look at the first 5 lines; it will give you the type
of Java operation the Thread was executing that lead to the PermGen depletion. Java PermGen space depletion will
be triggered by JVM operations such loading a class to a class loader
as per below example.
If you are not sure about either #1 or #2 approaches then you will need to enable the JVM verbose GC (-verbose:gc) in order to identifyif PermGen space depletion is your problem.
java.lang.OutOfMemoryError: PermGen space
at java.lang.ClassLoader.defineClass1(Native
Method)
at
java.lang.ClassLoader.defineClass(Unknown Source)
The Java HotSpot PermGen space is
the second most common memory space that you will face OutOfMemoryError since
it is storing your Java program Class descriptor related objects.
Configuring a PermGen space too
small vs. your Java / Java EE program size if the most common problem. Other scenarios include class
loader leak that can be triggered for example by too many deploy / undeploy
without JVM restart.
JVM Memory Space: Native Heap (C-Heap)
Applicable JVM vendors
·Oracle Java HotSpot (any version)
·IBM Java J9 (any version)
·Oracle JRockit (any version)
Analysis Approach Native OutOfMemoryError error messages are normally not very informative. A deeper analysis of the
OutOfMemoryError Stack Trace is required.
Native Heap depletion will be triggered by Java operation such as loading a JAR file (MMAP file), trying to create a new Java Thread etc. which all requires enough native memory available to the C-Heap. Find below an example of OutOfMemoryError due to Native memory depletion of a 32-bit VM:
java.lang.OutOfMemoryError
at java.util.zip.ZipFile.open(Native Method)
at
java.util.zip.ZipFile.<init>(ZipFile.java:203)
at
java.util.jar.JarFile.<init>(JarFile.java:132)
at java.util.jar.JarFile.<init>(JarFile.java:97)
at
weblogic.utils.jars.JarFileDelegate.<init>(JarFileDelegate.java:32)
An OutOfMemoryError resulting from a Native Heap depletion
is less common but can happen for example if you physical server is running out
of virtual memory.
Other scenarios include memory leak of third
party native libraries such as a monitoring agent or trying to deploy too many
applications (EAR files) or Java classes to a single 32-bit JVM.