bug #5048
Updated by Andreas Kohlbecker over 4 years ago
Our cdm-server are suffering from memory problems! Evidence for this can bee seen st least since the middle of last week.
* edit-int was consuming about 12GB of RAM with only 4 instances running. This became obvious because the server was swapping and became more or less unresponsive. After a restart RAM consumption was at 5GB,
* edit-test was today also consuming about 12GB of RAM but with over 20 instances. It also was stalled due to swapping memory to the disk.
* edit-production spit out a lot of monit warnings that the server reached the waring limit of more that 80% RAM being consumed.
see also #5375
----
## Native memory tracking
* https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/tooldescr007.html (*****)
* https://stackoverflow.com/questions/2756798/java-native-memory-usage
* https://www.techpaste.com/2012/07/steps-debugdiagnose-memory-memory-leaks-jvm/ (****)
* https://plumbr.io/blog/memory-leaks/native-memory-leak-example
### jemalloc
* [jemalloc home](http://jemalloc.net/)
* [jemalloc github](https://github.com/jemalloc/jemalloc)
* **https://technology.blog.gov.uk/2015/12/11/using-jemalloc-to-get-to-the-bottom-of-a-memory-leak/**
* https://www.evanjones.ca/java-native-leak-bug.html
* https://docs.tibco.com/pub/bwce/2.4.5/doc/html/GUID-231E1EFC-EA7C-4072-B0F4-0D92093D3161.html
**installation and usage**
download jemalloc-5.2.1.tar.bz2 from https://github.com/jemalloc/jemalloc/releases and extract the archive. In the extracted folder ($JMALLOC_FOLDER):
~~~
./configure --enable-prof
make
~~~
the executable and the library file are now found here:
* $JMALLOC_FOLDER/bin/jeprof
* $JMALLOC_FOLDER/lib/libjemalloc.so
to profile the cdmlib-remote-webapp instanc or cdmserver, add the following env variables to the eclipse jetty launcher:
~~~
LD_PRELOAD=$JMALLOC_FOLDER/lib/libjemalloc.so
MALLOC_CONF=prof:true,lg_prof_interval:24,lg_prof_sample:17
~~~
Compared to the above linked posts I am using the quite low lg_prof_interval of 2^24 (~16MB) to record the allocation fine grained. With this setting I could bin the massive allocations down to the cause `Java_java_util_zip_Inflater_init` (see below in results)
![](picture323-1.png)
now start the launcher to run the application. jemalloc will write several files named like `jeprof.*.heap` in the working directory
the `jeprof` executable can create diagrams from these files:
~~~
bin/jeprof --show-bytes --gif /opt/java-oracle/jdk1.8/bin/java jeprof.*.heap > app-profiling.gif
~~~
### perf
installation
~~~
apt install linux-tools-$(uname -r) linux-tools-generic
~~~
## Similar issues
### MALLOC_ARENA Issue
* http://stackoverflow.com/questions/561245/virtual-memory-usage-from-java-under-linux-too-much-memory-used#28935176
* http://serverfault.com/questions/341579/what-consumes-memory-in-java-process#answer-554697
* http://stackoverflow.com/questions/18734389/huge-memory-allocated-outside-of-java-heap
* http://stackoverflow.com/questions/26041117/growing-resident-memory-usage-rss-of-java-process
* https://www.ibm.com/developerworks/community/blogs/kevgrig/entry/linux_glibc_2_10_rhel_6_malloc_may_show_excessive_virtual_memory_usage?lang=en
The MALLOC_ARENA Issue is a known problem with glibc >= 2.10 your servers running jessie have glibc 2.19 installed!
Setting MALLOC_ARENA_MAX env variable to a low value (0-4) could help
~~~
export MALLOC_ARENA_MAX=4
~~~
but only lead to a decrease in performance but did not reduce memory consumption at all
----
## Diagnoses and results
**Diagnosis of edit-test as of 30.06.2015**
settings for the cdmserver:
~~~
-Xmx4500M -XX:PermSize=512m -XX:MaxPermSize=1800m
~~~
system memory usage of the cdm-server process:
~~~
KiB Mem: 10266200 total, 9720060 used, 546140 free, 95836 buffers
KiB Swap: 2097148 total, 1870592 used, 226556 free. 1442588 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31761 cdm 20 0 18.997g 6.710g 3084 S 0.0 68.5 42:35.69 jsvc
~~~
**Almost all of the Swap space is being used.**
~~~
$ jmap -heap 31761
Attaching to process ID 31761, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 24.71-b01
using thread-local object allocation.
Garbage-First (G1) GC with 4 thread(s)
Heap Configuration:
MinHeapFreeRatio = 40
MaxHeapFreeRatio = 70
MaxHeapSize = 4718592000 (4500.0MB)
NewSize = 1363144 (1.2999954223632812MB)
MaxNewSize = 17592186044415 MB
OldSize = 5452592 (5.1999969482421875MB)
NewRatio = 2
SurvivorRatio = 8
PermSize = 536870912 (512.0MB)
MaxPermSize = 1887436800 (1800.0MB)
G1HeapRegionSize = 1048576 (1.0MB)
Heap Usage:
G1 Heap:
regions = 4380
capacity = 4592762880 (4380.0MB)
used = 3908545928 (3727.479866027832MB)
free = 684216952 (652.520133972168MB)
85.10228004629754% used
G1 Young Generation:
Eden Space:
regions = 602
capacity = 874512384 (834.0MB)
used = 631242752 (602.0MB)
free = 243269632 (232.0MB)
72.18225419664269% used
Survivor Space:
regions = 10
capacity = 10485760 (10.0MB)
used = 10485760 (10.0MB)
free = 0 (0.0MB)
100.0% used
G1 Old Generation:
regions = 3116
capacity = 3707764736 (3536.0MB)
used = 3266817416 (3115.479866027832MB)
free = 440947320 (420.52013397216797MB)
88.10746227454275% used
Perm Generation:
capacity = 1504706560 (1435.0MB)
used = 1504138704 (1434.4584503173828MB)
free = 567856 (0.5415496826171875MB)
99.96226134615908% used
59361 interned Strings occupying 5910712 bytes.
~~~
so this is in total ~10GB of heap capacity for this process whereas G1 Old Generation and Perm Generation are together using ~5GB alone!!!
### Diagnosing with jemalloc
Result:
![cutout of attachment:app-profiling.gif](picture743-1.png)