Project

General

Profile

bug #5048

Updated by Andreas Kohlbecker over 4 years ago

Our cdm-server are suffering from memory problems! Evidence for this can bee seen st least since the middle of last week. 

 *    edit-int was consuming about 12GB of RAM with only 4 instances running. This became obvious because the server was swapping and became more or less unresponsive. After a restart RAM consumption was at 5GB, 
 *    edit-test was today also consuming about 12GB of RAM but with over 20 instances. It also was stalled due to swapping memory to the disk. 
 *    edit-production spit out a lot of monit warnings that the server reached the waring limit of more that 80% RAM being consumed. 

 see also #5375 

 ---- 

 ## Native memory tracking 

 * https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/tooldescr007.html (*****) 
 * https://stackoverflow.com/questions/2756798/java-native-memory-usage 
 * https://www.techpaste.com/2012/07/steps-debugdiagnose-memory-memory-leaks-jvm/ (****) 
 * https://plumbr.io/blog/memory-leaks/native-memory-leak-example 

 ### jemalloc  

 * [jemalloc home](http://jemalloc.net/) 
 * [jemalloc github](https://github.com/jemalloc/jemalloc) 

 * **https://technology.blog.gov.uk/2015/12/11/using-jemalloc-to-get-to-the-bottom-of-a-memory-leak/** 
 * https://www.evanjones.ca/java-native-leak-bug.html 
 * https://docs.tibco.com/pub/bwce/2.4.5/doc/html/GUID-231E1EFC-EA7C-4072-B0F4-0D92093D3161.html 

 **installation and usage** 

 download jemalloc-5.2.1.tar.bz2 from https://github.com/jemalloc/jemalloc/releases and extract the archive. In the extracted folder ($JMALLOC_FOLDER): 

 ~~~ 
 ./configure --enable-prof 
 make 
 ~~~ 

 the executable and the library file are now found here: 

 * $JMALLOC_FOLDER/bin/jeprof 
 * $JMALLOC_FOLDER/lib/libjemalloc.so 

 to profile the cdmlib-remote-webapp instanc or cdmserver, add the following env variables to the eclipse jetty launcher: 

 ~~~ 
 LD_PRELOAD=$JMALLOC_FOLDER/lib/libjemalloc.so 
 MALLOC_CONF=prof:true,lg_prof_interval:24,lg_prof_sample:17 
 ~~~  

 Compared to the above linked posts I am using the quite low lg_prof_interval of 2^24 (~16MB) to record the allocation fine grained. With this setting I could bin the massive allocations down to the cause `Java_java_util_zip_Inflater_init` (see below in results) 

 ![](picture323-1.png) 

 now start the launcher to run the application. jemalloc will write several files named like `jeprof.*.heap` in the working directory  

 the `jeprof` executable can create diagrams from these files: 

 ~~~ 
 bin/jeprof --show-bytes --gif /opt/java-oracle/jdk1.8/bin/java jeprof.*.heap > app-profiling.gif 
 ~~~ 

 ## Similar issues 

 ### MALLOC_ARENA Issue  

 * http://stackoverflow.com/questions/561245/virtual-memory-usage-from-java-under-linux-too-much-memory-used#28935176 
 * http://serverfault.com/questions/341579/what-consumes-memory-in-java-process#answer-554697 
 * http://stackoverflow.com/questions/18734389/huge-memory-allocated-outside-of-java-heap 
 * http://stackoverflow.com/questions/26041117/growing-resident-memory-usage-rss-of-java-process 
 * https://www.ibm.com/developerworks/community/blogs/kevgrig/entry/linux_glibc_2_10_rhel_6_malloc_may_show_excessive_virtual_memory_usage?lang=en 


 The MALLOC_ARENA Issue is a known problem with glibc >= 2.10 your servers running jessie have glibc 2.19 installed! 


 Setting MALLOC_ARENA_MAX env variable to a low value (0-4) could help 

 ~~~ 
 export MALLOC_ARENA_MAX=4 
 ~~~ 
 but only lead to a decrease in performance but did not reduce memory consumption at all 


 ---- 

 ## Diagnoses and results 

 **Diagnosis of edit-test as of 30.06.2015**  

 settings for the cdmserver: 

 ~~~ 
 -Xmx4500M -XX:PermSize=512m -XX:MaxPermSize=1800m 
 ~~~ 

 system memory usage of the cdm-server process: 

 ~~~ 
 KiB Mem:    10266200 total,    9720060 used,     546140 free,      95836 buffers 
 KiB Swap:    2097148 total,    1870592 used,     226556 free.    1442588 cached Mem 

   PID USER        PR    NI      VIRT      RES      SHR S    %CPU %MEM       TIME+ COMMAND                                                                    
 31761 cdm         20     0 18.997g 6.710g     3084 S     0.0 68.5    42:35.69 jsvc  
 ~~~ 
  **Almost all of the Swap space is being used.**  


 ~~~ 
 $ jmap -heap 31761 

 Attaching to process ID 31761, please wait... 
 Debugger attached successfully. 
 Server compiler detected. 
 JVM version is 24.71-b01 

 using thread-local object allocation. 
 Garbage-First (G1) GC with 4 thread(s) 

 Heap Configuration: 
    MinHeapFreeRatio = 40 
    MaxHeapFreeRatio = 70 
    MaxHeapSize        = 4718592000 (4500.0MB) 
    NewSize            = 1363144 (1.2999954223632812MB) 
    MaxNewSize         = 17592186044415 MB 
    OldSize            = 5452592 (5.1999969482421875MB) 
    NewRatio           = 2 
    SurvivorRatio      = 8 
    PermSize           = 536870912 (512.0MB) 
    MaxPermSize        = 1887436800 (1800.0MB) 
    G1HeapRegionSize = 1048576 (1.0MB) 

 Heap Usage: 
 G1 Heap: 
    regions    = 4380 
    capacity = 4592762880 (4380.0MB) 
    used       = 3908545928 (3727.479866027832MB) 
    free       = 684216952 (652.520133972168MB) 
    85.10228004629754% used 
 G1 Young Generation: 
 Eden Space: 
    regions    = 602 
    capacity = 874512384 (834.0MB) 
    used       = 631242752 (602.0MB) 
    free       = 243269632 (232.0MB) 
    72.18225419664269% used 
 Survivor Space: 
    regions    = 10 
    capacity = 10485760 (10.0MB) 
    used       = 10485760 (10.0MB) 
    free       = 0 (0.0MB) 
    100.0% used 
 G1 Old Generation: 
    regions    = 3116 
    capacity = 3707764736 (3536.0MB) 
    used       = 3266817416 (3115.479866027832MB) 
    free       = 440947320 (420.52013397216797MB) 
    88.10746227454275% used 
 Perm Generation: 
    capacity = 1504706560 (1435.0MB) 
    used       = 1504138704 (1434.4584503173828MB) 
    free       = 567856 (0.5415496826171875MB) 
    99.96226134615908% used 

 59361 interned Strings occupying 5910712 bytes. 
 ~~~ 

 so this is in total ~10GB of heap capacity for this process whereas    G1 Old Generation and Perm Generation are together using ~5GB alone!!! 

 ### Diagnosing with jemalloc 

 Result: 

 ![cutout of attachment:app-profiling.gif](picture743-1.png) 



Back