[Feature] Add used memory after GC to memory reports
i0xHeX opened this issue ยท 3 comments
Problem:
Used memory at the time we send for example /spark health
includes garbage memory, that will be collected soon. When looking for lag source (or for memory leak) checking used memory may confuse, at the moment it could be like 80%, and the next time GC will collect everything so usage become 47%:
https://i.imgur.com/MIdEkcl.png
This forces to send command again and again to see the approximate memory usage excluding garbage.
Solution:
We can calculate used memory without GC, lets assume we check it every tick:
- Record current memory usage to
current_mem_usage
- If there is
prev_mem_usage
recorded, then compare. Ifcurrent_mem_usage < prev_mem_usage
, then we assignno_garbage_mem_usage
=current_mem_usage
, else do nothing. - Assign
prev_mem_usage
=current_mem_usage
So briefly if we see that GC collected garbage, we update so called no_garbage_mem_usage
to current memory usage. This will produce us "stable" memory usage. I think it would be much better to use some GC listeners if possible. I never worked with this, but searching fast I found something:
https://docs.oracle.com/javase/7/docs/jre/api/management/extension/com/sun/management/GarbageCollectionNotificationInfo.html
Running /spark heath --memory
will report the memory usage at the last GC.
Also, /spark gcmonitor
will report how much memory is freed on each collection (it hooks in with the GC listeners you mentioned)