Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Splash uses
resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
which have several downsides and I believe is not very suitable for this task:Looks like it can be an issue for Splash instance as well
max_rss
never goes down, so Splash instance will be shut down after one such memory usage peak despite current memory usage might be really low - which leads to server restart, makes service unavailable, clears all cache and forces all pages to be downloaded once again which is bad in terms of performance.An article about difference between
max_rss
andpsutils
can be found [here].(http://fa.bianp.net/blog/2013/different-ways-to-get-memory-consumption-or-lessons-learned-from-memory_profiler/)Splash already depends on
psutil
, so it's not a problem to use it for a memory check.First commit is intended to start discussion, it's not the code I want to merge right away. First of all because this change is to some extent backwards incompatible, and probably we can have another Splash option to enable
psutil
memory check and keepmaxrss
option for compatibility, maybe remove it later. But there's one more thing about howpsutil
measures memory. In general it would allow spikes of memory usage and would work okay, util memory check will be launched during some memory intensive process. In this casepsutil
behaviour would be the same asmax_rss
, but this PR rather aims to find better solution for memory check than simply replacemax_rss
withpsutil
. So instead of simply measuring memory usage withpsutil
I wanted to propose using moving average as a metric for memory measurement. We can collect 3-5 points of memory usage and compare to the limit. This will allow to remove some random fluctuations and shut down instance only in case when memory usage in average is higher than some give watermark.Disclaimer: proposal about moving average came to my mind after committing first commit which simply replaces
max_rss
withpsutil
, but I hope the idea is quite clear 😃