Simple stats in worker script #6

raggleton · 2016-07-12T12:35:24Z

Would it would be handy to have simple statistics printed out at the end of the worker script: e.g. time taken, memory usage, etc? There is the .log file which has something similar, but perhaps there is more customisability putting it in the worker script?

kreczko · 2016-10-25T13:45:51Z

Would be better to have them as an additional file in JSON format so they can be processed.
I would even add entries every 5 min or so which would allow to show a distribution.

raggleton · 2016-10-29T14:08:49Z

Ok, so whilst bored in a meeting I added in some rudimentary ability for the worker to output process info into JSON. See the logger branch, especially condor_worker.py. This uses the psutil package, and logs info from the spawned command + all its recursive children every 15s. (All the recursive children are stored because often one runs a script, which runs a command, and eventually you get to the "meaty" command.)
Info that can be stored: https://pythonhosted.org/psutil/#psutil.Process

An example set of JSONs can be found: /hdfs/user/ra12451/status_29_Oct_16_11*.json

There's also a rudimentary analysis/plotting script in the dashboard directory that shows a very basic usage of the JSON using the plot.ly library. (pip instal plotly) The example plot page is also included (analysis.html). There's a live copy here: http://raggleton.github.io/analysis.html

It ain't pretty, but it's a start!

(Some more info incase you have trouble running: I installed psutil via pip install --user psutil. The pip is used is the one form my own miniconda installation. My PATH=/software/ra12451/miniconda/bin:$PATH, so it would still import psutil even after I'd cmsenv'd. Dunno how that works...)

raggleton · 2016-10-29T14:17:19Z

Some other ideas/thoughts:

Where to stick JSON? At the moment it ends up in /hdfs/user/$LOGNAME but that's just for a quick test. Although it does have to be put on HDFS initially as that's the only place the worker node can write to.
Is it possible to do it "live" - transfer the JSON after each update allowing the user to plot it? I worry that the hadoop fs -copyFromLocal is too slow for this (or could cause issues with O(100) - O(1000) jobs doing this simultaneously every O(minute).
Is JSON the best format? (Probably yes for now)
Plotting/dashboard: How to serve it? How to handle different jobs/DAGs? Is there a quick framework to put one together? plot.ly maybe not ideal - I used it because that was the first thing that came to mind... but there's bokeh, Google charts, d3.js...
How to choose which stats to plot?

raggleton added enhancement question labels Jul 12, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simple stats in worker script #6

Simple stats in worker script #6

raggleton commented Jul 12, 2016

kreczko commented Oct 25, 2016

raggleton commented Oct 29, 2016 •

edited

Loading

raggleton commented Oct 29, 2016 •

edited

Loading

Simple stats in worker script #6

Simple stats in worker script #6

Comments

raggleton commented Jul 12, 2016

kreczko commented Oct 25, 2016

raggleton commented Oct 29, 2016 • edited Loading

raggleton commented Oct 29, 2016 • edited Loading

raggleton commented Oct 29, 2016 •

edited

Loading

raggleton commented Oct 29, 2016 •

edited

Loading