Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple stats in worker script #6

Open
raggleton opened this issue Jul 12, 2016 · 3 comments
Open

Simple stats in worker script #6

raggleton opened this issue Jul 12, 2016 · 3 comments

Comments

@raggleton
Copy link
Owner

Would it would be handy to have simple statistics printed out at the end of the worker script: e.g. time taken, memory usage, etc? There is the .log file which has something similar, but perhaps there is more customisability putting it in the worker script?

@kreczko
Copy link
Contributor

kreczko commented Oct 25, 2016

Would be better to have them as an additional file in JSON format so they can be processed.
I would even add entries every 5 min or so which would allow to show a distribution.

@raggleton
Copy link
Owner Author

raggleton commented Oct 29, 2016

Ok, so whilst bored in a meeting I added in some rudimentary ability for the worker to output process info into JSON. See the logger branch, especially condor_worker.py. This uses the psutil package, and logs info from the spawned command + all its recursive children every 15s. (All the recursive children are stored because often one runs a script, which runs a command, and eventually you get to the "meaty" command.)
Info that can be stored: https://pythonhosted.org/psutil/#psutil.Process

An example set of JSONs can be found: /hdfs/user/ra12451/status_29_Oct_16_11*.json

There's also a rudimentary analysis/plotting script in the dashboard directory that shows a very basic usage of the JSON using the plot.ly library. (pip instal plotly) The example plot page is also included (analysis.html). There's a live copy here: http://raggleton.github.io/analysis.html

It ain't pretty, but it's a start!

(Some more info incase you have trouble running: I installed psutil via pip install --user psutil. The pip is used is the one form my own miniconda installation. My PATH=/software/ra12451/miniconda/bin:$PATH, so it would still import psutil even after I'd cmsenv'd. Dunno how that works...)

@raggleton
Copy link
Owner Author

raggleton commented Oct 29, 2016

Some other ideas/thoughts:

  • Where to stick JSON? At the moment it ends up in /hdfs/user/$LOGNAME but that's just for a quick test. Although it does have to be put on HDFS initially as that's the only place the worker node can write to.
  • Is it possible to do it "live" - transfer the JSON after each update allowing the user to plot it? I worry that the hadoop fs -copyFromLocal is too slow for this (or could cause issues with O(100) - O(1000) jobs doing this simultaneously every O(minute).
  • Is JSON the best format? (Probably yes for now)
  • Plotting/dashboard: How to serve it? How to handle different jobs/DAGs? Is there a quick framework to put one together? plot.ly maybe not ideal - I used it because that was the first thing that came to mind... but there's bokeh, Google charts, d3.js...
  • How to choose which stats to plot?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants