New method to get profiler data as python object #13

nima · 2016-07-28T15:23:56Z

Hey!

I needed a way to access the profiler data programmatically; to do this I've implemented the get_stats method, and any other changes there are to reduce code-duplication.

The next steps I'd like to work on would be:

Rework annotate' to useget_stats`
Add any missing callgrind' data to the object returned byget_stats, and reworkcallgrindmethod to also useget_stats`

To see how this new method is used, please check out the pydata' branch. In iPython, you can basically perform things likels,cat, andgrepthe files that the profiler picked up, there's a lot of room for awesome features, but I've only been working on what I need most. Here's a quick demo using a sample file calledsample.py`:

import time
import os
import pdb

def _sleep(t):
    time.sleep(t / 1000000.0)
    return t

def aaa(a):
    return a

def bbb(b):
    return _sleep(b)

def ccc(c):
    if not os.path.exists('/proc/%d' % c): c ^= c
    return c

print('test')

def foo(a, b, c):
    r = 0

    for i in range(a):
        for j in range(b):
            for k in range(c):
                r += sum([
                    aaa(a), bbb(b),
                    ccc(c)
                ])

    return r

def bar(i=3):
    return foo(i, 20, 100)

And now, the profiler:

In [31]: import sample
In [33]: fn = (sample.bar, 8)
In [34]: s = pprofiler.Profiler.profile('test@alpha', *fn) # profile sample.bar(8)
In [35]: p = pprofiler.Profiler(s)

# Utility function for color-printing python objects as JSON
In [39]: from pprofile.pprofiler import jprint
In [40]: jprint(p.overhead())
{
  "actual": 18.111714124679565,
  "overhead": 0.0001728534698486328,
  "total": 18.111886978149414
}

# List files, ordered by cumulative time, optionally takes an argument of the number of seconds to match or exceed (reduce output towards the worst-case files) 
In [41]: p.ls()
  0 |   17.3 | /tmp/sample.py
  1 |    0.8 | /opt/lib/python2.7/genericpath.py
  2 |    0.0 | /tmp/pprofile/pprofiler.py

# Catenate the file (using it's path, or index according to p.ls, show time/hit score and line numbers.
In [42]: p.cat(0)
  0.0/0         | 0001 | import time
  0.0/0         | 0002 | import os
  0.0/0         | 0003 | import pdb
  0.0/0         | 0004 |
  0.1/16000     | 0005 | def _sleep(t):
 14.2/16000     | 0006 |     time.sleep(t / 1000000.0)
  0.2/16000     | 0007 |     return t
  0.0/0         | 0008 |
  0.1/16000     | 0009 | def aaa(a):
  0.1/16000     | 0010 |     return a
  0.0/0         | 0011 |
  0.1/16000     | 0012 | def bbb(b):
  0.4/16000     | 0013 |     return _sleep(b)
  0.0/0         | 0014 |
  0.1/16000     | 0015 | def ccc(c):
  0.5/16000     | 0016 |     if not os.path.exists('/proc/%d' % c): c ^= c
  0.2/16000     | 0017 |     return c
  0.0/0         | 0018 |
  0.0/0         | 0019 | print('test')
  0.0/0         | 0020 |
  0.0/1         | 0021 | def foo(a, b, c):
  0.0/1         | 0022 |     r = 0
  0.0/0         | 0023 |
  0.0/9         | 0024 |     for i in range(a):
  0.0/168       | 0025 |         for j in range(b):
  0.2/16160     | 0026 |             for k in range(c):
  0.1/16000     | 0027 |                 r += sum([
  0.6/16000     | 0028 |                     aaa(a), bbb(b),
  0.4/16000     | 0029 |                     ccc(c)
  0.0/0         | 0030 |                 ])
  0.0/0         | 0031 |
  0.0/1         | 0032 |     return r
  0.0/0         | 0033 |
  0.0/1         | 0034 | def bar(i=3):
  0.0/1         | 0035 |     return foo(i, 20, 100)

# Grep for a string in all files known to the profiler
In [43]: p.grep('def ccc')
# file[0]: /tmp/sample.py
  0.1/16000     | 0015 | def ccc(c):

# Get a list of the worst lines of code, exceeding 0.5s here, and optionally limit the search by providing a filter expression (not demonstrated here)
In [54]: jprint(p.get_filtered_profile(0.5))
[
  {
    "address": "/tmp/sample.py:6",
    "datum": {
      "duration": 17.271899461746216,
      "file_name": "/tmp/sample.py",
      "line": {
        "block_id": [
          "/tmp/sample.py",
          "_sleep",
          5
        ],
        "duration": 14.248581409454346,
        "hits": 16000,
        "line": "    time.sleep(t / 1000000.0)",
        "line_no": 6,
        "line_profile": []
      }
    }
  },
  {
    "address": "/tmp/sample.py:28",
    "datum": {
      "duration": 17.271899461746216,
      "file_name": "/tmp/sample.py",
      "line": {
        "block_id": [
          "/tmp/sample.py",
          "foo",
          21
        ],
        "duration": 0.6394095420837402,
        "hits": 16000,
        "line": "                    aaa(a), bbb(b),",
        "line_no": 28,
        "line_profile": [
          {
            "callee_file": "/tmp/sample.py",
            "callee_line": 12,
            "callee_name": "bbb",
            "duration": 15.001068115234375,
            "hits": 16000
          },
          {
            "callee_file": "/tmp/sample.py",
            "callee_line": 9,
            "callee_name": "aaa",
            "duration": 0.20499634742736816,
            "hits": 16000
          }
        ]
      }
    }
  }
]

vpelletier · 2016-07-29T00:24:42Z

pprofile.py

@@ -883,14 +962,37 @@ def runpath(path, argv, filename=None, threads=True, verbose=False):
    """
    _run(threads, verbose, 'runpath', filename, path, argv)

-_allsep = os.sep + (os.altsep or '')


The advantage of defining on module is to compute once on module import rather than once per call.

Yes, what you say is true, in the general case. In the specific case here however - given that this code is executed at most once per execution of pprofile (correct me if I'm wrong) - then shouldn't readability/style take a higher precedence?

Basically unless a profiler tells me something is significantly faster in the less good-looking style, I opt for the good-looking style :)

Tastes... :)

Another aspect is that this changes a piece of code which does not have to change for the new feature. Like the s/close/close_fn/ change in main. This reduces (admittedly by a trivial amount here) patch (and later, VCS history) readability. I'm not opposed to style patches (although they can quickly degenerate in style wars), but I do not like patches which touch both functionality and style outside of that functionality.

Well, I can't argue with that; 👍

vpelletier · 2016-07-29T01:17:41Z

I commented on minor, main-feature-independent issues in the code.

On the main feature, I am a bit worried about the impact on forward compatibility: I wonder how dicts of lists of dicts (etc) work as an API.

For example, line_profile['block_id'] has at least 3 formats:

module, None, None
module, None, ''
module, func, lineno

The 3rd item can take 3 types depending on some internal details. Caller code will look like a 3-branch "if", and if it does not it is likely to fail when profiling another source code. Is caller supposed to treat this as a black box ? How to document this ? How to prevent naïve introspection of this value ?

From a quick look at the pprofiler.py file in your branch, maybe this is what should get exposed instead: it would subclass profiling classes and access the needed internal details directly (_FileTiming instances). Maybe it could be named ipprofile.py to express the interactive usage intention (which softens the constraints on forward compatibility).

nima · 2016-07-29T16:12:28Z

On block_id - this wasn't a successful venture and I'm going to remove it. I was trying provide the user with an abstract view of the code, so that they can run queries like "what are the slowest blocks in the code?" rather than simply "what are the slowest lines". I will revert this change and push - but would be really interested to hear what your thoughts are - can we do this in a deterministic way? Rather than treating the source code as a pile of lines, can we dually profile it as a collection of objects, methods, and functions - and construct a tree? That would be super-cool!

Other than the block_id thing however, I did try and make the return object consistent - should be consistent enough that we could convert the output to JSON and police it against a JSON schema. In fact that would be a great unit test to add - I can a JSON schema (http://json-schema.org/documentation.html) for it if you agree?

So my goal here was to make get_status() the source of the profile data - then rewrite functions such as annotate() and callgring() to call this function for their input data.

Does that sounds like a bad idea?

I would be nice for the user not to have to subclass this, but simply be able to call get_stats() and do what they want to with that data - the same was as annotate() and callgrind() would. In fact, those could conceptually be split out the same way a user would be expected to use get_stats().

I'm open to changing the actual format of the object returned by get_stats() - moving towards a standard structure (either dict/list format, or an actual object (but that may slow things down) per entity. By entity I mean a line, module, function, etc.

What are your thoughts?

vpelletier · 2016-07-30T00:40:54Z

I was trying provide the user with an abstract view of the code

I would tend to do this using extra run-time introspection, like what I did in zpprofile.py when encountering specific functions (object database and relational database queries) - just generalised to all functions: from current callable's details, tell if it is bound to an instance, maybe of what class etc. Then it opens the possibility to not only track hot classes but also hot instances.

So my goal here was to make get_status() the source of the profile data - then rewrite functions such as annotate() and callgring() to call this function for their input data.

On the avoiding-code-rot level, it is a good idea.

But as I said, I think it would be better to expose the interactive-usage-friendly layer directly rather than an extra level of internal state: it is easy to defend breaking (by future evolutions - and I prefer not to presume of what evolutions will do to the internal data structures) the API in the former because caller is a human on a command line: he can quickly see something changed and adapt. But it is hard to defend breaking the API in the latter case because python structures are just too tempting to use in another program (integrating pprofile), and once caller is a program nothing auto-corrects anymore.

So I would rather:

not expose data in a new python structure (if - and that's still a big "if" to me - it should be done, maybe _FileTiming would be a better format to expose as it allows more room for backward-compatibility code than having to later fake being a dict/list and overloading __getitem__)
as a consequence, keep callgrind & annotate output generators as-is
add classes in a new module similar to zpprofile.py, which would contain interactive-friendly commands and are free to themselves access _FileTiming instances and all pprofile internals: if internal data structure changes, that code can follow easily as it is in the same repository (easy to find) and released at the same time

Quoting `vpelletier': > Well, in the case of qcachegrind, this is specifically for the > callgrind output, so it made sense to be local to that method.

Will reserve style-changes for another branch to eliminate unnecessary noise in this branch's diff: - Reverted `allsep' - Reverting `close' -> `close_fn' (style-change) - Reverted `x' -> `path' (style-change)

The `relative_path' kwarg is now mandatory, so user has to choose (since no default makes sense or is otherwise intuitive). Also updated the docstring.

Quoting `vpelletier': > This test will prevent any output in statistic mode, as no time is > tracked but only hits. I think it can be dropped and contained block > be un-indented without any other change.

vpelletier reviewed Jul 29, 2016
View reviewed changes

nima force-pushed the upstream branch from 3ee28e9 to 0f6c698 Compare August 2, 2016 16:33

nima added 5 commits August 2, 2016 16:36

New method to get profiler data as python object

4ce6231

Reverted abstraction of `qcachegrind' logic

4c586ee

Quoting `vpelletier': > Well, in the case of qcachegrind, this is specifically for the > callgrind output, so it made sense to be local to that method.

Reverted all style-only changes

27569d1

Will reserve style-changes for another branch to eliminate unnecessary noise in this branch's diff: - Reverted `allsep' - Reverting `close' -> `close_fn' (style-change) - Reverted `x' -> `path' (style-change)

Updated `_path_resolver' method prototype

d4c941c

The `relative_path' kwarg is now mandatory, so user has to choose (since no default makes sense or is otherwise intuitive). Also updated the docstring.

Updated get_stats' (dropped if total_time' test)

6ccea1f

Quoting `vpelletier': > This test will prevent any output in statistic mode, as no time is > tracked but only hits. I think it can be dropped and contained block > be un-indented without any other change.

nima force-pushed the upstream branch from 0f6c698 to 6ccea1f Compare August 2, 2016 16:37

vpelletier force-pushed the master branch 2 times, most recently from d26119a to 79f7d2c Compare July 9, 2017 01:50

vpelletier force-pushed the master branch from adf1881 to 6ac5eee Compare July 1, 2020 15:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New method to get profiler data as python object #13

New method to get profiler data as python object #13

nima commented Jul 28, 2016

vpelletier Jul 29, 2016

nima Jul 29, 2016

vpelletier Jul 30, 2016

nima Aug 2, 2016

vpelletier commented Jul 29, 2016

nima commented Jul 29, 2016 •

edited

Loading

vpelletier commented Jul 30, 2016 •

edited

Loading

New method to get profiler data as python object #13

Are you sure you want to change the base?

New method to get profiler data as python object #13

Conversation

nima commented Jul 28, 2016

vpelletier Jul 29, 2016

Choose a reason for hiding this comment

nima Jul 29, 2016

Choose a reason for hiding this comment

vpelletier Jul 30, 2016

Choose a reason for hiding this comment

nima Aug 2, 2016

Choose a reason for hiding this comment

vpelletier commented Jul 29, 2016

nima commented Jul 29, 2016 • edited Loading

vpelletier commented Jul 30, 2016 • edited Loading

nima commented Jul 29, 2016 •

edited

Loading

vpelletier commented Jul 30, 2016 •

edited

Loading