Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect instrumentation data from correct dir. #45

Merged
merged 4 commits into from
Apr 20, 2018
Merged

Conversation

snim2
Copy link
Contributor

@snim2 snim2 commented Apr 19, 2018

This fixes a bug (Issue #43) where VM instrumentation data could not be drawn with changepoints or outliers, due to the way we were guessing the name of the directory containing the instrumentation
data.

In the new code, we look for all files or directories named *_instr_data, filter out any that are not directories, and filter out any that do not have a prefix common to the filename of the benchmarking data. i.e. if the input filename is: warmup_outliers_w200_changepoints.json.bz2 then we would accept directores such as warmup_instr_data/ or warmup_whatever_instr_data/

If there is more than one candidate directory, we issue a warning to the user and pick one at random.

Example (with outliers):
test_issue43.pdf

Fixes #43

@snim2 snim2 requested a review from vext01 April 19, 2018 14:15
@snim2
Copy link
Contributor Author

snim2 commented Apr 19, 2018

It looks like this fix interacts oddly with --xlimits, sometimes the instrumentation data isn't drawn, sometimes changepoints are missing. More commits coming up...

@snim2
Copy link
Contributor Author

snim2 commented Apr 19, 2018

./bin/plot_krun_results -o test3.pdf --with-outliers -w 150 --with-changepoint-means --xlimits 2,1500 openresty_results_outliers_w150_changepoints.json.bz2

results in:

Traceback (most recent call last):
  File "../bin/plot_krun_results", line 1643, in <module>
    inset_xlimits=options.inset_xlimits)
  File "../bin/plot_krun_results", line 361, in main
    core_cycles, cycles_ylimits, inset_xlimits)
  File "../bin/plot_krun_results", line 986, in draw_page
    p_exec_charts[index].plot_stack()
  File "../bin/plot_krun_results", line 523, in plot_stack
    self.plot_wallclock_times()
  File "../bin/plot_krun_results", line 587, in plot_wallclock_times
    self._plot_steady_equivalent(self.wallclock_axis)
  File "../bin/plot_krun_results", line 672, in _plot_steady_equivalent
    if (self.segment_means[0] + self.segment_vars[0] >= lower_bound and
AttributeError: 'ProcessExecChart' object has no attribute 'segment_vars'

@snim2
Copy link
Contributor Author

snim2 commented Apr 20, 2018

And a related bug, the steady-equivalent plots don't quite work in the insets when the user passes in --xlimits

screenshot from 2018-04-20 12-03-31

# then we expect the VM instrumentation files to be in a directory
# called `warmup_instr_data/` If there is more than one data named
# *_instr_data/ then we issue a warning to the user and chose one a
# random.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would an easier approach be to have a --instr-dir switch on the command line? Then the user can put the instrumentation data in a directory named as they wish.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible. Since @fsfod is our main user at the moment, I'm happy to go with whatever he thinks is sensible. If we do it this way though, I'll probably split the other fixes into a separate PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe keep it automatic with but the command line an optional override for weird situations

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you wish, but let's try to keep it as simple as possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I think we either need to do one or the other. It would be really baffling for a user who doesn't pass in the CLI option to suddenly find that we have picked up data from a directory when they didn't expect it. Any opinions @ltratt ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does sound like a case where an explicit option is the least surprising thing to do.

@snim2
Copy link
Contributor Author

snim2 commented Apr 20, 2018

OK, I have recorded the xlimits bugs in Issue #46 and a fix will be in a later PR, not this one. Commits to resolve the instr data directory issue coming up...

@snim2
Copy link
Contributor Author

snim2 commented Apr 20, 2018

OK, the latest commits work for me, we now pass in --instr-dir DIRECTORY to plot instrumentation data. Note that this doesn't fix the issue with --xlimits, which will come in a separate PR.

@snim2
Copy link
Contributor Author

snim2 commented Apr 20, 2018

Needs squashing BTW

@fsfod
Copy link
Contributor

fsfod commented Apr 20, 2018

Should bin/warmup_stats not also be changed to support --instr-dir?

@snim2
Copy link
Contributor Author

snim2 commented Apr 20, 2018

That's a good point...

@snim2
Copy link
Contributor Author

snim2 commented Apr 20, 2018

OK, this works for me, and I've fixed another small bug. Please do check before we squash though!

@fsfod
Copy link
Contributor

fsfod commented Apr 20, 2018

--instr-dir for both warmup_stats and plot_krun_results works for me.

print('No VM instrumentation data is available.')
else:
if not os.path.exists(options.instr_dir):
fatal_error('%s (VM instrumentation data directory) does not exist.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing format argument here?

if not os.path.exists(options.instr_dir):
fatal_error('%s (VM instrumentation data directory) does not exist.')
elif not os.path.isdir(options.instr_dir):
fatal_error('%s (VM instrumentation data directory) is not a directory.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing format argument here as well

@snim2
Copy link
Contributor Author

snim2 commented Apr 20, 2018

Ready to squash?

@fsfod
Copy link
Contributor

fsfod commented Apr 20, 2018

yes you can squash

Sarah Mount added 4 commits April 20, 2018 16:07
This sets a directory containing VM instrumentation data. A
parser for the data must be present in warmup.vm_instruments in
order for the data to be charted.
Only useful for generating plots with VM instrumentation data.
Fixes a bug where the window size of a set of benchmarks was not
stored if the user passed in a Krun JSON results file which
included outliers.
@snim2
Copy link
Contributor Author

snim2 commented Apr 20, 2018

OK, squashed!

@fsfod fsfod merged commit 56e1f3b into master Apr 20, 2018
@fsfod fsfod deleted the fix-issue-43 branch April 20, 2018 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Instrumentation data assumes no changepoints or outliers
4 participants