Skip to content

Latest commit

 

History

History
92 lines (57 loc) · 3.36 KB

talk_draft_2.org

File metadata and controls

92 lines (57 loc) · 3.36 KB

Intro: the busy scientist

Finch (as an example)

  • Suppose you’re working on doing a heat simulation as seen in this example.
  • You tweak some parameters and suddenly…

[Show graphics]

Ugh, so frustrating.

Start printing out NaNs… this is very time-consuming.

After a while, you find a place where the NaNs are coming from. You fix the problem, but the output still doesn’t look quite right…

You find a place where a NaN disappears…

You wonder… maybe that’s not the only problem with exceptional values in your code?

Our tool does these things!

RxInfer: You too can be a wizard

Before we go further, let me say that this story played out in the “real world” with some real researchers…

… show quotes …

Now that I’ve hopefully got you all excited about what we’ve done, let’s talk about what we’ve done. To start, let’s refresh what floating-point exceptions are. Then we’ll dive deep into some instances where we’ve used our tool. Finally, we’ll talk about how we implemented this.

Floating-point review

Brief refresher for mostly Julia-focused people

  • Floating-point is an approximation of the reals—the gaps introduce error
  • Exceptional values are useful annoyances
    • Helpful to see when something has gone wrong, but annoying because they ruin our calculations
    • Can be mitigated by tweaking our arithmetic (if someone hasn’t done this before, I think this will be eye-opening)
  • It can be hard to track down where these things come from—that’s what we help with

How we characterize floating-point events

  • gen, prop, kill, with one or two examples

FloatTracker introduction

Live demo of adding FloatTracker to the maximum example

Use the max example to illustrate simultaneously how a NaN kill can be dangerous and how FloatTracker helps us find the problem.

ShallowWaters

Setup

  • CFL parameter to control simulation speed
  • Tracking down where NaNs come from can help us stabilize simulations without sacrificing simulation speed
  • NaN does not lead to a crash
    • We’re lucky the output looks so bad—kills can quietly disrupt a program
  • Walk through CSTGs that we get

Sometimes we want to know how the instability evolves over time. We are still exploring different slicing schemes, but if we slice the graphs, we can diff them!

OrdinaryDiffEq: Fuzzing

  • Started out by fuzzing NBodySimulator; turns out root cause was in OrdinaryDiffEq!
  • We have some fine-grained tools to help you fuzz the places you’re interested in
  • We keep a recording of when and how we inject
  • Here’s the kill—fun fact, I started looking at a different kill in the logs until I looked at the CSTG!
  • The smoking gun: the outer loop goes until the list of tstops is empty, and the inner loop pulls items out of there, but that inner loop never runs!

Making FloatTracker work

Intercept floating-point exceptions

  • Create our own types; all instances of TrackedFloat
  • AbstractTrackedFloat <: AbstractFloat
  • We do this for each of the built-in operators; type dispatch takes care of the rest!
  • How we characterize events
  • Log events: we give you some tools to control what and how much gets logged

Summarize with CSTG

Remember: CSTG is a separate tool.

Fuzzing: Inject NaNs

Using metaprogramming

Conclusion

Acknowledgments