- Suppose you’re working on doing a heat simulation as seen in this example.
- You tweak some parameters and suddenly…
[Show graphics]
Ugh, so frustrating.
Start printing out NaNs… this is very time-consuming.
After a while, you find a place where the NaNs are coming from. You fix the problem, but the output still doesn’t look quite right…
You find a place where a NaN disappears…
You wonder… maybe that’s not the only problem with exceptional values in your code?
Our tool does these things!
Before we go further, let me say that this story played out in the “real world” with some real researchers…
… show quotes …
Now that I’ve hopefully got you all excited about what we’ve done, let’s talk about what we’ve done. To start, let’s refresh what floating-point exceptions are. Then we’ll dive deep into some instances where we’ve used our tool. Finally, we’ll talk about how we implemented this.
- Floating-point is an approximation of the reals—the gaps introduce error
- Exceptional values are useful annoyances
- Helpful to see when something has gone wrong, but annoying because they ruin our calculations
- Can be mitigated by tweaking our arithmetic (if someone hasn’t done this before, I think this will be eye-opening)
- It can be hard to track down where these things come from—that’s what we help with
- gen, prop, kill, with one or two examples
Use the max example to illustrate simultaneously how a NaN kill can be dangerous and how FloatTracker helps us find the problem.
- CFL parameter to control simulation speed
- Tracking down where NaNs come from can help us stabilize simulations without sacrificing simulation speed
- NaN does not lead to a crash
- We’re lucky the output looks so bad—kills can quietly disrupt a program
- Walk through CSTGs that we get
Sometimes we want to know how the instability evolves over time. We are still exploring different slicing schemes, but if we slice the graphs, we can diff them!
- Started out by fuzzing NBodySimulator; turns out root cause was in OrdinaryDiffEq!
- We have some fine-grained tools to help you fuzz the places you’re interested in
- We keep a recording of when and how we inject
- Here’s the kill—fun fact, I started looking at a different kill in the logs until I looked at the CSTG!
- The smoking gun: the outer loop goes until the list of
tstops
is empty, and the inner loop pulls items out of there, but that inner loop never runs!
- Create our own types; all instances of
TrackedFloat
AbstractTrackedFloat <: AbstractFloat
- We do this for each of the built-in operators; type dispatch takes care of the rest!
- How we characterize events
- Log events: we give you some tools to control what and how much gets logged
Remember: CSTG is a separate tool.