-
-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a fast path for evenly-spaced abscissae #198
Conversation
Linearly equispaced data coordinates: using BenchmarkTools, DataInterpolations
# Equispaced
begin
exec_times = map(3:8) do x
N = 10^x
u = rand(N)
tmax = exp(rand() * 10)
t = (collect(1:N) .- 1) ./ (N - 1) .* tmax
li = LinearInterpolation(u, t)
b = @benchmark $li(rand() * $tmax)
mean(b.times)
end
@show exec_times
end
# Log-distributed
begin
logdist_times = map(3:8) do x
N = 10^x
u = rand(N)
tmax = exp(rand() * 10)
t = (exp.((collect(1:N) .- 1) ./ (N - 1)) .- 1)
t .*= (tmax / t[end])
li = LinearInterpolation(u, t)
b = @benchmark $li(rand() * $tmax)
mean(b.times)
end
@show logdist_times
end
# Irregularly distributed
begin
randist_times = map(3:8) do x
N = 10^x
u = rand(N)
tmax = exp(rand() * 10)
tdiff = rand(N)
t = cumsum(tdiff .- first(tdiff))
t .*= (tmax / t[end])
li = LinearInterpolation(u, t)
b = @benchmark $li(rand() * $tmax)
mean(b.times)
end
@show randist_times
end
# ranges
begin
range_times = map(3:8) do x
N = 10^x
u = rand(N)
tmax = exp(rand() * 10)
t = ((1:N) .- 1) ./ (N - 1) .* tmax
li = LinearInterpolation(u, t)
b = @benchmark $li(rand() * $tmax)
mean(b.times)
end
@show range_times
end |
It's clear that the log-distributed case is quite bad for the PR with large amounts of data (or, potentially, just a smaller cache; I'm on a Ryzen 7800 X3D, which is an outlier in that sense), probably because it has to look up the coordinate array during both the bracketing and the |
that would be good. It would be nice if during construction time there was a type you can set to choose different search behaviors. That would also be a way to balance it. |
For benchmarks see: * SciML/DataInterpolations.jl#198 * SciML/DataInterpolations.jl#147 > While the cost of constructing the interpolator does not change, tracking the last index results in a best-case speedup of ~2.7x for CubicSpline, when successive values are close together (and a little higher for simpler interpolators). In the worst case (where successive values are always on opposite ends of the vectors), it can result in a ~15% slowdown due to the unhelpful expanding binary search at the beginning. However, the original approach of not tracking the index at all is also still available; it now involves essentially one extra if statement, which seems to be lost in the timing noise.
One note is that this PR should get updated to call the functionality which has moved to https://github.com/SciML/FindFirstFunctions.jl. As a way to finish it, what about the following. Calculate using Statistics
var(diff(1:6)) # 0.0
var(diff(exp.(1:6))) # 10773.646024936865 we would just need a tunable cutoff point here then, and we can flip on "fastmode" if the constructed interpolation satisfied the criteria. |
Sounds good (I was similarly thinking on using a threshold based on testing with randomly-spaced abscissae, though not based on variance). I have some doubts regarding implementation: do you think it would be best to add a field or type parameter to every interpolant cache recording that test, or do you envision some alternative? |
I think a field with a bool is fine. Runtime checking that is pretty cheap. |
Alright, will try to reopen this soon |
Revisiting this topic, I added a simple mechanism along the lines of what we discussed. In particular:
Right now I have only added this mechanism in the (Note that these benchmarks are with a different computer than the previous ones). |
@DaniGlez I like the concept, although I cannot make any assumptions about the distribution of |
Yeah, there are indeed false positives with this criterion; I felt that having minimal overhead with a cheap criterion is worth some false positives (particularly when you can manually opt-out at construction time) but it's absolutely up for debate. |
For a switchable mode, doing |
Alright, I made a new proposal, it's similar to the variance one but I take the different w.r.t. the straight line, which seems more directly relevant (you could build a contrived counterexample to straightforward |
Seems like the right path to me, though |
Ah, yeah, the flag name slipped. I'll change it later and apply it to other interpolant caches if everyone is happy with this. |
Alright, finally got back to finishing this with remaining caches and suggestions. The docstring component is maybe a bit too wordy but everything seems to be working fine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments. Can you add a few tests which demonstrate the speedup mentioned (with regular spaced t) in comments in the PR to make sure everything is working correctly?
I've incorporated your suggestions, @sathvikbhagavan. I'm a bit unsure about the benchmark, because it's a bit expensive as a test (a few seconds) compared to most of the test suite and CPU timings are non-deterministic (thus, the test might fail because of some random stuff). I also have a question regarding the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of small comments.
I had a quick question - should this not be used in Lagrange Interpolation or even if it is used, it wouldn't make much difference?
Last step, run the julia formatter.
In Lagrangian interpolation, each datapoint contribuyes a term towards the result, thus there is no need to call |
Not sure what's wrong with the format of |
Add a newline at the end of file. |
I had thought that would not be an issue since the live version https://github.com/SciML/DataInterpolations.jl/blob/master/docs/src/manual.md?plain=1 seems to not have it either, let's see now |
Amazing, thanks for seeing this through. It would be great to see if we could reuse this as well with the diffeq interpolations, but that can wait. |
Do you mean, using DataInterpolations in DiffEqBase for interpolations? If so, are there any methods missing or its just the matter of trying it out? |
I mean, for the special interpolations for the ODE solvers, there is a similar search phase to find where to interpolate, and that just uses a naive findfirst. There's now a lot of optimizations used here to find the right time more quickly, and it would be nice to get those used there as well. |
Maybe move the functionality to |
Possibly yeah, since this is really a smarter findfirst. |
yeah, abstracting this in findfirst would be better. |
Currently, all interpolation methods but Lagrangian perform a two-step procedure to determine the index of the datapoints / spline weights to use in the corresponding expression: first, a call to
bracketstrictlymontonic
to get a bracket of the viable indices and then a call tosearchsortedfirst
(respectively,searchsortedlast
) using the bracket as an initial guess¹.bracketstrictlymontonic
performs an accelerating expansion of the bracket from a size 1 integer interval to the corresponding side; however, this behavior is implicitly disabled right now, because the provided initial guess is0
which triggers a bailout path in which the function essentially does nothing.The current PR enables this behavior with a linear guess. This is a "greedy" approach that aims to provide faster performance for cases where the data is equispaced (many cases, in practice) or nearly so, with hopefully a small penalty otherwise. The first goal is clearly met by the PR (puzzingly, it's even faster than the range shortcut added in #192, not sure why really), while the second one is not, at the moment (see benchmarks in the subsequent comment).
¹ As an additional note, looking into
Base
code it seems that the bracket is indeed expected to contain the "correct" solution, which would require a stronger guarantee than the one given in thebracketstrictlymontonic
docstring (its behavior might provide it, though? not 100% sure).