-
-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamical integrators #545
base: master
Are you sure you want to change the base?
Conversation
if is_diagonal_noise(prob) | ||
if prob.f isa DynamicalSDEFunction | ||
noise_rate_prototype = rate_prototype.x[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems like an odd way to put it. In other words, it's diagonal on one part of the system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we just shouldn't classify that as diagonal noise at all, but it looks like it doesn't lead to too many required changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The noise is zero on some part of the system, so it is assumed that the given noise only applied to the other part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the proposed change here is to allow DynamicalSDEFunction to have non diagonal noise, as the previous implementation assume the noise was diagonal.
I think the implementation looks reasonable. @frankschae could you take a look through it as well? |
I have added some correction to OBABO that actually require 2 noise processes. |
test/sde/sde_dynamical.jl
Outdated
|
||
sim1 = analyticless_test_convergence(dts,prob1,ABOBA(gamma=γ),(1/2)^10;trajectories=Int(5e2),use_noise_grid=false) | ||
display(sim1.𝒪est) | ||
@test abs(sim1.𝒪est[:weak_final]-1) < 0.3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weak final order 1 seems strange to me -- aren't these methods supposed to be second-order in weak convergence?
(Having said this, based on the paper https://arxiv.org/abs/1304.3269, it seems that the error may increase when you lower the step size (Figure 3) and second-order convergence can only be seen over very small intervals (Figure 4), so I am not too worried but maybe we should still try to reproduce the plots?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that they should be 2nd order, but the numerics give me first order. I will try to reproduce the figure 4 in the paper over this smaller step size range to see what it give.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an update for this plot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is only 200 trajectories of 10^7 timesteps so less trajectories than the paper, but results are similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And now the panel (a) and (c) of the figure 4 of the paper, I cannot go as far in stepsize since SciML detect instability at timestep starting from 0.24 and abort the computation so I went to sm
aller timestep to better see the scaling. There is less trajectories than in the paper, here 1000 of 10^7 timestep to be compared to 500 trajectories of 10^9 timestep, but the computation is quite demanding so I went for less timestep. The main features of the figure 4 are there (with BP=OBABO)
@@ -869,3 +869,16 @@ struct BAOAB{T} <: StochasticDiffEqAlgorithm | |||
scale_noise::Bool | |||
end | |||
BAOAB(;gamma=1.0, scale_noise=true) = BAOAB(gamma, scale_noise) | |||
|
|||
struct ABOBA{T} <: StochasticDiffEqAlgorithm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add a docstring that contains the SDE form:
(or to the SDE solver docs https://docs.sciml.ai/DiffEqDocs/stable/solvers/sde_solve/ given that it doesn't match our common diagonal/non-diagonal noise classification exactly?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe it needs a different problem type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already the DynamicalSDEProblem (was introduced by the previous implementation of BAOAB) that is supposed to be solved by those integrators
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I add some alg_compatible() functions to allow only DynamicalSDEProblem to be solved by Dynamical integrators?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that would be helpful in order to gate it to the right form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realize that the DynamicalSDEProblem constructor actually return a SDEProblem so alg_comptable functions won't be useful here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it can dispatch on prob.problem_type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I think we should use DynamicalSDEProblem
and it's a good idea to use alg_compatible
to make sure that only BAOAB, ABOBA, etc. can be used to solve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it can dispatch on
prob.problem_type
Not sure to understand, SDEProblem does not have a field problem_type.
test/sde/sde_dynamical.jl
Outdated
@@ -19,6 +19,15 @@ g(u,p,t) = 1 | |||
sim1 = analyticless_test_convergence(dts,prob1,BAOAB(gamma=γ),(1/2)^10;trajectories=Int(2e2),use_noise_grid=false) | |||
display(sim1.𝒪est) | |||
@test abs(sim1.𝒪est[:weak_final]-1) < 0.3 | |||
|
|||
sim1 = analyticless_test_convergence(dts,prob1,ABOBA(gamma=γ),(1/2)^10;trajectories=Int(5e2),use_noise_grid=false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
convergence tests are failing on CI -- did you try increasing the number of trajectories? 500 seems relatively small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did run then locally, here are the results (1e5 trajectories from 0 to 1.5 / step size = from 1/2^6 to (1/2) ^ 3 / reference 1/2^10)
ABOBA:
Dict{Any, Any} with 2 entries:
:weak_final => 1.97714
:final => 1.00363
Dict{Any, Any} with 2 entries:
:weak_final => 1.95064
:final => 1.00363
Dict{Any, Any} with 2 entries:
:weak_final => 1.35648
:final => 1.01876
Asking for more trajectories or smaller timesteps ask too much memory for my computer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did increase the number of trajectories in tests and there are now passing (but longer computation time)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Maybe we should move the weak convergence tests also to the buildkite queue (@ChrisRackauckas?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes for sure.
I think the implementation looks pretty good to go? @frankschae what do you think should be done with the problem types here? |
test/sde/sde_dynamical.jl
Outdated
display(sim1.𝒪est) | ||
@test abs(sim1.𝒪est[:weak_final]-1) < 0.3 | ||
@test abs(sim1.𝒪est[:weak_final]-1.5) < 0.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ambiguous as to whether the weak is 1 or 2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The theory say it should be 2, the figure above says it should be 1, but I can change it to be 2 and increase the allowed range to pass the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to change it to 2, leave the allowed range as it is, and mark the test as broken (using @test_broken
). Might be that it takes a lot of playing around with the dts
range to see it.
I just realized I forgot about these. Let's see what's going on with those convergence tests and get this merged. |
5f84223
to
f836738
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The perform_step
functions of BAOAB, ABOBA, and OBABO look correct to me (except for c1
and c2
not being constants if dt
is not constant).
src/caches/dynamical_caches.jl
Outdated
c2 = sqrt(1 - alg.scale_noise*c1^2) # if scale_noise == false, c2 = 1 | ||
BAOABConstantCache(k, uEltypeNoUnits(1//2), uEltypeNoUnits(c1), uEltypeNoUnits(c2)) | ||
if typeof(alg.gamma) <: AbstractMatrix | ||
c1 = exp(-alg.gamma*dt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c1
and c2
should be functions. They shouldn't be constants because they depend on dt
. So if someone uses a non-uniform time-stepping, this gets wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are chosen as constant for efficiency in the constant dt case, as it is the most common case. Should I pass them as function anyway? Knowing that I have as plan for the future to implement the case of gamma being a function of the current state, in which case c1 and c2 would be function of both dt and gamma.
The alternative is to add to the documention (to be written) "Fixed timestepping only." as in the SplitODE solvers for example (https://docs.sciml.ai/DiffEqDocs/stable/solvers/split_ode_solve/)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Writing "fixed time-stepping only" might not be sufficient because I think we typically only distinguish between adaptively chosen time steps and fixed time-stepping schemes. In particular, there could be situations where one has a concrete idea when dt
should be small and when it should be large apriori. For instance, Frank and Moritz proposed a rescaling of the time stepping of the stochastic process in https://projecteuclid.org/journals/electronic-journal-of-statistics/volume-11/issue-1/Bayesian-estimation-of-discretely-observed-multi-dimensional-diffusion-processes-using/10.1214/17-EJS1290.full conditioned on the observation times. And we allow this behavior for fixed time stepping schemes by passing tstops
only to the solve call (https://docs.sciml.ai/DiffEqDocs/stable/basics/common_solver_opts/).
It also seems relatively cheap to compute c1
and c2
-- do you have an example in mind where this matters in practice?
I think
Knowing that I have as plan for the future to implement the case of gamma being a function of the current state, in which case c1 and c2 would be function of both dt and gamma.
is another very good reason to do it in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally symplectic integrators lose the symplectic property if you adapt the time steps. Even the ODE solver ones are only fixed time step. But indeed we might as well make it safe for future work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the efficiency question: If you have a matricial gamma, if you have to do a matrix exponential and a chlolesky at each timestep you increase the computation time by a few percent.
At least on my case, the typical use case of this type of dynamics is to infers it from larger system (molecular dynamics) that are too expensive to run for long trajectories and run the inferred dynamics for very long times, so even a few percent increase of runtime could lead to extra hours of computation.
But ok, I will implement those coefficients as function, I could always add some caching behavior latter one if this becomes a performance issue. To be honest, I have never really thought to the possibility of time-stepping (losing the symplectic property is certainly an issue in my case), thanks for the reference, I will look at it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, then it might also be enough to check this and throw an error (if nobody would ever want to/should change it)? I think
- fixing it and throwing an error (or documenting it very explicitly maybe) or
- allowing to adapt the time steps
both sound good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the efficiency question: If you have a matricial gamma, if you have to do a matrix exponential and a chlolesky at each timestep you increase the computation time by a few percent.
I can see this point. But the cholesky/exponential also needs to be recomputed for fixed-time steps when gamma depends on the state, right? (so then it only hurts a few percent in the case of large matrix-valued gamma and fixed dt -- I have no intuition if that's a common use case.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That make sense, I will implement those as function, will do it this week.
src/caches/dynamical_caches.jl
Outdated
k = zero(rate_prototype.x[1]) | ||
|
||
if typeof(alg.gamma) <: AbstractMatrix | ||
c₂ = exp(-alg.gamma*dt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not label them c1 and c2 as in the other methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points, no particular reason, I will change this. The coefficients will be c2 for the friction term and σ for the diffusion term
I change the convergence tests (and separate also the tests into convergence and interface tests). I did not update the c1 and c2 as function as @frankschae requested yet, waiting the reply to my comment |
Unexpected Pass
Expression: abs(sim1.𝒪est[:weak_final] - 2) < 0.5
Got correct result, please change to @test if no longer broken. :-D |
By the way, did you know a way to reduce the frequency of the log output?
The logs start to be very heavy when reaching 100000 iterations |
No but I think a PR would be very welcome. You could add a |
Following discussion in #259 this is a proposed implementation of the ABOBA and OBABO integrators (GJ set of integrators coming later). I have also modified the BAOAB implementation to allow matrix as friction (the gamma parameter) in the equation.
That seems to work, but I am not sure to have done everything properly (in particular to verify is gamma is a matrix or not). So let me know if there are changes to do.
Also, I would like to add the integrator to the documentation, but I am not sure of where I should propose changes.