Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamical integrators #545

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

HadrienNU
Copy link

Following discussion in #259 this is a proposed implementation of the ABOBA and OBABO integrators (GJ set of integrators coming later). I have also modified the BAOAB implementation to allow matrix as friction (the gamma parameter) in the equation.

That seems to work, but I am not sure to have done everything properly (in particular to verify is gamma is a matrix or not). So let me know if there are changes to do.

Also, I would like to add the integrator to the documentation, but I am not sure of where I should propose changes.

Comment on lines +214 to +217
if is_diagonal_noise(prob)
if prob.f isa DynamicalSDEFunction
noise_rate_prototype = rate_prototype.x[1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like an odd way to put it. In other words, it's diagonal on one part of the system?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we just shouldn't classify that as diagonal noise at all, but it looks like it doesn't lead to too many required changes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The noise is zero on some part of the system, so it is assumed that the given noise only applied to the other part

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the proposed change here is to allow DynamicalSDEFunction to have non diagonal noise, as the previous implementation assume the noise was diagonal.

@ChrisRackauckas
Copy link
Member

I think the implementation looks reasonable. @frankschae could you take a look through it as well?

@HadrienNU
Copy link
Author

I have added some correction to OBABO that actually require 2 noise processes.


sim1 = analyticless_test_convergence(dts,prob1,ABOBA(gamma=γ),(1/2)^10;trajectories=Int(5e2),use_noise_grid=false)
display(sim1.𝒪est)
@test abs(sim1.𝒪est[:weak_final]-1) < 0.3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weak final order 1 seems strange to me -- aren't these methods supposed to be second-order in weak convergence?

(Having said this, based on the paper https://arxiv.org/abs/1304.3269, it seems that the error may increase when you lower the step size (Figure 3) and second-order convergence can only be seen over very small intervals (Figure 4), so I am not too worried but maybe we should still try to reproduce the plots?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that they should be 2nd order, but the numerics give me first order. I will try to reproduce the figure 4 in the paper over this smaller step size range to see what it give.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an update for this plot?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fig3
Here is already the figure 3 of the paper (OBABO here correspond to BP there). Figure 4 is running, I should get it by tomorrow

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is only 200 trajectories of 10^7 timesteps so less trajectories than the paper, but results are similar

Copy link
Author

@HadrienNU HadrienNU Oct 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fig4
And now the panel (a) and (c) of the figure 4 of the paper, I cannot go as far in stepsize since SciML detect instability at timestep starting from 0.24 and abort the computation so I went to sm
aller timestep to better see the scaling. There is less trajectories than in the paper, here 1000 of 10^7 timestep to be compared to 500 trajectories of 10^9 timestep, but the computation is quite demanding so I went for less timestep. The main features of the figure 4 are there (with BP=OBABO)

@@ -869,3 +869,16 @@ struct BAOAB{T} <: StochasticDiffEqAlgorithm
scale_noise::Bool
end
BAOAB(;gamma=1.0, scale_noise=true) = BAOAB(gamma, scale_noise)

struct ABOBA{T} <: StochasticDiffEqAlgorithm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a docstring that contains the SDE form:
Screen Shot 2023-09-19 at 11 44 02 AM

(or to the SDE solver docs https://docs.sciml.ai/DiffEqDocs/stable/solvers/sde_solve/ given that it doesn't match our common diagonal/non-diagonal noise classification exactly?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe it needs a different problem type?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already the DynamicalSDEProblem (was introduced by the previous implementation of BAOAB) that is supposed to be solved by those integrators

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I add some alg_compatible() functions to allow only DynamicalSDEProblem to be solved by Dynamical integrators?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that would be helpful in order to gate it to the right form.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realize that the DynamicalSDEProblem constructor actually return a SDEProblem so alg_comptable functions won't be useful here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it can dispatch on prob.problem_type

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I think we should use DynamicalSDEProblem and it's a good idea to use alg_compatible to make sure that only BAOAB, ABOBA, etc. can be used to solve it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it can dispatch on prob.problem_type

Not sure to understand, SDEProblem does not have a field problem_type.

@@ -19,6 +19,15 @@ g(u,p,t) = 1
sim1 = analyticless_test_convergence(dts,prob1,BAOAB(gamma=γ),(1/2)^10;trajectories=Int(2e2),use_noise_grid=false)
display(sim1.𝒪est)
@test abs(sim1.𝒪est[:weak_final]-1) < 0.3

sim1 = analyticless_test_convergence(dts,prob1,ABOBA(gamma=γ),(1/2)^10;trajectories=Int(5e2),use_noise_grid=false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convergence tests are failing on CI -- did you try increasing the number of trajectories? 500 seems relatively small.

Copy link
Author

@HadrienNU HadrienNU Oct 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did run then locally, here are the results (1e5 trajectories from 0 to 1.5 / step size = from 1/2^6 to (1/2) ^ 3 / reference 1/2^10)
ABOBA:
convergenceABOBA

Dict{Any, Any} with 2 entries:
  :weak_final => 1.97714
  :final      => 1.00363

BAOAB:
convergenceBAOAB

Dict{Any, Any} with 2 entries:
  :weak_final => 1.95064
  :final      => 1.00363

OBABO:
convergenceOBABO

Dict{Any, Any} with 2 entries:
  :weak_final => 1.35648
  :final      => 1.01876

Asking for more trajectories or smaller timesteps ask too much memory for my computer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did increase the number of trajectories in tests and there are now passing (but longer computation time)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Maybe we should move the weak convergence tests also to the buildkite queue (@ChrisRackauckas?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes for sure.

@ChrisRackauckas
Copy link
Member

I think the implementation looks pretty good to go? @frankschae what do you think should be done with the problem types here?

display(sim1.𝒪est)
@test abs(sim1.𝒪est[:weak_final]-1) < 0.3
@test abs(sim1.𝒪est[:weak_final]-1.5) < 0.5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ambiguous as to whether the weak is 1 or 2?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The theory say it should be 2, the figure above says it should be 1, but I can change it to be 2 and increase the allowed range to pass the test

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to change it to 2, leave the allowed range as it is, and mark the test as broken (using @test_broken). Might be that it takes a lot of playing around with the dts range to see it.

@ChrisRackauckas
Copy link
Member

I just realized I forgot about these. Let's see what's going on with those convergence tests and get this merged.

Copy link
Member

@frankschae frankschae left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The perform_step functions of BAOAB, ABOBA, and OBABO look correct to me (except for c1 and c2 not being constants if dt is not constant).

c2 = sqrt(1 - alg.scale_noise*c1^2) # if scale_noise == false, c2 = 1
BAOABConstantCache(k, uEltypeNoUnits(1//2), uEltypeNoUnits(c1), uEltypeNoUnits(c2))
if typeof(alg.gamma) <: AbstractMatrix
c1 = exp(-alg.gamma*dt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c1 and c2 should be functions. They shouldn't be constants because they depend on dt. So if someone uses a non-uniform time-stepping, this gets wrong.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are chosen as constant for efficiency in the constant dt case, as it is the most common case. Should I pass them as function anyway? Knowing that I have as plan for the future to implement the case of gamma being a function of the current state, in which case c1 and c2 would be function of both dt and gamma.
The alternative is to add to the documention (to be written) "Fixed timestepping only." as in the SplitODE solvers for example (https://docs.sciml.ai/DiffEqDocs/stable/solvers/split_ode_solve/)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing "fixed time-stepping only" might not be sufficient because I think we typically only distinguish between adaptively chosen time steps and fixed time-stepping schemes. In particular, there could be situations where one has a concrete idea when dt should be small and when it should be large apriori. For instance, Frank and Moritz proposed a rescaling of the time stepping of the stochastic process in https://projecteuclid.org/journals/electronic-journal-of-statistics/volume-11/issue-1/Bayesian-estimation-of-discretely-observed-multi-dimensional-diffusion-processes-using/10.1214/17-EJS1290.full conditioned on the observation times. And we allow this behavior for fixed time stepping schemes by passing tstops only to the solve call (https://docs.sciml.ai/DiffEqDocs/stable/basics/common_solver_opts/).

It also seems relatively cheap to compute c1 and c2 -- do you have an example in mind where this matters in practice?

I think

Knowing that I have as plan for the future to implement the case of gamma being a function of the current state, in which case c1 and c2 would be function of both dt and gamma.

is another very good reason to do it in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally symplectic integrators lose the symplectic property if you adapt the time steps. Even the ODE solver ones are only fixed time step. But indeed we might as well make it safe for future work.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the efficiency question: If you have a matricial gamma, if you have to do a matrix exponential and a chlolesky at each timestep you increase the computation time by a few percent.

At least on my case, the typical use case of this type of dynamics is to infers it from larger system (molecular dynamics) that are too expensive to run for long trajectories and run the inferred dynamics for very long times, so even a few percent increase of runtime could lead to extra hours of computation.

But ok, I will implement those coefficients as function, I could always add some caching behavior latter one if this becomes a performance issue. To be honest, I have never really thought to the possibility of time-stepping (losing the symplectic property is certainly an issue in my case), thanks for the reference, I will look at it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, then it might also be enough to check this and throw an error (if nobody would ever want to/should change it)? I think

  • fixing it and throwing an error (or documenting it very explicitly maybe) or
  • allowing to adapt the time steps
    both sound good to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the efficiency question: If you have a matricial gamma, if you have to do a matrix exponential and a chlolesky at each timestep you increase the computation time by a few percent.

I can see this point. But the cholesky/exponential also needs to be recomputed for fixed-time steps when gamma depends on the state, right? (so then it only hurts a few percent in the case of large matrix-valued gamma and fixed dt -- I have no intuition if that's a common use case.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That make sense, I will implement those as function, will do it this week.

k = zero(rate_prototype.x[1])

if typeof(alg.gamma) <: AbstractMatrix
c₂ = exp(-alg.gamma*dt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not label them c1 and c2 as in the other methods?

Copy link
Author

@HadrienNU HadrienNU Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points, no particular reason, I will change this. The coefficients will be c2 for the friction term and σ for the diffusion term

src/caches/dynamical_caches.jl Outdated Show resolved Hide resolved
test/sde/sde_dynamical.jl Outdated Show resolved Hide resolved
src/algorithms.jl Show resolved Hide resolved
@HadrienNU
Copy link
Author

I change the convergence tests (and separate also the tests into convergence and interface tests).

I did not update the c1 and c2 as function as @frankschae requested yet, waiting the reply to my comment

@frankschae
Copy link
Member

Unexpected Pass
 Expression: abs(sim1.𝒪est[:weak_final] - 2) < 0.5
 Got correct result, please change to @test if no longer broken.

:-D

@HadrienNU
Copy link
Author

By the way, did you know a way to reduce the frequency of the log output?

[ Info: Monte Carlo iteration: 90088/100000

The logs start to be very heavy when reaching 100000 iterations

@frankschae
Copy link
Member

No but I think a PR would be very welcome. You could add a verbose kwarg to

https://github.com/SciML/DiffEqDevTools.jl/blob/5a4439f33ddcc5c1ca561cad125f77e01015af72/src/convergence.jl#L121

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants