Skip to content
This repository has been archived by the owner on May 14, 2020. It is now read-only.

ODE/DAE solver interface #5

Closed
1 of 4 tasks
mauro3 opened this issue Oct 31, 2016 · 155 comments
Closed
1 of 4 tasks

ODE/DAE solver interface #5

mauro3 opened this issue Oct 31, 2016 · 155 comments

Comments

@mauro3
Copy link
Contributor

mauro3 commented Oct 31, 2016

I think it would be great if all the ODE/DAE solvers could be called through the same interface. This would essentially mirror the approach of https://github.com/JuliaOpt/MathProgBase.jl . Would it be worth trying to hammer out this interface in this road-map issue?

Edit: By "same interface" I don't mean that ODE and DAE solvers have the same interface. But that all ODE solvers are wrapped to conform to the same interface (similarly all DAE, and maybe all IMEX, etc). So, Sundials.jl, ODE.jl, etc. would then depend on ODEProgBase (or probably DiffEqBase) and extend some solve function there, taking as input some problem specification type. Likewise all higher level Diff-Eq stuff (currently DifferentialEquations.jl and IVPTestSuite.jl) would program against this interface.

TODO

Specify:

  • solve signature: solve(p::Problem, ::Type{Alg}; kwargs...)
  • Problem types
  • Solution types
  • default options
  • for callback options see Callback Interface #10
@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Oct 31, 2016

I am not sure how you'd do this or if it's possible. DAE solvers just require more information: (t,u,du,out) or whatever you want to name them. There's no reason for the fourth argument for an ODE solver, but the fourth argument is what the DAE solver is "zeroing". You can't make the DAE solvers in-place either because Sundials requires the in-place style (we wrap it inside the simple interface so that f! = (t,u,du,out) -> (out[:] = f(t,u,du,out)), but now the benefits of in-place updates wouldn't be available). Or we could have ODE functions take 4 arguments and ignore 1: I don't see that as a good idea because it would be slightly confusing.

You could make them all f(t,u,du) and have the DAE solver be a call overloaded type which write the output in-place to f.out, and use that to build the function for Sundials. However, then using that would require users to write the function as both in-place and as a call overloaded type which is really asking a lot unless you essentially have the main option being a DSL with macros. Also this would put heavy limitations on the usage: you wouldn't be able to use something like ForwardDiff because the restrictions on the type parameters for f.out (which would have to be strict for performance) would be too strict to allow DualNumbers (this is why ParameterizedFunctions.jl ends up generating extra "parameter explicit" functions to allow for explicit parameter derivatives in sensitivity analysis).

@mauro3 mauro3 changed the title ODE/DAE solver wrapping ODE/DAE solver interface Oct 31, 2016
@mauro3
Copy link
Contributor Author

mauro3 commented Oct 31, 2016

Sorry, I should have been clearer, I updated above post.

@ChrisRackauckas
Copy link
Member

There is already a unified interface in DifferentialEquations.jl. The only difference is that the problem object needs to be specified slightly differently (the function specification difference noted above, and DAEs need an initial condition for the derivative). You can compare this two pages (note the DAE docs here a little bit behind because of the modularization changes):

http://juliadiffeq.github.io/DifferentialEquations.jl/latest/tutorials/ode_example.html
http://juliadiffeq.github.io/DifferentialEquations.jl/latest/tutorials/dae_example.html

All of the other analysis and plotting functions work the same (I don't have the overloaded interpolations on the DAEs because I need to make a special output for Sundials for that though) since this is all handled (in the newest modularization change) in DiffEqBase. The same interface/API and functionality is then provided to SDEs, PDEs, etc, and will be provided to DDEs by the end of next month. I think I've shown that it's a good and extendable solution, though it's likely in need of dev docs (which I'm adding soon now that the modularization split happened).

If someone wants to take a different approach, that's fine. I don't know what exactly you would gain (or what a different approach would look like) but offshoots of experimental approaches aren't harmful.

@mauro3
Copy link
Contributor Author

mauro3 commented Oct 31, 2016

Yes, I know, the interface of DifferentialEquations.jl resides in https://github.com/JuliaDiffEq/DiffEqBase.jl. Here I'm wondering whether it would make sense to declare it, or some variation thereof as the "community standard". Packages like Sundials.jl, OrdinaryDiffEq.jl, ODEInterface.jl, ODE.jl, etc. could then implement this interface (or make a wrapper package) and this would open up their use elsewhere. Also making this a "standard" would make them more accessible from outside the DifferentialEquations.jl-universe. For me as a user this would be useful, I'd depend on DiffEqBase.jl and could call into all solvers. I know that currently I can do this with DifferentialEquations.jl, but usually I prefer hooking into a lower level package.

Then, on a personal note, for me as a contributor, this would also help to alleviate my feeling that DifferentialEquations.jl is steamrolling all ode related stuff, that any contributions of mine would be greeted by a "I'll wrap this in DifferentialEquations.jl" and just end up being an insignificantly tiny cog in its machinery. If I could instead program against such an interface, I would probably feel a bit more independent and motivated to contribute. (Whilst it would still be immediately usable to DifferentialEquations.jl) Of course, I realize that my personal feelings should not be an indication of how to take JuliaDiffEq forward, even more so as the numbers and quality of my contributions will stay modest at best. At the end the best & best-maintained software should and will be the "winner".

Last, maybe it's still too early to do this. DifferentialEquations.jl is young, so is the iterator interface in ODE.jl. Thus it may not be clear yet what a useful interface would be.

@ChrisRackauckas
Copy link
Member

Here I'm wondering whether it would make sense to declare it, or some variation thereof as the "community standard".

The thing is, I don't know what right I, or anyone else, would have to declare a "community standard". What I can do is build everything I do off of this standard, document it, and personally recommend it. I am at least building it as a standard type hierarchy and interface for everything that I am doing. If there are ways to to make this a more useful interface in a broader sense, let's discuss it.

On that note Sundials.jl is undergoing an API change. I was trying to keep it "as Sundials-y" as possible (i.e. not introduce problem/solution types) but am realizing that to take full advantage of Sundials I am going to want at least a special solution type. To handle this I was going to write an alternative Sundials API using DiffEqBase.jl things, and use this through OrdinaryDiffEq.jl so as to not affect the current Sundials API too much.

Then, on a personal note, for me as a contributor, this would also help to alleviate my feeling that DifferentialEquations.jl is steamrolling all ode related stuff, that any contributions of mine would be greeted by a "I'll wrap this in DifferentialEquations.jl" and just end up being an insignificantly tiny cog in its machinery.

I hope that a small contribution isn't seen as insignificant. If it's the right algorithm for the right problem then it's a significant contribution for that purpose. I plan on making sure Julia has (optimized) implementations available in DifferentialEquations.jl for every algorithm that I know of or can find, so in that sense any algorithm is a small but meaningful contribution.

The thing I really need to think about is how to deal with existing implementations which I think are not optimized and may need to be re-written. For example, the GSoC project is written in a variable-coefficient format instead of a Nystrom format which is known to not be as well-optimized (and it uses a lot of mutable data structures in the inner loops). There's a double-edged sword here that I am going to be implementing a Nystrom version of ABM methods in order to have a well-optimized version available, but yes it would be in some sense stepping on another's toes (and the change would pretty much require a re-design of the current ABM anyways). I do see how this could build a culture against contributions and a "ehh just let Chris do it attitude" which is not good for the me or the community because, even though I have been pretty productive, this has a huge scope and I do need help.

Last, maybe it's still too early to do this. DifferentialEquations.jl is young, so is the iterator interface in ODE.jl. Thus it may not be clear yet what a useful interface would be.

Indeed, that's why I'm taking a kind of brute-force tactic of "build it all and see what sticks". However, maybe we can hammer out a type-structure which works for all of us. Maybe this should then move over to DiffEqBase.jl, but here's some thoughts on that:

  • It may be better in many cases to define an interface over abstract types. For example, the plot recipes work on AbstractODESolution types, and AbstractSDESolution <: AbstractODESolution and contain fields t and timeseries so they plot the same. A clear definition of the interface would allow people to write tools over them, or get all of this stuff "for free".
  • The problem types I use are pretty simple, see this. Maybe that could go over to DiffEqBase.jl with some modifications (add places for Jacobians and Hessians? I think that the Jacobian and Hessians are properties of the function, and should thus be interfaced as a dispatch on the function itself like in ParameterizedFunctions.jl. Because we are using such a powerful language, we should also allow for symbolically calculated inverse/pre-factorized Jacobians and Hessians which again ParameterizedFunctions.jl already does, and so I think it's another case where an interface through overloading the function makes it easier for people to use this without the extra machinery if it doesn't exist)
  • The Solution objects definitely have to be tailored to the solver output. Think about how for Sundials the mem type has to be stored and all interpolations need to be built for that, which is completely different than native algorithm dense output. Another case where this crops up is that with Hermite interpolations you want to store the derivative at each timepoint, whereas for higher order interpolations you want a vector of derivatives at evaluation points for each timepoint, so both what is stored and the algorithm on it are very different. What we could do is just define an abstract interface here, where for an AbstractODESolution, dense==true means that the object has its call overloaded with the interpolating function, that is sol(t) is the continuous solution at time t. This is already pretty much standardized, and other interfaces can be documented like that. Then, even if the implementation are very different for solutions of different algorithms, and algorithm written on an AbstractODESolution (like parameter estimation or sensitivity analysis) can know to just use sol(t), no matter the algorithm.

And the real biggie:

  • ode_solve.jl in OrdinaryDiffEq.jl is a mess. I am not sure using alg=:DP5 and what not for choosing the algorithm was the right idea. If this was instead based off of types and dispatch (with a smart tolerance and specifier-based default dispatch using a plan_ode method, or just a dispatch from a DefaultODEAlgorithm), then conditional dependencies wouldn't be necessary, and would could just add new algorithms to OrdinaryDiffEq.jl in their own separate package by adding a new dispatch (you can already do this, but you would currently need to add a lot of stuff to internal dictionaries). If fixing up this algorithm selection mechanism would help bring in contributors, then this will get bumped up the priority list.

All in all, the modularization changes were really to help pull in the possibility for others to contribute. DifferentialEquations.jl was undergoing some pretty large changes daily which would have made it a little bit tough to contribute, but with it being modular now, at least some of the packages during some timeframes will be more stable (I'll actually be more productive now that this is transitioning from being a side project I work on at night. Julia and JuliaDiffEq is now part of my research instead of a distraction, and so I can easily double my amount of development time towards this stuff now. So it's moreso that certain packages will be stable since I am concentrating more time elsewere, mostly with stochastic ODEs/PDEs and inverse problems. I never intended to work on ODEs, but I just needed something which was fast enough for the PDEs I was solving). I hope the stability makes it easier for others to contribute, and I hope to have a load of dev docs explaining a lot of what I've done.

@mauro3
Copy link
Contributor Author

mauro3 commented Nov 1, 2016

Here I'm wondering whether it would make sense to declare it, or some variation thereof as the "community standard".

The thing is, I don't know what right I, or anyone else, would have to declare a "community standard". What I can do is build everything I do off of this standard, document it, and personally recommend it. I am at least building it as a standard type hierarchy and interface for everything that I am doing. If there are ways to to make this a more useful interface in a broader sense, let's discuss it.

I think that is fine as a community standard. I guess one thing is that the interface ought to be stable-ish, which sounds like you're heading towards.

On that note Sundials.jl is undergoing an API change. I was trying to keep it "as Sundials-y" as possible (i.e. not introduce problem/solution types) but am realizing that to take full advantage of Sundials I am going to want at least a special solution type. To handle this I was going to write an alternative Sundials API using DiffEqBase.jl things, and use this through OrdinaryDiffEq.jl so as to not affect the current Sundials API too much.

Yes, it would be just fine for solvers to have their own interface too. Although, one indication of a good the standard interface would be that most adapt it as the only interface eventually.

The thing I really need to think about is how to deal with existing implementations which I think are not optimized and may need to be re-written. For example, the GSoC project is written in a variable-coefficient format instead of a Nystrom format which is known to not be as well-optimized (and it uses a lot of mutable data structures in the inner loops). There's a double-edged sword here that I am going to be implementing a Nystrom version of ABM methods in order to have a well-optimized version available, but yes it would be in some sense stepping on another's toes (and the change would pretty much require a re-design of the current ABM anyways). I do see how this could build a culture against contributions and a "ehh just let Chris do it attitude" which is not good for the me or the community because, even though I have been pretty productive, this has a huge scope and I do need help.

That is fine, no worries. So all the ODE solvers you're writing would go into DiffEqODE.jl (or DAE), right?

Last, maybe it's still too early to do this. DifferentialEquations.jl is young, so is the iterator interface in ODE.jl. Thus it may not be clear yet what a useful interface would be.

Indeed, that's why I'm taking a kind of brute-force tactic of "build it all and see what sticks". However, maybe we can hammer out a type-structure which works for all of us. Maybe this should then move over to DiffEqBase.jl, but here's some thoughts on that:

  • It may be better in many cases to define an interface over abstract types. For example, the plot recipes work on AbstractODESolution types, and AbstractSDESolution <: AbstractODESolution and contain fields t and timeseries so they plot the same. A clear definition of the interface would allow people to write tools over them, or get all of this stuff "for free".

Yes, that sounds right. Maybe it would make sense to have a standard type which has one field which can be used freely:

type ODESolution{T,Y,O}
    t::T # times
    y::Y # solution
    ...
    other::O # put whatever else is needed for a particular solver in-here
end
  • The problem types I use are pretty simple, see this. Maybe that could go over to DiffEqBase.jl with some modifications (add places for Jacobians and Hessians? I think that the Jacobian and Hessians are properties of the function, and should thus be interfaced as a dispatch on the function itself like in ParameterizedFunctions.jl. Because we are using such a powerful language, we should also allow for symbolically calculated inverse/pre-factorized Jacobians and Hessians which again ParameterizedFunctions.jl already does, and so I think it's another case where an interface through overloading the function makes it easier for people to use this without the extra machinery if it doesn't exist)

I like simple. Julia's powers make it easy to come up with too complicated solutions. Having said this, I'll need to check ParameterizedFunctions.jl again as that did seem quite complicated to me.

  • The Solution objects definitely have to be tailored to the solver output. Think about how for Sundials the mem type has to be stored and all interpolations need to be built for that, which is completely different than native algorithm dense output. Another case where this crops up is that with Hermite interpolations you want to store the derivative at each timepoint, whereas for higher order interpolations you want a vector of derivatives at evaluation points for each timepoint, so both what is stored and the algorithm on it are very different. What we could do is just define an abstract interface here, where for an AbstractODESolution, dense==true means that the object has its call overloaded with the interpolating function, that is sol(t) is the continuous solution at time t. This is already pretty much standardized, and other interfaces can be documented like that. Then, even if the implementation are very different for solutions of different algorithms, and algorithm written on an AbstractODESolution (like parameter estimation or sensitivity analysis) can know to just use sol(t), no matter the algorithm.

In all types there needs to be room for custom fields. Letting t use to get the solution at time t looks good. Maybe use [] to get it though to mirror Interpolations.jl?

And the real biggie:

  • ode_solve.jl in OrdinaryDiffEq.jl is a mess. I am not sure using alg=:DP5 and what not for choosing the algorithm was the right idea. If this was instead based off of types and dispatch (with a smart tolerance and specifier-based default dispatch using a plan_ode method, or just a dispatch from a DefaultODEAlgorithm), then conditional dependencies wouldn't be necessary, and would could just add new algorithms to OrdinaryDiffEq.jl in their own separate package by adding a new dispatch (you can already do this, but you would currently need to add a lot of stuff to internal dictionaries). If fixing up this algorithm selection mechanism would help bring in contributors, then this will get bumped up the priority list.

We were/are looking at this over in PR49 and we do dispatch on the solver-type because of the concern you expressed above. Our strategy is the following: a problem is a pair: (IVP-specification, integrator), which corresponds to your (ODEProblem, ODEIntegrator). In our case the types are called (IVP, AbstractSolver) and wrapped in a Problem type. (Note, a bit confusing: what you call "Problem" is our IVP-specs only, whereas we call the pair IVP+integrator the "Problem"). See here.

The thing we noted then is that the integrator instance (<:AbstractSolver) can only be made once the IVP is specified as its type-parameters depend on the parameters of IVP (say time is in Float32). Thus to create an instance of a Solver you need to pass it the IVP-instance: Solver(ivp; opts...) and thus Problem(ivp, Solver(ivp; opts...)). (And the same goes for the structure holding the options as its type parameters also depend on the IVP-parameters and solver parameters. This also goes in direction of addressing your plan_ode concern.)

Thus a integrator needs to provide the constructor Solver(ivp; opts...) and then it can be used in PR49. A simple example is here and more complicated one is here

TLDR Maybe a suitable interface would thus be:

abstract DEProblem
abstract AbstractIntegrator

solve{I<:AbstractIntegrator}(de::DEProblem, ::Type{I}; options...) = solve(de, I(de, options))

If for some reason a package cannot implement the I(de, options) constructor, then it can just overload solve directly solve(de::DEProblem, ::Type{MyIntegrator}; options...) = ...

We should provide standard subtypes of DEProblem for say ODE, DAE, etc. But if a solver requires a special subtype of DEProblem then it can define one.

Not sure whether we should try to standardize the options too. In PR49 we try this: https://github.com/JuliaODE/ODE.jl/blob/master/src/options.jl.

Also, it maybe good to have several levels to the interface. Above would be the top-level which would make most functionality of down-stream packages available. But then if something extra is needed, say sensitivity analysis requires some extra bits, then a deeper layer of the interface also needs implementing to unlock that feature.

I never intended to work on ODEs, but I just needed something which was fast enough for the PDEs I was solving.

Same here ;-)

@ChrisRackauckas
Copy link
Member

I think that is fine as a community standard. I guess one thing is that the interface ought to be stable-ish, which sounds like you're heading towards.

Indeed, I am aiming for this to be stable and be pretty much untouched. In order to get there, it's pretty bare and mostly includes abstract types with no concrete types. I need to power through the DDEs, some FDM implementations, and usage of explicit Jacobians/Hessians, to really have the Problem types nailed down. It's easy to think it's setup alright for these things, but there's probably something that I missed.

So all the ODE solvers you're writing would go into DiffEqODE.jl (or DAE), right?

Yes, all directly into the ode_integrators.jl file. There's a top level above it, ode_solve.jl, which chooses to call the algorithms out of there, or interfaces with Sundials, ODEInterface, ODE. ode_integrators.jl has a bunch of associated tools so that writing algorithms is fast and easy (and they end up with top-notch performance), but because of that there is quite a lot of complexity in there. But given this structure, that's why I am fine with having solvers in different packages because I can just bring them all together anyways as long as the API is "close enough". However, given the DSL in ode_integrators.jl, maybe I should just dev doc that more because you can essentially copy/paste algorithms out of a paper to get it running in there, so the only reason why others wouldn't use it is because it's not documented enough.

Yes, that sounds right. Maybe it would make sense to have a standard type which has one field which can be used freely

At that point though we're doing it because we want an interface. I think it should just go the route of a documented interface like the rest of Julia.

As with the use of y, I don't think you're going to get me to adopt that bad idea. If you use y, you have to do a bunch of things:

  • You have to drop the compatibility of the basic parts of the interface between most PDEs and ODEs. If you have an interface with t as time and y as the final solution, but then y doesn't make sense for the solution since it has space? There's a reason why PDE books use u.
  • y doesn't even make sense for many of the test ODEs, like Lorenz and Pleides. Anything physical which has a 2D location has a y as a variable, and so you already have name clashing.
  • y(t) isn't even standard amongst ODE books. x(t) is another standard, or for books which will do PDEs later, many just do u(t) from the beginning.

I like simple. Julia's powers make it easy to come up with too complicated solutions. Having said this, I'll need to check ParameterizedFunctions.jl again as that did seem quite complicated to me.

(p::LotkaVolterra)(t,u,du) = begin
         du[1] = p.a * u[1] - p.b * u[1]*u[2]
         du[2] = -3 * u[2] + u[1]*u[2]
end
function (p::LotkaVolterra)(t,u,du,J,::Type{Val{:Jac}})
  J[1,1] = p.a - p.b * u[2]
  J[1,2] = -(p.b) * u[1]
  J[2,1] = 1 * u[2]
  J[2,2] = -3 + u[1]
  nothing
end

It's just an interface on call overloaded, which is now just a standard and widely used feature in Julia. Most users shouldn't have to use it directly anyways because the macro is powerful enough (or if it's not, make it more powerful).

In all types there needs to be room for custom fields. Letting t use to get the solution at time t looks good. Maybe use [] to get it though to mirror Interpolations.jl?

No, [] clashes with the array interface. sol[i] is the ith numerical solution. This makes the solution object work in the array interface and thus makes it easy to plug into already duck-typed algorithms. sol(t) does based off of time. If you make getindex dispatch off of floats as time and ints and the solution point, then sol[2.0] != sol[2] which is bad behavior. You can't guarantee that everyone will want continuous output either because it takes more memory and for some equations/problems a continuous output is neither easy or always possible (always have SPDEs in mind, or even better, RPDDAEs).

Note, a bit confusing: what you call "Problem" is our IVP-specs only, whereas we call the pair IVP+integrator the "Problem"

That is highly confusing, because when I tell people about an ODE problem, it's always independent of numerical methods. There's a semantic issue with having the two together: the integrator is how you solved it, not the specification of the mathematical problem (it is part of the computational problem specification I guess). Optim.jl's dispatch method is better than both of ours since it lets you choose the algorithm in the solve command (which the alg=:DP5 already lets you do), but then does the choice based off of non-dynamic dispatch (which the PR49 method does). It's the same as solve(de::DEProblem, ::Type{MyIntegrator}; options...) = ....

We should provide standard subtypes of DEProblem for say ODE, DAE, etc. But if a solver requires a special subtype of DEProblem then it can define one.

Trying. DDEs/explicit Jacobian usage needs to get hammered out before those are stable.

Not sure whether we should try to standardize the options too.

Yes, but that set of options is far too restrictive. It doesn't allow control even over the true options available in ODE.jl, which is a small subset of anything OrdinaryDiffEq or ODEInterface does. This is the set of options I've needed so far: https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/blob/master/src/solve/ode_constants.jl#L53 . It doesn't include norm choice because a devectorized and non-allocating form for that needs to be hammered out. Output controls, verbosity (progressbar) control, stabilization controls, and callbacks are all part of standard ODE interfaces which are not present in the ODE.jl list of options but are essential for usage. It gets complicated because I have to alias between the naming differences of the different packages (you can see all the aliases on that page).

Also, it maybe good to have several levels to the interface. Above would be the top-level which would make most functionality of down-stream packages available. But then if something extra is needed, say sensitivity analysis requires some extra bits, then a deeper layer of the interface also needs implementing to unlock that feature.

Of course. Call overloading makes this seamless on the problem specification side. Surprisingly it's the solution handling which is much more complex, but that's developer and not user details so it just needs docs.

@mauro3
Copy link
Contributor Author

mauro3 commented Nov 1, 2016

It's easy to think it's setup alright for these things, but there's probably something that I missed.

Yep, code always gets better with 20-20 hindsight.

So all the ODE solvers you're writing would go into DiffEqODE.jl (or DAE), right?

Yes, all directly into the ode_integrators.jl file. There's a top level above it, ode_solve.jl, which chooses to call the algorithms out of there, or interfaces with Sundials, ODEInterface, ODE. ode_integrators.jl has a bunch of associated tools so that writing algorithms is fast and easy (and they end up with top-notch performance), but because of that there is quite a lot of complexity in there. But given this structure, that's why I am fine with having solvers in different packages because I can just bring them all together anyways as long as the API is "close enough". However, given the DSL in ode_integrators.jl, maybe I should just dev doc that more because you can essentially copy/paste algorithms out of a paper to get it running in there, so the only reason why others wouldn't use it is because it's not documented enough.

Maybe I'm missing something but how about a slightly different organization: DiffEqBase.jl has all in it to make a callable, standard wrap/interface for a solver-package using it. So, for instance, Sundials.jl would get an additional file where that interface is implemented (as opposed to in OrdinaryDiffEq.jl). For packages outside of JuliaDiffEq we could have a wrapper package implementing the interface (or submit a PR to the package in question). OrdinaryDiffEq.jl would then be the place where you have the solvers you coded and not where all the wraps are too. Example:

A package needing an ODE integrator could then only depend on DiffEqBase:

module SuperDuperModel
using DiffEqBase

...

run_it{I<:AbstractIntegrator}(model::SuperDuperType, odesolver::Type{I}) = solve(model.eq, odesolver)

end

and then a user of that package could pick their solver of choice to use with it:

using SuperDuperModel
import Sundials
model = ...
run_it(model, Sundials.Ida)

I think this decoupling would make the whole more modular. As an added bonus it would also clear my personal discomfort mentioned above.

Yes, that sounds right. Maybe it would make sense to have a standard type which has one field which can be used freely

At that point though we're doing it because we want an interface. I think it should just go the route of a documented interface like the rest of Julia.

Sounds good.

As with the use of y, I don't think you're going to get me to adopt that bad idea. If you use y, you have to do a bunch of things:

This is just about naming, using y vs u, right? I'm easy. But preferably consistent. Looks like, you use u, t and (internally) for step size \Delta t. I'll go with that in PR49 too.

Note, a bit confusing: what you call "Problem" is our IVP-specs only, whereas we call the pair IVP+integrator the "Problem"

That is highly confusing, because when I tell people about an ODE problem, it's always independent of numerical methods. There's a semantic issue with having the two together: the integrator is how you solved it, not the specification of the mathematical problem (it is part of the computational problem specification I guess).

I'm not advocating any particular names here, my above statement was just to not get you confused. But yes, we thought of the "Problem" as all that was needed to actually produce an output, i.e. maths+numerics.

Optim.jl's dispatch method is better than both of ours since it lets you choose the algorithm in the solve command (which the alg=:DP5 already lets you do), but then does the choice based off of non-dynamic dispatch (which the PR49 method does). It's the same as solve(de::DEProblem, ::Type{MyIntegrator}; options...) = ....

I guess the difference is that Optim uses an instance and PR49 uses a type for dispatch. The reason we went for the latter is that the type parameter of the created instance could be set to match those of the IVP-specs.

Not sure whether we should try to standardize the options too.

Yes, but that set of options is far too restrictive.

Yes, it would only make sense to standardize the common options to keep them consistent between solvers, e.g. abstol, reltol... But many/most solvers will have their specialized options too.

@ChrisRackauckas
Copy link
Member

Maybe I'm missing something but how about a slightly different organization: DiffEqBase.jl has all in it to make a callable, standard wrap/interface for a solver-package using it. So, for instance, Sundials.jl would get an additional file where that interface is implemented (as opposed to in OrdinaryDiffEq.jl). For packages outside of JuliaDiffEq we could have a wrapper package implementing the interface (or submit a PR to the package in question). OrdinaryDiffEq.jl would then be the place where you have the solvers you coded and not where all the wraps are too.

Yes, using dispatch will make this better and work almost automatically. Then it will make OrdinaryDiffEq.jl simpler, but also get rid of the conditional dependencies that it uses.

This is just about naming, using y vs u, right? I'm easy. But preferably consistent. Looks like, you use u, t and (internally) for step size \Delta t. I'll go with that in PR49 too.

Yeah, just naming. t and u. \Delta t was a bad idea which bit me when I had to use a cluster which only supported ASCII. After that experience, I want to make sure anything that users have to interface with doesn't require Unicode. The answer to that is to have dt alias with \Delta t so you can set the option however you wish. I think that's the only user-facing Unicode that I had.

I guess the difference is that Optim uses an instance and PR49 uses a type for dispatch. The reason we went for the latter is that the type parameter of the created instance could be set to match those of the IVP-specs.

It probably will be types instead of instances of immutable types.

solve(de::AbstractODEProblem, tspan, alg=DefaultODEAlgorithm; options...)

(Should tspan be moved to be in the problem type?)

Dispatches to different packages could then be made off of abstract types

function solve{alg<:OrdinaryDiffEqAlgs}(...)

with OrdinaryDiffEqAlgs<:AbstractODEAlgs. This would allow packages to have "common setup functions", like a function which sets up output to any Sundials algorithm, or the like. The setup is necessary because this outer function naturally puts a function-barrier between any type-instability. Then instead of having dictionaries of algorithms to choose solvers, this would naturally make dispatch for things like CVODE_BDF <: SundialsODEAlgs <: AbstractODEAlgs. DefaultODEAlgorithm would then have an interface before a solver in order to choose an algorithm, based on solver options.

Is this design making sense? Have any suggestions?

Yes, it would only make sense to standardize the common options to keep them consistent between solvers, e.g. abstol, reltol... But many/most solvers will have their specialized options too.

Even more difficult is when default options are different. And the biggest difficulty, the one which lead to the most code bloat for me, is algorithm-specific defaults. For example, good default stabilization parameters for the Verner methods are different from what should be used with DP5, so even kwargs don't work well because there needs to be a good way to specialize the options to the algorithm choices.

Specialized options to solvers are not an issue if you just always kwargs... since that will simply pass things on if you don't know what to do with them. Writing the solve command like that would allow different implementations to choose which options to use, but ignore others (without error). Documenting it well require a Plots.jl-like attribute table.


This is planning a large refactor which I won't be implementing right now (I'm refactor tired out... this modularization has been a lot of work for "no gains"... well, organizational gains, but I didn't get to see anything go faster...). Let's try to hammer this design out and I probably won't get to it until the end of the month, but will handle getting both OrdinaryDiffEq and Sundials together. ODEInterface kind of does its own thing... so what I have in OrdinaryDiffEq might need to essentially just turn into a wrapper package which adds this interface to ODEInterface. This will get rid of the conditional dependencies and make Tony happy too.

@mauro3
Copy link
Contributor Author

mauro3 commented Nov 2, 2016

solve(de::AbstractODEProblem, tspan, alg=DefaultODEAlgorithm; options...)

(Should tspan be moved to be in the problem type?)

@pwl and I discussed this at length but did not come to a conclusive conclusion. Good to see that you're struggling too ;-) I think that it should go into the AbstractODEProblem: t0 definitely belongs into AbstractODEProblem. So, I think it would be good the have the whole of tspan there.

Dispatches to different packages could then be made off of abstract types

function solve{alg<:OrdinaryDiffEqAlgs}(...)

with OrdinaryDiffEqAlgs<:AbstractODEAlgs. This would allow packages to have "common setup functions", like a function which sets up output to any Sundials algorithm, or the like. The setup is necessary because this outer function naturally puts a function-barrier between any type-instability. Then instead of having dictionaries of algorithms to choose solvers, this would naturally make dispatch for things like CVODE_BDF <: SundialsODEAlgs <: AbstractODEAlgs. DefaultODEAlgorithm would then have an interface before a solver in order to choose an algorithm, based on solver options.

Is this design making sense? Have any suggestions?

Yes this sounds right. To bikeshed names: I'd prefer AbstractIntegrator instead of AbstractODEAlg, as Alg is a bit non-specific. But maybe Integrator is not specific enough either when considering the whole DiffEq-universe.

Yes, it would only make sense to standardize the common options to keep them consistent between solvers, e.g. abstol, reltol... But many/most solvers will have their specialized options too.

Even more difficult is when default options are different. And the biggest difficulty, the one which lead to the most code bloat for me, is algorithm-specific defaults. For example, good default stabilization parameters for the Verner methods are different from what should be used with DP5, so even kwargs don't work well because there needs to be a good way to specialize the options to the algorithm choices.

How about something like this for standard options (adapted form PR49):

immutable AdaptiveOptions{T,N<:Function,O<:Function} <: Options{T}
    reltol::T
    abstol::T
    ...
end
# default constructor defines defaults, dispatches on AbstractIntegrator:
function (::Type{AdaptiveOptions{T}}){I<:AbstractIntegrator, T,N,O}(::Type{I};
                                                     reltol   = eps(T)^T(1//3)/T(10),
                                                     abstol   = eps(T)^T(1//2)/T(10),
                                                     ...,
                                                     kargs...)
    AdaptiveOptions{T,N,O}(reltol,abstol,...)
end

...

## For a solver which has different defaults overload constructor:
function (::Type{AdaptiveOptions{T}}){I<:SunidalsIntegrator, T,N,O}(::Type{I};
                                                     reltol   = 1e-4,
                                                     abstol   = 1e-6,
                                                     ...,
                                                     kargs...)
    AdaptiveOptions{T,N,O}(reltol,abstol,...)
end

An integrator type would then look like:

abstract SundialsIntegrator{T} <: AbstractIntegrator{T}
immutable SunidalsIda{T} <: SundialsIntegrator{T}
    opts::AdaptiveOptions{T}
    other_opts::IdaOpts{T} # solver specific options
    ...
end

function SunidalsIda{T}(::AbstractODEProblem{T}; opts...)
    SunidalsIda{T}( AdaptiveOptions(SunidalsIda; opts...), IdaOpts(opts...) )
end

# and finally in solve it would look
function solve{I<:SunidalsIntegrators}(de::AbstractODEProblem, alg::Type{I}; opts...)
    solver = I(de, opts...)
    sundials_solve(...)
end

I think something along these lines could work, but maybe too complicated and not general enough...

Specialized options to solvers are not an issue if you just always kwargs... since that will simply pass things on if you don't know what to do with them. Writing the solve command like that would allow different implementations to choose which options to use, but ignore others (without error). Documenting it well require a Plots.jl-like attribute table.

Yes, that is good and what I put in above example.

This is planning a large refactor which I won't be implementing right now (I'm refactor tired out... this modularization has been a lot of work for "no gains"... well, organizational gains, but I didn't get to see anything go faster...). Let's try to hammer this design out and I probably won't get to it until the end of the month, but will handle getting both OrdinaryDiffEq and Sundials together. ODEInterface kind of does its own thing... so what I have in OrdinaryDiffEq might need to essentially just turn into a wrapper package which adds this interface to ODEInterface. This will get rid of the conditional dependencies and make Tony happy too.

:-) your clock ticks at a different rate to mine. In a month's time is plenty plenty early!

@pwl
Copy link

pwl commented Nov 5, 2016

I've just caught up with the whole discussion. All of this looks quite right to me, and seems to be intersecting PR49 at surprisingly many places.

As for the naming I would rather have Problem for IVP and Solver for the combination of IVP+algorithm, but I wouldn't mind the current PR49 naming.

The example https://github.com/JuliaODE/ODE.jl/blob/master/examples/custom_integrator.jl shows how one can add a solver to the current PR49 interface and also shows how to handle options ask kargs. The Options type that we implemented in PR49 was merely for convenience and it's use is not obligatory, it just implements the common parameters used by most algorithms. I think that implementing a number of things to do when implementing a new solver should be kept down to a minimum, as @mauro3 suggested, we could have a simple dependency (using DiffEqBase or whatnot) and then define a few methods to define a solver. In this example we are only defining init, onestep! and the constructor for EulerIntegrator{T} and the integrator is ready to go with minimal dependencies.

As for tspan=[t0,t1] being a part of IVP, we only kept it separate because of the iterator interface. In principle the iterators don't need t1 as they don't need to know the stop time. If you don't plan on adding iterators to your interface there is no need to keep tspan out of IVP. That said, I still think that the iterators are the way to go with ODE solvers as they give you more control over the interrupting, saving and plotting of the solution without having to implement these inside the integrators as callbacks or options. At the same time you can build the higher-level interface on top of them so the end user doesn't have to use or understand them, you also don't have to care much about them when implementing a new solver, as illustrated in the example of EulerIntegrator{T} I posted earlier.

I will try to follow this discussion as much as time permits, but @mauro3 seems to be doing a good job in conveying the problems and ideas that we encountered while developing PR49. Congrats on becoming a full time JuliaDiffEq developer @ChrisRackauckas! I basically only have weekends to think about this as my daily schedule is now completely covered from 8am to 8pm.

@ChrisRackauckas
Copy link
Member

As for the naming I would rather have Problem for IVP and Solver for the combination of IVP+algorithm

That does work in any case where one wants to define problems independently of the solver. It incorrectly adds a difficulty to anyone who wants to "change the method". In something like DiffEqProblemLibrary.jl, do we just set default methods? The current work is to move the problem definitions out of OrdinaryDiffEq.jl et. al. and into DiffEqBase so that way DiffEqProblemLibrary can have no dependents, but if you have to define the problem with an algorithm it will still need the dependency on the solver algorithm packages. If the solution to this problem is to put dummy BlankSolver into each problem that is defined (or if it gets DefaultSolver, then it would still require the package which defines that!), then that's admitting there's a problem with putting the solver method with the algorithm selection. This problem pops up in all of the *Models packages too which are for creating models and domain-specific problems which will be solver independent (and not have a dependency on the solvers) so that way the tools can be easily used outside of DiffEq (MultiScaleModels.jl is already picking up other uses).

So because it's semantically weird and only causes problems for actual implementation, I don't think they should be put together. I know that you want to put them together because in the iterator interface you want to pass one object around to everything. That's a developer issue, and not a user issue, so it shouldn't propagate beyond the solver. Anyways, it has an easy fix: the first thing you do in your methods is you put the two in an immutable which holds both types like you currently have it. Done. As noted above, the opposite way does not have an easy fix, and imposes a structure on everything even when it doesn't make sense.

As for the names, we are pretty much settled on Problem, except you abbreviate it in IVP. Julia types usually are not abbreviations, see the Style Guide and check out Base conventions (in Base source). I think that is a good reason to not use IVP and instead write out Problem. I think ODEProblem is clearly the IVP, and we can make ODEBoundaryProblem for the BVP. This keeps going as HeatProblem is clearly the problem for the Heat Equation, SDEProblem is clearly for SDEs whereas SIVP is not an abbreviation which is used. These are the justifications for the Problem naming scheme.

As for the solution, I think it's very clear to keep the Problem from the Solution. Again, it's a developer-side issue that for iterator schemes this ends up being the same thing because you're just mutating the Problem-type. But by keeping Problem and Solution different, we can have different interfaces. This is important because you can require parts of the Solution interface for certain functionality, for example a full timeseries for plot recipes and an interpolation scheme for parameter estimation. The other problem is that different solvers need different solution objects. Sundials saves its memory quite differently, which makes overloading the interpolant very different than RK methods where you have full access to the k vector. Making different Problem types with different fields to initialize for different solvers again adds unnecessary user burden when there is a simple solution: keep Problem separate from Solution.

By keeping these separate, the documentation is also more clear. We can document the Problem interface and the Solution interface. People who only write new parameter estimation techniques need to know the Solution interface well, but not necessarily the Problem interface.

So I think it's very clear that Problem and Solution is here to stay. It's easy to document, extends to other problem domains well, and from all of the feedback I've had, is very intuitive. However, that doesn't mean I'm not conceding to any changes, there are plenty of changes in the PR49 direction that I will be making. Here are some changes and justifications:

  • Put tspan in the Problem. This should've been done in the first place. In DiffEqProblemLibrary.jl you can see that in the docstrings I usually mention the timespan for the problems... that makes it very clear I made a mistake.
  • Move the options over to some type. It's very clear that this dictionary method was not a good idea. Your usage of types here is much better. However, the true solution is the JuliaML/Optim way of using instances of the types. The reason is because that way types can be immutables which hold their default values (stabilization betas, etc.) and override the defaults. Each standard option will have a boolean for if the default is overwritten. It's the solver's job the handle kwargs seamlessly. This gives algorithm-specific defaults which are general enough for the ecosystem, but also easy on the user.

Let me give a specific example. q is the stepsize modifier, where the next step you adapt to q*dt (this is Hairer's notation?). It's common to have a qmax. It's also common for that qmax to be dependent on the algorithm: DP5 uses 10.0 while DP8 uses 6.0 following the Hairer convention / tests. 10.0 is a standard default for most algorithms. So the logic goes as follows:

  • If the user hasn't set a qmax, then the option type is made with qmax=10.0 and qmax_set=false. DP8 has a field qmax=6.0, and has a function set_alg_defaults!(opts,alg) which sees qmax_set=false and replaces (or makes a new option type, depending on if it's mutable) the value with qmax=10.0.
  • If the user passes qmax=10.0 as a kwarg, then the solver does a generic dispatch where it reads the kwargs and makes an option type from then, then calls solver with the options. Thus algorithm developers don't need to worry about kwarg support (which can get complicated due to aliasing).
  • For the case where the user passed the kwarg, qmax=10.0 and qmax_set=true. Inside set_alg_defaults(opts,alg), it sees qmax_set=true and thus does not override the value.

This solves three issues:

  1. It makes it so that way algorithm defaults are optional. Less issues for the developers to handle.
  2. It handles aliasing at the highest level. This keeps things simple for those who aren't aware with how the terms are used throughout the ecosystem, both developers and users.
  3. It gives a way for algorithm defaults to not override user input, even if the user input is the global default.

set_alg_defaults will just have to be part of the advanced solver interface, but many people can ignore it. Sane defaults will be set in the option definition (i.e. matching the MATLAB/SciPy/ODEInterface/Sundials/LSODE/DifferentialEquations conventions most people know and which are well tested). The last thing is that the options type will have a dictionary at the end for nonstandard / algorithm-specific options.

There's no need to have a FixedOptions and AdaptiveOptions. There's an adaptive option to turn on/off adaptive timestepping, and if you don't need some options, just ignore them. There's not really a cost to ignoring options, but there's a big conceptual/implementation cost to splitting apart and having multiple options types. Instead just ODEOptions, SDEOptions, etc. since between problem domains these are very different. I think it's much easier to have a pretty encompassing set of options which "most algorithms use" and have a chart mentioning what's available to each algorithm

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Nov 5, 2016

PR49 was never about implementing newer or faster solvers but more about designing a flexible and simple interface, so if the plan is to wrap it up with another layer of interfaces anyway, then maybe there no point in having PR49 at all.

While I think it was an interesting experiment, I don't think that it's flexible or simple. In fact, I think it's main problem is that it can be very misleading. Here's my full take on it.

Flexible

It is flexible, but not anymore flexible than callback functions. Essentially when you write:

for (t,y) in ODE.iterate(ode)
    @show (t,y)
    if t > 1
        break
    end
end

what you're doing is making the algorithm have a callback function

function callback(t,y)
    @show (t,y)
    if t > 1
        break
    end
end

which occurs after each iteration. Not only is that strictly less flexible in ability than having the saving behavior in the callback (in the iterator interface, you need the temps part of the object so you have some form of saving already occurring, and this causes problems when you have immutables unless you make the type mutable which then reduces performance), but it's not as flexible in algorithm-choice: only iterator style algorithms can do this. This excludes many algorithms. Sure, you can put an iterator interface on Sundials, but you can't on ODEInterface. And if someone writes GPU algorithms? If you want to port more C algorithms? They will all allow for callbacks, but not iterator interfaces.

But while callbacks are strictly more capable than the iterator interface, you can still use a callback function with the iterator interface. Because you have the iterator interface, I can define the solver to use

for (t,y) in ODE.iterate(ode)
   callback(t,y)
end

and then if someone passes a callback function, it will work with PR49.

So while it's more flexible than what ODE had before and makes it easy to setup ODE with callbacks, it's still less flexible than callbacks and so this is not a point that's pro iterators.

Simple

No. I agree with you that

I think that implementing a number of things to do when implementing a new solver should be kept down to a minimum

That means an iterator interface shouldn't be required. It takes a good amount of Julia knowledge to really make the iterator interface work for something as complex as solving differential equations. Indeed, I still don't understand some of the choices you had to make in full detail. The choice of an iterator interface that people have to write towards would exclude the simplest way to implement methods: a for loop. If we want to make the interface accessible to those who might have very little development/Julia knowledge, then they should just be able to write a loop which uses the input to compute the formula from the paper. Someone should be able to write an algorithm as:

  @ode_preamble
  halfΔt::tType = Δt/2
  local du::rateType
  while t < T
      k = f(t+halfΔt,u+halfΔt*f(t,u))
      u = u + Δt*k
      t += Δt
  end
  return u,t

and have it work (replacing the preamble with explicit code of basically saying f=prob.f etc.). That's clearly the midpoint method. Does this version have callbacks, adaptive timestepping, etc.? No, but it should work. This will allow people to contribute algorithms for things like PDEs without knowing much Julia development. Someone can come later and add to it, but I think allowing the simplest algorithm should be allowed, and that's not an iterator interface on a custom type, it's a loop.

The iterator interface does make it easy to add more advanced features (for example, the callbacks), but that's an advanced developer issue, not something to complicte both users and not as advanced developers.

Not General

Not only is the iterator interface not flexible, but it's also not general. While all algorithms iterate in some form, the iterations can mean something very different in different algorithms. The iterator interface you have works well in time, but not all algorithms do that. Some algorithms don't necessary step foward in time, and instead iterates a vector of steps torwards a solution (some parareal methods). BVP tends to take this form as well. Some SDE methods don't necessarily treat time always in a foward motion. So restricting the interface to iterating (t,y) is not general, and you'd have to say "well, for some algorithms you do this... and for others...". But they can all use solve. And whether you have to do callbacks differently (or what callbacks even mean for some algorithms!) is a higher order issue with extra documentation that should only need to be addressed if the user has specific requirements.

Misleading

The iterator interface allows for misleadingly simple callbacks. By misleading I mean, it lets you easily do bad things. In your manual you already show one:

for (t,y) in ODE.iterate(ode)
    @show (t,y)
    if t > 1
        break
    end
end

if t>1 isn't good because you step over t=1. So then you backtrack and say "nonono, don't do that!". And this is precisely the problem. Anytime you want to do event handling like this, you most likely want to use interpolations, root finding, etc. to get it right. A simple check like x<0 will hit the wrong time and worse, it will not even find the events if there's oscillations (every timestep can be positive, with it negative inbetween!). It's even worse for stochasticity. I have a research project in event handling for stochasticity. Think about this: if you're very close to x=0, then you "probably hit it already" in the stochastic case. So what's the distribution you should use to pull back your time?

This is not easy stuff to do correctly, and by telling them "oh you define events right here!" is leaving the correctness as user-burden. A user has to know proper event handling algorithms to do this correctly. I think that most people who develop differential equation algorithms don't even know this stuff! Yet this interface leaves it ripe for abuse. If the discontinuties are not handled appropriately, you lose all of your accuracy (if it's handled with a conditional as you show, then every algorithm is first order O(dt)!). You need to find the the point of the event, backtrack the solver to that time, discard whatever was after and save the new value, and decide how to continue. This is not shown in your example, and it won't be easy to do in general (different algorithms have differerent interpolation schemes. Just showing the way to do it for Hermite interpolation is really bad because it's 3rd order with a large constant for the error!)

This is why event handling really needs a DSL. There is a lot of functionality that you want to make simple, but is actually much more complex. This still isn't very stable, but check out the docs for it. How to handle the DSL for different packages is still an issue that needs to be worked out (if other packages get as far as event handling) since currently the macros make callbacks specific for OrdinaryDiffEq (likely, the callbacks/macros will have to be package-specific. Making it work for all OrdinaryDiffEq algorithms was already a feat which pushed the limits of what's doable).

Performance

Packing everything into a type (as is required by the iterator interface) makes you use more mutable structures for when things change. It's much easier to write an algorithm using only stack-allocated immutable, limited function calls (and testing that each function inlines), etc. when it's not in an iterator interface. Julia optimizes the standard loops very well, and so we should allow someone to just write the loop!

Conclusion

There are a lot of downsides to the iterator interface. It's a Julia-specific interface which only makes sense in some cases (though those cases are more normal than not), but also is much more complex, can hinder performance, and makes it extremely easy to destroy the accuracy of the methods. It also adds more user-burden since the iterator interface is undeniably more complex than "solve the problem", and there's developer-burden if it's made into a requirement (some algorithms I'm not quite sure how I'd even do it!). If there wasn't a simple solution which doesn't have these problems, I would go with the iterator interface. But there is a simple solution which doesn't have these problems and is perfectly compatible with non-Julia codes: callback functions.

That doesn't mean there aren't any pros to the iterator interface. It's clear that, if you design your set of algorithms with an iterator interface, it's easy to put a common footer on each algorithm and uniformly implement callbacks. Also, by folding things into types, "algorithm-switching" (detecting stiffness and switching to a different algorithm) can be easier to implement (indeed, this research path is why I looked so deeply into iterators myself). However, there are too many pervasive and hard to avoid negatives that I cannot recommend that it be the common interface. I think it was a good experiment, but the conclusion is that the iterator interfaces work much better on machine learning / optimization problems, and iterative solvers of Ax=b, than they do for differential equations.

@ChrisRackauckas
Copy link
Member

Also, thanks guys for being very responsive. It helps to be able to knock this out and get back to coding!

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Nov 5, 2016

To make it a little more clear why using the iterator interface ends up being a user trap, consider the standard event handling example: the bouncing ball with dampening. Let the first value be the position, and the second value be the velocity. If someone wants to use the iterator interface, they might think the following code makes sense:

for (t,y) in ODE.iterate(ode)
    if y[1] < 0 # Check if the position is negative
        y[2] = -a*y[2] # Reverse the velocity and dampen
    end
end

However, this code will run into many issue for some very fundamental reasons.

  1. The solver will not backtrack, so you will save values of y which are negative. This is unphysical.
  2. You will miss the actual point of discontinuity, and so the order of accuracy is now O(dt), even if you were using a high order method.
  3. Adaptive timestepping will have made the steps smaller near the discontinuity, and so even for mild dampening values a, your next step might take you from a negative position to a negative position. This will give you weird behavior: the ball will get stuck below the ground, and bounce every step from that point on.
  4. Your interpolant is now all screwed up. You have to save, apply the discontinuity, and then re-save at the same timepoint so that way the interpolation works. If you don't do the second save, then I don't even think the interpolation is 1st order accurate in the next timestep since the interpolation value in the next time interval will have the wrong endpoint.

Another case is the simple ODE y' = cos(y) starting at y=0.3. Let's say you want to stop the calculation when y<0, so you use

for (t,y) in ODE.iterate(ode)
    if y < 0 # Check if the position is negative
        break
    end
end

However, if you use any high order method, the timesteps will become large enough that there is a very good chance it will skip over the negative timepoints. This is why event handling functions check at intermediate interpolated points for events.

Those are two very basic examples which show why event handling in diffeq solvers have evolved the way they have. This is why once you have the iterators, it's still not even a good idea for most users to directly use the iterator interface. This is why MATLAB and all other "standard solvers" have specific functionality for dealing with events: it is hard to get it correct. Yet handling of events is the one thing that the iterator interface seemingly gives you, yet it's clear it misleads users. The only other features are:

  1. Printing values, which is just as easily handled by a callback
  2. Inserting progress bars, which arguably is best handled by the solver with a default callback since how many users know Juno.jl in detail? Just spend the 1/2 minute to make it an option.
  3. Sophisticated saving. Sure, you can put some pretty sophisticated saving logic in the iterator part. But you can do exactly the same with callbacks, and you can also make this be solver options which is easier for users because then users don't have to worry about how to do the interpolations.

Again, in every case except for 1, there is an easier solution (and for 1, callbacks work just fine). I hope that helps explain the point.

@pwl
Copy link

pwl commented Nov 6, 2016

the iterator interface, you need the temps part of the object so you have some form of saving already occurring

We never added any storage because of the iterators. We only actually implemented two solvers for the iterator interface so the sample size might not be significant but the temp storage in the solvers was necessary for them to work anyway.

This excludes many algorithms. Sure, you can put an iterator interface on Sundials, but you can't on ODEInterface. And if someone writes GPU algorithms?

Well, this is an upstream issue, over which we have no influence. In that case you simply don't publish an iterator interface but a top-level interface (with callbacks functionality) instead.

But while callbacks are strictly more capable than the iterator interface, you can still use a callback function with the iterator interface.

And vice versa, I don't see how one is more flexible than the other. You even say it yourself: "The iterator interface does make it easy to add more advanced features (for example, the callbacks)". I don't think your arguments are sufficient to claim that callbacks are "strictly more capable".

That means an iterator interface shouldn't be required. It takes a good amount of Julia knowledge to really make the iterator interface work for something as complex as solving differential equations.
[...]
Someone should be able to write an algorithm as:

@ode_preamble
halfΔt::tType = Δt/2
local du::rateType
while t < T
k = f(t+halfΔt,u+halfΔt_f(t,u))
u = u + Δt_k
end
return u,t

I get the idea, this particular example is simpler then what PR49 proposes and it has a lot of appeal to non-Julia users. The more I look at this example the more uncomfortably convinced I feel. But then you start adding the macros for loopfooter and loopheader and it gets more and more complicated to clearly see the whole picture (and you have to add them for the callback to work). Even with the effort you put in keeping these macros as simple as possible (they don't even take any arguments!) they are still quite extensive (on the order of hundreds of lines). And if you want to do something less standard and fiddle with parts of the code you would have to redefine the whole macro. While I certainly see the appeal of simplicity, comparing a code that uses them with a code that doesn't is not a fair game.

While I'm not saying that PR49 approach is simpler on the user side, I want to highlight that the PR49 makes the definition of the solver almost completely self-contained. You can have everything clearly laid out in a single file. You still have to define a type, a constructor and a few methods, which, compared to the level of simplicity you suggest, seems like a fancy programming. Maybe there is a way to get the most of both worlds? It should be possible to define macros generating the PR49-like solvers.

Looking at the code in OrdinaryDiffEq.jl I couldn't help but notice a huge number of function arguments, type parameters and type fields. Maintaining this kind of code might be easy for you but it looks a little nightmarish to me. I wonder if the price for the simplicity on the user end and performance is worth making the base code less accessible.

BVP tends to take this form as well.
[...]
Some SDE methods don't necessarily treat time always in a foward motion.

I don't see how this is related. BVP (I guess you mean boundary value problem) and SDE are completely different animals, we are talking about an interface for deterministic IVPs (see the topic, "ODE/DAE solver interface"). The classical IVP solvers seem to be conform very well to this decomposition into single forward steps.

The stochastic case is out of my area of expertise, but maybe ODE/DAE and SDEs are simply not similar enough to warrant a common interface. The same goes for BVPs, that you mention earlier. You've probably given it much more though then I did, so could you please elaborate on why it makes sense to fit all of these problems into a single interface? Would it be possible to first build a lower level interface for IVP and only unify it with the interfaces for SDE/BVP one step higher in the interface hierarchy? This way the more specific functionality of IVP solvers would be still accessible to users working only with this class of problems.

if t>1 isn't good because you step over t=1. So then you backtrack and say "nonono, don't do that!". And this is precisely the problem. Anytime you want to do event handling like this, you most likely want to use interpolations, root finding, etc. to get it right.
[...]
This is not easy stuff to do correctly, and by telling them "oh you define events right here!" is leaving the correctness as user-burden. A user has to know proper event handling algorithms to do this correctly.

I agree that this example might be misleading but it only proves that this is a bad example. This could be fixed by a better documentation. You are also right about about users becoming confused and trying to (ab)use iterators for events, but then again, this might be fixed with better documentation and with first introducing the user to a top-level interface (similar to what you propose), and the iterator interface later.

And while I'm at it, we had root finding implemented at some point we just removed it to simplify and refactor the code. We already have generic/algorithm-specific interpolations though, along with the dense output. This proves that you can build more complex functionality on top of the iterator interface without changing the integration algorithms. We were thinking on event functions, which might be more challenging but still possible.

Packing everything into a type (as is required by the iterator interface) makes you use more mutable structures for when things change. It's much easier to write an algorithm using only stack-allocated immutable, limited function calls (and testing that each function inlines), etc. when it's not in an iterator interface. Julia optimizes the standard loops very well, and so we should allow someone to just write the loop!

This is a valid concern. We are always passing a mutable type to iterator related functions and this might lead to some slowdown. That would require a benchmark comparison between ODE.jl (or other existing packages) and the optimized PR49. I used the form "the optimized PR49" to mark a difference between the "current PR49" which has never been completely optimized.

There are a lot of downsides to the iterator interface. It's a Julia-specific interface which only makes sense in some cases (though those cases are more normal than not), but also is much more complex, can hinder performance, and makes it extremely easy to destroy the accuracy of the methods. It also adds more user-burden since the iterator interface is undeniably more complex than "solve the problem", and there's developer-burden if it's made into a requirement (some algorithms I'm not quite sure how I'd even do it!). If there wasn't a simple solution which doesn't have these problems, I would go with the iterator interface. But there is a simple solution which doesn't have these problems and is perfectly compatible with non-Julia codes: callback functions.

I believe that you are drawing these conclusions too hastily. I addressed most (all?) of your points and your most important concerns were about simplicity and performance. For the latter I haven't seen any actual numbers but I acknowledge that there might be some hard limitations. As for the simplicity, in DifferentialEquations.jl it is mostly achieved by a set of macros. In principle you could use similar macros to generate iterators, so I don't really see that as a fundamental issue (my distaste for macros aside).

@pwl
Copy link

pwl commented Nov 6, 2016

To make it a little more clear why using the iterator interface ends up being a user trap, consider the standard event handling example: the bouncing ball with dampening.

You seem to be confusing the iterator interface with event functions and root finding.

@musm
Copy link

musm commented Nov 6, 2016

IMO the iterator idiom seems just seems strange for ODEs. Chris' lays out the arguments better than I could.

Is there a precedent or some sort of application you are thinking where this iterator interface is necessary/preffered?

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Nov 6, 2016

We never added any storage because of the iterators. We only actually implemented two solvers for the iterator interface so the sample size might not be significant but the temp storage in the solvers was necessary for them to work anyway.

That's not addressing the statement. You already saved the current value of y to the array. If you want to backtrack, you need a second copy. This isn't about storage, it's about unnecessary copy operations. I already gave a solution with a default callback for saving. For the iterator to do this, you'd have to tell everyone to write the saving behavior inside the the loop. That's not user-friendly, and so the interface has put you in a bad position, which could be avoided.

Well, this is an upstream issue, over which we have no influence. In that case you simply don't publish an iterator interface but a top-level interface (with callbacks functionality) instead.

This would be an issue if the iterator interface was the common interface. However, I already offered a solution which completely avoids this.

But then you start adding the macros for loopfooter and loopheader and it gets more and more complicated to clearly see the whole picture (and you have to add them for the callback to work)

Then don't use macros? Those macros were just for my codes. With the solve interface, you can choose to write the code however you like. I chose to use a macro, others who extend solve don't have to. If it's an iterator interface, everyone has to define their solution as an iterator.

I wonder if the price for the simplicity on the user end and performance is worth making the base code less accessible.

I'd argue the iterators are even less accessible. But still, this has nothing to do with the interface. You can still write a dispatch for solve without understanding any of OrdinaryDiffEq (with the changes to dispatching on an algorithm type as above). If the iterator interface is the common interface, you have to understand how to write everything to an iterator interface in order to be apart of the common interface.

You've probably given it much more though then I did, so could you please elaborate on why it makes sense to fit all of these problems into a single interface?

Because most people see these equations as extensions of ODEs. When you follow the logic of papers, people talk about an ODE system and ask "what happens if you add noise?". People simulate a gene regulatory network and then ask "what changes if you add the fact that the construction of proteins is controlled by a delayed signal?". In many problems ecological problems, people ask "what happens if I add a spatial element? Some kind of diffusion"? So for users, there is a natural transition from ODEs to SDEs/DDEs/PDEs.

This is also a developer issue. Not only are delay equations related to ODEs, but the standard methods for solving DDEs is to use ODEs with event handling (for discontinuity propagation). Solving PDEs is essentially making and solving ODEs, and SDEs can be solved by solving PDEs or using MCMC methods. So for developers, there is a natural translation between the problems which is commonly exploited.

The solve interface is simple and lets you literally use the same function data and the same solve command to solve the more complicated equation. The iterator interface is less simple and doesn't handle all of these cases.

Would it be possible to first build a lower level interface for IVP and only unify it with the interfaces for SDE/BVP one step higher in the interface hierarchy?

Why not use the simpler interface for both?

This way the more specific functionality of IVP solvers would be still accessible to users working only with this class of problems.

What extra functionality are you gaining from the iterator interface? If you're not supposed to be putting code there because naively doing so handles many cases incorrectly, what specific functionality does it give you (that isn't done with callbacks)?

I believe that you are drawing these conclusions too hastily. I addressed most (all?) of your points and your most important concerns were about simplicity and performance.

No, you didn't address the point. What you talked about instead was this:

I wonder if the price for the simplicity on the user end and performance is worth making the base code less accessible

Sure, you don't find my code accessible and I know it's not the most accessible because while my loops "are just the math", I attempt every hack under the sun to have both good performance and a huge array of features. I don't find your code accessible because the iterator interface "doesn't look like the math" for lots of its internal logic. Clearly, the most accessible code is ODE.jl which is why it still has a purpose.

But that's not even what we're talking about. The discussion is:

What should a common solver interface look like?

I proposed the following:

solve(prob,alg=AlgorithmType;kwargs...)

where users define a problem, and solve dispatches on the algorithm type. Developers can write a dispatch however they want to return a solution.

The other possible proposal is an iterator interface. My argument against it is that:

  1. It doesn't work for for all of the problems we want to solve in the DiffEq ecosystem (or at least is very intuitive for some ways), making it already a problem as a common interface.
  2. It pigeon-holes developers by requiring them to write their solver in the iterator interface.
  3. The iterator interface doesn't give users anything.

That doesn't mean that you can't write your solvers using an iterator interface, and have it plug into the common interface. Rather, you can dispatch solve(prob,alg=AlgorithmType;kwargs...) to make an iterator and solve it like that if you please. But I still don't see a clear argument as to why it should be the common interface. I went through the cons, what pro does it give you?

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Nov 6, 2016

You seem to be confusing the iterator interface with event function and root finding.

I agree that this example might be misleading but it only proves that this is a bad example. This could be fixed by a better documentation. You are also right about about users becoming confused and trying to (ab)use iterators for events, but then again, this might be fixed with better documentation and with first introducing the user to a top-level interface (similar to what you propose), and the iterator interface later.

If that's not what it's for, then what does it do for a user? Is the manual supposed to say:

for (t,y) in ODE.iterate(ode)
  # Don't put code here. It won't do what you want it to do
end

? The pro that you seem to suggest is summed up here:

While I'm not saying that PR49 approach is simpler on the user side, I want to highlight that the PR49 makes the definition of the solver almost completely self-contained. You can have everything clearly laid out in a single file. You still have to define a type, a constructor and a few methods, which, compared to the level of simplicity you suggest, seems like a fancy programming.

The pro is: you like to code a few IVPs in an iterator format. But that's personal (and not something I agree with). Should it be imposed as the common interface for all DiffEq solvers? Should everyone have to code in an iterator style in order to be a part of the common interface? I don't see how you get there.

What is clear is that, with the solve(prob,alg=AlgorithmType;kwargs...) common interface, you can still have your codes internally be iterators, I can use macros to more quickly build complicated loops, Joe can write his solver to be a straight for loop, Jill can directly translate a Fortran code, and Julian can wrap a GPU code he wrote. With a common iterator interface, you're telling me to re-write all of my codes to an iterator interface and try to get the same performance, you're telling Joe that he needs to learn how to write everything as an iterator before getting involved with JuliaDiffEq, you're telling Jill that a translation won't work and instead she has to re-interpret the algorithm in an iterator form, and you're telling Julian he can't wrap his GPU code because the interface doesn't support it (at least, he won't be part of the common interface).

And why? What did would we actually gain by having an iterator common interface? It doesn't make code faster. It doesn't allow for more features. It doesn't support more equations. It doesn't support more implementations styles, modes, or languages. And it's not always simpler to code. Again, what about it makes it a compelling must-have common interface?

@ChrisRackauckas
Copy link
Member

Again, let me be clear, I do not have any issue at all with having a package in the JuliaDiffEq ecosystem which internally uses iterators. The "Package-Oriented Development, Distributed Maintenance" policy is about letting there be a plurality of implementation styles to encourage more developers to have their package or their method join the ecosystem, while helping them maintain the package for use in a common interface. We will want to show users a common interface: "look! Julia gives you fast and feature-filled methods for solving ODEs/SDEs/DAEs/DDEs/PDEs!", and this common interface will allow us to easily test between different implementations (very useful for research!) and recommend/default to different algorithms/packages in different situations at no cost.

The solve interface I proposed already does most of this. The extensions from the discussion with @mauro3 fixes the problem of having the interface package specific, and makes it dispatch related so that way it allows the interface to freely use different packages seamlessly. This seems to solve the remaining problem.

The iterator interface does not accommodate all of this well, which is why I am against it as the common interface. That doesn't mean that you can't have an iterator interface for your package, and plug into the common interface by defining and solving an iterator. And on your package's README / extra documentation, you can detail this.

@pwl
Copy link

pwl commented Nov 6, 2016

I'm sold on the solve(prob,alg=AlgorithmType;kwargs...) part, which, as you say, has nothing to do with whether the internals use iterators or not. You can have packages simply implementing iterate(prob,alg=AlgorithmType;kwargs...) internally (which is the case for PR49) to get the underlying iterable object and we can both be happy.

Then don't use macros? Those macros were just for my codes.

What interface do you plan to be the standard for the callback argument? Is it the one defined by the macros in https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/blob/master/src/solve/ode_callbacks.jl? Is this going to be a part of the common interface?

@ChrisRackauckas
Copy link
Member

What interface do you plan to be the standard for the callback argument? Is it the one defined by the macros in https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/blob/master/src/solve/ode_callbacks.jl? Is this going to be a part of the common interface?

I did mention earlier that I don't know. It probably got lost in everything else:

This is why event handling really needs a DSL. There is a lot of functionality that you want to make simple, but is actually much more complex. This still isn't very stable, but check out the docs for it. How to handle the DSL for different packages is still an issue that needs to be worked out (if other packages get as far as event handling) since currently the macros make callbacks specific for OrdinaryDiffEq (likely, the callbacks/macros will have to be package-specific. Making it work for all OrdinaryDiffEq algorithms was already a feat which pushed the limits of what's doable).

The issue, since a callback is essentially injecting code, I don't see an easy way to make there be one function definition which works for all implementations: it needs to be tailored to the algorithm it's dealing with in order to get all of the functionality (the callbacks as implemented are very powerful, and can even do things like resize the internal caches, which clearly requires some specificity to the internal algorithm).

One cheap solution is to have a well-defined DSL (macro for callback, event detection, and event apply), and let the different implementations add different macros which have the same user-interface but make different functions (with similar functionality). So for example, the current @ode_callback_def could be ordinarydiffeq_callback_def which works for these methods, and another set for yours. Cheap solution, not necessarily good, but will get the job done.

But it lends itself to a good long term solution. One those are all implemented, you can make it so that way writing a callback using the DSL:

callback = @ode_callback begin
  @ode_event event_f apply_event!
end

will generate specific codes for each algorithm. This is done by:

  1. Instead of having it define a function, you have it make a call-overloaded type with different dispatches for each algorithm. Thus callback will act as a function, and if the arguments that the different algorithms use is unambiguous (which should happen naturally since these signatures may need to be quite long and so randomly it should mostly work out, otherwise we can force it to work out with a type dispatch), then callback will always act as the right function in the right algorithm.
  2. The callback, being defined as a macro, can smartly interpret and substitute any internal DSL. Essentially what it can do is replace @ode_event event_f apply_event! with @ordinarydiffeq_event event_f apply_event! in that specific dispatch, @sundials_event event_f apply_event! in that specific dispatch, etc. so anything defined using the macros will naturally work between algorithms even though it is implementation specific.
  3. The most common arguments, t, u, etc., will likely be in every dispatch, so very simple things should work naturally.
  4. Algorithms/implementations could still document how to manually define a callback specific to their algorithm. I suspect that the super vast majority of users will never want to do this (and instead most cases should be a proposal to expand the DSL), but it will allow an absurd level of customization if users want it.

This idea is designed to match JuMP which uses a DSL for all of its user definitions to handle the problem that its solvers are very different backends with very different requirements on the definitions. JuMP has handled this very well, and so I think we can do the same. We also have a leg up because the implementation for our case is very easy: each package/implementation implements the callback DSL on its stuff, and one common find/replace macro will work (this common one won't need to be more than like 100 lines, and could be in DiffEqBase).

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Nov 7, 2016

I pruned down to make this common interface proposal be things which just about every implementation uses (or will be using in the near future). We can bikeshed names if you want.

Problem Type

type ODEProblem{uType,uEltype,tType,fType<:Base.Callable} <: AbstractODEProblem
  f::fType
  u0::uType
  analytic::Base.Callable
  knownanalytic::Bool
  numvars::Int
  isinplace::Bool
  tspan::Vector{tType}
end

Most of this should make sense. analytic is an analytical solution which is used in DiffEqDevTools for error calculations. It should be in the standard problem type because any implemented algorithm should have an easy way to calculate errors to test convergence, benchmark, etc. (and you just ignore it if the analytical solution isn't there). isinplace is whether the function is in-place: we accept f(t,u,du) and du = f(t,u) because the first is more performant (should be used in most cases) and the second is the only way that works with immutable structures (numbers, static arrays, whatever crazy things are thrown at it which might not even exist yet).

Note that the common constructor is:

ODEProblem(f::Function,u0,tspan=[0,1];analytic=nothing)

and has two things to be aware of: isinplace = numparameters(f)>=3 determines in-placeness by the number of arguments, and tspan=[0,1] is a positional optional argument because this is very useful for testing (though that can definitely go now that tspan has moved to the problem definition, and the testing library can be setup with the right tspans.

Other Problem Types

Other problem types look very similar. It makes obvious small changes as necessary (for example, two functions for SDEs, the function in DAEs has a different signature and needs du0 as well, etc).

I'm not sure on how to handle "extra functions". To you this probably just means Jacobians, but since we have the power of symbolic calculations (see ParameterizedFunctions), we can do things like symbolically pre-calculate the inverse of W=I-gamma*J (the Rosenbrock factor), pre-calculate in-place functions for Hessians for usage in optimization / parameter estimation algorithms, etc. Binding ourselves to make Jacobians "special" in this sense is just following a C-convention which was created because they didn't have the ability metaprogram the ASTs and do symbolic calculations... but we can and so we should (these are things I already do!). Maybe have a dictionary and list out what can be in there?

I suspect most people will let ParameterizedFunctions functions do this automatically (who really wants to write out Jacobians?), but we should still have a clear interface for doing this yourself. And this is part of the problem specification (this is all inferrable from the problem) and not part of the algorithm. The common constructor will just pull all of this from a ParameterizedFunction if it exists, or rely on user input.

Common ODE Options

Here's what I got in categories:

  dt
  adaptive

  gamma
  abstol
  reltol
  qmax
  qmin
  dtmax
  dtmin

  qoldinit
  beta1
  beta2

  internalnorm

The standard adaptive timestepping options. dt is the initial timestep or, for fixed timestepping, is the timestep. adaptive is a boolean for turning adaptive stepping on/off. The next 7 options are the standard timestepping tuning options that simple timestepping algorithms will have. The next three are the PI controls (currently in OrdinaryDiffEq and ODEInterface, and unexposed in the Sundials API for now but I hope to change that soon). Lastly, you can choose a norm function internalnorm(u) for condensing the errors.

  dense
  save_timeseries
  timeseries_steps
  saveat
  calck

The next set of algorithms control the saving behavior. dense is the highest level: its a boolean that turns on continuous (dense) output. I.e. solutions for which dense=true act as a continuous function (sol(t)). I've already implemented it in OrdinaryDiffEq, Sundials will get the same treatment, and my wrapper over PR49 does it so it's definitely easy there. Next is whether to save every step the solver takes. This is an important boolean for PDEs where many time saving every step would be far too costly in memory. save_timeseries is the boolean for turning that on/off. timeseries_steps controls how often it saves: every 1, 2, n steps. saveat are "special-values to save at even if the stepping doesn't hit there" (matching default behavior of Sundials). Lastly, there's calck, for turning on and off internal dense calculations. Normally this can be inferred by whether saveat and dense are on, but that's not always the case when events are involved.

Then there's

  timeseries_errors
  dense_errors

This turns on/off the errors to calculate if an analytical solution is given. This is because in some cases it can be costly. timeseries_errors are like l2, and dense_errors are like L2. DiffEqDevTools can calculate errors and convergence on anything using the common interface like this (and can calculate it from an approximate solution), and it will always by default calculate the error at the last timepoint (if the analytical solution is given).

  progressbar
  progress_steps

Turns on a progressbar, and designates how many steps to update it (since it can be costly). The Juno team got a new progressbar which has time estimates, I'm using it and adding it to Sundials and ODEInterface by default callbacks. I suggest everyone implement this because it's easy to do and good for users.

Then there's some important stragglers:

autodiff

Many algorithms can choose to autodifferentiate or not. This should always be an option because it's not always possible (requires a bit of type-looseness), but always better.

tType

The type for time. The type for the independent variables u is determined by the initial condition and is thus part of the prob. This can be inferred from dt, but if you want automatic dt then you can set it manually (setting time's type separate from u's type is important when dealing with units, writing DDE solvers from ODE solvers, etc.).

callback

Sets a callback function.

Common Options for Other Problems

The common options look pretty much the same. Anything missing for other problems? I think that this even covers everything in PDEs (that aren't algorithm options).

ODE Algorithm Specification and Options

Algorithms will be specified by an immutable type in some type hierarchy. In DiffEqBase, there will be an AbstractODEAlgorithm which all ODE algorithms descend from. The word algorithm is nice because it's easy for everyone to understand, works in every context, shortens to alg, etc. Do you like the word Integrator better? Each package will declare an abstract type for the package, for example OrdinaryDiffEqAlgorithms <: AbstractODEAlgorithm. Then each algorithm can hold its specific defaults, for example, for the Dormand-Prince 4/5 algorithm, I will have

immutable DP5
  order
  adaptive_order
  beta1
  beta2  
end

and have the constructor DP5() put in the correct values. This will allow you to do things like let beta1 override global defaults only for this algorithm (and only if the user did not set a value for beta1!). Of course, you can add whatever things are specific to an algorithm. One compelling use case is when you have very specific requirements, like preconditioners and linear solvers. For the PDE method CrankNicholson, you can have

immutable CrankNicholson
  order
  linear_solver
end

and have CrankNicholson() default to using \, but can allow people to override this with cg! from IterativeSolvers, etc. using the constructor.

So the interface would normally look something like:

solve(prob,DP5();kwargs...)

to choose the DP5 algorithm, which could dispatch to my packages OrdinaryDiffEqAlgorithms <: AbstractODEAlgorithm where I could pre-condition all of the arguments, and pass it to an inner solver.

Passing Options

I am backtracking on supporting the options in a type. Here's why:

  1. It's not flexible. Keyword arguments allow you to add to the "package-wide interface". For example, you could accept a keyword arg which everyone else ignores. You can't do that unless you add a dictionary to the options type as a "here's where extra stuff goes", which is really just admitting that you want to splat.
  2. It doesn't let you set package-wide defaults. If there's an option type, there has to be a default constructor, and it has to be ecosystem wide and in DiffEqBase. Ecosystem-wide defaults don't make sense: if someone makes a package for exponential integrators, why would we think they would want the standard reltol=1e-3; abstol=1e-6 which is for "standard interactive computing"? Even ODE and PR49 don't use these because of their stepping behavior (though Sundials and ODEInterface do). Because of this, I think there would be a lot of cases where people set package-wide defaults anyways, and just override the ecosystem-wide defaults in a non-uniform manner that will be hard
  3. You can easily use a type for this if you want:

solve(prob,DummyAlgoritm;kwargs...) = solve(prob,DummyAlgorithm,Options(kwargs...))

As previously noted, the other way around is much rougher because types have a fixed number of spots.

  1. With a function barrier (like the last line), there's no performance problem and it will dispatch just fine (and dispatch on kwargs is coming soon via Struct)

Any objections?

Default Algorithm

The default algorithm will have

immutable DefaultODEAlgorithm
  alg_hints::Vector{Symbol}
end

with a constructor so that a user can do something like

solve(prob,:stiff,:interpolant,:highprecision;kwargs...)

and then DefaultAlgorithm(opargs...) will slurp up those into alg_hints, and choose an algorithm from the qualifiers.

Documentation

The DifferentialEquations docs will update to have the stuff about the common interface. For each equation, there will be a page on the problem type and the solver. The solver page will talk about the available algorithms and have a table for how kwargs are implemented. The table will be a checkbox yes/no if something is implement, and the defaults for things which are different amongst different packages (abstol, reltol). It will pull a lot of ideas from how Plots.jl documents attributes and backend compatibility.

Solutions

Solutions are <:AbstractODESolution, etc. for each Problem type. Let's talk about the common interface for this in a little bit, and first focus on user input. This post has already been a lot.

Timing

I should have most of this implemented on OrdinaryDiffEq by the end of the week, so let's bikeshed the names (I'll just regex change them as necessary). I hope to hammer this out and get a blog post together explaining the transition.

@pwl
Copy link

pwl commented Nov 7, 2016

I am sorry I don't have the time to read the whole thing so I will only comment on the problem type.

type ODEProblem{uType,uEltype,tType,fType<:Base.Callable} <: AbstractODEProblem
  f::fType
  u0::uType
  analytic::Base.Callable
  knownanalytic::Bool
  numvars::Int
  isinplace::Bool
  tspan::Vector{tType}
end
  1. What is uEltype for? Isn't it redundant with uType?
  2. The same for numvars, isn't it just the length(u0)?
  3. Having an analytic solution as a part of a problem seems odd to me, you write "It should be in the standard problem type because any implemented algorithm should have an easy way to calculate errors to test convergence, benchmark, etc. (and you just ignore it if the analytical solution isn't there)."
    but this is not the typical use case and a solution could be wrapped together with a problem internally in the the test/benchmark package. This would also save us from having knownanalytic as a type field.
  4. If we agree that supporting not in place formulation isinplace should be a type parameter, so that you could dispatch methods depending on the problem formulation. Otherwise you would have to have if isinplace boilerplate code in all algorithms.
  5. Shouldn't tspan be a tuple instead?

As for the in-place or non-in-place definitions, we decided to focus on the in-place approach (it leads to faster code) and have a simple mutable type with a single field, say a Vector, but this could be anything else, to support the immutable types. Our thinking was that these use cases are pretty rarely used as a performance critical code and serve mostly pedagogical purpose of showing off the various integration algorithms. Correct me if I'm wrong, but most performance critical code usually involves working with mutable objects. And even if there is a slight difference performance I would go for mutable types only, because this way you don't have to support two use cases in one type and all (most?) algorithms.

This leads me to another question. What is the minimal type interface (traits) that we require from a type for it to be supported. This should be fixed as a part of the interface. You are often mentioning supporting Unitful types, or DualNumbers and so on, so what is, in your view, the minimal type that an integrator should support? A minimal working type for PR49.jl is declared here https://github.com/JuliaODE/ODE.jl/blob/master/src/tests/minimal_types.jl. From the mathematical point of view u should be at least an element of a Banach space, in most cases finite dimensional vector space, and t should belong to the real sub-field of this vector space. On top of that we demand a bunch of promotion rules and other technical stuff, but this minimal type is not too restrictive.

And one more thing, I don't know what the standard Julia indentation is but we should pick one and stick with it. I, and I think @mauro3 too, am using 4 spaces indent while you are using 2 spaces. If there is going to be a common part of the code we should use a proper indentation. I looked it up in Julia Base and it seems that 4 spaces are standard.

@pwl
Copy link

pwl commented Nov 7, 2016

Just one more quick note

  dense
  save_timeseries
  timeseries_steps
  saveat

If we add these options would it mean that dense output and saving options would have to be supported on a per solver basis?

The old arguments aside, one benefit of the iterator interface is that you could easily add a generic dense output, save options and root finding algorithms on top of the integrators. The suggested approach to the interface is going to encourage implementing these on a per-integrator basis, which again adds boilerplate code. The other option would be to use some standard macros, as you do in your solvers, but I had an impression that we agreed to be able to avoid them.

The same goes for other suggested options

  progressbar
  progress_steps

these would be cool to have but I don't image Jack or Jane implementing them. But still it would be nice to have these options for all the algorithms as they involve essentially the same functionality. That's why it helps to decompose the integrator into its loopy part and be able to do arbitrary stuff in between each iteration. We use iterators for that, you use macros, but both approaches address the same problems.

@ChrisRackauckas
Copy link
Member

  1. What is uEltype for? Isn't it redundant with uType?

I guess it isn't needed in the problem interface.

  1. The same for numvars, isn't it just the length(u0)?

Same. That can be pruned (the default constructor indeed does this calculation).

  1. Having an analytic solution as a part of a problem seems odd to me, you write "It should be in the standard problem type because any implemented algorithm should have an easy way to calculate errors to test convergence, benchmark, etc. (and you just ignore it if the analytical solution isn't there)." but this is not the typical use case and a solution could be wrapped together with a problem internally in the the test/benchmark package. This would also save us from having knownanalytic as a type field.

It should be a common enough use case that every single algorithm that is developed uses it (I mean, you should have tests). It doesn't affect users at all if the default constructor makes it optional. As in the example:

ODEProblem(f::Function,u0,tspan=[0,1];analytic=nothing)

means that most people even if might never know you could give it an analytical solution or ever use it, but it'll still be there for tests.

As for the in-place or non-in-place definitions, we decided to focus on the in-place approach (it leads to faster code) and have a simple mutable type with a single field, say a Vector, but this could be anything else, to support the immutable types. Our thinking was that these use cases are pretty rarely used as a performance critical code and serve mostly pedagogical purpose of showing off the various integration algorithms. Correct me if I'm wrong, but most performance critical code usually involves working with mutable objects. And even if there is a slight difference performance I would go for mutable types only, because this way you don't have to support two use cases in one type and all (most?) algorithms.

First of all, it's fine if you don't want to explicitly support it. In the code where I wrapped your solvers, I just had

    if typeof(u) <: Number
      u = [u]
    end
    if !isinplace && typeof(u)<:AbstractArray
      f! = (t,u,du) -> (du[:] = prob.f(t,u))
    else
      f! = prob.f
    end

and it works. However, I want to support it because there are many interesting differential equations on non-vectors. Usually these arise in the cases of DDEs and SDEs (one famous one is the Black-Scholes equation SDE which is probably one of the most solved stochastic equations!). The performance difference between working with an immutable vs a vector of size 1 is pretty massive. It does require doubling up code so I understand if not everyone wants to fully support it, but I gave a quick way out above gives support at the cost of the not having the performance gains.

What is the minimal type interface (traits) that we require from a type for it to be supported.

If it has +, -, *, then it's supported by OrdinaryDiffEq without adaptivity. If there's a norm that can be put on it and the unitless uType supports exponentiation by a unitless uType, then it is supported with adaptivity. That is far less restrictive than requiring it live in a Banach space (for example, this includes
rationals for non-adaptivity which can be important in DDEs). Maybe there's a more succinct way of saying that.

@ChrisRackauckas
Copy link
Member

Okay thanks! Also, reading (a lot of your stuff) on traits: overloaded types having Jacobians should be a trait, etc. I just really need to find a way to make code which gets rid of the other branches when this is the case, and it looks like using has_jacobian(f) with the methods table doesn't cut it. I think the two standing issues are:

  1. How to make "Functions with Jacobians" something that's easy to setup and has type information. I think just using the standard call overloading syntax with an alias to a ParameterizedFunction (which would have a lot of booleans) isn't a bad solution.
  2. How to get algorithm sub-choices like "autodiff" to be type-information which doesn't cause dynamic dispatch. Pre-instantiating the types would work, using types parameters on algorithms would work, and initializing could work if there's a good clean way to do it.

I think when those two issues are solved we can close this and the remaining problems could be handled in their own threads (callbacks, DAE types, etc.)

@mauro3
Copy link
Contributor Author

mauro3 commented Nov 14, 2016

Regarding 1., rather than all the of bools, could you use an approach like https://github.com/JuliaODE/ODE.jl/blob/e47748ba852ebf4f9b1acf4ce0771e577bbe92e5/src/base.jl#L51: i.e. each (possible) function-field has a type parameter. If it is used then its the function type, if not then it's Void.

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Nov 14, 2016

We could do that. And a separate dispatch to make an ODEProblem from a ParameterizedFunction to put the right functions in the right places. The only problem is, are you sure you want that many fields? The kinds of functions I have already for this is:

  1. Jacobian
  2. Inverse Jacobian
  3. Rosenbrock Inverse-W
  4. Rosenbrock Transformed Inverse-W
  5. Hessian
  6. Inverse Hessian

not including the explicit parameter functions and parameter derivatives which aren't necessary for ODEs. I also plan on added explicit functions for exp(dt*J) (for very fast exponential integrator methods) and who knows what else. Since I have the power to symbolically compute and use macro programming to build these as explicit in-place updating functions (and the domain I work in, Systems Biology, has most of its equations compatible with what I am doing, even in the PDE form after certain transformations), I plan to be researching this area quite a bit since I think it can result in "stupid fast" methods (in the specific cases where it's possible). We could put them all in the ODEProblem and make the ODEProblem have some keyword arguments for these things, but maybe there's a better way, and I'm sure a few months from now you'll be asking for the ODEProblem to be cleaned up with the amount of functions put in there.

@mauro3
Copy link
Contributor Author

mauro3 commented Nov 14, 2016

ODEProblem has just one parameter for the function f. That parameter can then be the type of just a function, say typeof(sin), or the complex beast of a ParameterizedFunction. If you add them later, say an analytic jacobian, you will get a different problem instance. I think, this can (ans should) be done internally without the user ever knowing.

Internally, going down the init path, this would mean that init also potentially re-creates the problem instance: p0_, alg = init(p0, Alg; kwargs...). Then, say, typeof(p0.f)==typeof(sin) but typeof(p0_.f)==ParameterizedFunction{...}.

@ChrisRackauckas
Copy link
Member

But how do you check if a ParameterizedFunction has a specific dispatch in a type-stable way?

@mauro3
Copy link
Contributor Author

mauro3 commented Nov 14, 2016

Could this work:

type ParaF{..., tJ, ...}
...
J::tJ
...
end

# typealias not needed but nice to avoid juggling function-parameters:
typealias ParaFwithJ{...,tJ,...} ParaF{...,tJ,...}
typealias ParaFwithoutJ{...} ParaF{...,Void,...}

f(::ParaFwithJ) = 1
f(::ParaFwithoutJ) = 2

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Nov 14, 2016

No, the whole reason for using overloading was to have the parameter implicit forms:

(p::LotkaVolterra)(t,u,du) = begin
         du[1] = p.a * u[1] - p.b * u[1]*u[2]
         du[2] = -3 * u[2] + u[1]*u[2]
end

Putting functions as fields won't allow this. Also, having the functions as fields isn't any easier to part for existence than putting bools in the type parameter (since that's what you'd essentially be doing here: checking the type parameters).

@mauro3
Copy link
Contributor Author

mauro3 commented Nov 14, 2016

I see. Is each ParameterizedFunction its own type? (I think so and your above example suggests it)

If yes, then you could go with traits because then you can add functions later without having to re-do the type because one of the parameters changes.

using SimpleTraits # you can hand code it as well
@traitdef HasJ{X}
@traitdef HasH{X}

immutable ParaF # no parameters
end

...
(p::LotkaVolterra)(Val{:J}, t, u, J) = ...
@traitimpl HasJ{LotkaVolterra}


@traitfn fn(p::ParaF::HasJ, ...) = 1
@traitfn fn(p::ParaF::(!HasJ), ...) = 2

@ChrisRackauckas
Copy link
Member

I see. Is each ParameterizedFunction its own type? (I think so and your above example suggests it)

Yes, this is the simplest example of a ParameterizedFunction:

type  LotkaVolterra <: ParameterizedFunction
         a::Float64
         b::Float64
end
f = LotkaVolterra(0.0,0.0)
(p::LotkaVolterra)(t,u,du) = begin
         du[1] = p.a * u[1] - p.b * u[1]*u[2]
         du[2] = -3 * u[2] + u[1]*u[2]
end

I guess call-overloading on types isn't too widespread yet (it's something that was introduced in v0.5), but it's the (p::LotkaVolterra)(t,u,du) which allows you to make your type have a call. In v0.5, anonymous functions are defined like:

immutable HiddenType end
(f::HiddenType)(x,y)  = x^2 + y

where it generates a type with some weird unique name. So this is really just a more general way of defining functions (since you now have the whole type, so you can put fields in there).

If yes, then you could go with traits because then you can add functions later without having to re-do the type because one of the parameters changes.

Thanks for showing how to use traits. Should we just embrace them before a Base release? I mean, then the code for defining a Jacobian manually would be:

type  LotkaVolterra <: ParameterizedFunction
         a::Float64
         b::Float64
end
f = LotkaVolterra(0.0,0.0)
(p::LotkaVolterra)(t,u,du) = begin
         du[1] = p.a * u[1] - p.b * u[1]*u[2]
         du[2] = -3 * u[2] + u[1]*u[2]
end
(p::LotkaVolterra)(Val{:J}, t, u, J) = ...
@traitimpl HasK{LotkaVolterra}

which I don't think looks bad at all (users who aren't familiar with such overloading would probably just need one example like this, right? Maybe it's one of those things where since I already know it that it looks easy).

What I really like is then:

@traitfn fn(p::ParaF::HasJ, ...) = 1
@traitfn fn(p::ParaF::(!HasJ), ...) = 2

is extremely nice for writing the solvers. This would make it extremely easy to write something nested. For example, in Rosenbrock methods you really want to calculate W^(-1) = (I - gamma*J), and so if I am understanding the traits correctly the Jacobian code could be:

@traitfn calc_jac(p::ParaF::HasJ,t,u,J)    = p(Val{:J}, t, u,J)
@traitfn calc_jac(p::ParaF::(!HasJ),t,u,J,method=Val{:FD}) = # Calculate the Jacobian using ForwardDiff or Calculus.jl?

and then calc_jac could always be used and it will essentially auto-optimize? That's really freaking cool! I would propose a separate interface on this then, having calc_jac in DiffEqBase, and the DiffEq universe could just always use this code agnostic of whether the it's using a user-defined Jacobian or not.

But, if I'm understanding it correctly, it gives really complex things like

# Define Win-calculations
@traitfn function Wsolve(p::ParaF::HasWinv,t,u,out,b,Jcache)
    p(Val{:Winv}, t, u, Jcache)
    @into out = Jcache*b # uses InPlaceOps.jl to make `*` in-place
    Jcache # return Winv
end
@traitfn function Wsolve(p::ParaF::(!HasWinv),t,u,out,b,Jcache)
   calc_jac(p::ParaF::HasJ,t,u,Jcache)
   fac = lufact((I - gamma*Jcache))
   @into out = fac \ b # uses InPlaceOps.jl to make `\` in-place
    fac # return the factorization
end

To solve Wx=b for W = (I - gamma*J) in the most efficient manner given what functions have been defined (without overhead). Is this on the right track. Of course, there's still something to cleanup, (like how we need Wsolve_cached(...) which solves the next steps using either the cached Winv or factorization, etc.), but does traits actually give us this design? Because if that's the case, I am 10000% for traits.

@mauro3
Copy link
Contributor Author

mauro3 commented Nov 14, 2016

Concerning #5 (comment):
The way we handle it in PR49 is to have the options part of the Alg type, whereas you (if I understand correctly), have the Alg type part of the options type (ODEIntegrator). I don't think that makes that much difference. The advantage of our approach is that each Alg only has the options it can deal with, whereas your approach probably makes unifying the options a bit easier. The init function in your case would then be p0_, odeint = init(p0, Alg; kwargs...) and the kwarg-less solve invokation solve(p0_, odeint). Anyway, I think it should be fine but I may well be missing something.

Concerning your last post: yep, I suspect that this should work. I'm a bit hesitant to make SimpleTraits a requirement, but maybe that would be ok. If someone doesn't want to do trait-dispatch they could always use a provided check has_jac(f) and do hand (or dynamic) dispatch. Am I correct that the common interface wouldn't need to depend on ParameterizedFunctions? It would just specify to provide a Jacobian by overloading f(::Val{:jac},...). (but that would be compatible with ParameterizedFunctions)

Note that this example of yours

@traitfn calc_jac(p::ParaF::HasJ,t,u,J)    = p(Val{:J}, t, u,J)
@traitfn calc_jac(p::ParaF::(!HasJ),t,u,J,method=Val{:FD}) = # Calculate the Jacobian using ForwardDiff or Calculus.jl?

doesn't quite work as dispatch first happens on types as usual. Only then, if a traitmethod is dispatched to, the trait dispatch kicks in. So it would have to be kwargs ...,J;method=Val{:FD}) or probably let the first definition also have the method argument (but just ignore it).

@ChrisRackauckas
Copy link
Member

The init function in your case would then be p0_, odeint = init(p0, Alg; kwargs...) and the kwarg-less solve invokation solve(p0_, odeint). Anyway, I think it should be fine but I may well be missing something.

Oh I see, that will work.

I'm a bit hesitant to make SimpleTraits a requirement, but maybe that would be ok. If someone doesn't want to do trait-dispatch they could always use a provided check has_jac(f) and do hand (or dynamic) dispatch.

Yeah, but if has_jac(f) is defined in DiffEqBase they might as well use it. It would require people who define their Jacobians by hand to use SimpleTraits though.

Am I correct that the common interface wouldn't need to depend on ParameterizedFunctions? It would just specify to provide a Jacobian by overloading f(::Val{:jac},...). (but that would be compatible with ParameterizedFunctions)

Yes. ParameterizedFunctions.jl just has macros to make a ParameterizedFunction, but the abstract type is in DiffEqBase. So all code can be written without ParameterizedFunctions.jl and just DiffEqBase.

doesn't quite work as dispatch first happens on types as usual. Only then, if a traitmethod is dispatched to, the trait dispatch kicks in. So it would have to be kwargs ...,J;method=Val{:FD}) or probably let the first definition also have the method argument (but just ignore it).

Oh yeah oops. So yeah, put and ignore the method optional arg in the first.

@mauro3
Copy link
Contributor Author

mauro3 commented Nov 14, 2016

Yes. ParameterizedFunctions.jl just has macros to make a ParameterizedFunction, but the abstract type is in DiffEqBase. So all code can be written without ParameterizedFunctions.jl and just DiffEqBase.

Ok, but what about just using straight methods? That should work just fine too even if they are not call-overloaded. Or would someone implementing the DiffEqBase interface need to subtype from ParameterizedFunction?

@ChrisRackauckas
Copy link
Member

Ok, but what about just using straight methods?

What do you mean "straight methods"?

@mauro3
Copy link
Contributor Author

mauro3 commented Nov 14, 2016

Say I implement a Alg: solve(p::Prob, ::Type{Euler}; kwargs...) = .... Inside the solve method, I do not need to care whether p.f<:Function or f<:ParameterizedFunction. I think for "normal" algorithms, f could be either and I wouldn't need to know which it is. Thus, I think we don't need ParameterizedFunction type part of DiffEqBase. But I might be wrong, if so, I don't understand how. Maybe you can point me to the places where it is needed?

@ChrisRackauckas
Copy link
Member

How do you do has_jac(f) without it? The traits example dispatches off of ParameterizedFunction. Using a type-alias like FunctionWithJacobian is just using type parameters (in a hidden and easier style), but still requires subtyping. The methods table is not inferrable since it's using a global. Is there a way to write a function which can check for the dispatches on a generic function at compile time?

@mauro3
Copy link
Contributor Author

mauro3 commented Nov 15, 2016

@ChrisRackauckas can you give me access rights to DiffEqBase.jl (and Roadmap whilst you're at it)? I have a branch to push.

@ChrisRackauckas
Copy link
Member

@ChrisRackauckas can you give me access rights to DiffEqBase.jl (and Roadmap whilst you're at it)? I have a branch to push.

Done.

@mauro3
Copy link
Contributor Author

mauro3 commented Nov 15, 2016

No, that didn't work. I can't push to either nor can I access the settings on the web-page. Can you try again?

@ChrisRackauckas
Copy link
Member

Try it.

@mauro3
Copy link
Contributor Author

mauro3 commented Nov 15, 2016

Tnx!

Here a function interface draft.

Here an example implementing the interface.

@mauro3
Copy link
Contributor Author

mauro3 commented Nov 15, 2016

Thanks, I made two PRs: SciML/DiffEqBase.jl#2 and SciML/DiffEqBase.jl#3, maybe discuss it further there.

@ChrisRackauckas
Copy link
Member

Alright, I moved my discussion over there. Is there any other high level topic to discuss that doesn't have its own thread?

@ChrisRackauckas
Copy link
Member

Closing this monster. I think we can start talking about specifics with @mauro3 's PRs.

@mauro3
Copy link
Contributor Author

mauro3 commented Nov 16, 2016

Thanks to all! I think this was very productive.

@ChrisRackauckas
Copy link
Member

Yeah, I learned quite a bit. I'm excited for traits now too! 👍

@pwl
Copy link

pwl commented Nov 16, 2016

Thanks for a nice discussion, I'm very happy we agreed on some common ground. And, sorry for a limited participation.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants