Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AD: change the backend transparently #25

Open
2 tasks done
jbcaillau opened this issue Feb 9, 2023 · 19 comments
Open
2 tasks done

AD: change the backend transparently #25

jbcaillau opened this issue Feb 9, 2023 · 19 comments
Labels
enhancement New feature or request

Comments

@jbcaillau
Copy link
Member

jbcaillau commented Feb 9, 2023

  • we currently use ForwardDiff but should be able to move transparently (= without any user code change) to another backend
  • use the backend choice as a default / global behaviour, not hard coded
  • as soon as sth more efficient than dual numbers (Enzyme, e.g.) is available, switch
@ocots
Copy link
Member

ocots commented Mar 28, 2023

This is more a discussion than an issue, no? Should be transfered elsewhere?

@jbcaillau jbcaillau transferred this issue from control-toolbox/OptimalControl.jl Mar 28, 2023
@jbcaillau
Copy link
Member Author

It is an issue. Now in CTBase.jl, FWIW

@ocots
Copy link
Member

ocots commented May 9, 2023

I am not convinced that this is an issue. It is more a wish :-)

@jbcaillau
Copy link
Member Author

OK, move it!

@jbcaillau
Copy link
Member Author

We should move from ForwardDiff to AbstractDifferentiation

@jbcaillau
Copy link
Member Author

See also: FastDifferentiation.jl

@ocots
Copy link
Member

ocots commented May 7, 2024

Check also DifferentiationInterface.jl.

@gdalle
Copy link
Contributor

gdalle commented May 28, 2024

Friendly ping from the creator of DifferentiationInterface: I'm available to help you make the transition if you want me to :)

@ocots
Copy link
Member

ocots commented May 29, 2024

Hi @gdalle! This would be great. Thanks. I propose first to post here how we use AD. @jbcaillau and @PierreMartinon, please complete.

  • We have defined some auxiliary functions that use ForwardDiff.jl package:

function ctgradient(f::Function, x::ctNumber)

function ctgradient(f::Function, x::ctNumber)
    return ForwardDiff.derivative(x -> f(x), x)
end

function ctjacobian(f::Function, x::ctNumber)

function ctjacobian(f::Function, x::ctNumber) 
    return ForwardDiff.jacobian(x -> f(x[1]), [x])
end
  • We use these auxiliary functions for differential geometry in CTBase.jl package:

function (X::VectorField{Autonomous, <: VariableDependence}, f::Function)::Function

  • We use these auxiliary functions also in the CTFlows.jl package:

https://github.com/control-toolbox/CTFlows.jl/blob/fff879627ccec8d3252694ae2ad27252522d676f/src/hamiltonian.jl#L61

function rhs(h::AbstractHamiltonian)
    function rhs!(dz::DCoTangent, z::CoTangent, v::Variable, t::Time)
        n      = size(z, 1) ÷ 2
        foo(z) = h(t, z[rg(1,n)], z[rg(n+1,2n)], v)
        dh     = ctgradient(foo, z)
        dz[1:n]    =  dh[n+1:2n]
        dz[n+1:2n] = -dh[1:n]
    end
    return rhs!
end
  • I think we use AD also directly form third-party packages like ADNLPModels.jl:

https://github.com/control-toolbox/CTDirect.jl/blob/60edc0c8be071bba860db12c768f46f29e482592/src/solve.jl#L40

    # call NLP problem constructor
    docp.nlp = ADNLPModel!(x -> DOCP_objective(x, docp), 
                    x0,
                    docp.var_l, docp.var_u, 
                    (c, x) -> DOCP_constraints!(c, x, docp), 
                    docp.con_l, docp.con_u, 
                    backend = :optimized)

@gdalle
Copy link
Contributor

gdalle commented May 29, 2024

Thanks for the links, I'll take a look but I already have a few questions.

Why do you call the derivative the gradient? What is this ctNumber that you use?

Are the derivative and Jacobian the only operators you need? What are the typical input and output dimensionalities for the Jacobian? Depending on the answer, you may want to parametrize with different AD backends for the derivative (forward mode always) and the Jacobian (forward mode for large input and small output, reverse mode for small input and large output, otherwise unclear).

Do you take derivatives or Jacobians of the same function several times, but with different input vectors? If so, you will hugely benefit from a preparation mechanism like the one that is implemented in DifferentiationInterface.

@gdalle
Copy link
Contributor

gdalle commented May 29, 2024

As for ADNLPModels, they are also considering a switch to DifferentiationInterface but it might be slightly slower

@jbcaillau
Copy link
Member Author

jbcaillau commented May 29, 2024

@gdalle Thanks for the PR and comments

Why do you call the derivative the gradient? What is this ctNumber that you use?

ctNumber = Real. We want to deal more or less uniformly with reals and one dimensional vectors, that is why the special case when the variable is a single real is explicitly dealt with.

Are the derivative and Jacobian the only operators you need? What are the typical input and output dimensionalities for the Jacobian? Depending on the answer, you may want to parametrize with different AD backends for the derivative (forward mode always) and the Jacobian (forward mode for large input and small output, reverse mode for small input and large output, otherwise unclear).

Dimensions < 1e2, e.g. to build the right hand side of a Hamiltonian system.

Do you take derivatives or Jacobians of the same function several times, but with different input vectors? If so, you will hugely benefit from a preparation mechanism like the one that is implemented in DifferentiationInterface.

✅ to be tested elsewhere (see also this comment)

@ocots
Copy link
Member

ocots commented Jun 15, 2024

I think that for CTBase.jl, a step has been done. Do we close this issue? We will see next in CTFlows.jl how to handle this.

@ocots ocots mentioned this issue Jun 15, 2024
13 tasks
@gdalle
Copy link
Contributor

gdalle commented Jun 17, 2024

To me this is not yet done, because #141 added a backend kwargs to ctgradient and the like, but this kwarg is not passed from further up the chain. As a result, users cannot change the AD backend, even though package developers can through the __auto() function.

@jbcaillau
Copy link
Member Author

@gdalle check this PR

@gdalle
Copy link
Contributor

gdalle commented Jun 21, 2024

To clarify, even with this PR, you're currently doing something like

function solve_control_problem(f)
    # ...
    for i in 1:n
        x -= gradient(f, x)
    end
    # ...
end

function gradient(f, x, backend=default_backend())
    # ...
end

And for users who only care about high-level interfaces, and who never call gradient directly, the following seems better to me:

function solve_control_problem(f, backend)
    # ...
    for i in 1:n
        x -= gradient(f, x, backend)
    end
    # ...
end

function gradient(f, x, backend=default_backend())
    # ...
end

But you know best if that's relevant in your case or not.

@ocots
Copy link
Member

ocots commented Jun 21, 2024

We totally agree with you. The second choice is better.

But, actually the function to solve optimal control problems is not in the CTBase.jl package.

Besides, our function

function gradient(f, x, backend=default_backend())
    # ...
end

is not used in the resolution function of optimal control problems. It is used for instance in the package CTFlows.jl here. I agree that here I will have to add a kwarg for the AD backend.

About the resolution of the optimal control problems, we pass through ADNLPModels.jl and again we want the user to have the possibility to choose the AD backend.

@jbcaillau
Copy link
Member Author

@gdalle agreed, thanks for the feedback. actually, there is now a setter that allows user / dev to change the backend (globally and dynamically); it is also easy to add optional kwarg to allow this anywhere it makes sense (solvers, etc.) We leave this issue open for further testing, e.g. for cases requiring a change of backend between first order derivative computation and second order ones.

On a side note: check this upcoming talk at JuliaCon 2024 (we'll also be around)

@gdalle
Copy link
Contributor

gdalle commented Jun 21, 2024

Thanks for pointing out ADOLC.jl, we're already on the ball ;) see TimSiebert1/ADOLC.jl#7 to track progress

@ocots ocots added the enhancement New feature or request label Jul 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants