Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default Automatic Differentiation Choice #353

Closed
1 of 3 tasks
avik-pal opened this issue Jan 16, 2024 · 9 comments
Closed
1 of 3 tasks

Default Automatic Differentiation Choice #353

avik-pal opened this issue Jan 16, 2024 · 9 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@avik-pal
Copy link
Member

avik-pal commented Jan 16, 2024

Forward AD

  • Non-Sparse Default AD should be AutoPolyesterForwardDiff (if that package is loaded). This will be similar to SimpleNonlinearSolve

Reverse AD

  • For in-place problems, we default to AutoFiniteDiff. This is a really bad choice. We should default to: (conditional on the package being loaded)
    • AutoReverseDiff (after implementing the corresponding version in SparseDiffTools) for non-GPU versions
    • AutoEnzyme for all other cases
@ChrisRackauckas
Copy link
Member

Should we just depend on Polyseter and make it the default here?

AutoPolyesterForwardDiff (if that package is loaded)

OrdinaryDiffEq.jl depends on Polyester, and so you might have an odd interaction that some codes work better or worse depending on whether you have the ODE solver loaded, and this might be a little invisible to many users.

For in-place problems, we default to AutoFiniteDiff. This is a really bad choice. We should default to: (conditional on the package being loaded)

Why not Forward? Are you talking about a specific size of the Jacobian?

@avik-pal
Copy link
Member Author

I am still debating on the default as polyesterforwarddiff, for the bruss we see a clear improvement but for the battery problem there is a slowdown. I need to investigate this a bit to verify it is not my code that is problematic.

@avik-pal
Copy link
Member Author

Why not Forward? Are you talking about a specific size of the Jacobian?

We could construct the full jacobian and then compute the VJP but the default was based on the implementations available in SparseDiffTools and was not updated after that. Currently we maintain the JacobianOperator in house so we can easily switch that as well.

@ChrisRackauckas
Copy link
Member

The problem is that if there are any other threads then it's not going to be a speedup since you'll lock the threads. This makes it pretty unsafe unless the user knows it's going to be using Polyester. That is why in OrdinaryDiffEq.jl it's always an opt-in (and maybe something we can make into an extension), and I think the same would need to be done here.

I think we should highlight it in documentation and tutorials much better than we do now, since indeed for any large enough problem it's a good idea, but it's hard to make something that bypasses hierarchical threading into a default.

@ChrisRackauckas
Copy link
Member

We could construct the full jacobian and then compute the VJP but the default was based on the implementations available in SparseDiffTools and was not updated after that. Currently we maintain the JacobianOperator in house so we can easily switch that as well.

Oh you're talking about the default vjp, for some line searches?

@avik-pal
Copy link
Member Author

Oh you're talking about the default vjp, for some line searches?

For some of the line searches and if you use a krylov method like LSMR requiring both $J^Tv$ and $Ju$

@avik-pal
Copy link
Member Author

That is why in OrdinaryDiffEq.jl it's always an opt-in (and maybe something we can make into an extension), and I think the same would need to be done here.

Do you have a link to the docs for that? We can have it be consistent here

@ChrisRackauckas
Copy link
Member

It's not documented well, and it's used in a very different way. It's just in some methods you can set threads=PolyesterThreads(). We should highlight it in the docs and make it into a package extension though.

@gdalle
Copy link
Collaborator

gdalle commented Oct 31, 2024

I'm only seeing this now but I want to highlight a caveat. In DI, ReverseDiff is the only package that does support constant arguments but is slowed down significantly by those. The reason is because you can no longer tape anything if the constants change in further function calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants