Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zygote support broke in new release #298

Closed
avik-pal opened this issue Jul 16, 2024 · 9 comments · Fixed by #315 or #324
Closed

Zygote support broke in new release #298

avik-pal opened this issue Jul 16, 2024 · 9 comments · Fixed by #315 or #324

Comments

@avik-pal
Copy link
Member

See https://github.com/LuxDL/LuxLib.jl/actions/runs/9965073839/job/27534650459?pr=86#step:6:2929. Started showing up on update.

Probably needs a @non_differentiable on methods

@sathvikbhagavan
Copy link
Member

@SouthEndMusic, can you look into this as you were working on #289?

@SouthEndMusic
Copy link
Member

It's hard to deduce from the link what exactly is failing. Which derivatives are being computed? What is desired behavior?

From working on #289 I know that differentiating w.r.t. A.u is broken in the new release, which is because of the caching of parameters at the initialization of the interpolation objects. In the discussion about that caching in #271 we came to the conclusion that mutation of A.t and A.u should not be allowed because then this data gets out of sync with the cached parameters. But updating A.u in place for optimization is a very sensible use case (see also #83). Initializing a new interpolation object for each iteration (as I do here) is now slow because of allocation for the cached parameters. For that use case it might be better be able to set cache_parameters = false.

Regarding #289: I've managed to get the Zygote gradients working again by writing custom ChainRulesCure.rrule for the parameter cache constructors. I have no idea whether this is the right direction to go in, especially with @ChrisRackauckas comments in mind that most effort should be put in Enzyme support.

@ChrisRackauckas
Copy link
Member

If the rule exists then we should add it and use it. I just said I personally wasn't going to spend much timing optimizing Zygote rules, but we shouldn't regress Zygote support.

@avik-pal
Copy link
Member Author

avik-pal commented Aug 5, 2024

I am still seeing some failures after updating to v6 https://github.com/LuxDL/Boltz.jl/actions/runs/10240025636/job/28326309967?pr=45#step:6:716

@ChrisRackauckas
Copy link
Member

Interesting, @SouthEndMusic is this a different case then the ones you looked at?

@SouthEndMusic
Copy link
Member

Could be, I only looked at u::AbstractVector{<:Real}, t::AbstractVector{<:Real}.

@SouthEndMusic
Copy link
Member

@avik-pal can you specify what type of u is used in these tests?

@avik-pal
Copy link
Member Author

avik-pal commented Aug 6, 2024

It is a vector of arrays

@SouthEndMusic
Copy link
Member

I made a PR and made it work in the case that u is a matrix, couldn't make it work yet with u a vector of arrays. Maybe someone else can have a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants