Iteration over `params(m)` in explicit mode gives no gradient #2091

mcabbott · 2022-10-20T14:25:49Z

The use of params within explicit-mode gradients has been broken by #2054 . Sadly there were no tests of this:

julia> m = (x=[1,2.0], y=[3.0]);

julia> gradient(m -> (sum(norm, Flux.params(m))), m)
((x = [0.4472135954999579, 0.8944271909999159], y = [1.0]),)  # before, [email protected]
(nothing,)  # after, [email protected]

julia> gradient(() -> (sum(norm, Flux.params(m))), Flux.params(m))
Grads(...)

julia> ans[m.x]  # unchanged
2-element Vector{Float64}:
 0.4472135954999579
 0.8944271909999159

(This form really is very annoying. Suggestions are fine but required textboxes seem to be solving a problem we didn't have. Also, it seems to break links to issues, like #2054 above. Edited to delete the boilerplate.)

The text was updated successfully, but these errors were encountered:

ToucheSir · 2022-10-20T14:29:31Z

How much do we care about this working? I'm not sure I have the wherewithal to figure it out while not regressing Flux.

This form really is very annoying. Suggestions are fine but required textboxes seem to be solving a problem we didn't have...

We (or really github) provide an escape hatch. You can still open a blank issue via the " Don’t see your issue here? Open a blank issue." link at the bottom of the new issue page.

mcabbott · 2022-10-20T14:37:15Z

IDK, it's the officially documented way to add a regularisation term, http://fluxml.ai/Flux.jl/stable/models/regularisation/ .

Ideally that would move to something like FluxML/Optimisers.jl#57 . But if the goal is to provide a soft transition to explicit mode, so you can add the new things slowly during 0.13 rather than all-at-once to run on 0.14, then I think the old way ought to keep working a bit longer?

ToucheSir · 2022-10-21T04:26:43Z

Is there a smart rule we could write which avoids #2054? I think the main considerations would be how map trainable pullbacks to the original tree structure and the final params vector. Maybe maintaining an auxiliary tree structure with indices at the leaves?

mcabbott · 2022-10-22T16:38:29Z

This is surely possible, someone just has to write one more Functors thing which builds this gradient tree, as the rrule for params(m).

mcabbott added the bug label Oct 20, 2022

mcabbott transferred this issue from FluxML/Zygote.jl Oct 20, 2022

ToucheSir added help wanted gradients labels Oct 22, 2022

mcabbott mentioned this issue Nov 21, 2022

Narrower version of @non_differentiable params #2118

Merged

mcabbott closed this as completed in #2118 Nov 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Iteration over `params(m)` in explicit mode gives no gradient #2091

Iteration over `params(m)` in explicit mode gives no gradient #2091

mcabbott commented Oct 20, 2022 •

edited

Loading

ToucheSir commented Oct 20, 2022 •

edited

Loading

mcabbott commented Oct 20, 2022

ToucheSir commented Oct 21, 2022

mcabbott commented Oct 22, 2022

Iteration over params(m) in explicit mode gives no gradient #2091

Iteration over params(m) in explicit mode gives no gradient #2091

Comments

mcabbott commented Oct 20, 2022 • edited Loading

ToucheSir commented Oct 20, 2022 • edited Loading

mcabbott commented Oct 20, 2022

ToucheSir commented Oct 21, 2022

mcabbott commented Oct 22, 2022

Iteration over `params(m)` in explicit mode gives no gradient #2091

Iteration over `params(m)` in explicit mode gives no gradient #2091

mcabbott commented Oct 20, 2022 •

edited

Loading

ToucheSir commented Oct 20, 2022 •

edited

Loading