Make params non-differentiable (Closes #2040 & #2048) #2054

christiangnrd · 2022-08-29T17:20:41Z

This is the new pull request I mentioned in #2048 to allow params() calls from within gradient() to be cached by making params() non-differentiable. This would close issue #2040, and supersede pull request #2048.

Might be worth turning on downstream test like in the other pr.

christiangnrd · 2022-08-29T22:16:41Z

@ToucheSir can you enable the downstream tests for this pr?

ToucheSir · 2022-08-29T23:05:39Z

FastAI failure is innocuous and known. Metalhead I think too (cc @darsnack to confirm it's just the ResNet weights loading). GeometricFlux I'm not sure about, but there doesn't appear to be any AD calls in https://github.com/FluxML/GeometricFlux.jl/blob/master/test/layers/graphlayers.jl. @yuehhua can you confirm?

Edit: I forgot AtomicGraphNets in the CI jumble, but the CI there has failed for all other current Flux PRs which run downstream tests and thus I don't think it's related.

yuehhua · 2022-08-29T23:25:29Z

Failure in GeometricFlux is not related to AD and this PR could go. I will check the error further.

ToucheSir · 2022-08-29T23:29:42Z

src/functor.jl

@@ -88,6 +88,9 @@ function params(m...)
  return ps
 end

+# Allows caching of the parameters when params is called within gradient()


Can you add a quick mention of #2040 here? Otherwise this looks good to go!

Just pushed with edited comment.

ToucheSir · 2022-08-30T00:25:55Z

Thanks!

mcabbott · 2022-10-20T03:56:11Z

Unfortunately this breaks the use of params in explicit mode:

julia> m = (x=[1,2.0], y=[3.0]);

julia> gradient(m -> (sum(norm, Flux.params(m))), m)
((x = [0.4472135954999579, 0.8944271909999159], y = [1.0]),)  # before, [email protected]
(nothing,)  # after, [email protected]

julia> gradient(() -> (sum(norm, Flux.params(m))), Flux.params(m))
Grads(...)

julia> ans[m.x]  # unchanged
2-element Vector{Float64}:
 0.4472135954999579
 0.8944271909999159

Make params non_differentiable

bb18173

ToucheSir added the run downstream test label Aug 29, 2022

ToucheSir closed this Aug 29, 2022

ToucheSir reopened this Aug 29, 2022

ToucheSir approved these changes Aug 29, 2022

View reviewed changes

Add mention of issue FluxML#2040

f5793b5

ToucheSir merged commit 31e4dd0 into FluxML:master Aug 30, 2022

christiangnrd mentioned this pull request Aug 30, 2022

Fix bug that caused Flux.params(x) call to not be cached (Closes issue #2040) #2048

Closed

christiangnrd deleted the 0.13.5_caching_fix branch August 30, 2022 00:36

This was referenced Oct 20, 2022

Deprecate Flux.Optimisers and implicit parameters in favour of Optimisers.jl and explicit parameters #1986

Closed

Iteration over params(m) in explicit mode gives no gradient #2091

Closed

mcabbott mentioned this pull request Nov 21, 2022

Narrower version of @non_differentiable params #2118

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make params non-differentiable (Closes #2040 & #2048) #2054

Make params non-differentiable (Closes #2040 & #2048) #2054

christiangnrd commented Aug 29, 2022

christiangnrd commented Aug 29, 2022

ToucheSir commented Aug 29, 2022 •

edited

Loading

yuehhua commented Aug 29, 2022

ToucheSir Aug 29, 2022

christiangnrd Aug 30, 2022

ToucheSir commented Aug 30, 2022

mcabbott commented Oct 20, 2022 •

edited

Loading

Make params non-differentiable (Closes #2040 & #2048) #2054

Make params non-differentiable (Closes #2040 & #2048) #2054

Conversation

christiangnrd commented Aug 29, 2022

christiangnrd commented Aug 29, 2022

ToucheSir commented Aug 29, 2022 • edited Loading

yuehhua commented Aug 29, 2022

ToucheSir Aug 29, 2022

Choose a reason for hiding this comment

christiangnrd Aug 30, 2022

Choose a reason for hiding this comment

ToucheSir commented Aug 30, 2022

mcabbott commented Oct 20, 2022 • edited Loading

ToucheSir commented Aug 29, 2022 •

edited

Loading

mcabbott commented Oct 20, 2022 •

edited

Loading