Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rm Flux.Zeros #1882

Merged
merged 7 commits into from
Mar 5, 2022
Merged

rm Flux.Zeros #1882

merged 7 commits into from
Mar 5, 2022

Conversation

mcabbott
Copy link
Member

This is the same change as #1407, but easier to replace than to rebase.

This year's motivation for killing Flux.Zeros is that its broadcasting rules are broken for what's needed in #1881. Last year, and the one before... well, there's a long history. The extent of this thing was reduced in #1379.

Closes #1407. Fixes #1392. Fixes #1402. Does not re-break #1332.

@mcabbott mcabbott mentioned this pull request Feb 19, 2022
3 tasks
src/deprecations.jl Outdated Show resolved Hide resolved
Comment on lines -410 to -419
@testset "$b1 to $b2" for (b1, b2, be) in (
(Flux.zeros32, Flux.ones32, Flux.ones32), # Load ones as bias to a model with zeros as bias -> model gets ones as bias
(Flux.ones32, nobias, Flux.zeros32), # Load Zeros as bias to a model with ones as bias-> model gets zeros as bias
(nobias, Flux.ones32, nobias), # Load ones as bias to a model with Zeros as bias-> model bias does not change
)
m1 = dm(b1)
m2 = dm(b2)
loadparams!(m1, b1 == nobias ? weights(m2) : pararray(m2))
testdense(m1, be)
end
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the tests where I'm not quite sure what's being tested or what the intent is. Does anyone know?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the intent is if the model already has the bias as false (Zeros), then loading a numeric bias into it should be ignored. The opposite direction is not true, loading a false (Zeros) into a numeric bias should zero-out the entries of the numeric bias without disabling it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are some special cases that could easily be handled in #1875.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will fiddle a bit. But possibly test these facts in a less awful way.

Copy link
Member Author

@mcabbott mcabbott Feb 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8acad3e adds some tests.

loading a false (Zeros) into a numeric bias should zero-out the entries of the numeric bias without disabling it.

This does not happen on tagged flux, nor with this PR. I've marked it broken.

(The old tests have a comment "Load Zeros as bias to a model with ones as bias-> model gets zeros as bias" but, apparently, they don't actually test that. Which I take as evidence that nobody else can tell what on earth is being tested, or not tested, in these deeply nested special-cased loops. So we should replace them.)

if the model already has the bias as false (Zeros), then loading a numeric bias into it should be ignored.

This is the current behaviour, and is unchanged here. But there is some chance it ought to be an error, at least when the source vector isn't zero.

As you say, these can be adjusted as desired with structural loadparams! alla #1875.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to make some amends for awful code :)

The tests are indeed testing some edge cases which one might want to error, .e.g loading params from a model without bias into one which has. The behaviour of loadparams! was a hot topic in the previous iteration of no bias (although I feel it might have been a proxy for some other more ideological issue) and I just tried to test what the code kind of tried to do.

I think I was a bit worried that testing with just a dense layer as the model would mask alot of errors just because the bias happened to be the last parameter in the list of parameters (and therefore easily ignored by zip). Using a Chain with multiple dense makes the tests a bit more verbose but I fully admit that I went overboard with helper functions (and "succinct" naming) here.

Furthermore, the test in 8acad3e which is marked as broken fails only because Params does not collect the bias, so I guess one can just replace it with e.g. [m0.weights, m0.bias].

I suppose this is all pretty moot if #1875 will anways change the behaviour of loadparams! (hopefully to make it more fail-fast).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, well thanks for having a thick skin! Somehow I thought these tests were from the beginning of time... they certainly matched the surrounding code style perfectly.

I've changed the test to use a Chain. Now all the mismatched cases are errors, before and after the PR.

Because while Params claims to be an IdSet of parameters, in fact loadparams! relies on the order in vector of parameters. I have no idea whether that's a bug or a feature, the whole thing is hopelessly under-specified. Nevertheless, Flux.params is clearly the public interface here, and using that, mismatched models in loadparams! gave an error before this PR, and still give an error after it.

You can make it do other things by constructing this internal vector of parameters yourself. You can put numbers in there:

julia> false in Params([[1 2; 3 4], false])
true

which was what #1731 wanted to make an error, since they don't fit with the IdSet picture at all. No doubt you can construct vectors which make loadparams! do almost anything, or fail at anything. Are we obliged to preserve any of that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they certainly matched the surrounding code style perfectly.

Nice save 🙂

You can make it do other things by constructing this internal vector of parameters yourself.

Yeah, this is probably the key point w.r.t this issue. While one can put false in there, it has no more relevance to "no bias" than putting 13.44 and whether or not this is something that should work in loadparams! does not belong here (don't have a strong opinion fwiw).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we do this? Maybe it's time.

@codecov-commenter
Copy link

codecov-commenter commented Feb 19, 2022

Codecov Report

Merging #1882 (d559685) into master (57ef5c0) will increase coverage by 0.13%.
The diff coverage is 64.28%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1882      +/-   ##
==========================================
+ Coverage   85.91%   86.04%   +0.13%     
==========================================
  Files          19       18       -1     
  Lines        1434     1419      -15     
==========================================
- Hits         1232     1221      -11     
+ Misses        202      198       -4     
Impacted Files Coverage Δ
src/deprecations.jl 33.33% <0.00%> (-7.85%) ⬇️
src/layers/basic.jl 79.03% <50.00%> (ø)
src/layers/conv.jl 80.32% <100.00%> (-0.22%) ⬇️
src/utils.jl 93.46% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 57ef5c0...d559685. Read the comment docs.

@mcabbott mcabbott added this to the v0.13 milestone Feb 19, 2022
@mcabbott mcabbott force-pushed the nobias4 branch 2 times, most recently from f2f3c89 to 02f6594 Compare February 20, 2022 01:11
src/utils.jl Outdated

Return a bias parameter for a layer, based on the value given
to the constructor's keyword `bias=bias`.

* `bias == true` creates a zero vector, of the same type as weights.
* `bias == false` returns `Zeros()`, a special struct which exists only to encode the absence of bias.
* `bias == false` returns `false` now, which is understood by AD to be non-differentiable.
* `bias::AbstractArray` uses the array provided, provided it has the correct size and eltype. If the type is wrong, it will be converted.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is a lie, it does not at present fix eltypes. I think I'd prefer it to insist they match, is there a use for differing types which isn't just a mistake?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I edited the words to match the code.

@mcabbott mcabbott requested a review from darsnack March 5, 2022 04:52
src/utils.jl Outdated Show resolved Hide resolved
@test sum(m1[1].weight) == 6 # written before error

# load into a model without bias -- should it ignore the parameter which has no home, or error?
@test_broken Flux.loadparams!(m0, Flux.params(m2))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think being strict and erroring is fine. Also not worth thinking too much about this now as we phase out Params.

Co-authored-by: Carlo Lucibello <[email protected]>
Copy link
Member

@darsnack darsnack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think it's better to deal with the loadparams! separately in #1875.

src/layers/basic.jl Outdated Show resolved Hide resolved
@darsnack darsnack mentioned this pull request Mar 5, 2022
3 tasks
@mcabbott mcabbott merged commit 0edf602 into FluxML:master Mar 5, 2022
@mcabbott mcabbott deleted the nobias4 branch March 5, 2022 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants