-
-
Notifications
You must be signed in to change notification settings - Fork 612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow non-tuple data in the new train! #2119
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this maybe-splat behaviour is a little odd, but see the argument for keeping it in the name of making upgrades easier. Comments below are only on this.
Besides the headline change, this PR not only re-introduces callbacks, but introduces a larger more complicated callback scheme. Do we want this, or to push people towards writing a for
loop?
I simplified the scheme, compared to the old scheme it is more useful and has the same complexity. |
I agree that a callback which gets some details is more useful; the old-style ones where you wrap the global model seem more in keeping with the implicit style. Passing this as one NamedTuple is probably a better choice than splatting it as keywords, as your function can then take what it needs. One downside is that it does mean many more things need names, in code not just in docs. The present list is Maybe it would be clearer to call this |
Regarding callbacks, I think it's more of a design/community issue than anything else. The proposed callback scheme does provide all the state in the loop to the user. But there is already a difficulty: as written, the callback happens before the optimizer update, wouldn't
This is why I describe it as a community issue, because the issue is preventing feature creep. (2) does this neatly whereas (1) will always have users requesting more. |
Re before/after, worth remembering the Whereas |
my reasoning was that executing the callback before the update you are able to implement things such as gradient clipping |
For simple clipping we of course have ClipGrad. Do we have good examples of callbacks being useful in the wild? Every single use in the model zoo appears to be to print the loss, with https://github.com/FluxML/model-zoo/search?q=cb We could just build that into |
I removed the callback addition from this PR as it is controversial. |
Co-authored-by: Kyle Daruwalla <[email protected]>
Follow up to #2082 with
twochanges to the newtrain!
:batchmemaybe
function of the oldtrain!
)according to Add explicit
train!
, unifyupdate!
, and auto-translate the twoAdam
s #2082 (comment)support callbacksI can factor out the second change if deemed controversial. Edit: removed the callback addition.