Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add all-keyword constructors, much like @kwdef #160

Merged
merged 5 commits into from
Feb 8, 2024

Conversation

mcabbott
Copy link
Member

@mcabbott mcabbott commented Sep 12, 2023

Allows this:

julia> AdamW(lambda=333e-3)
OptimiserChain(Adam(0.001, (0.9, 0.999), 1.0e-8), WeightDecay(0.333))

Not entirely sure it's worth the complexity & extra documentation.

(Note that adjust! already exposes field names, so in a sense they are already API.)

Checklist

  • Tests are added
  • Documentation, if applicable

@CarloLucibello
Copy link
Member

I would really like to see this through. For instance, I hate having to copy and paste the default values of beta1 and beta2 from the docstring of AdamW and write AdamW(1e-3, (0.9, 0.999), 1e-4) instead of AdamW(eta=1e-3, decay=1e-4)

@mcabbott
Copy link
Member Author

mcabbott commented Oct 9, 2023

Sorry I missed your comment here. But AdamW precisely my motivating case too. Maybe we should do it?

@ToucheSir
Copy link
Member

No objections from me, though I would put in pin in un-greeking some of the common shared hyperparam names for the next major release.

prominent direction, in effect dampening oscillations.
- Machine epsilon (`ϵ`): Constant to prevent division by zero
- Machine epsilon (`ϵ == epsilon`): Constant to prevent division by zero
(no need to change default)
- Keyword `centred` (or `centered`): Indicates whether to use centred variant
of the algorithm.
Copy link
Member Author

@mcabbott mcabbott Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should these be documented?7706ffd is one suggestion.

We could also remove all greek glyphs from the docstrings, in favour of ascii field names. Some formulae would get a bit longer, or perhaps those few could show both.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe keep the ascii version in the part of the docstring that lists the actual parameter name, then use a latex-ified greek characters in the description of what the parameter does if applicable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow.

Goal of the current style is partly to be usable even if you don't know the names of greek letters. This was true before, as long as you can identify where the same scribble shows up.

Copy link
Member

@ToucheSir ToucheSir Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of how other libraries use embedded latex in their extended docstrings to describe what optimization rules are doing. e.g. https://pytorch.org/docs/stable/generated/torch.optim.AdamW.html#torch.optim.AdamW. Not something required for every field of every rule, but we don't have to purge greek characters from the docstrings completely.

Copy link
Member Author

@mcabbott mcabbott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should just do this.

Docstrings only mention it for a few. Perhaps we can revisit later how best to document this.

Tests fail because Yota is broken, apparently. Perhaps we can replace with Diffractor someday.

src/rules.jl Outdated Show resolved Hide resolved
@mcabbott mcabbott marked this pull request as ready for review February 2, 2024 17:09
@mcabbott mcabbott mentioned this pull request Feb 7, 2024
2 tasks
@mcabbott mcabbott changed the title RFC, add all-keyword constructors? Add all-keyword constructors, much like @kwdef Feb 7, 2024
@mcabbott mcabbott merged commit 1908a1c into FluxML:master Feb 8, 2024
3 of 4 checks passed
@mcabbott mcabbott deleted the all_kw branch February 8, 2024 04:08
mashu pushed a commit to mashu/Optimisers.jl that referenced this pull request Nov 14, 2024
* add all-keyword constructors

* update a few docstrings

* docstrings

* add tests

* one lost γ should be λ
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants