Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate walks out from fmap and add #39 to fcollect #43

Merged
merged 7 commits into from
Nov 4, 2022

Conversation

darsnack
Copy link
Member

@darsnack darsnack commented Oct 6, 2022

This is a non-breaking step towards separating all the keywords out of fmap. It was a side-effect of working on #41, so I separated the changes out to be evaluated independently. For #41, I am considering a more significant, but orthogonal, set of changes. Do we like this refactor?

Actually, this does seem to be breaking, since the walk keyword can no longer be an arbitrary function. I only introduced AbstractWalk to get around the method ambiguity error for fmap. Don't see another way around it.

@ToucheSir
Copy link
Member

Looks solid. Test fixes aside, perhaps we could wrap up #39 quickly and reuse the logic from that PR in CachedWalk/CollectWalk? It would be nice to finally decouple fmap and caching :)

@mcabbott
Copy link
Member

mcabbott commented Oct 8, 2022

Have not read this closely, but if I understand right on first look, it doesn't significantly change the logic of basic case?

Am curious whether you see such changes allowing all the weird cases in Optimisers.jl to be better abstracted.

Re breaking, I think you're saying that the method fmap(walk::AbstractWalk, f, x, ys...) wants a type to distinguish from fmap(f, x, ys...). But is there any need to make these both the same function? The first could be _fmap, maybe that's clearer anyway, and then it could allow walk::Any.

(But breaking isn't a huge deal, according to https://juliahub.com/ui/Packages/Functors/lER0H/0.3.0?page=2 only 5 direct deps, although that misses Flux somehow.)

@ToucheSir
Copy link
Member

ToucheSir commented Oct 9, 2022

But is there any need to make these both the same function? The first could be _fmap, maybe that's clearer anyway, and then it could allow walk::Any.

Thinking about the deprecation cycle for cache=<not Nothing> in fmap, I think it would make sense for this to be part of the public API. Could walks wrap the callback function (i.e. recurse)? Then both methods would have the same arity.

@darsnack
Copy link
Member Author

it doesn't significantly change the logic of basic case?

Yeah that was the goal here. I wanted to refactor the code as part of logical changes, so I pulled the refactoring out separately. Here the refactor is only a cleaner way of implementing the original logic.

Re breaking, I think you're saying that the method fmap(walk::AbstractWalk, f, x, ys...) wants a type to distinguish from fmap(f, x, ys...). But is there any need to make these both the same function? The first could be _fmap, maybe that's clearer anyway, and then it could allow walk::Any.

I actually want to move towards fmap(walk, f, x, ys...) and ditch the keywords in the long run. It's definitely longer to write it out this way, but I think the logic is clearer to understand than explaining how a bunch of keywords interact in the docstring of fmap. I only left fmap(f, x, ys...; kwargs...) for backwards compatibility.

I think it should be possible to maintain backwards compatibility and support both modes. We can have the walk keyword get wrapped in a AnonymousWalk struct (internal). Or we can just choose to be breaking. I think there are fewer uses of custom walk functions than dependents of Functors.jl. And porting existing ones is fairly simple since the interface is the same.

Could walks wrap the callback function (i.e. recurse)?

Part of what makes the walks composable is that DefaultWalk does not need to know that it's recursion calls back into CachedWalk. You could have an outer wrapper that holds a walk and a recursion function together. I think then we start getting to #32 which could be a good thing. I'd be willing to take this PR in that direction if it is desired. The initial goal here was to be as close to the original code as reasonable.

Am curious whether you see such changes allowing all the weird cases in Optimisers.jl to be better abstracted.

Only in part. The walk function here is a specification for starting at Node A in the tree and recursing into its children. I am thinking about it as a rule. You could certainly replace recurse with two functions—ascend and descend—that allows the walk function to choose to walk back up the tree. In this way, I think you could specify the rule for finding shared nodes which Optimisers.jl wants. I don't know how much clearer or more efficient that is compared walking the tree multiple times.

One thing that already works is writing a walk that holds external information (e.g. CachedWalk). Then the gradient cache from FluxML/Optimisers.jl#106 can be written more cleanly as a walk struct with a cache.

@mcabbott
Copy link
Member

mcabbott commented Oct 11, 2022

Re breaking, I marked #39 as being v0.4, although I'm not entirely sure we must.

Maybe 0.4 should contain both changes? No great rush for #39. And perhaps it's worth adding the AnonymousWalk idea, maybe it prints a depwarn... then 0.4 will in practice probably not break anyone's code.

Also, thoughts on whether FluxML/Flux.jl#1932 ought to wait for any changes here?

@darsnack
Copy link
Member Author

And perhaps it's worth adding the AnonymousWalk idea, maybe it prints a depwarn... then 0.4 will in practice probably not break anyone's code.

This is now in there. I think we can release this and #39 together in 0.4 if that's desired. Otherwise, if you would rather wait for #39, that's fine too.

Also, thoughts on whether FluxML/Flux.jl#1932 ought to wait for any changes here?

Not on this, but my next planned change will make functor opt-out instead of opt-in (via ConstructionBase defaults). This means that @layer will be restricted to just pretty printing, so a more accurate name would be nice. It also makes @layer optional then instead of required.

@darsnack darsnack force-pushed the walks branch 2 times, most recently from fb81199 to f788256 Compare October 31, 2022 16:32
@darsnack
Copy link
Member Author

Okay this now has #39 changes as well. Should be ready for another review.

@darsnack darsnack requested a review from ToucheSir November 1, 2022 18:44
src/maps.jl Outdated Show resolved Hide resolved
This type only exists for backwards compatability and should be directly used.
Attempting to wrap an existing `AbstractWalk` is a no-op (i.e. it is not wrapped).
"""
struct AnonymousWalk{F} <: AbstractWalk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
struct AnonymousWalk{F} <: AbstractWalk
struct AnonymousWalk{F<:AbstractWalk} <: AbstractWalk

Is there any reason not to constrain these type parameters, at least for now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason this type exists is to wrap F that isn't a AbstractWalk but still satisfies the AbstractWalk interface.

src/walks.jl Outdated Show resolved Hide resolved
@darsnack
Copy link
Member Author

darsnack commented Nov 3, 2022

Okay should be good?

@darsnack darsnack changed the title Separate walks out from fmap Separate walks out from fmap and add #39 to fcollect Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants