Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: different dispatch instances on same function #186

Closed
wants to merge 1 commit into from

Conversation

nstarman
Copy link
Contributor

No description provided.

@nstarman nstarman marked this pull request as ready for review July 20, 2024 16:38
@coveralls
Copy link

coveralls commented Jul 20, 2024

Pull Request Test Coverage Report for Build 10133091731

Details

  • 54 of 66 (81.82%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.9%) to 98.871%

Changes Missing Coverage Covered Lines Changed/Added Lines %
plum/dispatcher.py 54 66 81.82%
Totals Coverage Status
Change from base Build 9969956439: -0.9%
Covered Lines: 1314
Relevant Lines: 1329

💛 - Coveralls

@wesselb
Copy link
Member

wesselb commented Jul 22, 2024

Hey @nstarman!

Thanks for creating a PR. :) I'm not entirely sure about this change. For a function, function._f corresponds to the method that was given whenever the function was created. Here's an example:

from number import Number


@dispatch.abstract
def add(x: Number, y: Number):
    """Add two numbers.

    Generic docstring.
    """


@dispatch
def add(x: int, y: int):
    return x + y


# At this point, `add._f` is the function given in the abstract implementation.

I'm therefore worried that this might result in unexpected behaviour where people expect the attribute _f to be something that it isn't.

In general, I think passing a function to @dispatch is not correct. Instead, one should use @dispatch to define methods. Hopefully this makes sense, but what I'm trying to say is that a function is a generic function name, like add, and a method is a specific implementation of a function with a type signature, like add(x: int, y: int). In particular, a function does not have a type signature, because it chooses which signature to use whenever the function is called (the process of dispatching). Therefore, giving a function to @dispatch wouldn't be right, because there is no well-defined type signature.

Instead, I'm thinking that this instead we should give an error. What would you think about this?

@nstarman
Copy link
Contributor Author

nstarman commented Jul 25, 2024

I'm not sure I see what you mean. This PR is about making it possible to do

@dispatch2
@dispatch1
def f(x: int, y: float):
   return x + y

where dispatch1 and dispatch2 are different instances of Dispatcher.

I don't think this approach suffers from the problem you outline. The following should (correct me if I'm wrong) find the correct method.

I did discover a fault. Adding in two dispatch.abstract in the test suite...

dispatch1 = Dispatcher()
dispatch2 = Dispatcher()

@dispatch1.abstract
def f(x: Number, y: Number):
    return x - 2 * y

@dispatch2.abstract
def f(x: Number, y: Number):
    return x - y

@dispatch2
@dispatch1
def f(x: int, y: float):
    return x + y

@dispatch1
def f(x: str):
    return x

ns1 = SimpleNamespace(f=f)

@dispatch2
def f(x: int):
    return x

ns2 = SimpleNamespace(f=f)

assert ns1.f("a") == "a"
assert ns1.f(1, 1.0) == 2.0

assert ns2.f(1) == 1
assert ns2.f(1, 1.0) == 2.0

Means that method, _ = method.resolve_method(signature) raises an error because it's trying to find the abstract method.
What's a better way of getting dispatch2 to see the correct method (function with specific signature) when the dispatches are nested?

@wesselb
Copy link
Member

wesselb commented Jul 26, 2024

@nstarman What would you think of changing the pattern to something like this?

@dispatch1.chain(dispatch2)
def f(x: int, y: float):
    return x + y

Then, internally, the chain method would simply call dispatch1(f) and dispatch2(f), so there wouldn't be any unwrapping required. It's slightly less nice syntax wise, but perhaps acceptable?

@nstarman
Copy link
Contributor Author

nstarman commented Jul 26, 2024

Interesting! Or what about having a DispatcherBundle class such that it's

@(dispatch1 | dispatch2)
def func(...): ...

Where

class AbstractDispatcher: ...

class Dispatcher(AbstractDispatcher):
    def __or__(self, other: Dispatcher) -> DispatcherBundle: ...

class DispatcherBundle(AbstractDispatcher):
    def __call__(...):
        for dispatcher in self:
            ...
    

I think the syntax is more Pythonic. And then the bundle object is reusable and looks like the regular dispatcher. Moving most of the logic to a common base class would mean this is a smallish code addition!

Alternatively the or operator could just return a call chain function, though this prevents chaining e.g multi.

@wesselb
Copy link
Member

wesselb commented Jul 26, 2024

@nstarman Aha, I like that even better, since now the syntax is symmetric and indeed more Pythonic! Would be very happy to go for that. :)

@nstarman nstarman force-pushed the fix-dispatch-stacking branch from d24eee6 to 0793cf8 Compare July 28, 2024 15:35
@nstarman nstarman marked this pull request as draft July 28, 2024 15:35
@nstarman nstarman force-pushed the fix-dispatch-stacking branch from 0793cf8 to 2993d12 Compare July 28, 2024 15:38
@nstarman nstarman force-pushed the fix-dispatch-stacking branch from 2993d12 to 2583d45 Compare July 28, 2024 16:15
@nstarman
Copy link
Contributor Author

Just need to add a guard for python 3.8 since @(...) syntax is not supported until py3.9+.
As a side note, when were you planning to drop 3.8 support?

@PhilipVinc
Copy link
Collaborator

If I can share a thought about DispatcherBundle and the double registration of functions...

I would caution against including such features into plum itself, and would rather refactor and open up the internals in a reasonable way that makes it easy to build such things on top organically and without fragmenting the ecosystem.

One of the reasons is that multiple dispatch in python is already somewhat 'weird' and not native. Plum is trying to make it possible and reasonable, but there's already a long way to go to get there. While other languages offer great inspiration and carefully crafted ideas or frameworks, we lack a coherent view on how to get a 'reasonable MD that works and is intuitive' in Python.

However, I do not think that adding many features to plum itself is the best way to get there. What is proposed in this PR is an exotic use case that is intricated.
Already MD is 'hard to understand' to the average python user, and dispatcher bundles seem to me an even more complicated concept never seen in another language and that hides intent even more!
While I understand that it might make sense in your particular project* , I do not think that those untested ideas belong to plum itself, where they might confuse users even more and push them in directions I personally do not think we should endorse.

At the same time, I would like plum to foster experimentation of such 'crazy ideas', because maybe at some point we might land on the 'right approach' to MD in Python. To do that, I think that a reasonable part of plum's internal should be expandable and reusable such that you'd be able to build this feature in your own library, on top of plum.

In short, I think this is a strong idea about how MD should work, going in a step that does not belong to MD (multiple dispatchers for the same functions).

  • I really have never seen such an exotic idea and I think that if it was never attempted, there might be a reason. I would strongly suggest to think twice about such a complicated approach and try to take a few steps back and solve it differently.

@wesselb
Copy link
Member

wesselb commented Jul 30, 2024

@PhilipVinc, you make some solid arguments! @nstarman, what do you think?

In this case, I think that the functionality could be fairly easily implemented with a third-party function resulting in something like this:

@chain_dispatchers(dispatch1, dispatch2)
def f(…):
    …

@nstarman, would you also be happy with a solution like this, or do you think this functionality should belong in Plum? If the latter, possibly we could organise more experimental functions that are not to be used by the average user in e.g. plum.experimental, unless @PhilipVinc thinks that we should consider doing not even that.

@nstarman
Copy link
Contributor Author

nstarman commented Jul 30, 2024

I think @PhilipVinc raises good points!
Without language changes, MD in Python will be implemented by on and by objects of which there can be multiple instances. I agree this is not ideal (and hopefully not too common) and that we don't necessarily want to add functionality to make this more visible.
I'm happy for chain_dispatchers to be in my 3rd party code, but IMO given that this MD is in Python and done via objects, functionality for working with multiple of those objects would be good to have in a plum.experimental.

@wesselb
Copy link
Member

wesselb commented Jul 30, 2024

Happy to go with plum.experimental if @PhilipVinc is also happy with that!

@PhilipVinc
Copy link
Collaborator

Experimental is ok for me.

However I must say I disagree with this vision that 'MD' is implemented through 'objects'.
That's just an implementation detail in my eyes, and we could easily make the Dispatcher a singleton.

In fact, a question I never asked @wesselb is: why is not a singleton? What was the use case you had in mind for multiple dispatchers?

I fail to see in which case it would be useful to have multiple dispatcher objects, and in particular why you want to use it with the same functions?

For my own curiosity, could you let me know what is your use case?

@nstarman
Copy link
Contributor Author

@PhilipVinc. For me this is to support the multiple different versions of the same function that exist in NumPy and other array libraries during the Array API transition. For example jax.numpy.arange and jax.experimental.array_api have different signatures and requirements.
See unxt and quaxed/array_api and quaxed/numpy
This specific situation will eventually be resolved once array-api is fully adopted.

@wesselb
Copy link
Member

wesselb commented Jul 30, 2024

@PhilipVinc Multiple dispatchers are required to emulate something like how scoping works in Julia. For example, in Julia, if package A defines a function f but does not export it, then you can define your own function f without the methods defined in A. Only once A exports f become these methods visible in your local scope. With a singleton dispatcher, all methods for function f would be visible to everyone and you could accidentally override methods or extend functions that weren’t supposed to be public.

@nstarman If chain_dispatchers is only required temporarily, maybe its better to define it in your package. If you think it’s useful long term, then plum.experimental sounds great. :)

@nstarman nstarman closed this Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants