-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pipe_input target #1173
pipe_input target #1173
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to 5441199 in 33 seconds
More details
- Looked at
838
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
4
drafted comments based on config settings.
1. hamilton/function_modifiers/macros.py:920
- Draft comment:
Ensureself.target
is notNone
before callingextend
on it. This can lead to aTypeError
ifself.target
isNone
. Consider initializingself.target
as an empty list if it'sNone
. - Reason this comment was not posted:
Comment did not seem useful.
2. hamilton/function_modifiers/macros.py:885
- Draft comment:
Initializeself.target
as an empty list ifon_input
isNone
to avoid issues when extending it later. - Reason this comment was not posted:
Marked as duplicate.
3. hamilton/function_modifiers/macros.py:939
- Draft comment:
Explicitly check ifself.namespace
is a string before using it in_resolve_namespace
method for clarity and to avoid potential issues. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The current implementation of_resolve_namespace
assumes that ifself.namespace
is not...
orNone
, it is a string. The suggestion to explicitly check if it is a string could add clarity and prevent potential issues ifself.namespace
is accidentally set to a non-string value. However, the current logic seems to handle the expected cases, and the suggestion might be considered overly cautious unless there is a known issue withself.namespace
being set incorrectly.
I might be overthinking the need for an explicit string check. The current logic seems to handle the expected cases, and the suggestion might be unnecessary unless there is a specific reason to expectself.namespace
could be set to a non-string value.
Explicitly checking for a string could prevent future bugs ifself.namespace
is accidentally set to an unexpected type, but it might be unnecessary if the current logic is sufficient.
The comment suggests a precautionary measure that might not be necessary given the current logic. However, it could prevent potential issues ifself.namespace
is set incorrectly. Consider keeping the comment if there's a history of such issues.
4. hamilton/function_modifiers/macros.py:857
- Draft comment:
Please add documentation for the newon_input
functionality in thepipe_input
decorator to the Sphinx documentation underdocs/
. This will help users understand how to use this feature effectively. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is about documentation, which is not directly related to the code changes in the diff. The rules specify not to ask for updates to the PR description or documentation unless it's part of the code changes. Since the comment is about documentation outside the code changes, it should be removed.
I might be overlooking the importance of having external documentation for new features, but the rules are clear about not commenting on documentation unless it's part of the code changes.
The rules are strict about not commenting on documentation unless it's part of the code changes, so the comment should be removed.
Remove the comment as it pertains to documentation outside the scope of the code changes in the diff.
Workflow ID: wflow_iN66poHVEvrsbEW2
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Left some thoughts -- I think this is valuable (E.G. the first parameter is a bit of an awkward assumption), but we might want to restrict the API a bit. Also some naming/consistency stuff.
""" | ||
|
||
def __init__( | ||
self, | ||
*transforms: Applicable, | ||
namespace: NamespaceType = ..., | ||
on_input: base.TargetType = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'm not sure that on_input
makes sense for anything but a single string, and the semantics aren't quite the same. Some thoughts:
- By default, this applies to the first parameter
- Does
...
apply to everything?
IMO you should restrict it more -- either a string, or default (which is the first parameter). Also, @load_from
(which is similar) has the parameter named target
. But that's confusing, and probably not the best (maybe it should be called on_input
there?. Thoughts on naming?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'm not sure that
on_input
makes sense for anything but a single string, and the semantics aren't quite the same. Some thoughts:
- By default, this applies to the first parameter
Yes, I left the default to chain the first parameter and keep the namespace as is for backwards compatibility 👍 .
- Does
...
apply to everything?
I think we can add ...
, but you could also just use the global on_input
and pass in a list with the parameters.
IMO you should restrict it more -- either a string, or default (which is the first parameter). Also,
@load_from
(which is similar) has the parameter namedtarget
. But that's confusing, and probably not the best (maybe it should be calledon_input
there?. Thoughts on naming?
Ah okay, so the idea I had in mind is to leave flexibility that you can run pipe_input
on more than one parameter: 3 different cases specifically
- You can set a global
on_input
and it will apply it to those input arguments, like if we want to run it on the second argument w-->_add_one --> _add_two --> 2nd arg of foo
def _add_one(v: int) -> int:
return v + 1
def _add_two(v: int) -> int:
return v + 2
@pipe_input(
step(_add_one).named("a"),
step(_add_two).named("b"),
on_input="w",
namespace="global",
)
def fool(v: int, w: int) -> int:
return w
- You can set a local
on_input
and choose which transform runs on which input argument, like v-->_add_one --> 1st arg of foo and w-->_add_two --> 2nd arg of foo.
def _add_one(v: int) -> int:
return v + 1
def _add_two(v: int) -> int:
return v + 2
@pipe_input(
step(_add_one).on_input("v").named("a"),
step(_add_two).on_input("w").named("b"),
namespace="local",
)
def fool(v: int, w: int) -> int:
return v+w
- You can set have a mixture
on_input
for global and local. The local adds specific transforms on top of what the global has. Here the whole pipeline gets applied tov
, but only _add_two tow
def _add_one(v: int) -> int:
return v + 1
def _add_two(v: int) -> int:
return v + 2
@pipe_input(
step(_add_one).named("a"),
step(_add_two).on_input("w").named("b"),
on_input="v"
namespace="local",
)
def fool(v: int, w: int) -> int:
return v+w
- The
None
defaults to first parameter and if someone setson_input
but not all thesteps
inpipe_input
get hit, we error out that is unclear what the user wants and he either needs to specify locally or globally or omit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jernejfrank -- this makes a lot of sense.
To sum up the offline conversation we had, we decided to restrict it to a single input (for now), knowing that we can turn it on for more. The reasoning is to reduce supported surface area until we get a request for this feature!
self.chain = _chain | ||
self.namespace = namespace | ||
|
||
if isinstance(on_input, str): # have to do extra since strings are collections in python | ||
self.target = [on_input] | ||
elif isinstance(on_input, Collection): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the use-case for a collection here? It's interesting, but feels a bit more complex than necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The collection was to apply it for a subset of params, here is an example of globally, but it also works locally:
Lets say we only want to pipe_input
first two params
def _add_one(v: int) -> int:
return v + 1
def _add_two(v: int) -> int:
return v + 2
@pipe_input(
step(_add_one).named("a"),
step(_add_two).named("b"),
on_input=["v","w"],
namespace="global",
)
def foo(v: int, w: int, z:str, .... kwargs) -> int:
return w
total_nodes = [] | ||
total_rename_maps = {} | ||
|
||
for param in parameters_transforms_map: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is getting complex -- we should probably break it out into a separate function and add a unit test or two for it...
5441199
to
9c79275
Compare
The API expanded substantially to leave as much flexibility in there as possible. It also is a complete dual to But I'm happy to constrain it more, like @elijahbenizzy suggested, to only allow only a single parameter and you pass it in as a string seems reasonable. In case you want more flexibility use def _transform():
...
def _other_transform():
...
@pipe_output(step(_transform)) # this won't work
def interesting_node():
...
@pipe_input(
step(_transform).on_input="interesting_node", # this should work
step(_other_transform).on_input("foo") # leaving flexibility to also transform other params
)
def downstream(foo, bar, baz, interesting_node,...):
...
def some_other_dependency(interesting_node):
# here you don't want to have the pipe transform happening just the original `interesting_node` input
... |
Maybe it would make sense to keep it like this because the subtle difference in the implementations between Specifically, for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Some small docs suggestions.
Also, let's create a ticket here. You've done so much for multiple already, we should track + link from the docs. Nice work!
@@ -345,7 +345,7 @@ def __init__( | |||
:param _resolvers: Resolvers to use for the function | |||
:param _name: Name of the node to be created | |||
:param _namespace: Namespace of the node to be created -- currently only single-level namespaces are supported | |||
:param _target: Selects which target nodes it will be appended onto. By default all. | |||
:param _target: Selects which target nodes it will be appended onto. Default None gets resolved on decorator level. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would explain this more -- E.G. reference the decorators/how they resolve it.
.. code-block:: python | ||
|
||
@pipe_input( | ||
step(_add_one).on_input(["p1","p3"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update docs -- we disabled this (for now), right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, I found a hack how to "comment out" things from rst
so Sphinx doesn't generate this part of the docstring.
""" | ||
|
||
def __init__( | ||
self, | ||
*transforms: Applicable, | ||
namespace: NamespaceType = ..., | ||
on_input: base.TargetType = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jernejfrank -- this makes a lot of sense.
To sum up the offline conversation we had, we decided to restrict it to a single input (for now), knowing that we can turn it on for more. The reasoning is to reduce supported surface area until we get a request for this feature!
The new name makes the functionality clearer.
Pipe_input can select which parameter to apply to - Added on_input to Applicable for user facing API - Corrected Applicable named and namespaced to pass through target - Global on_input for pipe_input
The implementation allows for global string input for a target. - Functionality is there for multiple global and local parameter targets, but we shortcircuit for the moment with an error if the arg is not a string. - At the moment all transforms in the pipe get applied to the same argument. - Test and docs have commented out the multiple parameter cases for easy enabling should we want this. - Added URL of the GitHub issue tracking interest to enable multiple params
2cdbb85
to
5188e08
Compare
Addresses #1161.
Default behaviour reverts back to first parameter chaining for backwards compatibility.
Changes
How I tested this
Notes
There are two differences to
on_output
:on_input
settings. I made it so that the local (on the level of Applicable) adds to the global one (on the level of pipe_input). The other option would be to allow the local to override the global one. I also added a safeguard that ifon_input
is used, all transforms need to have a target to have it clear.Checklist