-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parameterize decorator should add values to default_parameter_values #1124
Comments
To clarify, you wouldn't expect the following to work normally though right? # this should not work
result = driver.execute(final_vars=['output_node'], inputs={"x": {"some": "value"}}) But I think I see where you're coming from now I think. The tricky part is how would you handle this: x_value = {}
y_value = {}
@parameterize(
output_node=dict(x=value(x_value),
output_node2=dict(x=value(y_value))
)
def fun(x: dict) -> dict:
return x Or is the case that all of them have a the same default value? In the interim you could do something functionally equivalent like this: @tag(marker="default") # could identify this as such
def x() -> dict:
return {} # or reference some variable like you have above in the module
@tag(marker="parameterized")
@parameterize(output_node=dict(x=source("x")))
def fun(x: dict) -> dict:
return x then:
|
This shouldn't work indeed.
How do I handle it or how to implement this? In terms of handling, this is precisely what I need: I have two nodes, output_node and output_node2, they return distinct values, and I can access those values via respective node objects. Anyway, the default values are always different, i.e. If it helps, one can add By compute on the fly I, for example, mean the case where the callable, which is already stored on the node object, is a partial object and you can pull the info from there. In fact, if this is indeed the case and the callable is surfaced, I should be able to pull this info myself, at least in the interim. |
So if you use from hamilton.function_modifiers import parameterize, value
x_value = {"b": 2}
y_value = {"a": 1}
@parameterize(
output_node=dict(x=value(x_value)),
output_node2=dict(x=value(y_value))
)
def fun(x: dict, i: int) -> dict:
return x
if __name__ == "__main__":
from hamilton import driver
import __main__ as main
dr = driver.Builder().with_modules(main).build()
# this is using internal APIs -- but yes you have access to the bound variables
list(dr.graph.nodes.values())[0].callable.keywords
list(dr.graph.nodes.values())[1].callable.keywords However, the more "blessed path", would be to go via: vars = dr.list_available_variables() However it seems that there's a 🐛 because the partial function isn't exposed in
|
We're inconsistent as to when originating_functions is populated. So this tries to fix it for expander parameterize decorator at least.
With #1125 then this would work: vars[1].originating_functions[1].keywords should have the partial function and what is is bound to. |
OK, so it seems like the requireent is that we want to know what the right way to expose this value. I think this clarifies that how we're doing it is not perfect IMO -- the concept of a "bound" variable is nice, and should be wired through. We often do a ton of function-algebra (callables with other callables), but really in the case of parameterize/inject (and mabye even default vals), having a bound kwarg concept would actually simplify code. I'm uncomfortable with layering the keywords onto the function and inspecting that -- it's too much internals. Unless we make everything a partial. But the partial application could be done within the node's callable with the originating function (in most cases). That said, if each node had: (1) a function We could pretty easily expose this all with a bit of refactoring -- can take a look soon. My take from slack is that this is not an immediate blocker, more ergonomics, but let us know how it fits into your workflow! |
Thank you guys. I appreciate your receptiveness and responsiveness. Take your time. A couple of notes just in case
|
Yep! We use |
But I wanted to know the name of the function parameter which the source binds to. This doesn't affect DAG execution, but it does affect the function-algebra stuff you mentioned earlier. |
Here is a specific example
Assume that 'fun_x_input' is an input node and it feeds into |
Hmm, interesting... So the algebra of what happens under the hood is: So in the current framework, it's a bit hidden how exactly it wires to the underlying function. In fact, the only contract is that it will get passed in some way, not how it will get passed/that you can access it. But to get back to your point -- if you know that all_vars = dr.list_available_variables()
fun_var, = [var for var in all_vars if var.name == "fun"]
deps = var.required_dependencies | var.optional_dependencies
user_inputs = [var for var in all_vars if var.is_external_input and var.name in deps]
# this should be the information you need, in the `source` case. Note this doesn't get the |
Are you saying there will be a |
Ahh, yes, sorry, didn't see that it was called It should look like this: all_vars = dr.list_available_variables()
o1_var, = [var for var in all_vars if var.name == "o1"]
deps = var.required_dependencies | var.optional_dependencies
user_inputs = [var for var in all_vars if var.is_external_input and var.name in deps]
# this should be the information you need, in the `source` case. The fact that the function is named Curious -- does this solve your problem? Or do you need further inspection into the way it was defined... |
UPDATE: let me run your example to see what info I get. I still need to know the function name and signature and the exact argument it is wired too. One can argue that this info is encoded in the source name, but arguably this is a hack. Compare it to wrappers in python, https://docs.python.org/3/library/functools.html#functools.update_wrapper. It does provide access to the wrapped function via |
So I'm not 100% sure that these are analagous to standard decorators. The decorators actually all set an attribute on the function that is compiled later -- they don't modify the function at all. But the Hamilton Graph does parse it. Some details here on how it is designed (for better or worse): By the time it gets to the Hamilton graph, it has gone through multiple transformations. There's a lot of stuff we do based on the different decorators -- you can see the general process here. This is the function (or series of functions, there are cases in which we may have multiple) that get called eventually. See this field. So I think that does give you access to the originating function. But it's not as straightforward a mapping as a simple decorator/modifying wrapper --quite a few other transformations can occur. This also doesn't have the bindings (as we discussed) -- those occur in the creation of the callable (used by the internal Otherwise, I'd be really curious as to how you're using the data. Most designs are built around a core set of assumptions that help us decide what should be internal/publicly exposed, and it's always interesting to know how those might not be true for everyone! So if you have an explanation of how you might use this/what consumes this, that would be very helpful 🙏 And curious if the code I showed you works (it should get at what you're attempting, I think...). |
I still need the function. What the code snippet gives me is the topology of the graph. The function is a property of the node. I might take different decisions depending on whether it's Let me suggest an alternative approach. How about allowing users to attach free-form attributes to nodes. Much like tags, but free-form. Unlike tags, Hamilton should never be tasked to interpret attributes, just to accumulate them along node expansion paths and eventually store them on a HamiltonNode. Example 1 a_value = {}
b_value = dict(abc=[])
common_attrs = dict(a=a_value, b=b_value)
@parameterize(
o1=dict(x=source('o1_x_input')),
o2=dict(x=source('o2_x_input')),
)
@attr(common_attrs)
def fun(x: dict, i: int) -> dict:
return x Here I expect HamiltonNodes o1.attributes and o2.attributes to be {**common_attrs} each. Example 2 o1_attrs = dict(name='o1')
o2_attrs = dict(name='o2')
@parameterize(
o1=dict(x=source('o1_x_input'), None=value(o1_attrs)),
o2=dict(x=source('o2_x_input'), None=value(o2_attrs)),
)
@attr(common_attrs)
def fun(x: dict, i: int) -> dict:
return x Note the None key in the parameterize args. Example 3 Use attributes to capture the function. @capture_func_name
def fun(x: dict, i: int) -> dict:
return x which should be equivalent to def fun(x: dict, i: int) -> dict:
return x
fun = attr(dict(func=fun))(fun) |
So yeah, don't know your use-case/what you need -- my general skepticism is that the contract should be at the level of the graph, not the function. E.G. if you have a node for That said, Now, the really cool thing is that you can define whatever higher-level decorator you want. E.G. you can define your own parameterization that:
And you'd be using Hamilton decorators as building blocks. Customizing/wrapping decorators is highly encouraged, given that we can only support so much surface area but want everyone to be able to satisfy their use-case. |
I agree. I got carried away with the idea of parameterized nodes being kind of wrappers. There is no need for this. Using tags to store JSON is a good idea, thanks! |
To be clear, I think there might be room for more free-form tags/attrs -- that could be a valuable issue to open up! |
Is your feature request related to a problem? Please describe.
When a vanilla node is created the default values of its parameters, if any, are available via
node.default_parameter_values
attribute. However, when a node is created by@parameterize
, the values passed to the decorator are not available via the attribute. In fact they are not available from the node object at all.The point I am making is that those values are essentially default values for the expanded node and should be surfaced in the same fashion. This feature would make expanded nodes equivalent to vanilla nodes in this regards.
Describe the solution you'd like
Define a node
The following should hold
The text was updated successfully, but these errors were encountered: