Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to enable RayGraphAdapter and HamiltonTracker to work well together #1103

Merged
merged 33 commits into from
Sep 9, 2024

Conversation

jernejfrank
Copy link
Contributor

Hi Elijah,

I have done some things, but got stuck in what seems to be the final stages (unless I missed something). Please let me know if you have some time this week to discuss.

Changes

  • added inline function to execute lifecycle hooks in remote environment
  • added new lifecycle class for remote execution that passes the inline function to the adapters that handle remote environments
  • changed RayGraphAdapter to execute the full lifecycle (including pre- and post-hooks) within Ray remote

How I tested this

  • There is a script z_test_implementation.py that executes a single node in ray remote and waits for 5s to test the telemetry in HamiltonUI

Notes

  • Unable to overcome
AssertionError: The @ray.remote decorator must be applied either with no arguments and no parentheses
  • Unsure if I handled catching errors, success, results, exceptions with the inline function in graph_functions correctly.
  • Added RayGraphAdapter.do_remote_execute(..., **future_kwargs) to collect things such as run_id and task_id that are not used in function.

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future TODOs are captured in comments
  • Project documentation has been updated if adding/changing functionality.

hamilton/plugins/h_ray.py Outdated Show resolved Hide resolved
Copy link
Contributor Author

@jernejfrank jernejfrank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore, I had a local mistake that the Tracker was called from pip installed package and not the executable dir...

z_test_implementation.py Outdated Show resolved Hide resolved
z_test_implementation.py Outdated Show resolved Hide resolved
@skrawcz
Copy link
Collaborator

skrawcz commented Aug 25, 2024

@jernejfrank i fixed the ray object issue. ray doesn't resolve nested references, only top level ones. see my commit.

@jernejfrank
Copy link
Contributor Author

Awesome, I moved the post-execute-graph hooks to be executed after the results builder and it fixed the simple issue of displaying correct telemetry on the graph level. However, I don't have the bigger picture if this would mess things up for other adapters?

@jernejfrank
Copy link
Contributor Author

So the one thing missing is this weird behaviour:
If the second node is executed with time.sleep(i):

  1. where i<5s (which is the first node), the UI doesn't shows only the first node and keeps thinking its running.
  2. where i=5s, the UI shows first node complete and the second node as running without finishing
  3. where i>>5s, the UI shows both nodes correct

@jernejfrank
Copy link
Contributor Author

Only thing I haven't done is written any tests beyond running the small script. Let me know if we should add these.

@skrawcz
Copy link
Collaborator

skrawcz commented Aug 28, 2024

@jernejfrank I think there's a few test failures that need to be investigated. Some look like it's due to polars changing, others could be related to the changes here.

[edit] I pushed polars fix to main [/edit]

@skrawcz skrawcz changed the title Partial implementation, got stuck ray.remote AssertionError Refactor to enable RayGraphAdapter and HamiltonTracker to work well together Aug 30, 2024
@skrawcz
Copy link
Collaborator

skrawcz commented Aug 30, 2024

@jernejfrank how much time do you have over the next week? I'd like to get this over the line.

The main thing is the unit tests -- else there's a few things to refactor around raw_execute that we should probably do. If you don't have time one of us can take that part on.

@jernejfrank
Copy link
Contributor Author

Looking good! Only a few things to change - let me know if I misinterpreted anything.

Hi @skrawcz , let me know if there is anything I should add to finish the PR.

@@ -27,6 +29,8 @@ def node_1s_error() -> float:
username = "admin"

try:
# ray.init()
ray.shutdown()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this was just to test something?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typically we have a README, requirements.txt, and a notebook version of the script in an example. Would you mind, please?

Copy link
Collaborator

@skrawcz skrawcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jernejfrank what's the rationale for initializing ray in the adapter now?

My concern here now that I look more closely at it is any new behaviors should be opt-in, and it seems to have some specific behavior checks that I'm not sure we should be that opinionated on, e.g. always adding ignore_reinit_error kwarg.

So with that in mind:

  1. I think initializing ray is fine, but you should opt-in to do it.
  2. stopping Ray should again, be opt-in, or something you specify explicitly.

So my suggestion is:

  1. allow someone to pass in ray config.
  2. if that value is not None, do ray init with that config, else assume user has initialized ray.
  3. have flag shutdown_ray_on_completion=False as a default, and use that to know when to shutdown or not.

Thoughts?

@skrawcz skrawcz force-pushed the todo-for-ray-remote-hooks branch from 9ddc5ca to ed7967d Compare September 5, 2024 18:05
@skrawcz skrawcz changed the base branch from todo-for-ray-remote-hooks to main September 5, 2024 18:06
@skrawcz
Copy link
Collaborator

skrawcz commented Sep 5, 2024

@jernejfrank I've tested locally -- so functionally things seem to work without regressions! Nice work! I ran a few of the Hamilton examples to validate things.

So only outstanding thing is around the ray_config bit + tidying up the example.

@skrawcz
Copy link
Collaborator

skrawcz commented Sep 5, 2024

I also changed the branch to be main - so we'd do a squash merge here as a single commit... unless you want to tidy up the commits to be more atomic.

@jernejfrank
Copy link
Contributor Author

3. shutdown_ray_on_completion

Fair point, we really shouldn't be adding a hidden init of a cluster. The initial idea was to abstract away from ray and create a context manager that would:

  1. spin up cluster
  2. execute DAG
  3. clean up after itself

but I can see that it is too restrictive and changed it as suggested.

@jernejfrank
Copy link
Contributor Author

I also changed the brain to be main - so we'd do a squash merge here as a single commit... unless you want to tidy up the commits to be more atomic.

happy with a squash commit!

hamilton/plugins/h_ray.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@skrawcz skrawcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So just that one thing. But otherwise I think this LGTM. Will ask @elijahbenizzy to take a pass, but we'll merge this weekend if there isn't anything else! Thanks @jernejfrank !

@elijahbenizzy
Copy link
Collaborator

Will play tomorrow with it then we can merge! I think this is the most surgery anyone who is not a core maintainer has done on the library -- good stuff! Special award to you 🥇

@elijahbenizzy
Copy link
Collaborator

Looks good to me, thank you so much for this! Might want to make the example a little more compelling at some point (it reads like a unit test), but that is in no way a blocker to getting it out. Really appreciate all your good work!

@elijahbenizzy elijahbenizzy merged commit fd984cd into DAGWorks-Inc:main Sep 9, 2024
27 checks passed
elijahbenizzy pushed a commit that referenced this pull request Sep 9, 2024
…ogether

This is a squash commit:
- issue=#1079
- PR=#1103

All commits:
- Update graph_functions.py

Describes what to do in `graph_functions.py`

Adds comments to lifecycle base

Update h_ray.py with comments for ray tracking compatibility

Replicate previous error

Inline function, unsure if catching errors and exceptions to be handadled differently

BaseDoRemoteExecute has the added Callable function that snadwisched lifecycle hooks

method fails, says AssertionError about ray.remote decorator

simple script for now to check telemetry, execution yield the ray.remote AssertionError

passing pointer through and arguments to lifecycle wrapper into ray.remote

post-execute hook for node not called

finally executed only when exception occurs, hamilton tracker not executed

atexit.register does not work, node keeps running inui

added stop() method, but doesn't get called

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Fixes ray object dereferencing

Ray does not resolve nested arguments:
https://docs.ray.io/en/latest/ray-core/objects.html#passing-object-arguments

So one option is to make them all top level:

- one way to do that is to make the other arguments not clash with any
possible user parameters -- hence the `__` prefix. This is what I did.
- another way would be in the ray adapter, wrap the incoming function,
and explicitly do a ray.get() on any ray object references in the
kwargs arguments. i.e. keep the nested structure, but when the ray
task starts way for all inputs... not sure which is best, but this
now works correctly.

ray works checkpoint, pre-commit fixed

fixed graph level telemtry proposal

pinned ruff

Correct output, added option to start ray cluster

Unit test mimicks the DoNodeExecute unit test

Refactored driver so all tests pass

Refactored driver so all tests pass

Refactored driver so all tests pass

Refactored driver so all tests pass

Workaround to not break ray by calling init on an open cluster

raw_execute does not have post_graph_execute and is private now

Correct version for depraction warning

all tests work

this looks better

ruff version comment

Refactored pre- and post-graph-execute hooks outside of raw_execute which now has deprecation warning

added readme, notebook and made script cli interactive

made cluster init optional through inserting config dict

User has option to shutdown ray cluster

Co-authored-by: Stefan Krawczyk <[email protected]>
elijahbenizzy added a commit that referenced this pull request Sep 9, 2024
Refactor to enable RayGraphAdapter and HamiltonTracker to work well together
This is a squash commit:
- issue=#1079
- PR=#1103

Update graph_functions.py

Describes what to do in `graph_functions.py`

Adds comments to lifecycle base

Update h_ray.py with comments for ray tracking compatibility

Replicate previous error

Inline function, unsure if catching errors and exceptions to be handadled differently

BaseDoRemoteExecute has the added Callable function that snadwisched lifecycle hooks

method fails, says AssertionError about ray.remote decorator

simple script for now to check telemetry, execution yield the ray.remote AssertionError

passing pointer through and arguments to lifecycle wrapper into ray.remote

post-execute hook for node not called

finally executed only when exception occurs, hamilton tracker not executed

atexit.register does not work, node keeps running inui

added stop() method, but doesn't get called

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Fixes ray object dereferencing

Ray does not resolve nested arguments:
https://docs.ray.io/en/latest/ray-core/objects.html#passing-object-arguments

So one option is to make them all top level:

- one way to do that is to make the other arguments not clash with any
possible user parameters -- hence the `__` prefix. This is what I did.
- another way would be in the ray adapter, wrap the incoming function,
and explicitly do a ray.get() on any ray object references in the
kwargs arguments. i.e. keep the nested structure, but when the ray
task starts way for all inputs... not sure which is best, but this
now works correctly.

ray works checkpoint, pre-commit fixed

fixed graph level telemtry proposal

pinned ruff

Correct output, added option to start ray cluster

Unit test mimicks the DoNodeExecute unit test

All commits:
- Update graph_functions.py

Describes what to do in `graph_functions.py`

Adds comments to lifecycle base

Update h_ray.py with comments for ray tracking compatibility

Replicate previous error

Inline function, unsure if catching errors and exceptions to be handadled differently

BaseDoRemoteExecute has the added Callable function that snadwisched lifecycle hooks

method fails, says AssertionError about ray.remote decorator

simple script for now to check telemetry, execution yield the ray.remote AssertionError

passing pointer through and arguments to lifecycle wrapper into ray.remote

post-execute hook for node not called

finally executed only when exception occurs, hamilton tracker not executed

atexit.register does not work, node keeps running inui

added stop() method, but doesn't get called

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Fixes ray object dereferencing

Ray does not resolve nested arguments:
https://docs.ray.io/en/latest/ray-core/objects.html#passing-object-arguments

So one option is to make them all top level:

- one way to do that is to make the other arguments not clash with any
possible user parameters -- hence the `__` prefix. This is what I did.
- another way would be in the ray adapter, wrap the incoming function,
and explicitly do a ray.get() on any ray object references in the
kwargs arguments. i.e. keep the nested structure, but when the ray
task starts way for all inputs... not sure which is best, but this
now works correctly.

ray works checkpoint, pre-commit fixed

fixed graph level telemtry proposal

pinned ruff

Correct output, added option to start ray cluster

Unit test mimicks the DoNodeExecute unit test

Refactored driver so all tests pass

Refactored driver so all tests pass

Refactored driver so all tests pass

Refactored driver so all tests pass

Workaround to not break ray by calling init on an open cluster

raw_execute does not have post_graph_execute and is private now

Correct version for depraction warning

all tests work

this looks better

ruff version comment

Refactored pre- and post-graph-execute hooks outside of raw_execute which now has deprecation warning

added readme, notebook and made script cli interactive

made cluster init optional through inserting config dict

User has option to shutdown ray cluster

Co-authored-by: Stefan Krawczyk <[email protected]>
elijahbenizzy added a commit that referenced this pull request Sep 9, 2024
…ogether

This is a squash commit:
- issue=#1079
- PR=#1103

Update graph_functions.py

Describes what to do in `graph_functions.py`

Adds comments to lifecycle base

Update h_ray.py with comments for ray tracking compatibility

Replicate previous error

Inline function, unsure if catching errors and exceptions to be handadled differently

BaseDoRemoteExecute has the added Callable function that snadwisched lifecycle hooks

method fails, says AssertionError about ray.remote decorator

simple script for now to check telemetry, execution yield the ray.remote AssertionError

passing pointer through and arguments to lifecycle wrapper into ray.remote

post-execute hook for node not called

finally executed only when exception occurs, hamilton tracker not executed

atexit.register does not work, node keeps running inui

added stop() method, but doesn't get called

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Fixes ray object dereferencing

Ray does not resolve nested arguments:
https://docs.ray.io/en/latest/ray-core/objects.html#passing-object-arguments

So one option is to make them all top level:

- one way to do that is to make the other arguments not clash with any
possible user parameters -- hence the `__` prefix. This is what I did.
- another way would be in the ray adapter, wrap the incoming function,
and explicitly do a ray.get() on any ray object references in the
kwargs arguments. i.e. keep the nested structure, but when the ray
task starts way for all inputs... not sure which is best, but this
now works correctly.

ray works checkpoint, pre-commit fixed

fixed graph level telemtry proposal

pinned ruff

Correct output, added option to start ray cluster

Unit test mimicks the DoNodeExecute unit test

All commits:
- Update graph_functions.py

Describes what to do in `graph_functions.py`

Adds comments to lifecycle base

Update h_ray.py with comments for ray tracking compatibility

Replicate previous error

Inline function, unsure if catching errors and exceptions to be handadled differently

BaseDoRemoteExecute has the added Callable function that snadwisched lifecycle hooks

method fails, says AssertionError about ray.remote decorator

simple script for now to check telemetry, execution yield the ray.remote AssertionError

passing pointer through and arguments to lifecycle wrapper into ray.remote

post-execute hook for node not called

finally executed only when exception occurs, hamilton tracker not executed

atexit.register does not work, node keeps running inui

added stop() method, but doesn't get called

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Fixes ray object dereferencing

Ray does not resolve nested arguments:
https://docs.ray.io/en/latest/ray-core/objects.html#passing-object-arguments

So one option is to make them all top level:

- one way to do that is to make the other arguments not clash with any
possible user parameters -- hence the `__` prefix. This is what I did.
- another way would be in the ray adapter, wrap the incoming function,
and explicitly do a ray.get() on any ray object references in the
kwargs arguments. i.e. keep the nested structure, but when the ray
task starts way for all inputs... not sure which is best, but this
now works correctly.

ray works checkpoint, pre-commit fixed

fixed graph level telemtry proposal

pinned ruff

Correct output, added option to start ray cluster

Unit test mimicks the DoNodeExecute unit test

Refactored driver so all tests pass

Refactored driver so all tests pass

Refactored driver so all tests pass

Refactored driver so all tests pass

Workaround to not break ray by calling init on an open cluster

raw_execute does not have post_graph_execute and is private now

Correct version for depraction warning

all tests work

this looks better

ruff version comment

Refactored pre- and post-graph-execute hooks outside of raw_execute which now has deprecation warning

added readme, notebook and made script cli interactive

made cluster init optional through inserting config dict

User has option to shutdown ray cluster

Co-authored-by: Stefan Krawczyk <[email protected]>
elijahbenizzy added a commit that referenced this pull request Sep 9, 2024
…ogether

This is a squash commit:
- issue=#1079
- PR=#1103

Describes what to do in `graph_functions.py`

Adds comments to lifecycle base

Update h_ray.py with comments for ray tracking compatibility

Replicate previous error

Inline function, unsure if catching errors and exceptions to be handadled differently

BaseDoRemoteExecute has the added Callable function that snadwisched lifecycle hooks

method fails, says AssertionError about ray.remote decorator

simple script for now to check telemetry, execution yield the ray.remote AssertionError

passing pointer through and arguments to lifecycle wrapper into ray.remote

post-execute hook for node not called

finally executed only when exception occurs, hamilton tracker not executed

atexit.register does not work, node keeps running inui

added stop() method, but doesn't get called

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Fixes ray object dereferencing

Ray does not resolve nested arguments:
https://docs.ray.io/en/latest/ray-core/objects.html#passing-object-arguments

So one option is to make them all top level:

- one way to do that is to make the other arguments not clash with any
possible user parameters -- hence the `__` prefix. This is what I did.
- another way would be in the ray adapter, wrap the incoming function,
and explicitly do a ray.get() on any ray object references in the
kwargs arguments. i.e. keep the nested structure, but when the ray
task starts way for all inputs... not sure which is best, but this
now works correctly.

ray works checkpoint, pre-commit fixed

fixed graph level telemtry proposal

pinned ruff

Correct output, added option to start ray cluster

Unit test mimicks the DoNodeExecute unit test

Refactored driver so all tests pass

Workaround to not break ray by calling init on an open cluster

raw_execute does not have post_graph_execute and is private now

Correct version for depraction warning

all tests work

this looks better

ruff version comment

Refactored pre- and post-graph-execute hooks outside of raw_execute which now has deprecation warning

added readme, notebook and made script cli interactive

made cluster init optional through inserting config dict

User has option to shutdown ray cluster

Co-authored-by: Stefan Krawczyk <[email protected]>
elijahbenizzy pushed a commit that referenced this pull request Sep 9, 2024
…ogether

This is a squash commit:
- issue=#1079
- PR=#1103

Describes what to do in `graph_functions.py`

Adds comments to lifecycle base

Update h_ray.py with comments for ray tracking compatibility

Replicate previous error

Inline function, unsure if catching errors and exceptions to be handadled differently

BaseDoRemoteExecute has the added Callable function that snadwisched lifecycle hooks

method fails, says AssertionError about ray.remote decorator

simple script for now to check telemetry, execution yield the ray.remote AssertionError

passing pointer through and arguments to lifecycle wrapper into ray.remote

post-execute hook for node not called

finally executed only when exception occurs, hamilton tracker not executed

atexit.register does not work, node keeps running inui

added stop() method, but doesn't get called

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Fixes ray object dereferencing

Ray does not resolve nested arguments:
https://docs.ray.io/en/latest/ray-core/objects.html#passing-object-arguments

So one option is to make them all top level:

- one way to do that is to make the other arguments not clash with any
possible user parameters -- hence the `__` prefix. This is what I did.
- another way would be in the ray adapter, wrap the incoming function,
and explicitly do a ray.get() on any ray object references in the
kwargs arguments. i.e. keep the nested structure, but when the ray
task starts way for all inputs... not sure which is best, but this
now works correctly.

ray works checkpoint, pre-commit fixed

fixed graph level telemtry proposal

pinned ruff

Correct output, added option to start ray cluster

Unit test mimicks the DoNodeExecute unit test

Refactored driver so all tests pass

Workaround to not break ray by calling init on an open cluster

raw_execute does not have post_graph_execute and is private now

Correct version for depraction warning

all tests work

this looks better

ruff version comment

Refactored pre- and post-graph-execute hooks outside of raw_execute which now has deprecation warning

added readme, notebook and made script cli interactive

made cluster init optional through inserting config dict

User has option to shutdown ray cluster

Co-authored-by: Stefan Krawczyk <[email protected]>
elijahbenizzy pushed a commit that referenced this pull request Sep 9, 2024
…ogether

This is a squash commit:
- issue=#1079
- PR=#1103

Describes what to do in `graph_functions.py`

Adds comments to lifecycle base

Update h_ray.py with comments for ray tracking compatibility

Replicate previous error

Inline function, unsure if catching errors and exceptions to be handadled differently

BaseDoRemoteExecute has the added Callable function that snadwisched lifecycle hooks

method fails, says AssertionError about ray.remote decorator

simple script for now to check telemetry, execution yield the ray.remote AssertionError

passing pointer through and arguments to lifecycle wrapper into ray.remote

post-execute hook for node not called

finally executed only when exception occurs, hamilton tracker not executed

atexit.register does not work, node keeps running inui

added stop() method, but doesn't get called

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Fixes ray object dereferencing

Ray does not resolve nested arguments:
https://docs.ray.io/en/latest/ray-core/objects.html#passing-object-arguments

So one option is to make them all top level:

- one way to do that is to make the other arguments not clash with any
possible user parameters -- hence the `__` prefix. This is what I did.
- another way would be in the ray adapter, wrap the incoming function,
and explicitly do a ray.get() on any ray object references in the
kwargs arguments. i.e. keep the nested structure, but when the ray
task starts way for all inputs... not sure which is best, but this
now works correctly.

ray works checkpoint, pre-commit fixed

fixed graph level telemtry proposal

pinned ruff

Correct output, added option to start ray cluster

Unit test mimicks the DoNodeExecute unit test

Refactored driver so all tests pass

Workaround to not break ray by calling init on an open cluster

raw_execute does not have post_graph_execute and is private now

Correct version for depraction warning

all tests work

this looks better

ruff version comment

Refactored pre- and post-graph-execute hooks outside of raw_execute which now has deprecation warning

added readme, notebook and made script cli interactive

made cluster init optional through inserting config dict

User has option to shutdown ray cluster

Co-authored-by: Stefan Krawczyk <[email protected]>
elijahbenizzy pushed a commit that referenced this pull request Sep 10, 2024
…ogether

This is a squash commit:
- issue=#1079
- PR=#1103

Describes what to do in `graph_functions.py`

Adds comments to lifecycle base

Update h_ray.py with comments for ray tracking compatibility

Replicate previous error

Inline function, unsure if catching errors and exceptions to be handadled differently

BaseDoRemoteExecute has the added Callable function that snadwisched lifecycle hooks

method fails, says AssertionError about ray.remote decorator

simple script for now to check telemetry, execution yield the ray.remote AssertionError

passing pointer through and arguments to lifecycle wrapper into ray.remote

post-execute hook for node not called

finally executed only when exception occurs, hamilton tracker not executed

atexit.register does not work, node keeps running inui

added stop() method, but doesn't get called

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Ray telemtry works for single node, problem with connected nodes

Fixes ray object dereferencing

Ray does not resolve nested arguments:
https://docs.ray.io/en/latest/ray-core/objects.html#passing-object-arguments

So one option is to make them all top level:

- one way to do that is to make the other arguments not clash with any
possible user parameters -- hence the `__` prefix. This is what I did.
- another way would be in the ray adapter, wrap the incoming function,
and explicitly do a ray.get() on any ray object references in the
kwargs arguments. i.e. keep the nested structure, but when the ray
task starts way for all inputs... not sure which is best, but this
now works correctly.

ray works checkpoint, pre-commit fixed

fixed graph level telemtry proposal

pinned ruff

Correct output, added option to start ray cluster

Unit test mimicks the DoNodeExecute unit test

Refactored driver so all tests pass

Workaround to not break ray by calling init on an open cluster

raw_execute does not have post_graph_execute and is private now

Correct version for depraction warning

all tests work

this looks better

ruff version comment

Refactored pre- and post-graph-execute hooks outside of raw_execute which now has deprecation warning

added readme, notebook and made script cli interactive

made cluster init optional through inserting config dict

User has option to shutdown ray cluster

Co-authored-by: Stefan Krawczyk <[email protected]>
@jernejfrank jernejfrank deleted the ray-remote-hooks branch September 10, 2024 21:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants