Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: primary node or mix of manual/udp #510

Open
mr-deamon opened this issue Nov 26, 2024 · 1 comment
Open

Request: primary node or mix of manual/udp #510

mr-deamon opened this issue Nov 26, 2024 · 1 comment

Comments

@mr-deamon
Copy link

Hi
Great project, i really like how fast things are moving here!

We have the following scenario: Multiple servers with GPUs in the same network, but they are "unstable". Some are used for other tasks while others are idling, i changes from time to time.

Creating a VM (or a physical machine, if necessary) outside this network is easy, but it won't have GPUs. Also, they can't directly reach each other.

So the primary/secondary architecture other projects use is quite usefull: primary runns all the time, GPUs join and leave. It would be great to have the possibility to specify a primary- or non-gpu-node which stays in charge (or stays online, for the matter) and have the other hosts find each other automatically as they already do with UDP.

Do you think something like this could be feasable?

Chris

@AlexCheema
Copy link
Contributor

Hi Great project, i really like how fast things are moving here!

We have the following scenario: Multiple servers with GPUs in the same network, but they are "unstable". Some are used for other tasks while others are idling, i changes from time to time.

Creating a VM (or a physical machine, if necessary) outside this network is easy, but it won't have GPUs. Also, they can't directly reach each other.

So the primary/secondary architecture other projects use is quite usefull: primary runns all the time, GPUs join and leave. It would be great to have the possibility to specify a primary- or non-gpu-node which stays in charge (or stays online, for the matter) and have the other hosts find each other automatically as they already do with UDP.

Do you think something like this could be feasable?

Chris

If I understood your use case correctly, this should already work out of the box.

There's no "master" node in exo.
You would just use the endpoint from the "primary" node you described.
Any other nodes that join/leave will automatically have an effect on the topology.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants