Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: Training by Multi-GPU #25

Open
RGTails opened this issue Jan 26, 2023 · 2 comments
Open

Request: Training by Multi-GPU #25

RGTails opened this issue Jan 26, 2023 · 2 comments

Comments

@RGTails
Copy link

RGTails commented Jan 26, 2023

All is in title.

@yggdrasil75
Copy link

would be nice. but I dont know how well it would work if the 2 gpus arent linked. though training 2 different models by 2 different gpus would be good. and queuing said training would also be good. train 50 epoches on model 1 on gpu 0, 50 for model 2 on gpu 1, gpu 1 being faster means it moves on to model 3 automatically, gpu 0 then continues the training for model 2 once it finishes and sees that the model is not active, 1 moves on to model 1, etc. over a week the 2 gpus train models 50 epoches at a time and split the workload somewhat evenly.

how this could possibly be done where it actually trains 1 model with both: train 1 epoch on each, use difference in time it takes to split the concepts to where faster gets slightly more, and slower slightly less (based on difference) and then merge the models every x epoches (small data set 5-10, large data set 1 or 2) maybe also randomly move concepts around between the 2 trying to maintain the balance while preventing either from missing some part of the model because it doesnt have all the concepts.

@rbbrdckybk
Copy link
Owner

Adding training is something that's on my list, but I'll probably need a solid weekend to get a quick & dirty version implemented. Been busier than usual so it's probably a few weeks off at least. Will leave this open as a reminder though!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants