-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The training process was killed when load data batch #45
Comments
I have the same issue. Can anyone help me? |
Hi, it's a bit hard to figure out exactly what is causing the crash. My best guess is that your system runs out of RAM. We cache the data in memory, but it shouldn't really be that heavy to run. Are you running this in a container, or in a conda environment? Could you maybe track the memory consumption while training? |
Hi how you solve this issue ? i have a 4060ti 16g and i cannot load the data... it is killed every time, is there a recommend ram ? |
Have you tried reducing the number of workers and/or queue lenght? https://github.com/georghess/neurad-studio/blob/main/nerfstudio/data/datamanagers/image_lidar_datamanager.py#L83-L85 |
Hi, I discovered that it takes approximately 50gb of ram when data loading, I tried to lower the worker and it seems still cannot work, I use another computer to solve this thank you |
Hi! Thanks to your great work!
When I tried to train the PandaSet according to your guide, The model couldn't load full data batch, it was killed like the image below despite I reduced the number of scenes in PandaSet (I just used one scene) and my GPUs is 4 GPU V100 (Moreover , I want to ask that could your model training on multi-gpu ? )
How can I fix that ?
Thanks.
P/s : I run version NeuRAD tiny but it still killed like the image below
I don't know why, did the reason is my GPU is V100 - 32gb so the model can not train ?
The text was updated successfully, but these errors were encountered: