-
Notifications
You must be signed in to change notification settings - Fork 637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA error: an illegal memory access was encountered #166
Comments
I also ran into a similar problem recently, waiting for the response from the authors... |
Did you manage to solve it? |
Decreasing batch size doesnt work, |
This is what im getting in logs: Starting training: Sun Aug 11 06:22:53 UTC 2024 W0811 06:22:55.518423 140052721798016 torch/distributed/run.py:779] 0%| | 0/2237 [00:00<?, ?it/s] 0%| | 0/4 [00:00<?, ?it/s] 0%| | 0/2237 [00:00<?, ?it/s]2024-08-11 06:23:05.515 | INFO | data_utils:_filter:64 - Init dataset... 0%| | 0/2237 [00:00<?, ?it/s]2024-08-11 06:23:05.520 | INFO | data_utils:_filter:64 - Init dataset... 0%| | 0/2237 [00:00<?, ?it/s] 100%|██████████| 2237/2237 [00:00<00:00, 44849.11it/s] 100%|██████████| 2237/2237 [00:00<00:00, 45839.05it/s] 0%| | 0/543 [00:00<?, ?it/s]/app/melo/train.py:252: FutureWarning: 0%| | 0/543 [00:00<?, ?it/s]/app/melo/train.py:252: FutureWarning: 0%| | 0/543 [00:00<?, ?it/s]/app/melo/train.py:252: FutureWarning: 0%| | 0/543 [00:00<?, ?it/s]/app/melo/data_utils.py:161: FutureWarning: You are using 0%| | 1/543 [00:11<1:43:48, 11.49s/it] 0%| | 1/543 [00:39<5:58:40, 39.71s/it]/app/melo/train.py:342: FutureWarning: 0%| | 2/543 [00:44<3:36:09, 23.97s/it] Exception raised from c10_cuda_check_implementation at ../c10/cuda/CUDAException.cpp:43 (most recent call first): W0811 06:24:22.025600 140052721798016 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 186579 closing signal SIGTERM
|
I meet same error and found that it was caused by the audio sample rate. make sure the training data has a sample rate of 44100Hz. |
Train environment:
Python==3.9
CUDA==11.8
torch 2.0.0+cu118
torchaudio 2.0.1+cu118
torchvision 0.15.1+cu118
how resolve this error?
The text was updated successfully, but these errors were encountered: