You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
How much CUDA memory are required to run the example?
While running exmaple with command "CUDA_VISIBLE_DEVICES=0 python examples/run_streaming_llama.py --enable_streaming"
Below error pop up:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 136.00 MiB. GPU 0 has a total capacty of 7.92 GiB of which 131.69 MiB is free. Including non-PyTorch memory, this process has 7.79 GiB memory in use. Of the allocated memory 7.03 GiB is allocated by PyTorch, and 131.61 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
The text was updated successfully, but these errors were encountered:
How much CUDA memory are required to run the example?
While running exmaple with command "CUDA_VISIBLE_DEVICES=0 python examples/run_streaming_llama.py --enable_streaming"
Below error pop up:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 136.00 MiB. GPU 0 has a total capacty of 7.92 GiB of which 131.69 MiB is free. Including non-PyTorch memory, this process has 7.79 GiB memory in use. Of the allocated memory 7.03 GiB is allocated by PyTorch, and 131.61 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
The text was updated successfully, but these errors were encountered: