You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When the model is loaded, nvidia-smi shows 21684MiB of memory usage. However, after the first query the memory usage increases to 22356MiB. Is this normal? How can I free the memory after each query?
The text was updated successfully, but these errors were encountered:
I am testing the 8b model with a custom simple api:
When the model is loaded, nvidia-smi shows 21684MiB of memory usage. However, after the first query the memory usage increases to 22356MiB. Is this normal? How can I free the memory after each query?
The text was updated successfully, but these errors were encountered: