You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your wonderful framework. May I know how much GPU memory is required to run a live-streaming demo? Do you have a small pre-trained model for low-memory inferencing? Thanks.
The text was updated successfully, but these errors were encountered:
Hi, thank you for your interest in our work! If generation is not required, Vince needs 18G GPU memory. If the generation module is loaded, it will need extra 6G GPU memory, in total a little more than 24G.
In our experiments, we run Vinci for live-streaming demo on one 4090 GPU.
Hi, thank you for your interest in our work! If generation is not required, Vince needs 18G GPU memory. If the generation module is loaded, it will need extra 6G GPU memory, in total a little more than 24G.
In our experiments, we run Vinci for live-streaming demo on one 4090 GPU.
Thanks for your prompt reply. Is there an alternate solution if I want to run it on a GPU less than 18G? Like InternLM2.5-1.8B as fixed LLM?
Thank you for your suggestion! In fact, we originally used smaller models but the performance did not seem optimal. We will add this option to our code soon, within this week. I will notify you in this thread once it is done.
Thanks for your wonderful framework. May I know how much GPU memory is required to run a live-streaming demo? Do you have a small pre-trained model for low-memory inferencing? Thanks.
The text was updated successfully, but these errors were encountered: