Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How much GPU memory required to run live streaming demo? #5

Open
WangyiNTU opened this issue Feb 10, 2025 · 3 comments
Open

How much GPU memory required to run live streaming demo? #5

WangyiNTU opened this issue Feb 10, 2025 · 3 comments

Comments

@WangyiNTU
Copy link

Thanks for your wonderful framework. May I know how much GPU memory is required to run a live-streaming demo? Do you have a small pre-trained model for low-memory inferencing? Thanks.

@hyf015
Copy link
Collaborator

hyf015 commented Feb 10, 2025

Hi, thank you for your interest in our work! If generation is not required, Vince needs 18G GPU memory. If the generation module is loaded, it will need extra 6G GPU memory, in total a little more than 24G.

In our experiments, we run Vinci for live-streaming demo on one 4090 GPU.

@WangyiNTU
Copy link
Author

Hi, thank you for your interest in our work! If generation is not required, Vince needs 18G GPU memory. If the generation module is loaded, it will need extra 6G GPU memory, in total a little more than 24G.

In our experiments, we run Vinci for live-streaming demo on one 4090 GPU.

Thanks for your prompt reply. Is there an alternate solution if I want to run it on a GPU less than 18G? Like InternLM2.5-1.8B as fixed LLM?

@hyf015
Copy link
Collaborator

hyf015 commented Feb 17, 2025

Thank you for your suggestion! In fact, we originally used smaller models but the performance did not seem optimal. We will add this option to our code soon, within this week. I will notify you in this thread once it is done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants