-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qwen2 5 vl new vit #1
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this great PR! I left a few comments and please take a look at them!
In addition, it would be great if you can show some results on
- Correctness verification on the ViT: We don't need to add a unit test on it but we should at least check if the embeddings generated from the same image match with the
transformers
implementation, in both TP=1 and TP>1 cases. - Speed verification: our implementation should be at least not slower than
transformers
implementation.
@yixqiao Thank you for the work! I will take it over from here. |
Add the new ViT class in vLLM to Qwen 2.5 VL, removing the huggingface pretrained dependency.
Includes changes to MLP, window-based partial attention, RMSNorm, when compared to 2 VL. Enables parallelized operations where appropriate.