-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MLLM support #142
Comments
Thanks for your interest @ChingKwanCheung. I will merge the code and doc this weekend. |
Hi @ChingKwanCheung, I have added the code and a initial doc in https://github.com/texttron/tevatron/tree/main/examples/dse |
Thank you!This paper is a really good job. I have tested the multi-modal retrieval model(https://huggingface.co/Tevatron/dse-phi3-docmatix-v1) you released before and found that the English retrieval capability is excellent. If I want to enhance its Chinese retrieval capability, is it recommended to continue training with Chinese data based on this model? |
Thanks @ChingKwanCheung , I guess the Chinese capability largely depends on the LLM's capability on Chinese and also how the Visual encoder aligns with the language model. I am not very sure if Phi3 do the things well on Chinese. I feel https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5 might be a good choice of backbone for Chinese tasks. |
Does this project support the training and inference of multi-modal retrieval models, such as Phi-3-vision? I'd like to reproduce the experiments in paper https://arxiv.org/abs/2406.11251 based on this project.
The text was updated successfully, but these errors were encountered: