Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MLLM support #142

Open
ChingKwanCheung opened this issue Aug 2, 2024 · 4 comments
Open

MLLM support #142

ChingKwanCheung opened this issue Aug 2, 2024 · 4 comments

Comments

@ChingKwanCheung
Copy link

Does this project support the training and inference of multi-modal retrieval models, such as Phi-3-vision? I'd like to reproduce the experiments in paper https://arxiv.org/abs/2406.11251 based on this project.

@MXueguang
Copy link
Contributor

Thanks for your interest @ChingKwanCheung. I will merge the code and doc this weekend.

@MXueguang
Copy link
Contributor

Hi @ChingKwanCheung, I have added the code and a initial doc in https://github.com/texttron/tevatron/tree/main/examples/dse

@ChingKwanCheung
Copy link
Author

Hi @ChingKwanCheung, I have added the code and a initial doc in https://github.com/texttron/tevatron/tree/main/examples/dse

Thank you!This paper is a really good job. I have tested the multi-modal retrieval model(https://huggingface.co/Tevatron/dse-phi3-docmatix-v1) you released before and found that the English retrieval capability is excellent. If I want to enhance its Chinese retrieval capability, is it recommended to continue training with Chinese data based on this model?

@MXueguang
Copy link
Contributor

Thanks @ChingKwanCheung , I guess the Chinese capability largely depends on the LLM's capability on Chinese and also how the Visual encoder aligns with the language model. I am not very sure if Phi3 do the things well on Chinese. I feel https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5 might be a good choice of backbone for Chinese tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants