diff --git a/docs/mllm/mllm_papers.md b/docs/mllm/mllm_papers.md index 908e6f5..14c46e9 100644 --- a/docs/mllm/mllm_papers.md +++ b/docs/mllm/mllm_papers.md @@ -3,6 +3,7 @@ 多模态交流QQ群: 237976286 ## 最新动态 +- 2024.11 [MM-EMBED: UNIVERSAL MULTIMODAL RETRIEVAL WITH MULTIMODAL LLMS](https://arxiv.org/pdf/2411.02571) 英伟达提出基于MLLM的通用多模态检索。 - 2024.11 [Attacking Vision-Language Computer Agents via Pop-ups](https://arxiv.org/abs/2411.02391) - 2024.11 [Know Where You're Uncertain When Planning with Multimodal Foundation Models: A Formal Framework](https://arxiv.org/abs/2411.01639) 提高多模态基础模型在处理不确定性时的能力,从而增强机器人在规划任务中的可靠性。 - 2024.10 [Mini-Omni2: Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities](https://arxiv.org/abs/2410.11190)