Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add jina-v3 evaluation result on mmteb multilingual #41

Merged
merged 7 commits into from
Nov 9, 2024

Conversation

bwanglzu
Copy link
Contributor

@bwanglzu bwanglzu commented Nov 5, 2024

This PR add jina-embeddings-v3's evaluation result (MMTEB, beta), the evaluation was conducted with the latest checkpoint 215a6e121fa0183376388ac6b1ae230326bfeaed.

@Samoed
Copy link
Contributor

Samoed commented Nov 5, 2024

Hi! Could you run MIRACLRetrieval (ru)? It's the only task missing in MTEB(ru). I tried running it myself but encountered unexpected CUDA OOM errors.

@bwanglzu
Copy link
Contributor Author

bwanglzu commented Nov 5, 2024

@Samoed on it, do i miss any other tasks? this is what i'm doing:

import mteb

model = mteb.get_model("jinaai/jina-embeddings-v3")


tasks = mteb.get_benchmark("MTEB(Multilingual, beta)")

evaluation = mteb.MTEB(tasks=tasks)

evaluation.run(model)

@Samoed
Copy link
Contributor

Samoed commented Nov 5, 2024

This task part of mteb.get_benchmark("MTEB(rus)"). You can iterate over benchmarks with mteb.get_benchmarks() to get all missing tasks.

Also currently I'm fixing tests

@bwanglzu
Copy link
Contributor Author

bwanglzu commented Nov 5, 2024

i think i can re-use what we submitted to MTEB on MIRACL, it is the same task and should get same results.

@Samoed
Copy link
Contributor

Samoed commented Nov 5, 2024

Currenly in jina-embedings-v3 model card only MIRACLReranking submitted, but no MIRACLRetrieval

@bwanglzu
Copy link
Contributor Author

bwanglzu commented Nov 5, 2024

that's strange, now i'm re-running MIRACLRetrieval for all 18 languages

@KennethEnevoldsen
Copy link
Contributor

Adding the fix for the tests (#42). Then I believe all tests should pass.

the evaluation was conducted with the latest checkpoint 215a6e121fa0183376388ac6b1ae230326bfeaed.

This is also logged in the model_meta.json as well as the file structure

@bwanglzu
Copy link
Contributor Author

bwanglzu commented Nov 7, 2024

didn't aware it take that much time (i'm only using 1 gpu), i'll patch the PR once ready

@bwanglzu
Copy link
Contributor Author

bwanglzu commented Nov 8, 2024

@Samoed @KennethEnevoldsen @isaac-chung i patched the MIRACLRetrieval.json to the PR :)

@Samoed
Copy link
Contributor

Samoed commented Nov 8, 2024

The only thing left. You should run results.py file to update path.json file

Copy link
Contributor

@isaac-chung isaac-chung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bwanglzu thanks for adding these! In terms of MMTEB multilingual results, they look complete. I'll merge this once the tests pass.

@Samoed this PR is not intended for the RU benchmark, so I do not expect those results to be added here. Also, after adding the jina-v3 model path and running results.py, the updated paths.json contains extra changes that are outside of jina-v3. Would you mind looking into that? We can add jina-v3 into paths in a separate PR.

@isaac-chung isaac-chung changed the title chore: add jina-v3 evaluation result on mmteb chore: add jina-v3 evaluation result on mmteb multilingual Nov 9, 2024
@isaac-chung isaac-chung enabled auto-merge (squash) November 9, 2024 10:26
@isaac-chung
Copy link
Contributor

@KennethEnevoldsen @Muennighoff or whoever has admin rights, would you mind taking a look at changing the required checks? (ubuntu-latest, 3.9) is not run but it's marked as required (instead of 3.8).

@Samoed
Copy link
Contributor

Samoed commented Nov 9, 2024

Maybe github tries to merge different PRS when checking, because in #40 there is only 3.9 python in tests

@isaac-chung isaac-chung merged commit 1616d1b into embeddings-benchmark:main Nov 9, 2024
2 checks passed
@isaac-chung
Copy link
Contributor

isaac-chung commented Nov 9, 2024

Looks like it merged. Thanks @bwanglzu again for the PR, and all for the discussion! Note that in this PR I've also updated tests to only run with Python 3.9 at the moment.

@bwanglzu bwanglzu deleted the chore-add-jina-v3 branch November 11, 2024 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants