[WIP] adding mmbench dev evaluation (#75) #46

Luodian · 2024-04-04T17:34:06Z

LLaVA-v1.5-7B eval results

* WIP * Update GPT evaluation model name and sys prompt * 🛠️ Scale accuracy to percentage

Luodian · 2024-04-04T17:35:01Z

@pufanyi Please help us to test if this PR works for LLaVA-v1.5 and LLaVA-v1.6 (using official repo code) model.

* Refactor logging and model initialization * Fix wandb_logger.online() method call * Add error handling during evaluation * Add wait time and error handling in get_chat_response function * Update wait_time in get_chat_response function * Refactor code for improved readability and maintainability * Refactor doc_to_visual function to handle multiple images in ICON-QA tasks * Refactor logging_utils.py and utils.py This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10. * Refactor code for wandb logging and generation in OtterHD class * Refactor prepare_report_by_task method in logging_utils.py * Update generation parameters in OtterHD model * Update generation parameters in OtterHD model * Squashed commit of the following: commit 4011e6c Author: kcz358 <[email protected]> Date: Tue Feb 13 18:50:37 2024 +0800 Fix seedbench choices bugs (#45) commit 16a6c1f Author: XinrunDu <[email protected]> Date: Tue Feb 13 18:50:23 2024 +0800 add stvqa and multidocvqa (#46) commit 515a7c4 Author: XinrunDu <[email protected]> Date: Sun Feb 11 00:54:39 2024 +0800 add cmmmu (#44) Co-authored-by: ygjin11 <[email protected]> commit b3a013c Author: kcz358 <[email protected]> Date: Sun Feb 11 00:54:23 2024 +0800 [Feat] Add qwen loglikelihood (#43) * Add qwen loglikelihood * Revise the pyproject dependency. Move tiktoken out from optional-dependencies * Add ferret-bench * Add seedbench 2, test on llava commit 1b4a477 Author: JvThunder <[email protected]> Date: Wed Feb 7 00:08:22 2024 +0800 Joshua/vizwizvqa refactor (#42) * refactor vizwizvqa task * Merge commit '41d044cd287adcbcf095afb1a0ef5a96c88c3d9d' * Fix exact_match accuracy calculation in vizwiz_vqa_process_results * Update vizwiz_vqa tasks --------- Co-authored-by: Fanyi Pu <[email protected]>

* Refactor logging and model initialization * Fix wandb_logger.online() method call * Add error handling during evaluation * Add wait time and error handling in get_chat_response function * Update wait_time in get_chat_response function * Refactor code for improved readability and maintainability * Refactor doc_to_visual function to handle multiple images in ICON-QA tasks * Refactor logging_utils.py and utils.py This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10. * Refactor code for wandb logging and generation in OtterHD class * Refactor prepare_report_by_task method in logging_utils.py * Update generation parameters in OtterHD model * Update generation parameters in OtterHD model * Squashed commit of the following: commit 5a44010 Author: kcz358 <[email protected]> Date: Tue Feb 13 18:50:37 2024 +0800 Fix seedbench choices bugs (#45) commit cf10a45 Author: XinrunDu <[email protected]> Date: Tue Feb 13 18:50:23 2024 +0800 add stvqa and multidocvqa (#46) commit caaad1d Author: XinrunDu <[email protected]> Date: Sun Feb 11 00:54:39 2024 +0800 add cmmmu (#44) Co-authored-by: ygjin11 <[email protected]> commit cfa11b6 Author: kcz358 <[email protected]> Date: Sun Feb 11 00:54:23 2024 +0800 [Feat] Add qwen loglikelihood (#43) * Add qwen loglikelihood * Revise the pyproject dependency. Move tiktoken out from optional-dependencies * Add ferret-bench * Add seedbench 2, test on llava commit 4d42aa8 Author: JvThunder <[email protected]> Date: Wed Feb 7 00:08:22 2024 +0800 Joshua/vizwizvqa refactor (#42) * refactor vizwizvqa task * Merge commit '0cf06439d3c85aee8783034b226f1badd3a08608' * Fix exact_match accuracy calculation in vizwiz_vqa_process_results * Update vizwiz_vqa tasks --------- Co-authored-by: Fanyi Pu <[email protected]>

Luodian · 2024-04-04T17:39:25Z

python -m accelerate.commands.launch \
    --main_process_port=12566 \
    --num_processes=8 \
    lmms_eval \
    --model=llava \
    --model_args=pretrained=liuhaotian/llava-v1.5-13b,conv_template=vicuna_v1 \
    --tasks=mmbench_en_dev,mmbench_cn_dev,mmbench_cn_cc \
    --batch_size=1 \
    --log_samples \
    --log_samples_suffix=debug \
    --output_path=./logs/ \
    --verbosity=DEBUG

* Refactor logging and model initialization * Fix wandb_logger.online() method call * Add error handling during evaluation * Add wait time and error handling in get_chat_response function * Update wait_time in get_chat_response function * Refactor code for improved readability and maintainability * Refactor doc_to_visual function to handle multiple images in ICON-QA tasks * Refactor logging_utils.py and utils.py This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10. * Refactor code for wandb logging and generation in OtterHD class * Refactor prepare_report_by_task method in logging_utils.py * Update generation parameters in OtterHD model * Update generation parameters in OtterHD model * Squashed commit of the following: commit c35da5e Author: kcz358 <[email protected]> Date: Tue Feb 13 18:50:37 2024 +0800 Fix seedbench choices bugs (EvolvingLMMs-Lab#45) commit 0175674 Author: XinrunDu <[email protected]> Date: Tue Feb 13 18:50:23 2024 +0800 add stvqa and multidocvqa (EvolvingLMMs-Lab#46) commit 25f7a96 Author: XinrunDu <[email protected]> Date: Sun Feb 11 00:54:39 2024 +0800 add cmmmu (EvolvingLMMs-Lab#44) Co-authored-by: ygjin11 <[email protected]> commit 631891b Author: kcz358 <[email protected]> Date: Sun Feb 11 00:54:23 2024 +0800 [Feat] Add qwen loglikelihood (EvolvingLMMs-Lab#43) * Add qwen loglikelihood * Revise the pyproject dependency. Move tiktoken out from optional-dependencies * Add ferret-bench * Add seedbench 2, test on llava commit 210d779 Author: JvThunder <[email protected]> Date: Wed Feb 7 00:08:22 2024 +0800 Joshua/vizwizvqa refactor (EvolvingLMMs-Lab#42) * refactor vizwizvqa task * Merge commit '5b0d7aaac69663d1fedc531b75644ebe1bdb867e' * Fix exact_match accuracy calculation in vizwiz_vqa_process_results * Update vizwiz_vqa tasks --------- Co-authored-by: Fanyi Pu <[email protected]>

* Refactor logging and model initialization * Fix wandb_logger.online() method call * Add error handling during evaluation * Add wait time and error handling in get_chat_response function * Update wait_time in get_chat_response function * Refactor code for improved readability and maintainability * Refactor doc_to_visual function to handle multiple images in ICON-QA tasks * Refactor logging_utils.py and utils.py This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10. * Refactor code for wandb logging and generation in OtterHD class * Refactor prepare_report_by_task method in logging_utils.py * Update generation parameters in OtterHD model * Update generation parameters in OtterHD model * Squashed commit of the following: commit 21dea7b Author: kcz358 <[email protected]> Date: Tue Feb 13 18:50:37 2024 +0800 Fix seedbench choices bugs (EvolvingLMMs-Lab#45) commit 12144a6 Author: XinrunDu <[email protected]> Date: Tue Feb 13 18:50:23 2024 +0800 add stvqa and multidocvqa (EvolvingLMMs-Lab#46) commit aca1e6d Author: XinrunDu <[email protected]> Date: Sun Feb 11 00:54:39 2024 +0800 add cmmmu (EvolvingLMMs-Lab#44) Co-authored-by: ygjin11 <[email protected]> commit 0925443 Author: kcz358 <[email protected]> Date: Sun Feb 11 00:54:23 2024 +0800 [Feat] Add qwen loglikelihood (EvolvingLMMs-Lab#43) * Add qwen loglikelihood * Revise the pyproject dependency. Move tiktoken out from optional-dependencies * Add ferret-bench * Add seedbench 2, test on llava commit 16f1cf2 Author: JvThunder <[email protected]> Date: Wed Feb 7 00:08:22 2024 +0800 Joshua/vizwizvqa refactor (EvolvingLMMs-Lab#42) * refactor vizwizvqa task * Merge commit '9bbbad51a77051fcf676438f81e81f723c1b438b' * Fix exact_match accuracy calculation in vizwiz_vqa_process_results * Update vizwiz_vqa tasks --------- Co-authored-by: Fanyi Pu <[email protected]>

* Refactor logging and model initialization * Fix wandb_logger.online() method call * Add error handling during evaluation * Add wait time and error handling in get_chat_response function * Update wait_time in get_chat_response function * Refactor code for improved readability and maintainability * Refactor doc_to_visual function to handle multiple images in ICON-QA tasks * Refactor logging_utils.py and utils.py This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10. * Refactor code for wandb logging and generation in OtterHD class * Refactor prepare_report_by_task method in logging_utils.py * Update generation parameters in OtterHD model * Update generation parameters in OtterHD model * Squashed commit of the following: commit 9cb2f41 Author: kcz358 <[email protected]> Date: Tue Feb 13 18:50:37 2024 +0800 Fix seedbench choices bugs (EvolvingLMMs-Lab#45) commit 8154867 Author: XinrunDu <[email protected]> Date: Tue Feb 13 18:50:23 2024 +0800 add stvqa and multidocvqa (EvolvingLMMs-Lab#46) commit 2078e19 Author: XinrunDu <[email protected]> Date: Sun Feb 11 00:54:39 2024 +0800 add cmmmu (EvolvingLMMs-Lab#44) Co-authored-by: ygjin11 <[email protected]> commit 81b2181 Author: kcz358 <[email protected]> Date: Sun Feb 11 00:54:23 2024 +0800 [Feat] Add qwen loglikelihood (EvolvingLMMs-Lab#43) * Add qwen loglikelihood * Revise the pyproject dependency. Move tiktoken out from optional-dependencies * Add ferret-bench * Add seedbench 2, test on llava commit b22bced Author: JvThunder <[email protected]> Date: Wed Feb 7 00:08:22 2024 +0800 Joshua/vizwizvqa refactor (EvolvingLMMs-Lab#42) * refactor vizwizvqa task * Merge commit '59c7d67077c315657a02bdee2eace0e64c1ee0d4' * Fix exact_match accuracy calculation in vizwiz_vqa_process_results * Update vizwiz_vqa tasks --------- Co-authored-by: Fanyi Pu <[email protected]>

* Refactor logging and model initialization * Fix wandb_logger.online() method call * Add error handling during evaluation * Add wait time and error handling in get_chat_response function * Update wait_time in get_chat_response function * Refactor code for improved readability and maintainability * Refactor doc_to_visual function to handle multiple images in ICON-QA tasks * Refactor logging_utils.py and utils.py This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10. * Refactor code for wandb logging and generation in OtterHD class * Refactor prepare_report_by_task method in logging_utils.py * Update generation parameters in OtterHD model * Update generation parameters in OtterHD model * Squashed commit of the following: commit e2686e8 Author: kcz358 <[email protected]> Date: Tue Feb 13 18:50:37 2024 +0800 Fix seedbench choices bugs (EvolvingLMMs-Lab#45) commit bf93c62 Author: XinrunDu <[email protected]> Date: Tue Feb 13 18:50:23 2024 +0800 add stvqa and multidocvqa (EvolvingLMMs-Lab#46) commit 3a6b334 Author: XinrunDu <[email protected]> Date: Sun Feb 11 00:54:39 2024 +0800 add cmmmu (EvolvingLMMs-Lab#44) Co-authored-by: ygjin11 <[email protected]> commit 568a358 Author: kcz358 <[email protected]> Date: Sun Feb 11 00:54:23 2024 +0800 [Feat] Add qwen loglikelihood (EvolvingLMMs-Lab#43) * Add qwen loglikelihood * Revise the pyproject dependency. Move tiktoken out from optional-dependencies * Add ferret-bench * Add seedbench 2, test on llava commit 966c56f Author: JvThunder <[email protected]> Date: Wed Feb 7 00:08:22 2024 +0800 Joshua/vizwizvqa refactor (EvolvingLMMs-Lab#42) * refactor vizwizvqa task * Merge commit '41ceea1413ea03f0089bcc346d9187060dc228df' * Fix exact_match accuracy calculation in vizwiz_vqa_process_results * Update vizwiz_vqa tasks --------- Co-authored-by: Fanyi Pu <[email protected]>

* Refactor logging and model initialization * Fix wandb_logger.online() method call * Add error handling during evaluation * Add wait time and error handling in get_chat_response function * Update wait_time in get_chat_response function * Refactor code for improved readability and maintainability * Refactor doc_to_visual function to handle multiple images in ICON-QA tasks * Refactor logging_utils.py and utils.py This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10. * Refactor code for wandb logging and generation in OtterHD class * Refactor prepare_report_by_task method in logging_utils.py * Update generation parameters in OtterHD model * Update generation parameters in OtterHD model * Squashed commit of the following: commit 5598ac0 Author: kcz358 <[email protected]> Date: Tue Feb 13 18:50:37 2024 +0800 Fix seedbench choices bugs (EvolvingLMMs-Lab#45) commit 015a8d2 Author: XinrunDu <[email protected]> Date: Tue Feb 13 18:50:23 2024 +0800 add stvqa and multidocvqa (EvolvingLMMs-Lab#46) commit ee5b446 Author: XinrunDu <[email protected]> Date: Sun Feb 11 00:54:39 2024 +0800 add cmmmu (EvolvingLMMs-Lab#44) Co-authored-by: ygjin11 <[email protected]> commit 7c11ba4 Author: kcz358 <[email protected]> Date: Sun Feb 11 00:54:23 2024 +0800 [Feat] Add qwen loglikelihood (EvolvingLMMs-Lab#43) * Add qwen loglikelihood * Revise the pyproject dependency. Move tiktoken out from optional-dependencies * Add ferret-bench * Add seedbench 2, test on llava commit d18d66d Author: JvThunder <[email protected]> Date: Wed Feb 7 00:08:22 2024 +0800 Joshua/vizwizvqa refactor (EvolvingLMMs-Lab#42) * refactor vizwizvqa task * Merge commit '780af491d66291bd0780d5426295a4c7dfe385e2' * Fix exact_match accuracy calculation in vizwiz_vqa_process_results * Update vizwiz_vqa tasks --------- Co-authored-by: Fanyi Pu <[email protected]>

* Refactor logging and model initialization * Fix wandb_logger.online() method call * Add error handling during evaluation * Add wait time and error handling in get_chat_response function * Update wait_time in get_chat_response function * Refactor code for improved readability and maintainability * Refactor doc_to_visual function to handle multiple images in ICON-QA tasks * Refactor logging_utils.py and utils.py This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10. * Refactor code for wandb logging and generation in OtterHD class * Refactor prepare_report_by_task method in logging_utils.py * Update generation parameters in OtterHD model * Update generation parameters in OtterHD model * Squashed commit of the following: commit 11c9464 Author: kcz358 <[email protected]> Date: Tue Feb 13 18:50:37 2024 +0800 Fix seedbench choices bugs (EvolvingLMMs-Lab#45) commit 1cbc746 Author: XinrunDu <[email protected]> Date: Tue Feb 13 18:50:23 2024 +0800 add stvqa and multidocvqa (EvolvingLMMs-Lab#46) commit 7c4d14b Author: XinrunDu <[email protected]> Date: Sun Feb 11 00:54:39 2024 +0800 add cmmmu (EvolvingLMMs-Lab#44) Co-authored-by: ygjin11 <[email protected]> commit 801829a Author: kcz358 <[email protected]> Date: Sun Feb 11 00:54:23 2024 +0800 [Feat] Add qwen loglikelihood (EvolvingLMMs-Lab#43) * Add qwen loglikelihood * Revise the pyproject dependency. Move tiktoken out from optional-dependencies * Add ferret-bench * Add seedbench 2, test on llava commit 2bb8fd6 Author: JvThunder <[email protected]> Date: Wed Feb 7 00:08:22 2024 +0800 Joshua/vizwizvqa refactor (EvolvingLMMs-Lab#42) * refactor vizwizvqa task * Merge commit '9bbbad51a77051fcf676438f81e81f723c1b438b' * Fix exact_match accuracy calculation in vizwiz_vqa_process_results * Update vizwiz_vqa tasks --------- Co-authored-by: Fanyi Pu <[email protected]>

* Refactor logging and model initialization * Fix wandb_logger.online() method call * Add error handling during evaluation * Add wait time and error handling in get_chat_response function * Update wait_time in get_chat_response function * Refactor code for improved readability and maintainability * Refactor doc_to_visual function to handle multiple images in ICON-QA tasks * Refactor logging_utils.py and utils.py This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10. * Refactor code for wandb logging and generation in OtterHD class * Refactor prepare_report_by_task method in logging_utils.py * Update generation parameters in OtterHD model * Update generation parameters in OtterHD model * Squashed commit of the following: commit ca0c734 Author: kcz358 <[email protected]> Date: Tue Feb 13 18:50:37 2024 +0800 Fix seedbench choices bugs (EvolvingLMMs-Lab#45) commit c6d4d44 Author: XinrunDu <[email protected]> Date: Tue Feb 13 18:50:23 2024 +0800 add stvqa and multidocvqa (EvolvingLMMs-Lab#46) commit b5204d4 Author: XinrunDu <[email protected]> Date: Sun Feb 11 00:54:39 2024 +0800 add cmmmu (EvolvingLMMs-Lab#44) Co-authored-by: ygjin11 <[email protected]> commit 3dd77b9 Author: kcz358 <[email protected]> Date: Sun Feb 11 00:54:23 2024 +0800 [Feat] Add qwen loglikelihood (EvolvingLMMs-Lab#43) * Add qwen loglikelihood * Revise the pyproject dependency. Move tiktoken out from optional-dependencies * Add ferret-bench * Add seedbench 2, test on llava commit 058a7d4 Author: JvThunder <[email protected]> Date: Wed Feb 7 00:08:22 2024 +0800 Joshua/vizwizvqa refactor (EvolvingLMMs-Lab#42) * refactor vizwizvqa task * Merge commit '59c7d67077c315657a02bdee2eace0e64c1ee0d4' * Fix exact_match accuracy calculation in vizwiz_vqa_process_results * Update vizwiz_vqa tasks --------- Co-authored-by: Fanyi Pu <[email protected]>

* Refactor logging and model initialization * Fix wandb_logger.online() method call * Add error handling during evaluation * Add wait time and error handling in get_chat_response function * Update wait_time in get_chat_response function * Refactor code for improved readability and maintainability * Refactor doc_to_visual function to handle multiple images in ICON-QA tasks * Refactor logging_utils.py and utils.py This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10. * Refactor code for wandb logging and generation in OtterHD class * Refactor prepare_report_by_task method in logging_utils.py * Update generation parameters in OtterHD model * Update generation parameters in OtterHD model * Squashed commit of the following: commit f77ff8a Author: kcz358 <[email protected]> Date: Tue Feb 13 18:50:37 2024 +0800 Fix seedbench choices bugs (EvolvingLMMs-Lab#45) commit 23294e3 Author: XinrunDu <[email protected]> Date: Tue Feb 13 18:50:23 2024 +0800 add stvqa and multidocvqa (EvolvingLMMs-Lab#46) commit e60daa7 Author: XinrunDu <[email protected]> Date: Sun Feb 11 00:54:39 2024 +0800 add cmmmu (EvolvingLMMs-Lab#44) Co-authored-by: ygjin11 <[email protected]> commit d95e7ff Author: kcz358 <[email protected]> Date: Sun Feb 11 00:54:23 2024 +0800 [Feat] Add qwen loglikelihood (EvolvingLMMs-Lab#43) * Add qwen loglikelihood * Revise the pyproject dependency. Move tiktoken out from optional-dependencies * Add ferret-bench * Add seedbench 2, test on llava commit 7a005aa Author: JvThunder <[email protected]> Date: Wed Feb 7 00:08:22 2024 +0800 Joshua/vizwizvqa refactor (EvolvingLMMs-Lab#42) * refactor vizwizvqa task * Merge commit 'cfdce77dad7c0ae328f60712c6dd5ba1bc75cc1d' * Fix exact_match accuracy calculation in vizwiz_vqa_process_results * Update vizwiz_vqa tasks --------- Co-authored-by: Fanyi Pu <[email protected]>

…ic/mmbench [WIP] adding mmbench dev evaluation (EvolvingLMMs-Lab#75)

[WIP] adding mmbench dev evaluation (#75)

1ea7a8f

* WIP * Update GPT evaluation model name and sys prompt * 🛠️ Scale accuracy to percentage

Luodian requested a review from pufanyi April 4, 2024 17:35

Luodian pushed a commit that referenced this pull request Apr 4, 2024

add stvqa and multidocvqa (#46)

16a6c1f

Luodian pushed a commit that referenced this pull request Apr 4, 2024

add stvqa and multidocvqa (#46)

cf10a45

Luodian mentioned this pull request Apr 5, 2024

N/A scores when running mmbench #43

Closed

pufanyi approved these changes Apr 7, 2024

View reviewed changes

pufanyi merged commit bf4c78b into main Apr 7, 2024
2 checks passed

Luodian deleted the dev/public/mmbench branch April 16, 2024 13:32

kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024

add stvqa and multidocvqa (EvolvingLMMs-Lab#46)

0175674

kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024

add stvqa and multidocvqa (EvolvingLMMs-Lab#46)

12144a6

kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024

add stvqa and multidocvqa (EvolvingLMMs-Lab#46)

8154867

kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024

add stvqa and multidocvqa (EvolvingLMMs-Lab#46)

bf93c62

kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024

add stvqa and multidocvqa (EvolvingLMMs-Lab#46)

015a8d2

kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024

add stvqa and multidocvqa (EvolvingLMMs-Lab#46)

1cbc746

kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024

add stvqa and multidocvqa (EvolvingLMMs-Lab#46)

c6d4d44

kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024

add stvqa and multidocvqa (EvolvingLMMs-Lab#46)

23294e3

kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024

Merge pull request EvolvingLMMs-Lab#46 from EvolvingLMMs-Lab/dev/publ…

61a33cd

…ic/mmbench [WIP] adding mmbench dev evaluation (EvolvingLMMs-Lab#75)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] adding mmbench dev evaluation (#75) #46

[WIP] adding mmbench dev evaluation (#75) #46

Luodian commented Apr 4, 2024

Luodian commented Apr 4, 2024

Luodian commented Apr 4, 2024

[WIP] adding mmbench dev evaluation (#75) #46

[WIP] adding mmbench dev evaluation (#75) #46

Conversation

Luodian commented Apr 4, 2024

Luodian commented Apr 4, 2024

Luodian commented Apr 4, 2024