Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] adding mmbench dev evaluation (#75) #46

Merged
merged 1 commit into from
Apr 7, 2024
Merged

Conversation

Luodian
Copy link
Contributor

@Luodian Luodian commented Apr 4, 2024

LLaVA-v1.5-7B eval results

image image image

* WIP

* Update GPT evaluation model name and sys prompt

* 🛠️ Scale accuracy to percentage
@Luodian
Copy link
Contributor Author

Luodian commented Apr 4, 2024

@pufanyi Please help us to test if this PR works for LLaVA-v1.5 and LLaVA-v1.6 (using official repo code) model.

@Luodian Luodian requested a review from pufanyi April 4, 2024 17:35
Luodian pushed a commit that referenced this pull request Apr 4, 2024
Luodian pushed a commit that referenced this pull request Apr 4, 2024
Luodian added a commit that referenced this pull request Apr 4, 2024
* Refactor logging and model initialization

* Fix wandb_logger.online() method call

* Add error handling during evaluation

* Add wait time and error handling in get_chat_response function

* Update wait_time in get_chat_response function

* Refactor code for improved readability and maintainability

* Refactor doc_to_visual function to handle multiple images in ICON-QA tasks

* Refactor logging_utils.py and utils.py

This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10.

* Refactor code for wandb logging and generation in OtterHD class

* Refactor prepare_report_by_task method in logging_utils.py

* Update generation parameters in OtterHD model

* Update generation parameters in OtterHD model

* Squashed commit of the following:

commit 4011e6c
Author: kcz358 <[email protected]>
Date:   Tue Feb 13 18:50:37 2024 +0800

    Fix seedbench choices bugs (#45)

commit 16a6c1f
Author: XinrunDu <[email protected]>
Date:   Tue Feb 13 18:50:23 2024 +0800

    add stvqa and multidocvqa (#46)

commit 515a7c4
Author: XinrunDu <[email protected]>
Date:   Sun Feb 11 00:54:39 2024 +0800

    add cmmmu (#44)

    Co-authored-by: ygjin11 <[email protected]>

commit b3a013c
Author: kcz358 <[email protected]>
Date:   Sun Feb 11 00:54:23 2024 +0800

    [Feat] Add qwen loglikelihood (#43)

    * Add qwen loglikelihood

    * Revise the pyproject dependency. Move tiktoken out from optional-dependencies

    * Add ferret-bench

    * Add seedbench 2, test on llava

commit 1b4a477
Author: JvThunder <[email protected]>
Date:   Wed Feb 7 00:08:22 2024 +0800

    Joshua/vizwizvqa refactor (#42)

    * refactor vizwizvqa task

    * Merge commit '41d044cd287adcbcf095afb1a0ef5a96c88c3d9d'

    * Fix exact_match accuracy calculation in vizwiz_vqa_process_results

    * Update vizwiz_vqa tasks

    ---------

    Co-authored-by: Fanyi Pu <[email protected]>
Luodian added a commit that referenced this pull request Apr 4, 2024
* Refactor logging and model initialization

* Fix wandb_logger.online() method call

* Add error handling during evaluation

* Add wait time and error handling in get_chat_response function

* Update wait_time in get_chat_response function

* Refactor code for improved readability and maintainability

* Refactor doc_to_visual function to handle multiple images in ICON-QA tasks

* Refactor logging_utils.py and utils.py

This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10.

* Refactor code for wandb logging and generation in OtterHD class

* Refactor prepare_report_by_task method in logging_utils.py

* Update generation parameters in OtterHD model

* Update generation parameters in OtterHD model

* Squashed commit of the following:

commit 5a44010
Author: kcz358 <[email protected]>
Date:   Tue Feb 13 18:50:37 2024 +0800

    Fix seedbench choices bugs (#45)

commit cf10a45
Author: XinrunDu <[email protected]>
Date:   Tue Feb 13 18:50:23 2024 +0800

    add stvqa and multidocvqa (#46)

commit caaad1d
Author: XinrunDu <[email protected]>
Date:   Sun Feb 11 00:54:39 2024 +0800

    add cmmmu (#44)

    Co-authored-by: ygjin11 <[email protected]>

commit cfa11b6
Author: kcz358 <[email protected]>
Date:   Sun Feb 11 00:54:23 2024 +0800

    [Feat] Add qwen loglikelihood (#43)

    * Add qwen loglikelihood

    * Revise the pyproject dependency. Move tiktoken out from optional-dependencies

    * Add ferret-bench

    * Add seedbench 2, test on llava

commit 4d42aa8
Author: JvThunder <[email protected]>
Date:   Wed Feb 7 00:08:22 2024 +0800

    Joshua/vizwizvqa refactor (#42)

    * refactor vizwizvqa task

    * Merge commit '0cf06439d3c85aee8783034b226f1badd3a08608'

    * Fix exact_match accuracy calculation in vizwiz_vqa_process_results

    * Update vizwiz_vqa tasks

    ---------

    Co-authored-by: Fanyi Pu <[email protected]>
@Luodian
Copy link
Contributor Author

Luodian commented Apr 4, 2024

python -m accelerate.commands.launch \
    --main_process_port=12566 \
    --num_processes=8 \
    lmms_eval \
    --model=llava \
    --model_args=pretrained=liuhaotian/llava-v1.5-13b,conv_template=vicuna_v1 \
    --tasks=mmbench_en_dev,mmbench_cn_dev,mmbench_cn_cc \
    --batch_size=1 \
    --log_samples \
    --log_samples_suffix=debug \
    --output_path=./logs/ \
    --verbosity=DEBUG

@pufanyi pufanyi merged commit bf4c78b into main Apr 7, 2024
2 checks passed
@Luodian Luodian deleted the dev/public/mmbench branch April 16, 2024 13:32
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* Refactor logging and model initialization

* Fix wandb_logger.online() method call

* Add error handling during evaluation

* Add wait time and error handling in get_chat_response function

* Update wait_time in get_chat_response function

* Refactor code for improved readability and maintainability

* Refactor doc_to_visual function to handle multiple images in ICON-QA tasks

* Refactor logging_utils.py and utils.py

This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10.

* Refactor code for wandb logging and generation in OtterHD class

* Refactor prepare_report_by_task method in logging_utils.py

* Update generation parameters in OtterHD model

* Update generation parameters in OtterHD model

* Squashed commit of the following:

commit c35da5e
Author: kcz358 <[email protected]>
Date:   Tue Feb 13 18:50:37 2024 +0800

    Fix seedbench choices bugs (EvolvingLMMs-Lab#45)

commit 0175674
Author: XinrunDu <[email protected]>
Date:   Tue Feb 13 18:50:23 2024 +0800

    add stvqa and multidocvqa (EvolvingLMMs-Lab#46)

commit 25f7a96
Author: XinrunDu <[email protected]>
Date:   Sun Feb 11 00:54:39 2024 +0800

    add cmmmu (EvolvingLMMs-Lab#44)

    Co-authored-by: ygjin11 <[email protected]>

commit 631891b
Author: kcz358 <[email protected]>
Date:   Sun Feb 11 00:54:23 2024 +0800

    [Feat] Add qwen loglikelihood (EvolvingLMMs-Lab#43)

    * Add qwen loglikelihood

    * Revise the pyproject dependency. Move tiktoken out from optional-dependencies

    * Add ferret-bench

    * Add seedbench 2, test on llava

commit 210d779
Author: JvThunder <[email protected]>
Date:   Wed Feb 7 00:08:22 2024 +0800

    Joshua/vizwizvqa refactor (EvolvingLMMs-Lab#42)

    * refactor vizwizvqa task

    * Merge commit '5b0d7aaac69663d1fedc531b75644ebe1bdb867e'

    * Fix exact_match accuracy calculation in vizwiz_vqa_process_results

    * Update vizwiz_vqa tasks

    ---------

    Co-authored-by: Fanyi Pu <[email protected]>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* Refactor logging and model initialization

* Fix wandb_logger.online() method call

* Add error handling during evaluation

* Add wait time and error handling in get_chat_response function

* Update wait_time in get_chat_response function

* Refactor code for improved readability and maintainability

* Refactor doc_to_visual function to handle multiple images in ICON-QA tasks

* Refactor logging_utils.py and utils.py

This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10.

* Refactor code for wandb logging and generation in OtterHD class

* Refactor prepare_report_by_task method in logging_utils.py

* Update generation parameters in OtterHD model

* Update generation parameters in OtterHD model

* Squashed commit of the following:

commit 21dea7b
Author: kcz358 <[email protected]>
Date:   Tue Feb 13 18:50:37 2024 +0800

    Fix seedbench choices bugs (EvolvingLMMs-Lab#45)

commit 12144a6
Author: XinrunDu <[email protected]>
Date:   Tue Feb 13 18:50:23 2024 +0800

    add stvqa and multidocvqa (EvolvingLMMs-Lab#46)

commit aca1e6d
Author: XinrunDu <[email protected]>
Date:   Sun Feb 11 00:54:39 2024 +0800

    add cmmmu (EvolvingLMMs-Lab#44)

    Co-authored-by: ygjin11 <[email protected]>

commit 0925443
Author: kcz358 <[email protected]>
Date:   Sun Feb 11 00:54:23 2024 +0800

    [Feat] Add qwen loglikelihood (EvolvingLMMs-Lab#43)

    * Add qwen loglikelihood

    * Revise the pyproject dependency. Move tiktoken out from optional-dependencies

    * Add ferret-bench

    * Add seedbench 2, test on llava

commit 16f1cf2
Author: JvThunder <[email protected]>
Date:   Wed Feb 7 00:08:22 2024 +0800

    Joshua/vizwizvqa refactor (EvolvingLMMs-Lab#42)

    * refactor vizwizvqa task

    * Merge commit '9bbbad51a77051fcf676438f81e81f723c1b438b'

    * Fix exact_match accuracy calculation in vizwiz_vqa_process_results

    * Update vizwiz_vqa tasks

    ---------

    Co-authored-by: Fanyi Pu <[email protected]>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* Refactor logging and model initialization

* Fix wandb_logger.online() method call

* Add error handling during evaluation

* Add wait time and error handling in get_chat_response function

* Update wait_time in get_chat_response function

* Refactor code for improved readability and maintainability

* Refactor doc_to_visual function to handle multiple images in ICON-QA tasks

* Refactor logging_utils.py and utils.py

This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10.

* Refactor code for wandb logging and generation in OtterHD class

* Refactor prepare_report_by_task method in logging_utils.py

* Update generation parameters in OtterHD model

* Update generation parameters in OtterHD model

* Squashed commit of the following:

commit 9cb2f41
Author: kcz358 <[email protected]>
Date:   Tue Feb 13 18:50:37 2024 +0800

    Fix seedbench choices bugs (EvolvingLMMs-Lab#45)

commit 8154867
Author: XinrunDu <[email protected]>
Date:   Tue Feb 13 18:50:23 2024 +0800

    add stvqa and multidocvqa (EvolvingLMMs-Lab#46)

commit 2078e19
Author: XinrunDu <[email protected]>
Date:   Sun Feb 11 00:54:39 2024 +0800

    add cmmmu (EvolvingLMMs-Lab#44)

    Co-authored-by: ygjin11 <[email protected]>

commit 81b2181
Author: kcz358 <[email protected]>
Date:   Sun Feb 11 00:54:23 2024 +0800

    [Feat] Add qwen loglikelihood (EvolvingLMMs-Lab#43)

    * Add qwen loglikelihood

    * Revise the pyproject dependency. Move tiktoken out from optional-dependencies

    * Add ferret-bench

    * Add seedbench 2, test on llava

commit b22bced
Author: JvThunder <[email protected]>
Date:   Wed Feb 7 00:08:22 2024 +0800

    Joshua/vizwizvqa refactor (EvolvingLMMs-Lab#42)

    * refactor vizwizvqa task

    * Merge commit '59c7d67077c315657a02bdee2eace0e64c1ee0d4'

    * Fix exact_match accuracy calculation in vizwiz_vqa_process_results

    * Update vizwiz_vqa tasks

    ---------

    Co-authored-by: Fanyi Pu <[email protected]>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* Refactor logging and model initialization

* Fix wandb_logger.online() method call

* Add error handling during evaluation

* Add wait time and error handling in get_chat_response function

* Update wait_time in get_chat_response function

* Refactor code for improved readability and maintainability

* Refactor doc_to_visual function to handle multiple images in ICON-QA tasks

* Refactor logging_utils.py and utils.py

This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10.

* Refactor code for wandb logging and generation in OtterHD class

* Refactor prepare_report_by_task method in logging_utils.py

* Update generation parameters in OtterHD model

* Update generation parameters in OtterHD model

* Squashed commit of the following:

commit e2686e8
Author: kcz358 <[email protected]>
Date:   Tue Feb 13 18:50:37 2024 +0800

    Fix seedbench choices bugs (EvolvingLMMs-Lab#45)

commit bf93c62
Author: XinrunDu <[email protected]>
Date:   Tue Feb 13 18:50:23 2024 +0800

    add stvqa and multidocvqa (EvolvingLMMs-Lab#46)

commit 3a6b334
Author: XinrunDu <[email protected]>
Date:   Sun Feb 11 00:54:39 2024 +0800

    add cmmmu (EvolvingLMMs-Lab#44)

    Co-authored-by: ygjin11 <[email protected]>

commit 568a358
Author: kcz358 <[email protected]>
Date:   Sun Feb 11 00:54:23 2024 +0800

    [Feat] Add qwen loglikelihood (EvolvingLMMs-Lab#43)

    * Add qwen loglikelihood

    * Revise the pyproject dependency. Move tiktoken out from optional-dependencies

    * Add ferret-bench

    * Add seedbench 2, test on llava

commit 966c56f
Author: JvThunder <[email protected]>
Date:   Wed Feb 7 00:08:22 2024 +0800

    Joshua/vizwizvqa refactor (EvolvingLMMs-Lab#42)

    * refactor vizwizvqa task

    * Merge commit '41ceea1413ea03f0089bcc346d9187060dc228df'

    * Fix exact_match accuracy calculation in vizwiz_vqa_process_results

    * Update vizwiz_vqa tasks

    ---------

    Co-authored-by: Fanyi Pu <[email protected]>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* Refactor logging and model initialization

* Fix wandb_logger.online() method call

* Add error handling during evaluation

* Add wait time and error handling in get_chat_response function

* Update wait_time in get_chat_response function

* Refactor code for improved readability and maintainability

* Refactor doc_to_visual function to handle multiple images in ICON-QA tasks

* Refactor logging_utils.py and utils.py

This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10.

* Refactor code for wandb logging and generation in OtterHD class

* Refactor prepare_report_by_task method in logging_utils.py

* Update generation parameters in OtterHD model

* Update generation parameters in OtterHD model

* Squashed commit of the following:

commit 5598ac0
Author: kcz358 <[email protected]>
Date:   Tue Feb 13 18:50:37 2024 +0800

    Fix seedbench choices bugs (EvolvingLMMs-Lab#45)

commit 015a8d2
Author: XinrunDu <[email protected]>
Date:   Tue Feb 13 18:50:23 2024 +0800

    add stvqa and multidocvqa (EvolvingLMMs-Lab#46)

commit ee5b446
Author: XinrunDu <[email protected]>
Date:   Sun Feb 11 00:54:39 2024 +0800

    add cmmmu (EvolvingLMMs-Lab#44)

    Co-authored-by: ygjin11 <[email protected]>

commit 7c11ba4
Author: kcz358 <[email protected]>
Date:   Sun Feb 11 00:54:23 2024 +0800

    [Feat] Add qwen loglikelihood (EvolvingLMMs-Lab#43)

    * Add qwen loglikelihood

    * Revise the pyproject dependency. Move tiktoken out from optional-dependencies

    * Add ferret-bench

    * Add seedbench 2, test on llava

commit d18d66d
Author: JvThunder <[email protected]>
Date:   Wed Feb 7 00:08:22 2024 +0800

    Joshua/vizwizvqa refactor (EvolvingLMMs-Lab#42)

    * refactor vizwizvqa task

    * Merge commit '780af491d66291bd0780d5426295a4c7dfe385e2'

    * Fix exact_match accuracy calculation in vizwiz_vqa_process_results

    * Update vizwiz_vqa tasks

    ---------

    Co-authored-by: Fanyi Pu <[email protected]>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* Refactor logging and model initialization

* Fix wandb_logger.online() method call

* Add error handling during evaluation

* Add wait time and error handling in get_chat_response function

* Update wait_time in get_chat_response function

* Refactor code for improved readability and maintainability

* Refactor doc_to_visual function to handle multiple images in ICON-QA tasks

* Refactor logging_utils.py and utils.py

This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10.

* Refactor code for wandb logging and generation in OtterHD class

* Refactor prepare_report_by_task method in logging_utils.py

* Update generation parameters in OtterHD model

* Update generation parameters in OtterHD model

* Squashed commit of the following:

commit 11c9464
Author: kcz358 <[email protected]>
Date:   Tue Feb 13 18:50:37 2024 +0800

    Fix seedbench choices bugs (EvolvingLMMs-Lab#45)

commit 1cbc746
Author: XinrunDu <[email protected]>
Date:   Tue Feb 13 18:50:23 2024 +0800

    add stvqa and multidocvqa (EvolvingLMMs-Lab#46)

commit 7c4d14b
Author: XinrunDu <[email protected]>
Date:   Sun Feb 11 00:54:39 2024 +0800

    add cmmmu (EvolvingLMMs-Lab#44)

    Co-authored-by: ygjin11 <[email protected]>

commit 801829a
Author: kcz358 <[email protected]>
Date:   Sun Feb 11 00:54:23 2024 +0800

    [Feat] Add qwen loglikelihood (EvolvingLMMs-Lab#43)

    * Add qwen loglikelihood

    * Revise the pyproject dependency. Move tiktoken out from optional-dependencies

    * Add ferret-bench

    * Add seedbench 2, test on llava

commit 2bb8fd6
Author: JvThunder <[email protected]>
Date:   Wed Feb 7 00:08:22 2024 +0800

    Joshua/vizwizvqa refactor (EvolvingLMMs-Lab#42)

    * refactor vizwizvqa task

    * Merge commit '9bbbad51a77051fcf676438f81e81f723c1b438b'

    * Fix exact_match accuracy calculation in vizwiz_vqa_process_results

    * Update vizwiz_vqa tasks

    ---------

    Co-authored-by: Fanyi Pu <[email protected]>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* Refactor logging and model initialization

* Fix wandb_logger.online() method call

* Add error handling during evaluation

* Add wait time and error handling in get_chat_response function

* Update wait_time in get_chat_response function

* Refactor code for improved readability and maintainability

* Refactor doc_to_visual function to handle multiple images in ICON-QA tasks

* Refactor logging_utils.py and utils.py

This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10.

* Refactor code for wandb logging and generation in OtterHD class

* Refactor prepare_report_by_task method in logging_utils.py

* Update generation parameters in OtterHD model

* Update generation parameters in OtterHD model

* Squashed commit of the following:

commit ca0c734
Author: kcz358 <[email protected]>
Date:   Tue Feb 13 18:50:37 2024 +0800

    Fix seedbench choices bugs (EvolvingLMMs-Lab#45)

commit c6d4d44
Author: XinrunDu <[email protected]>
Date:   Tue Feb 13 18:50:23 2024 +0800

    add stvqa and multidocvqa (EvolvingLMMs-Lab#46)

commit b5204d4
Author: XinrunDu <[email protected]>
Date:   Sun Feb 11 00:54:39 2024 +0800

    add cmmmu (EvolvingLMMs-Lab#44)

    Co-authored-by: ygjin11 <[email protected]>

commit 3dd77b9
Author: kcz358 <[email protected]>
Date:   Sun Feb 11 00:54:23 2024 +0800

    [Feat] Add qwen loglikelihood (EvolvingLMMs-Lab#43)

    * Add qwen loglikelihood

    * Revise the pyproject dependency. Move tiktoken out from optional-dependencies

    * Add ferret-bench

    * Add seedbench 2, test on llava

commit 058a7d4
Author: JvThunder <[email protected]>
Date:   Wed Feb 7 00:08:22 2024 +0800

    Joshua/vizwizvqa refactor (EvolvingLMMs-Lab#42)

    * refactor vizwizvqa task

    * Merge commit '59c7d67077c315657a02bdee2eace0e64c1ee0d4'

    * Fix exact_match accuracy calculation in vizwiz_vqa_process_results

    * Update vizwiz_vqa tasks

    ---------

    Co-authored-by: Fanyi Pu <[email protected]>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* Refactor logging and model initialization

* Fix wandb_logger.online() method call

* Add error handling during evaluation

* Add wait time and error handling in get_chat_response function

* Update wait_time in get_chat_response function

* Refactor code for improved readability and maintainability

* Refactor doc_to_visual function to handle multiple images in ICON-QA tasks

* Refactor logging_utils.py and utils.py

This commit refactors the `logging_utils.py` and `utils.py` files. It removes unused imports, adjusts code formatting, and updates the `get_chat_response` function to increase the `wait_time` parameter from 5 to 10.

* Refactor code for wandb logging and generation in OtterHD class

* Refactor prepare_report_by_task method in logging_utils.py

* Update generation parameters in OtterHD model

* Update generation parameters in OtterHD model

* Squashed commit of the following:

commit f77ff8a
Author: kcz358 <[email protected]>
Date:   Tue Feb 13 18:50:37 2024 +0800

    Fix seedbench choices bugs (EvolvingLMMs-Lab#45)

commit 23294e3
Author: XinrunDu <[email protected]>
Date:   Tue Feb 13 18:50:23 2024 +0800

    add stvqa and multidocvqa (EvolvingLMMs-Lab#46)

commit e60daa7
Author: XinrunDu <[email protected]>
Date:   Sun Feb 11 00:54:39 2024 +0800

    add cmmmu (EvolvingLMMs-Lab#44)

    Co-authored-by: ygjin11 <[email protected]>

commit d95e7ff
Author: kcz358 <[email protected]>
Date:   Sun Feb 11 00:54:23 2024 +0800

    [Feat] Add qwen loglikelihood (EvolvingLMMs-Lab#43)

    * Add qwen loglikelihood

    * Revise the pyproject dependency. Move tiktoken out from optional-dependencies

    * Add ferret-bench

    * Add seedbench 2, test on llava

commit 7a005aa
Author: JvThunder <[email protected]>
Date:   Wed Feb 7 00:08:22 2024 +0800

    Joshua/vizwizvqa refactor (EvolvingLMMs-Lab#42)

    * refactor vizwizvqa task

    * Merge commit 'cfdce77dad7c0ae328f60712c6dd5ba1bc75cc1d'

    * Fix exact_match accuracy calculation in vizwiz_vqa_process_results

    * Update vizwiz_vqa tasks

    ---------

    Co-authored-by: Fanyi Pu <[email protected]>
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants