Add new LMMS evaluation task for wild vision benchmark #247

Luodian · 2024-09-12T17:01:51Z

This pull request adds a new LMMS evaluation task named "wildvision_0630" for the wild vision benchmark. The task includes specific configuration settings such as dataset name, test split, output type, and prompt details. This task is added to the existing LMMS evaluation tasks for various subjects.

We also update the evaluation metrics (score, win rate...) for wildvision benchmark to align with their updated evaluation logic.

This commit adds a new LMMS evaluation task for the wild vision benchmark. The task is named "wildvision_0630" and includes specific configuration settings such as dataset name, test split, output type, and prompt details. This task is added to the existing LMMS evaluation tasks for various subjects.

Luodian added 4 commits September 10, 2024 08:54

Update LMMS evaluation tasks for various subjects

c0ccd61

Merge remote-tracking branch 'origin/main' into dev/fix_tags

6c456ba

add group task to include wildvison more tasks

8541b16

Luodian merged commit e77fb31 into main Sep 13, 2024
2 checks passed

Luodian deleted the dev/fix_tags branch November 23, 2024 14:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new LMMS evaluation task for wild vision benchmark #247

Add new LMMS evaluation task for wild vision benchmark #247

Luodian commented Sep 12, 2024 •

edited

Loading

Add new LMMS evaluation task for wild vision benchmark #247

Add new LMMS evaluation task for wild vision benchmark #247

Conversation

Luodian commented Sep 12, 2024 • edited Loading

Luodian commented Sep 12, 2024 •

edited

Loading