Discussion: Update GPT eval models #294

zhijian-liu · 2024-10-04T13:39:47Z

Some benchmarks (such as ActivityNet, VideoChatGPT, and many others) use gpt-3.5-turbo-0613 for evaluation, but this model has been discontinued by OpenAI. One quick fix would be to switch to gpt-3.5-turbo, but I would also like to bring up the discussion whether to switch all gpt-3.5-turbo to gpt-4o-mini as its performance is better and is 3 times cheaper.

After the discussion, I'm happy to submit the PR to make the change. @Luodian @kcz358

The text was updated successfully, but these errors were encountered:

Luodian · 2024-10-04T15:10:00Z

I would surely suggest using gpt-4o-mini.

We should inform users about the eval models change after certain PR, likely we should specify the changed datasets in this issue (the eval models used before/after).

I'll pin this issue and link to the PR for visibility if the PR is created.

Luodian added the discussion label Oct 4, 2024

Luodian pinned this issue Oct 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: Update GPT eval models #294

Discussion: Update GPT eval models #294

zhijian-liu commented Oct 4, 2024

Luodian commented Oct 4, 2024 •

edited

Loading

Discussion: Update GPT eval models #294

Discussion: Update GPT eval models #294

Comments

zhijian-liu commented Oct 4, 2024

Luodian commented Oct 4, 2024 • edited Loading

Luodian commented Oct 4, 2024 •

edited

Loading