Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logprob based multiple-choice question evals (callback) #18

Merged
merged 7 commits into from
Mar 5, 2025

Conversation

nielsrolf
Copy link
Collaborator

No description provided.

…le Choice Evaluation

 #### Log Probability Calculation (`logprobs.py`):
 - Refactored the results appending logic to include the entire dataset entry along with the `messages` key. This ensures that the output
 dictionary contains all necessary information from the dataset, not just the `messages`.

 #### Multiple Choice Question Evaluation (`mc_question.py`):
 - Added support for custom templates and context in the `Question` class. This allows for more flexibility when preparing questions for
 evaluation.
 - Improved the log probability summation to account for all messages and blocks, ensuring that the calculation is comprehensive and accurate
 - Fixed the `logp_correct` value to correctly reflect the log probability of the correct answer.
 - Introduced default values for `choice_template`, `question_template`, and `answer_template` in the `MultipleChoiceEvalFreeform` class to
 streamline the creation of freeform multiple-choice evaluations.

 These changes enhance the functionality and reliability of the log probability calculations and multiple-choice question evaluations.
@nielsrolf nielsrolf merged commit 410931f into main Mar 5, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant