Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify evaluation API of model-based and statistical metrics #6903

Closed
tholor opened this issue Feb 5, 2024 · 1 comment
Closed

Unify evaluation API of model-based and statistical metrics #6903

tholor opened this issue Feb 5, 2024 · 1 comment
Assignees
Labels
2.x Related to Haystack v2.0 P1 High priority, add to the next sprint topic:eval

Comments

@tholor
Copy link
Member

tholor commented Feb 5, 2024

We recently introduced two different ways to evaluate a pipeline:

  1. Statistical metrics
  2. Model-based metrics

Let's unify the interface here so that this is a consistent UX.
We already discussed that we will go for the new "model-based interface":

  • A single StatisticalEvaluator component
    • One metric per-instance
    • Inputs based on the metric
@tholor tholor added topic:eval P1 High priority, add to the next sprint 2.x Related to Haystack v2.0 labels Feb 5, 2024
@shadeMe
Copy link
Contributor

shadeMe commented Feb 7, 2024

The DeepEvalEvaluator implementation can serve as a blueprint for the statistical evaluator - deepset-ai/haystack-core-integrations#346

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to Haystack v2.0 P1 High priority, add to the next sprint topic:eval
Projects
None yet
Development

No branches or pull requests

3 participants