Welcome to Athina AI

Athina is building monitoring and evaluation tools for LLM developers.

Open-Source Framework for Evals

We have a library of preset evaluators, but you can also write custom evaluators within the Athina framework.

Context Contains Enough Information: Detect bad or insufficient retrievals.
Does Response Answer Query: Detect incomplete or irrelevant responses.
Response Faithfulness: Detect when responses are deviating from the provided context.
Summarization Accuracy: Detect hallucinations and mistakes in summaries
Grading Criteria: If X, then fail. Otherwise pass.
Custom Prompt: Custom prompt for LLM-powered evaluation.
RAGAS: A set of evaluators that return RAGAS metrics.
and more...

Results can also be viewed and tracked on our platform.

UI for monitoring and visibility into your LLM inferences.
Run evals automatically against logged inferences in production.
Track cost, token usage, response times, feedback, pass rate and other eval metrics.
Analytics segmented by Customer ID, Model, Prompt, Environment, and More.
Topic Classification
Data Exports
... and more

Contact [email protected] if you have any questions.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
profile		profile
README.md		README.md