[Observability AI Assistant] [Root Cause Analysis] Create an evaluation tool for LLM root cause analysis results #200064
Labels
Team:Obs AI Assistant
Observability AI Assistant
Team:obs-ux-management
Observability Management User Experience Team
As we begin to evaluate LLM assisted root cause analysis, we need a way to be able to evaluate the validity and usefulness of the results.
Historically, our process for evaluating these results has been quite manual. We've
As we begin to test more scenarios against an LLM assisted root cause analysis, we'd like a more automated and sensible way to evaluate the results.
We'd like to evaluate both controlled (known architecture, known root cause) and uncontrolled (unknown architecture, unknown root cause) scenarios. Here's some of what we envision.
We are aware of this framework for evaluating various scenarios with the LLM integration, but it's unclear if this can be adapted to fit our use case. We are intrigued by the idea of utilizing AI to help use evaluate the output of the investigation, for example providing the LLM with the final report and asking it how well does this final report reflect the known root cause. We'd be interested in a system that allows us to provide the given diagnostic file and output how well it believes the LLM performed.
The text was updated successfully, but these errors were encountered: