Skip to content

cdreetz/just-another-llm-evaluator

Repository files navigation

LLM Eval and Compare

Idk I'll write this later just look at the 2 pages and read the code Idk I'll write this later just look at the 3 pages and read the code

Compare models

Compare prompts

Compare chats

Transcribe audio

TODO v0.1.0

  • model/provider selection dropdown

  • remove selected model button

  • initial results table skeleton

  • info button top left on eval page

  • info button top left on compare page

  • info button top left on transcription page

  • header component with the nav dropdown and github icon link

  • ? or just add the github link to the dropdown?

  • parallel requests and streaming for compare page

  • results table outputs should be equal width per model

  • results table resizable

  • Github logo link to repo

  • fix margin of header icons

  • add all model options to the dual llm comparison chat

  • about page

  • feedback / submit bug / feature request

  • stt page with Groq whisper

  • transcription optional arguments

  • lambda chat api support

  • hermes 3 models added

  • upload 'vibe check list' button

  • Markdown rendering for responses

v0.2.0

  • auth / sign up / login pages

  • save prompt collection

  • saved 'vibe check list'

  • saved eval runs

v0.3.0

  • API access
  • batch audio processing

v1.0.0

  • create documentation