Deliver safe & effective language models
-
Updated
Nov 23, 2024 - Python
Deliver safe & effective language models
Python SDK for running evaluations on LLM generated responses
A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.
Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)
[ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models
scalexi is a versatile open-source Python library, optimized for Python 3.11+, focuses on facilitating low-code development and fine-tuning of diverse Large Language Models (LLMs).
Template for an AI application that extracts the job information from a job description using openAI functions and langchain
TypeScript SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)
Indic evals for quantised models AWQ / GPTQ / EXL2
Add a description, image, and links to the llm-evaluation-toolkit topic page so that developers can more easily learn about it.
To associate your repository with the llm-evaluation-toolkit topic, visit your repo's landing page and select "manage topics."