Skip to content

Repository contains some example tests to evaluate LLMs

Notifications You must be signed in to change notification settings

rawar/ix-promptfoo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ix-promptfoo

This repository contains examples of promptfoo. To run this, promptfoo, Python >= 3.9 and the Python framework langchain must be installed. To use OpenAI's GPT-3.5_turbo, you need a valid API key. Please export the key into your environment for example with

export OPENAI_API_KEY=sg......

To use local models from Ollama, you need to install Ollama and its models first.

Install promotfoo

On the command-line you can install promptfoo with

$ npm install -g promptfoo

or, if you would like to use npx

$ npx promptfoo@latest

Install Python virtual environment

If you allready have Python installed, please add a virtual environment like

$ python3 -m venv .venv
$ source .venv/bin/activate

Install langchain

To install the necessary libraries run

$ pip install -r requirements.txt

on your command-line.

Run promptfoo examples

To run the different promptfoo examples use

$ npx promptfoo eval -c <promptfooconfig_XXX>.yaml

For example

$ npx promptfoo eval -c promptfooconfig_all_any.yaml

View results

Promptfoo shows the result of a test run on the command line. To display the results in the browser you can use

$ npx promptfoo view

which starts the integrated server and displays the results in the web browser.

Clean promptfoo cache

If nothing else is set, promptfoo uses a cache to avoid executing the same prompts multiple times against the providers (LLMs, scripts, etc.). To clear this cache just enter

$ npx promptfoo npx promptfoo clean cache

Test RAG

To run the RAG evaluation promptfooconfig_rag.yaml you need to initialize a Chroma vector database with the content of twenty-thousand-leagues-under-the-sea.txt from project Gutenberg. Therefor the insert_vdb.py script is implemented. If you would like to save some cents yo can use the existing vector database with all embeddings.

Test Hugging Face classifiers

The promptfoo example promptfooconfig_classifier.yaml is using the text classifiers from Hugging Face. To use this, you need to export a valid Hugging Face API token like

$ export HF_API_TOKEN=.....

About

Repository contains some example tests to evaluate LLMs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages