GitHub - noah-art3mis/crucible: Develop better LLM apps by testing different models and prompts in bulk.

Crucible

Lightweight prompt evaluation package.

Use online here. Can also be used locally through streamlit. Can use ollama to run LLMs locally if necessary.

Cost estimation is very rough (input * 2).

Instructions

Set the models, prompts and variables
Set grading style and temperature
- "EXACT": is either right or wrong. ignores line breaks and spaces in answer
- "QUALITATIVE": ask gpt4o for feedback. be mindful of this token usage
Click compile. Check the price estimation. Click run.
Results are shown segmented by category.

Parameters

Model
- id (str): name as understood by ollama. you might need to download it first
- source (str): "local" or "openai" or "anthropic"
```
Model("llama3", "local")
```

Prompt

id (str): name of the test case
slot (str): name of theslot which will be substituted by the variable in the prompt
content (str): actual prompt

Prompt(
    id="test_3",
    slot="{variable}",
    content="""Sua tarefa é analisar e responder se o texto a seguir menciona a necessidade de comprar remédios ou itens de saúde. Aqui está o texto:\n\n###\n\n{variable}\n\n###\n\n\nPrimeiro, analise cuidadosamente o texto em um rascunho. Depois, responda: a solicitação citada menciona a necessidade de comprar remédios ou itens de saúde? Responda "<<SIM>>" ou "<<NÃO>>".""",
)

Variable

id (str): name of the test case
content (str): text of snippet to be inserted in prompt
expected (str list): values that would be considered correct
options (str list): all values that the response could take. leave empty if does not apply

Variable(
    id="despesas_essenciais",
    content="Família monoparental composta por Josefa e 5 filhos com idades entre 1 e 17 anos. Contam apenas com a renda de coleta de material reciclável e relatam dificuldade para manter as despesas essenciais. Solicita-se, portanto, o auxílio vulnerabilidade.",
    expected=["<<NAO>>", "<<NÃO>>"],
),

TODO

add tests
add instructions

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
src/crucible		src/crucible
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Crucible

Instructions

Parameters

TODO

Resources

About

Releases 3

Packages

Languages

License

noah-art3mis/crucible

Folders and files

Latest commit

History

Repository files navigation

Crucible

Instructions

Parameters

TODO

Resources

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages