Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: simplification of Document Search Evaluation interface #258

Merged
merged 32 commits into from
Jan 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
41b576f
run from config interface for evaluator
kdziedzic68 Dec 18, 2024
83ed7e7
optimize basic
kdziedzic68 Dec 18, 2024
070548a
fix ruff
kdziedzic68 Dec 18, 2024
5cd3f31
simplify examples
kdziedzic68 Dec 18, 2024
6e4b2d6
uncomment
kdziedzic68 Jan 9, 2025
33a6e1a
add info about neptune usage in opt
kdziedzic68 Jan 9, 2025
605968e
fix ruff
kdziedzic68 Jan 9, 2025
405fa35
inherit from with construction config
kdziedzic68 Jan 9, 2025
02e1c94
fix linters
kdziedzic68 Jan 9, 2025
80173a4
fix linters
kdziedzic68 Jan 9, 2025
18c9675
fixes
kdziedzic68 Jan 9, 2025
9ec00ae
ruff formatting
kdziedzic68 Jan 9, 2025
7bcf23f
fix ruff linter
kdziedzic68 Jan 9, 2025
a1c76bd
Merge branch 'main' into eval-interface-simpification
micpst Jan 15, 2025
6f78071
Merge branch 'main' into eval-interface-simpification
micpst Jan 16, 2025
98e8e7a
refactor eval
micpst Jan 17, 2025
d83a497
refactor advanced example
micpst Jan 17, 2025
06baf2e
optimizer fixes
micpst Jan 17, 2025
a0467a2
rename examples
micpst Jan 17, 2025
a171152
fix example deps
micpst Jan 17, 2025
b0a35e0
fix bug with rate limit
micpst Jan 17, 2025
f4eb227
fix formatting
micpst Jan 17, 2025
07f8216
fix optimization
micpst Jan 17, 2025
d53257b
evaluator type fixes
micpst Jan 17, 2025
8dfa696
revert change
micpst Jan 17, 2025
a824985
final version
micpst Jan 17, 2025
1afdee6
rename files
micpst Jan 17, 2025
799d3f3
fix config
micpst Jan 17, 2025
f8b882a
update examples
micpst Jan 17, 2025
e30616c
move loaders to dataloaders
micpst Jan 17, 2025
5c2bcae
fix sorting
micpst Jan 17, 2025
3f9a2fe
Merge branch 'main' into eval-interface-simpification
micpst Jan 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/how-to/evaluate/custom_dataloader.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# How to create custom DataLoader for Ragbits evaluation

Ragbits provides a base interface for data loading, `ragbits.evaluate.loaders.base.DataLoader`, designed specifically for evaluation purposes. A ready-to-use implementation, `ragbits.evaluate.loaders.hf.HFLoader`, is available for handling datasets in huggingface format.
Ragbits provides a base interface for data loading, `ragbits.evaluate.dataloaders.base.DataLoader`, designed specifically for evaluation purposes. A ready-to-use implementation, `ragbits.evaluate.dataloaders.hf.HFLoader`, is available for handling datasets in huggingface format.

To create a custom DataLoader for your specific needs, you need to implement the `load` method in a class that inherits from the `DataLoader` interface.

Expand Down
4 changes: 2 additions & 2 deletions docs/how-to/evaluate/optimize.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Ragbits provides a feature that allows users to automatically configure hyperpar
- The optimized pipeline must inherit from `ragbits.evaluate.pipelines.base.EvaluationPipeline`.
- The definition of optimized metrics must adhere to the `ragbits.evaluate.metrics.base.Metric` interface.
- These metrics should be gathered into an instance of `ragbits.evaluate.metrics.base.MetricSet`.
- An instance of a class inheriting from `ragbits.evaluate.metrics.loader.base.DataLoader` must be provided as the data source for optimization.
- An instance of a class inheriting from `ragbits.evaluate.dataloaders.base.DataLoader` must be provided as the data source for optimization.

## Supported Parameter Types

Expand Down Expand Up @@ -69,7 +69,7 @@ Next, we define the data loader. We'll use Ragbits generation stack to create an


```python
from ragbits.evaluate.loaders.base import DataLoader, DataT
from ragbits.evaluate.dataloaders.base import DataLoader, DataT
from ragbits.core.llms.litellm import LiteLLM
from ragbits.core.prompt import Prompt
from pydantic import BaseModel
Expand Down
Original file line number Diff line number Diff line change
@@ -1,35 +1,39 @@
# Document Search Evaluation

## Ingest
## Evaluation

### Evaluation on ingested data

```sh
uv run ingest.py
uv run evaluate.py
```

```sh
uv run ingest.py +experiments=chunking-250
uv run evaluate.py +experiments=chunking-250
```

```sh
uv run ingest.py --multirun +experiments=chunking-250,chunking-500,chunking-1000
uv run evaluate.py --multirun +experiments=chunking-250,chunking-500,chunking-1000
```

## Evaluate
### Logging

```sh
uv run evaluate.py
uv run evaluate.py logger.local=True
```

```sh
uv run evaluate.py +experiments=chunking-250
uv run evaluate.py logger.neptune=True
```

## Optimization

```sh
uv run evaluate.py --multirun +experiments=chunking-250,chunking-500,chunking-1000
uv run optimize.py
```

### Log to Neptune
### Monitoring

```sh
uv run evaluate.py neptune.run=True
uv run optimize.py neptune_callback=True
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
type: ragbits.evaluate.dataloaders.hf:HFDataLoader
config:
path: "micpst/hf-docs-retrieval"
split: "train"
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# @package _global_

task:
name: chunking-1000

pipeline:
config:
providers:
txt:
config:
chunking_kwargs:
max_characters: 1000
new_after_n_chars: 200
md:
config:
chunking_kwargs:
max_characters: 1000
new_after_n_chars: 200
vector_store:
config:
index_name: chunk-1000
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# @package _global_

task:
name: chunking-250

pipeline:
config:
providers:
txt:
config:
chunking_kwargs:
max_characters: 250
new_after_n_chars: 50
md:
config:
chunking_kwargs:
max_characters: 250
new_after_n_chars: 50
vector_store:
config:
index_name: chunk-250
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# @package _global_

task:
name: chunking-500

pipeline:
config:
providers:
txt:
config:
chunking_kwargs:
max_characters: 500
new_after_n_chars: 100
md:
config:
chunking_kwargs:
max_characters: 500
new_after_n_chars: 100
vector_store:
config:
index_name: chunk-500
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
precision_recall_f1:
type: ragbits.evaluate.metrics.document_search:DocumentSearchPrecisionRecallF1
config:
matching_strategy:
type: RougeChunkMatch
config:
threshold: 0.5
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
ranked_retrieval:
type: ragbits.evaluate.metrics.document_search:DocumentSearchRankedRetrievalMetrics
config:
matching_strategy:
type: RougeChunkMatch
config:
threshold: 0.5
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
defaults:
- [email protected]: hf
- [email protected]: document_search_optimization
- [email protected]:
- precision_recall_f1
- ranked_retrieval
- _self_

optimizer:
direction: maximize
n_trials: 5
max_retries_for_trial: 1

neptune_callback: False
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
defaults:
- [email protected]: litellm
- [email protected]_store: chroma
- [email protected]: noop
- [email protected]: noop
- [email protected]: unstructured
- [email protected]: hf
- _self_

type: ragbits.evaluate.pipelines.document_search:DocumentSearchPipeline
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
defaults:
- [email protected]: litellm_optimization
- [email protected]_store: chroma
- [email protected]: noop
- [email protected]: noop
- [email protected]: unstructured_optimization
- [email protected]: hf
- _self_

type: ragbits.evaluate.pipelines.document_search:DocumentSearchPipeline
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
type: ragbits.core.embeddings.litellm:LiteLLMEmbeddings
config:
model: "text-embedding-3-small"
options:
default_options:
dimensions: 768
encoding_format: float
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,15 @@ config:
optimize: true
choices:
- model: "text-embedding-3-small"
options:
default_options:
dimensions:
optimize: true
range:
- 32
- 512
encoding_format: float
- model: "text-embedding-3-large"
options:
default_options:
dimensions:
optimize: true
range:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
type: ragbits.document_search.documents.sources:HuggingFaceSource
config:
path: "micpst/hf-docs"
split: "train[:5]"
Original file line number Diff line number Diff line change
@@ -1,10 +1,8 @@
type: ragbits.core.vector_stores.chroma:ChromaVectorStore
config:
client:
type: PersistentClient
config:
path: chroma
index_name: default
type: EphemeralClient
index_name: baseline
distance_method: l2
default_options:
k: 3
Expand Down
11 changes: 11 additions & 0 deletions examples/evaluation/document-search/advanced/config/retrieval.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
defaults:
- dataloader: hf
- pipeline: document_search
- metrics:
- precision_recall_f1
- ranked_retrieval
- _self_

logger:
local: True
neptune: False
56 changes: 56 additions & 0 deletions examples/evaluation/document-search/advanced/evaluate.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# /// script
# requires-python = ">=3.10"
# dependencies = [
# "ragbits-core[chroma]",
# "ragbits-document-search[huggingface]",
# "ragbits-evaluate[relari]",
# ]
# ///
import asyncio
import logging
from typing import cast

import hydra
from omegaconf import DictConfig, OmegaConf

from ragbits.evaluate.evaluator import Evaluator
from ragbits.evaluate.utils import log_evaluation_to_file, log_evaluation_to_neptune

logging.getLogger("LiteLLM").setLevel(logging.ERROR)
logging.getLogger("httpx").setLevel(logging.ERROR)


async def evaluate(config: DictConfig) -> None:
"""
Document search evaluation runner.

Args:
config: Hydra configuration.
"""
print("Starting evaluation...")

evaluator_config = cast(dict, OmegaConf.to_container(config))
results = await Evaluator.run_from_config(evaluator_config)

if config.logger.local:
output_dir = log_evaluation_to_file(results)
print(f"Evaluation results saved under directory: {output_dir}")

if config.logger.neptune:
log_evaluation_to_neptune(results, config)
print("Evaluation results uploaded to Neptune")


@hydra.main(config_path="config", config_name="retrieval", version_base="3.2")
def main(config: DictConfig) -> None:
"""
Runs the evaluation process.

Args:
config: Hydra configuration.
"""
asyncio.run(evaluate(config))


if __name__ == "__main__":
main()
40 changes: 40 additions & 0 deletions examples/evaluation/document-search/advanced/optimize.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# /// script
# requires-python = ">=3.10"
# dependencies = [
# "ragbits-core[chroma]",
# "ragbits-document-search[huggingface]",
# "ragbits-evaluate[relari]",
# ]
# ///
import logging
from typing import cast

import hydra
from omegaconf import DictConfig, OmegaConf

from ragbits.evaluate.optimizer import Optimizer
from ragbits.evaluate.utils import log_optimization_to_file

logging.getLogger("LiteLLM").setLevel(logging.ERROR)
logging.getLogger("httpx").setLevel(logging.ERROR)


@hydra.main(config_path="config", config_name="optimization", version_base="3.2")
def main(config: DictConfig) -> None:
"""
Runs the optimization process.

Args:
config: Hydra configuration.
"""
print("Starting optimization...")

optimizer_config = cast(dict, OmegaConf.to_container(config))
configs_with_scores = Optimizer.run_from_config(optimizer_config)

output_dir = log_optimization_to_file(configs_with_scores)
print(f"Optimization results saved under directory: {output_dir}")


if __name__ == "__main__":
main()
Loading
Loading