Skip to content

Commit

Permalink
chore: docs for eval (#203)
Browse files Browse the repository at this point in the history
Co-authored-by: Mateusz Hordyński <[email protected]>
Co-authored-by: Mateusz Hordyński <[email protected]>
  • Loading branch information
3 people authored Dec 9, 2024
1 parent 32ba29c commit 6dcb311
Show file tree
Hide file tree
Showing 7 changed files with 74 additions and 3 deletions.
9 changes: 9 additions & 0 deletions docs/how-to/evaluate/custom_dataloader.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# How to create custom DataLoader for Ragbits evaluation

Ragbits provides a base interface for data loading, `ragbits.evaluate.loaders.base.DataLoader`, designed specifically for evaluation purposes. A ready-to-use implementation, `ragbits.evaluate.loaders.hf.HFLoader`, is available for handling datasets in huggingface format.

To create a custom DataLoader for your specific needs, you need to implement the `load` method in a class that inherits from the `DataLoader` interface.

Please find the [working example](optimize.md#define-the-data-loader) here.

**Note:** This interface is not to be confused with PyTorch's `DataLoader`, as it serves a distinct purpose within the Ragbits evaluation framework.
8 changes: 8 additions & 0 deletions docs/how-to/evaluate/custom_evaluation_pipeline.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# How to create custom Evaluation Pipeline for Ragbits evaluation

Ragbits provides a ready-to-use evaluation pipeline for document search, implemented within the `ragbits.evaluate.document_search.DocumentSearchPipeline` module.

To create a custom evaluation pipeline for your specific use case, you need to implement the `__call__` method as part of the `ragbits.evaluate.pipelines.base.EvaluationPipeline` interface.


Please find the [working example](optimize.md#define-the-optimized-pipeline-structure) here
7 changes: 7 additions & 0 deletions docs/how-to/evaluate/custom_metric.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# How to create custom Metric for Ragbits evaluation

`ragbits.evaluate` package provides the implementation of metrics that measure the quality of document search pipeline within `ragbits.evaluate.metrics.document_search`
on your data, however you are not limited to this. In order to implement custom ones for your specific use case you would need to inherit from `ragbits.evaluate.metrics.base.Metric`
abstract class and implement `compute` method.

Please find the [working example](optimize.md#define-the-metrics-and-run-the-experiment) here.
41 changes: 41 additions & 0 deletions docs/how-to/evaluate/evaluate.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# How to Evaluate with Ragbits

Ragbits provides an interface for evaluating pipelines using specified metrics. Generally, you can create any evaluation pipeline and metrics that comply with the interface.

Before running the evaluation, ensure the following prerequisites are met:

1. Define the `EvaluationPipeline` structure class ([Example](optimize.md#define-the-optimized-pipeline-structure))
2. Define the `Metrics` and organize them into a `MetricSet` ([Example](optimize.md#define-the-metrics-and-run-the-experiment))
3. Define the `DataLoader` ([Example](optimize.md#define-the-data-loader))

The evaluator interface itself is straightforward and requires no additional configuration to instantiate. Once the three prerequisites are complete, running the evaluation is as simple as:


```python
import asyncio
from omegaconf import OmegaConf
from ragbits.evaluate.evaluator import Evaluator
from ragbits.evaluate.metrics.base import MetricSet




async def main():
pipeline_config = OmegaConf.create({...})
pipeline = YourPipelineClass(config=pipeline_config)

metrics = [SomeMetric(OmegaConf.create({...})) for SomeMetric in your_metrics]
metric_set = MetricSet(*metrics)

dataloader = YourDataLoaderClass(OmegaConf.create({...}))

evaluator = Evaluator()

eval_results = await evaluator.compute(pipeline=pipeline, metrics=metric_set, dataloader=dataloader)
print(eval_results)

asyncio.run(main())
```

After the succesful execution your console should print a dictionary with keys corresponding to components of each metric and values
equal to results aggregated over the defined dataloader.
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Generating a Dataset with Ragbits
# How to Generate a Dataset with Ragbits

Ragbits offers a convenient feature to generate artificial QA datasets for evaluating Retrieval-Augmented Generation (RAG) systems. You can choose between two different approaches:

Expand Down
File renamed without changes.
10 changes: 8 additions & 2 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,17 +12,23 @@ nav:
- How-to Guides:
- how-to/use_prompting.md
- how-to/prompts_lab.md
- how-to/optimize.md
- how-to/use_guardrails.md
- how-to/integrations/promptfoo.md
- how-to/generate_dataset.md
- Document Search:
- how-to/document_search/async_processing.md
- how-to/document_search/create_custom_execution_strategy.md
- how-to/document_search/search_documents.md
- how-to/document_search/use_rephraser.md
- how-to/document_search/use_reranker.md
- how-to/document_search/distributed_ingestion.md
- Evaluate:
- how-to/evaluate/optimize.md
- how-to/evaluate/generate_dataset.md
- how-to/evaluate/evaluate.md
- how-to/evaluate/custom_metric.md
- how-to/evaluate/custom_evaluation_pipeline.md
- how-to/evaluate/custom_metric.md
- how-to/evaluate/custom_dataloader.md
- API Reference:
- Core:
- api_reference/core/prompt.md
Expand Down

0 comments on commit 6dcb311

Please sign in to comment.