From 6dcb31150cda228b3d2d2d9ff9cf100b6e4265a5 Mon Sep 17 00:00:00 2001 From: kdziedzic68 Date: Mon, 9 Dec 2024 14:32:13 +0100 Subject: [PATCH] chore: docs for eval (#203) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Mateusz Hordyński <26008518+mhordynski@users.noreply.github.com> Co-authored-by: Mateusz Hordyński --- docs/how-to/evaluate/custom_dataloader.md | 9 ++++ .../evaluate/custom_evaluation_pipeline.md | 8 ++++ docs/how-to/evaluate/custom_metric.md | 7 ++++ docs/how-to/evaluate/evaluate.md | 41 +++++++++++++++++++ .../how-to/{ => evaluate}/generate_dataset.md | 2 +- docs/how-to/{ => evaluate}/optimize.md | 0 mkdocs.yml | 10 ++++- 7 files changed, 74 insertions(+), 3 deletions(-) create mode 100644 docs/how-to/evaluate/custom_dataloader.md create mode 100644 docs/how-to/evaluate/custom_evaluation_pipeline.md create mode 100644 docs/how-to/evaluate/custom_metric.md create mode 100644 docs/how-to/evaluate/evaluate.md rename docs/how-to/{ => evaluate}/generate_dataset.md (99%) rename docs/how-to/{ => evaluate}/optimize.md (100%) diff --git a/docs/how-to/evaluate/custom_dataloader.md b/docs/how-to/evaluate/custom_dataloader.md new file mode 100644 index 000000000..abfb0ccfd --- /dev/null +++ b/docs/how-to/evaluate/custom_dataloader.md @@ -0,0 +1,9 @@ +# How to create custom DataLoader for Ragbits evaluation + +Ragbits provides a base interface for data loading, `ragbits.evaluate.loaders.base.DataLoader`, designed specifically for evaluation purposes. A ready-to-use implementation, `ragbits.evaluate.loaders.hf.HFLoader`, is available for handling datasets in huggingface format. + +To create a custom DataLoader for your specific needs, you need to implement the `load` method in a class that inherits from the `DataLoader` interface. + +Please find the [working example](optimize.md#define-the-data-loader) here. + +**Note:** This interface is not to be confused with PyTorch's `DataLoader`, as it serves a distinct purpose within the Ragbits evaluation framework. diff --git a/docs/how-to/evaluate/custom_evaluation_pipeline.md b/docs/how-to/evaluate/custom_evaluation_pipeline.md new file mode 100644 index 000000000..4e380ad91 --- /dev/null +++ b/docs/how-to/evaluate/custom_evaluation_pipeline.md @@ -0,0 +1,8 @@ +# How to create custom Evaluation Pipeline for Ragbits evaluation + +Ragbits provides a ready-to-use evaluation pipeline for document search, implemented within the `ragbits.evaluate.document_search.DocumentSearchPipeline` module. + +To create a custom evaluation pipeline for your specific use case, you need to implement the `__call__` method as part of the `ragbits.evaluate.pipelines.base.EvaluationPipeline` interface. + + +Please find the [working example](optimize.md#define-the-optimized-pipeline-structure) here \ No newline at end of file diff --git a/docs/how-to/evaluate/custom_metric.md b/docs/how-to/evaluate/custom_metric.md new file mode 100644 index 000000000..0e277fd9c --- /dev/null +++ b/docs/how-to/evaluate/custom_metric.md @@ -0,0 +1,7 @@ +# How to create custom Metric for Ragbits evaluation + +`ragbits.evaluate` package provides the implementation of metrics that measure the quality of document search pipeline within `ragbits.evaluate.metrics.document_search` +on your data, however you are not limited to this. In order to implement custom ones for your specific use case you would need to inherit from `ragbits.evaluate.metrics.base.Metric` +abstract class and implement `compute` method. + +Please find the [working example](optimize.md#define-the-metrics-and-run-the-experiment) here. \ No newline at end of file diff --git a/docs/how-to/evaluate/evaluate.md b/docs/how-to/evaluate/evaluate.md new file mode 100644 index 000000000..176bccf0d --- /dev/null +++ b/docs/how-to/evaluate/evaluate.md @@ -0,0 +1,41 @@ +# How to Evaluate with Ragbits + +Ragbits provides an interface for evaluating pipelines using specified metrics. Generally, you can create any evaluation pipeline and metrics that comply with the interface. + +Before running the evaluation, ensure the following prerequisites are met: + +1. Define the `EvaluationPipeline` structure class ([Example](optimize.md#define-the-optimized-pipeline-structure)) +2. Define the `Metrics` and organize them into a `MetricSet` ([Example](optimize.md#define-the-metrics-and-run-the-experiment)) +3. Define the `DataLoader` ([Example](optimize.md#define-the-data-loader)) + +The evaluator interface itself is straightforward and requires no additional configuration to instantiate. Once the three prerequisites are complete, running the evaluation is as simple as: + + +```python +import asyncio +from omegaconf import OmegaConf +from ragbits.evaluate.evaluator import Evaluator +from ragbits.evaluate.metrics.base import MetricSet + + + + +async def main(): + pipeline_config = OmegaConf.create({...}) + pipeline = YourPipelineClass(config=pipeline_config) + + metrics = [SomeMetric(OmegaConf.create({...})) for SomeMetric in your_metrics] + metric_set = MetricSet(*metrics) + + dataloader = YourDataLoaderClass(OmegaConf.create({...})) + + evaluator = Evaluator() + + eval_results = await evaluator.compute(pipeline=pipeline, metrics=metric_set, dataloader=dataloader) + print(eval_results) + +asyncio.run(main()) +``` + +After the succesful execution your console should print a dictionary with keys corresponding to components of each metric and values +equal to results aggregated over the defined dataloader. \ No newline at end of file diff --git a/docs/how-to/generate_dataset.md b/docs/how-to/evaluate/generate_dataset.md similarity index 99% rename from docs/how-to/generate_dataset.md rename to docs/how-to/evaluate/generate_dataset.md index 0df1034b9..63cd7fce6 100644 --- a/docs/how-to/generate_dataset.md +++ b/docs/how-to/evaluate/generate_dataset.md @@ -1,4 +1,4 @@ -# Generating a Dataset with Ragbits +# How to Generate a Dataset with Ragbits Ragbits offers a convenient feature to generate artificial QA datasets for evaluating Retrieval-Augmented Generation (RAG) systems. You can choose between two different approaches: diff --git a/docs/how-to/optimize.md b/docs/how-to/evaluate/optimize.md similarity index 100% rename from docs/how-to/optimize.md rename to docs/how-to/evaluate/optimize.md diff --git a/mkdocs.yml b/mkdocs.yml index 29fa2b500..75f9d05aa 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -12,10 +12,8 @@ nav: - How-to Guides: - how-to/use_prompting.md - how-to/prompts_lab.md - - how-to/optimize.md - how-to/use_guardrails.md - how-to/integrations/promptfoo.md - - how-to/generate_dataset.md - Document Search: - how-to/document_search/async_processing.md - how-to/document_search/create_custom_execution_strategy.md @@ -23,6 +21,14 @@ nav: - how-to/document_search/use_rephraser.md - how-to/document_search/use_reranker.md - how-to/document_search/distributed_ingestion.md + - Evaluate: + - how-to/evaluate/optimize.md + - how-to/evaluate/generate_dataset.md + - how-to/evaluate/evaluate.md + - how-to/evaluate/custom_metric.md + - how-to/evaluate/custom_evaluation_pipeline.md + - how-to/evaluate/custom_metric.md + - how-to/evaluate/custom_dataloader.md - API Reference: - Core: - api_reference/core/prompt.md