Advanced RAG Solution Accelerator

This solution accelerator supports advanced techniques for parsing, indexing and improved querying over non-structured data through a simple web interface to achieve improved accuracy and performance rates than a simple out of the box Retrieval-Augmented Generation (RAG) solution.

To read more about the underlying design principles, architecture, and solution capabilities, please refer to the solution documentation here: Advanced RAG Solution Accelerator Documentation.

❗Important❗

Content in this repository are for demo purposes only and not intended for 'production-ready' workloads. It focuses on showcasing the advanced RAG techniques used to improve accuracy and performance of standard RAG techniques.
In the context of a financial demo, it's important to understand the distinction between Microsoft's fiscal year and the calendar year. Microsoft's fiscal year runs from July 1st to June 30th of the following year, whereas the calendar year follows the traditional January 1st to December 31st timeline.
For more information on best practices on evaluation, architecture or validation of the solution design and outputs, please see the 'Additional Resources' section in solution documentation: Advanced RAG Solution Accelerator Documentation.

Use Case: Copilot for Financial Reports

A custom RAG application can be highly beneficial when dealing with large financial datasets like quarterly reports and multi-year company performance records. Here's why such a custom solution may be necessary:

Efficient Handling of Large Data Volumes: Financial data accumulated over multiple years can be massive. Standard off-the-shelf solutions might struggle with performance issues when indexing, retrieving, and processing such extensive datasets. A custom RAG application can be optimized to handle large volumes efficiently, ensuring quick response times.
Domain-Specific Knowledge and Contextual Understanding*: Financial data is rich with industry-specific terminology, acronyms, and complex concepts. A custom RAG solution could be optimized on domain-specific corpora to better understand and generate accurate responses related to financial statements, performance metrics, and regulatory filings.
Enhanced Query Capabilities: Financial professionals may need to ask complex queries that involve conditional logic, comparisons over time, or aggregations across different data dimensions. A custom solution can support advanced querying capabilities, natural language understanding tailored to financial contexts, and more accurate interpretation of user intent.

To showcase the solution’s capabilities, a pre-recorded voiceover demonstrates its functionality, ranging from simple queries to complex multimodal interactions. Watch the Demo Video below and follow along with the Demo Script.

Advanced_RAG_Techniques_Demo.mp4

Features

The repository includes a complete end-to-end solution, comprising:

A frontend application for seamless user interaction
Backend microservices to handle core functionalities
A dataset of financial reports to quickly set up and test the solution
A generic ingestion service that enhances chunks with metadata to drastically improve search results

Architecture

The solution includes the following key components:

Ingestion Service: The ingestion service includes various enhancements to ensure that when raw content is ingested, there is minimum loss of information. Additional metadata is added to the chunks to the index to make search results more relevant.
Enhanced User Interaction: This includes a frontend for users to interact with the bot and an implementation of a queuing layer in the backend. This allows users to send multiple questions to the copilot, and the copilot can produce multiple responses to a user query, making the overall experience more engaging.
Core Microservices and Skills: This includes the Orchestrator, which executes various skills to best address the user query. The core microservices handle different aspects of the solution, such as session management, data processing, runtime configuration, and orchestration. Specialized skills provide specific capabilities, such as search skill designed specifically for financial domain.
Testing and Evaluation: This includes the ability to simulate conversations with the copilot, run certain end-to-end tests on demand, and an evaluation tool to help perform end-to-end evaluation of the copilot.

For more information about these components, please refer to the solution guide documentation.

Getting Started

To set up and start using this project, follow our Getting Started Guide. It provides step-by-step instructions for both Azure resources and local environments.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct.

Resources:

Microsoft Open Source Code of Conduct
Microsoft Code of Conduct FAQ
Contact opencode@microsoft.com with questions or concerns

For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Responsible AI Guidelines

This project follows below responsible AI guidelines and best practices, please review them before using this project:

Microsoft Responsible AI Guidelines
Responsible AI practices for Azure OpenAI models
Safety evaluations transparency notes

Dataset License

This dataset is released under the Community Data License Agreement – Permissive, Version 2.0 - CDLA, see the LICENSE-DATA file.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Advanced RAG Solution Accelerator

❗Important❗

Table of Contents

Use Case: Copilot for Financial Reports

Features

Architecture

Getting Started

Code of Conduct

Responsible AI Guidelines

Dataset License

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Advanced RAG Solution Accelerator

❗Important❗

Table of Contents

Use Case: Copilot for Financial Reports

Features

Architecture

Getting Started

Code of Conduct

Responsible AI Guidelines

Dataset License

License