This solution accelerator supports advanced techniques for parsing, indexing and improved querying over non-structured data through a simple web interface to achieve improved accuracy and performance rates than a simple out of the box Retrieval-Augmented Generation (RAG) solution.
To read more about the underlying design principles, architecture, and solution capabilities, please refer to the solution documentation here: Advanced RAG Solution Accelerator Documentation.
- Content in this repository are for demo purposes only and not intended for 'production-ready' workloads. It focuses on showcasing the advanced RAG techniques used to improve accuracy and performance of standard RAG techniques.
- In the context of a financial demo, it's important to understand the distinction between Microsoft's fiscal year and the calendar year. Microsoft's fiscal year runs from July 1st to June 30th of the following year, whereas the calendar year follows the traditional January 1st to December 31st timeline.
- For more information on best practices on evaluation, architecture or validation of the solution design and outputs, please see the 'Additional Resources' section in solution documentation: Advanced RAG Solution Accelerator Documentation.
- Use Case: Copilot for Financial Reports
- Features
- Architecture
- Getting Started
- Code of Conduct
- Responsible AI Guidelines
- Dataset License
- License
A custom RAG application can be highly beneficial when dealing with large financial datasets like quarterly reports and multi-year company performance records. Here's why such a custom solution may be necessary:
-
Efficient Handling of Large Data Volumes: Financial data accumulated over multiple years can be massive. Standard off-the-shelf solutions might struggle with performance issues when indexing, retrieving, and processing such extensive datasets. A custom RAG application can be optimized to handle large volumes efficiently, ensuring quick response times.
-
Domain-Specific Knowledge and Contextual Understanding*: Financial data is rich with industry-specific terminology, acronyms, and complex concepts. A custom RAG solution could be optimized on domain-specific corpora to better understand and generate accurate responses related to financial statements, performance metrics, and regulatory filings.
-
Enhanced Query Capabilities: Financial professionals may need to ask complex queries that involve conditional logic, comparisons over time, or aggregations across different data dimensions. A custom solution can support advanced querying capabilities, natural language understanding tailored to financial contexts, and more accurate interpretation of user intent.
To showcase the solution’s capabilities, a pre-recorded voiceover demonstrates its functionality, ranging from simple queries to complex multimodal interactions. Watch the Demo Video below and follow along with the Demo Script.
Advanced_RAG_Techniques_Demo.mp4
The repository includes a complete end-to-end solution, comprising:
- A frontend application for seamless user interaction
- Backend microservices to handle core functionalities
- A dataset of financial reports to quickly set up and test the solution
- A generic ingestion service that enhances chunks with metadata to drastically improve search results
The solution includes the following key components:
-
Ingestion Service: The ingestion service includes various enhancements to ensure that when raw content is ingested, there is minimum loss of information. Additional metadata is added to the chunks to the index to make search results more relevant.
-
Enhanced User Interaction: This includes a frontend for users to interact with the bot and an implementation of a queuing layer in the backend. This allows users to send multiple questions to the copilot, and the copilot can produce multiple responses to a user query, making the overall experience more engaging.
-
Core Microservices and Skills: This includes the Orchestrator, which executes various skills to best address the user query. The core microservices handle different aspects of the solution, such as session management, data processing, runtime configuration, and orchestration. Specialized skills provide specific capabilities, such as search skill designed specifically for financial domain.
-
Testing and Evaluation: This includes the ability to simulate conversations with the copilot, run certain end-to-end tests on demand, and an evaluation tool to help perform end-to-end evaluation of the copilot.
For more information about these components, please refer to the solution guide documentation.
To set up and start using this project, follow our Getting Started Guide. It provides step-by-step instructions for both Azure resources and local environments.
This project has adopted the Microsoft Open Source Code of Conduct.
Resources:
- Microsoft Open Source Code of Conduct
- Microsoft Code of Conduct FAQ
- Contact [email protected] with questions or concerns
For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.
This project follows below responsible AI guidelines and best practices, please review them before using this project:
- Microsoft Responsible AI Guidelines
- Responsible AI practices for Azure OpenAI models
- Safety evaluations transparency notes
This dataset is released under the Community Data License Agreement – Permissive, Version 2.0 - CDLA, see the LICENSE-DATA file.
This project is licensed under the MIT License. See the LICENSE file for details.