This project implements a Retrieval-Augmented Generative (RAG) conversational AI chatbot powered by Groq and Gradio. It utilizes Sentence Transformers for efficient document retrieval and leverages pre-trained large language models (LLMs) through Groq for comprehensive question answering. The chatbot can handle various input modalities, including text documents, audio files, URLs, images, and CSV/Excel files.
- Multimodal interaction: Text, audio, URL, image, and CSV/Excel file uploads
- Contextual awareness for improved understanding of user queries
- Retrieval of relevant document segments to support responses
- Integration with various LLM models via Groq API
- Confidence score for answer reliability (to be implemented)
- Python 3.9 (https://www.python.org/downloads/)
- Transformers library (https://huggingface.co/docs/transformers/en/index)
- SentenceTransformers library (https://huggingface.co/sentence-transformers)
- Groq account and API key (https://groq.com/)
- Gradio library (https://www.gradio.app/guides/quickstart)
- Additional libraries for specific modalities:
- librosa for audio (pip install librosa)
- fitz for PDF (pip install fitz-py)
- pandas for CSV/Excel (pip install pandas)
pip install transformers sentence-transformers groq gradio librosa fitz pandas # For all functionalities
Alternatively, you can install only the required libraries based on the modalities you want to support.
- Clone this repository or download the project files.
- Install the required libraries (see Requirements).
- Configure your Groq API key (obtain from your Groq account).
- You can set the
GROQ_API_KEY
environment variable. - Alternatively, you can modify the `answer_question` function in `logic.py` to provide the API key as an argument.
- Prepare your input documents (text files, audio files, URLs, images, or CSV/Excel files).
- For text files, ensure each line is a separate document.
- For audio files, ensure they are in a supported format (e.g., WAV, MP3, MP4).
- For URLs, ensure they are valid links.
- For images, ensure they are in a supported format (e.g., PNG, JPG, JPEG).
- For CSV/Excel files, make sure the relevant text column is identified.
- (Optional) If using CSV/Excel files, modify the `csv_to_dict_list` function in `logic.py` to adjust column names or formatting as needed.
- Run the Gradio app:
- Open
http://localhost:1274
in your web browser to interact with the chatbot.
python groq_app.py
logic.py
: Contains core logic for document processing, retrieval, and question answering using RAG, Groq, and handling different modalities.groq_app.py
: Handles the Gradio app interface and interaction with `logic.py`. It defines functions like `print_like_dislike`, `add_message`, and functions for processing different modalities like audio, URLs, PDFs, images, and CSV/Excel files.
- The current implementation assumes pre-computed document embeddings. You may need to modify the code to generate embeddings from scratch if necessary.
- Consider adding error handling and logging for a more robust user experience.
- The confidence score for answer reliability is not yet implemented but can be integrated in the future.
Feel free to submit pull requests for improvements or bug fixes.