v0.1.0 updates

AstraBert · Apr 5, 2024 · 5a11aab · 5a11aab
1 parent 77410a4
commit 5a11aab
Show file tree

Hide file tree

Showing 7 changed files with 333 additions and 8 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,3 @@
+flagged/
+scripts/__pycache__
+docker/build_command.sh
diff --git a/README.md b/README.md
@@ -4,17 +4,21 @@
 _Go and give it a try [here](https://hf.co/chat/assistant/660d9a4f590a7924eed02a32)!_ 🤖
 
 <div align="center">
-    <hr>
     <img src="https://img.shields.io/github/languages/top/AstraBert/everything-rag" alt="GitHub top language">
    <img src="https://img.shields.io/github/commit-activity/t/AstraBert/everything-rag" alt="GitHub commit activity">
    <img src="https://img.shields.io/badge/everything_rag-partially stable-orange" alt="Static Badge">
-   <img src="https://img.shields.io/badge/Release-v0.0.0-blue" alt="Static Badge">
+   <img src="https://img.shields.io/badge/Release-v0.1.0-blue" alt="Static Badge">
+   <div>
+        <a href="https://astrabert.github.io/everything-rag/"><img src="./data/example_chat.png" alt="Example chat" align="center"></a>
+        <p><i>Example chat with everything-rag, mediated by google/flan-t5-base</i></p>
+   </div>
 </div>
 
 
 ### Table of Contents
 
 1. [Introduction](#introduction)
+2. [Inspiration](#inspiration)
 2. [Getting Started](#getting-started)
 3. [Using the Chatbot](#using-the-chatbot)
 4. [Troubleshooting](#troubleshooting)
@@ -38,22 +42,35 @@ While everything-rag offers many benefits, there are a couple of limitations to
 
 In summary, everything-rag is a simple, customizable, and local chatbot assistant that offers a wide range of features and capabilities. By leveraging the power of RAG, everything-rag offers a unique and flexible chatbot experience that can be tailored to your specific needs and preferences. Whether you're looking for a simple chatbot to answer basic questions or a more advanced conversational AI to engage with your users, everything-rag has got you covered.😊
 
+## Inspiration
+
+This project is a humble and modest carbon-copy of its main and true inspirations, i.e. [Jan.ai](https://jan.ai/), [Cheshire Cat AI](https://cheshirecat.ai/), [privateGPT](https://privategpt.io/) and many other projects that focus on making LLMs (and AI in general) open-source and easily accessible to everyone. 
+
 ## Getting Started
 
 You can do two things:
 
 - Play with generation on [Kaggle](https://www.kaggle.com/code/astrabertelli/gemma-for-datasciences)
 - Clone this repository, head over to [the python script](./scripts/gemma_for_datasciences.py) and modify everything to your needs!
+- Docker installation(⚠️**NOT YET FULLY IMPLEMENTED**): you will be able, **in the near future**, to install everything-rag through docker image and running it thanks do Docker by following these really simple commands:
 
-😇🤪Coming soon... Installation through Docker image!
+```bash
+docker pull ghcr.io/AstraBert/everything-rag:latest
+docker run everything-rag:latest -e "model=microsoft/phi-2" -e "task=text-generation"
+```
+- As you can see, you just need to specify the LLM model and its task. Keep in mind that, for what concerns v0.1.0, everything-rag supports only text-generation and text2text-generation (this is not even mandatory: if not specified, the image employes directly google/flan-t5-base, a model for text2text-generation). For these two tasks, you can use virtually *any* model from HuggingFace Hub: the sole recommendation is to watch out for your disk space, RAM and CPU power, LLMs can be quite resource-consuming!
 
 ## Using the Chatbot
 
 ### GUI
 
-The chatbot has a simple GUI built using tkinter. The GUI displays the chat history and allows the user to input queries. The user can send a message by pressing the "Send" button.
+The chatbot has a brand-new GradIO-based interface that runs on local server. You can interact by uploading directly your pdf files and/or sending messages, all (for now), by running:
+
+```bash
+python3 scripts/chat.py -m provider/modelname -t task
+```
 
-### Code breakdown
+### Code breakdown - notebook
 
 Everything is explained in [the dedicated notebook](./scripts/gemma-for-datasciences.ipynb), but here's a brief breakdown of the code:
 
@@ -66,7 +83,6 @@ Everything is explained in [the dedicated notebook](./scripts/gemma-for-datascie
 
 Et voilà, your chatbot is up and running!🦿
 
-
 ## Troubleshooting
 
 ### Common Issues Q&A
@@ -77,8 +93,10 @@ Et voilà, your chatbot is up and running!🦿
     > A: This is quite common with resource-limited environments that deal with too large or too small models: large models require **at least** 32 GB RAM and >8 core CPU, whereas small model can easily be allucinating and producing responses that are endless repetitions of the same thing! Check *penalty_score* parameter to avoid this. **try rephrasing the query and be as specific as possible**
 * Q: My model is allucinatin and/or repeating the same sentence over and over again😵‍💫
     > A: This is quite common with small or old models: check *penalty_score* and *temperature* parameter to avoid this. 
-* The chatbot is giving incorrect/non-meaningful answers🤥
+* Q: The chatbot is giving incorrect/non-meaningful answers🤥
     >A: Check that the PDF document is relevant and up-to-date. Also, **try rephrasing the query and be as specific as possible**
+* Q: An error occurred while generating the answer💔
+    >A: This frequently occures when your (small) LLM has a limited maximum hidden size (generally 512 or 1024) and the context that the retrieval-augmented chain produces goes beyond that maximum. You could, potentially, modify the configuration of the model, but this would mean dramatically increase its resource consumption, and your small laptop is not prepared to take it, trust me!!! A solution, if you have enough RAM and CPU power, is to switch to larger LLMs: they do not have problems in this sense.
 
 ## Contributing
 
@@ -92,10 +110,10 @@ Contributions are welcome! If you would like to improve the chatbot's functional
 * [Langchain-community](https://github.com/langchain-community/langchain-community)
 * [Tkinter](https://docs.python.org/3/library/tkinter.html)
 * [PDF document about data science](https://www.kaggle.com/datasets/astrabertelli/what-is-datascience-docs)
+* [GradIO](https://www.gradio.app/)
 
 ## License
 
-
 This project is licensed under the Apache 2.0 License.
 
 If you use this work for your projects, please consider citing the author [Astra Bertelli](http://astrabert.vercel.app).
diff --git a/data/example_chat.png b/data/example_chat.png
diff --git a/docker/Dockerfile b/docker/Dockerfile
@@ -0,0 +1,44 @@
+# Use an official Python runtime as a parent image
+FROM python:3.10-slim-buster
+
+# Set the working directory in the container to /app
+WORKDIR /app
+
+# Add the current directory contents into the container at /app
+ADD . /app
+
+# Update and install system dependencies
+RUN apt-get update && apt-get install -y \
+    build-essential \
+    libpq-dev \
+    libffi-dev \
+    libssl-dev \
+    musl-dev \
+    libxml2-dev \
+    libxslt1-dev \
+    zlib1g-dev \
+    && rm -rf /var/lib/apt/lists/*
+
+# Install Python dependencies
+RUN python3 -m pip cache purge
+RUN python3 -m pip install --no-cache-dir -r requirements.txt
+
+# Set version
+ARG version="v0.1.0"
+
+# Echo version
+RUN echo "Started everything-rag ${version}"
+
+# Set default model and task
+ENV model="google/flan-t5-base"
+
+ENV task="text2text-generation"
+
+# Run script to install and build the model for the first time
+RUN python3 utils.py -m ${model} -t ${task}
+
+# Expose the port that the application will run on
+EXPOSE 7860
+
+# Set the entrypoint with a default command and allow the user to override it
+ENTRYPOINT ["python3", "chat.py", "-m", "${model}", "-t", "${task}"]
diff --git a/docker/chat.py b/docker/chat.py
@@ -0,0 +1,78 @@
+import gradio as gr
+import os
+import time
+from utils import *
+
+vectordb = ""
+
+def generate_welcome_message():
+    return (None, "Hello! Welcome to the chatbot. You can enter a message or upload a file.")
+
+def print_like_dislike(x: gr.LikeData):
+    print(x.index, x.value, x.liked)
+
+def add_message(history, message):
+    if len(message["files"]) > 0:
+        history.append((message["files"], None))
+    if message["text"] is not None and message["text"] != "":
+        history.append((message["text"], None))
+    return history, gr.MultimodalTextbox(value=None, interactive=False)
+
+
+def bot(history):
+    global vectordb
+    global tsk
+    if type(history[-1][0]) != tuple:
+        if vectordb == "":
+            pipe = pipeline(tsk, tokenizer=tokenizer, model=model)
+            response = pipe(history[-1][0])[0]
+            response = response["generated_text"]
+            history[-1][1] = ""
+            for character in response:
+                history[-1][1] += character
+                time.sleep(0.05)
+                yield history
+        else:
+            try:
+                response = just_chatting(model=model, tokenizer=tokenizer, query=history[-1][0], vectordb=vectordb, chat_history=[convert_none_to_str(his) for his in history])["answer"]
+                history[-1][1] = ""
+                for character in response:
+                    history[-1][1] += character
+                    time.sleep(0.05)
+                    yield history
+            except Exception as e:
+                response = f"Sorry, the error '{e}' occured while generating the response; check [troubleshooting documentation](https://astrabert.github.io/everything-rag/#troubleshooting) for more"
+    if type(history[-1][0]) == tuple:
+        filelist = []
+        for i in history[-1][0]:
+            filelist.append(i)
+        finalpdf = merge_pdfs(filelist)
+        vectordb = create_a_persistent_db(finalpdf, os.path.dirname(finalpdf)+"_localDB", os.path.dirname(finalpdf)+"_embcache")
+        response = "VectorDB was successfully created, now you can ask me anything about the document you uploaded!😊"
+        history[-1][1] = ""
+        for character in response:
+            history[-1][1] += character
+            time.sleep(0.05)
+            yield history
+
+with gr.Blocks() as demo:
+    chatbot = gr.Chatbot(
+        [[None, "Hi, I'm **everything-rag**🤖.\nI'm here to assist you and let you chat with _your_ pdfs!\nCheck [my website](https://astrabert.github.io/everything-rag/) for troubleshooting and documentation reference\nHave fun!😊"]],
+        label="everything-rag",
+        elem_id="chatbot",
+        bubble_full_width=False,
+    )
+
+    chat_input = gr.MultimodalTextbox(interactive=True, file_types=["pdf"], placeholder="Enter message or upload file...", show_label=False)
+
+    chat_msg = chat_input.submit(add_message, [chatbot, chat_input], [chatbot, chat_input])
+    bot_msg = chat_msg.then(bot, chatbot, chatbot, api_name="bot_response")
+    bot_msg.then(lambda: gr.MultimodalTextbox(interactive=True), None, [chat_input])
+
+    chatbot.like(print_like_dislike, None, None)
+    gr.ClearButton(chatbot)
+demo.queue()
+if __name__ == "__main__":
+    demo.launch()
+
+
diff --git a/docker/requirements.txt b/docker/requirements.txt
@@ -0,0 +1,10 @@
+langchain-community==0.0.13 
+langchain==0.1.1 
+pypdf==3.17.4
+sentence_transformers==2.2.2
+chromadb==0.4.22
+gradio
+torch
+transformers
+trl 
+peft