A self-hosted, privacy-focused RAG (Retrieval-Augmented Generation) interface for intelligent document interaction. Turn any document into a knowledge base you can chat with.
A powerful and secure document interaction system that transforms any document into an interactive knowledge base. Using advanced AI models that run entirely on-premises, DocuChat allows you to have natural conversations with your documents while maintaining complete data privacy and security.
- Complete Data Isolation: All documents and conversations stay within your network
- On-Premises Processing: AI models run locally, ensuring no data leaves your secure environment
- Local Vector Storage: Document embeddings are stored in your local Milvus instance
- Network Control: No external API dependencies for core functionality
The system uses the following model configurations by default:
- LLM Model:
ibm-granite/granite-3.1-1b-a400m-instruct
- Embedding Model:
ibm-granite/granite-embedding-30m-english
You can configure different models based on your needs:
- Smaller models for faster responses and lower resource usage
- Larger models for higher quality responses when compute resources are available
- Balance between model size and performance based on your hardware capabilities
- Fully on-premises deployment for maximum security and privacy
- All documents and embeddings stored locally in your secure environment
- No external API calls - all processing happens within your network
- Self-contained AI models running locally
- Interactive web interface for document Q&A
- Support for loading content from:
- Local files
- Local directories (recursive scanning)
- URLs
- Support for multiple document formats:
- PDF documents
- HTML pages
- Markdown files
- Plain text files
- Flexible model selection to balance performance and resource usage
- Configurable AI models to match your hardware capabilities
- Python 3.8+
- GPU (recommended) or CPU for model inference
- Clone the repository:
git clone https://github.com/yaacov/rag-chat-interface.git
cd rag-chat-interface
- Install dependencies:
# Optional: set a virtual env
python3.10 -m venv .venv
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
- Start the server:
python main.py \
[--source INITIAL_SOURCE] \
[--host HOST] \
[--port PORT] \
[--db-path DB_PATH] \
[--models-cache-dir CACHE_DIR] \
[--downloads-dir DOWNLOADS_DIR] \
[--chunk_size CHUNK_SIZE] \
[--chunk_overlap CHUNK_OVERLAP]
Arguments:
--source
: Optional initial document or directory to load--host
: Host to bind the server to (default: 0.0.0.0)--port
: Port to bind the server to (default: 8000)--db-path
: Path to the Milvus database file (default: ./rag_milvus.db)--models-cache-dir
: Directory to store downloaded models (default: ./models_cache)--downloads-dir
: Directory to store downloaded files (default: ./downloads)--chunk_size
: Maximum size of each document chunk (default: 1000 characters)--chunk_overlap
: Overlap between chunks (default: 200 characters)
- Open your browser and navigate to
http://localhost:8000
MIT License
Contributions are welcome! Please feel free to submit a Pull Request.