IAssistant is a demo project to start with a customizable AI voice assistant. It is build by adapting the core functionalities of the AlwaysReddy project. AlwaysReddy is designed for multi-OS environments, with a couple of clients to choose from, it simplifies a variety of tasks through speech and text-based interaction, featuring enhanced extensibility and user-friendly controls via a terminal based app with hotkeys controls.
IAssistant is meant for linux, Windows and Mac stuff has been striped out, remote AI clients have been removed as well to prefer opensource local tools but hence requires a GPU with sufficient VRAM to allow for small latency.
Disclaimer: This project is a starter project, currently under construction. It provides basic logic and simple use cases, far from perfect and subject to occasional bugs. Have you ever received an awkward answer from an LLM due to a messy prompt? Imagine when your assistant speaks it out loud! While it’s a fun and half-functional demonstration, it’s still a work in progress—designed to explore possibilities, not polished for production by any means. Use with curiosity and a pinch of patience! 😅
- Clipboard Integration: Leverages clipboard content for completing tasks based on user requests.
- PDF Handling: Enables querying and interaction with content in PDF documents.
- Customizable Components:
- Transcriber engine: Use Faster Whisper or OpenAI Whisper.
- Transcriber model: Choose from available models like
openai/whisper-large-v3-turbo
,tiny.en
, etc. - LLM: Integrate any compatible LLM via Ollama with adaptable prompts.
- System prompts: Tailor prompts to suit specific use cases.
- Text-to-speech voice: Utilize Piper to set the desired voice.
- Speech Output: Conversational answers are spoken aloud.
- Clipboard Outputs: Outputs formatted text or code directly to the clipboard for further use.
- Python GTK Interface: A graphical interface replaces terminal-based logic, enabling intuitive settings adjustment (e.g., model, voice, prompt).
Ensure your environment is set up for Python-based AI development tasks. Then, follow these steps:
-
Clone the repository:
git clone https://github.com/sdelahaies/IAssistant.git cd IAssistant
-
Install the assistant using the setup script:
python setup.py
-
Start the assistant:
./run_IAssistant.sh
A default english voice is present in the repo piper_tss/voices/voice_en_1/
, download other voices here, place them in piper_tss/voices/
and update voice_list
in main.py
.
- update system prompts
- allow for the use of uv instead of virtualenv
- add specific use cases
- keep up cleaning up the repo!
- Vision Model Integration: Enable handling of images copied to the clipboard for document-related queries.
- Image Generation: Add functionality to generate images using diffuser models (e.g., Stable Diffusion XL).
- Using AI to Solve Real-World Problems
- AlwaysReddy: Terminal-based AI Assistant
- Ollama: Get up and running with large language models
- OpenAI Whisper Large v3 Turbo
- Faster Whisper
- Piper: Neural Text-to-Speech
- The Python GTK+ 3 Tutorial
- Need to logo for your app? Check out this huggingface space Prompt: [Style: Minimalist] [Color: black and White] [Concept: AI] [Text: 'IA'] [Background: circular vortex]