Skip to content

sdelahaies/IAssistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IAssistant: An Adaptable AI Voice Assistant

IAssistant is a demo project to start with a customizable AI voice assistant. It is build by adapting the core functionalities of the AlwaysReddy project. AlwaysReddy is designed for multi-OS environments, with a couple of clients to choose from, it simplifies a variety of tasks through speech and text-based interaction, featuring enhanced extensibility and user-friendly controls via a terminal based app with hotkeys controls.

IAssistant is meant for linux, Windows and Mac stuff has been striped out, remote AI clients have been removed as well to prefer opensource local tools but hence requires a GPU with sufficient VRAM to allow for small latency.

Disclaimer: This project is a starter project, currently under construction. It provides basic logic and simple use cases, far from perfect and subject to occasional bugs. Have you ever received an awkward answer from an LLM due to a messy prompt? Imagine when your assistant speaks it out loud! While it’s a fun and half-functional demonstration, it’s still a work in progress—designed to explore possibilities, not polished for production by any means. Use with curiosity and a pinch of patience! 😅

Features

  • Clipboard Integration: Leverages clipboard content for completing tasks based on user requests.
  • PDF Handling: Enables querying and interaction with content in PDF documents.
  • Customizable Components:
    • Transcriber engine: Use Faster Whisper or OpenAI Whisper.
    • Transcriber model: Choose from available models like openai/whisper-large-v3-turbo, tiny.en, etc.
    • LLM: Integrate any compatible LLM via Ollama with adaptable prompts.
    • System prompts: Tailor prompts to suit specific use cases.
    • Text-to-speech voice: Utilize Piper to set the desired voice.
  • Speech Output: Conversational answers are spoken aloud.
  • Clipboard Outputs: Outputs formatted text or code directly to the clipboard for further use.
  • Python GTK Interface: A graphical interface replaces terminal-based logic, enabling intuitive settings adjustment (e.g., model, voice, prompt).

Installation

Ensure your environment is set up for Python-based AI development tasks. Then, follow these steps:

  1. Clone the repository:

    git clone https://github.com/sdelahaies/IAssistant.git
    cd IAssistant
  2. Install the assistant using the setup script:

    python setup.py
  3. Start the assistant:

    ./run_IAssistant.sh

install voices

A default english voice is present in the repo piper_tss/voices/voice_en_1/, download other voices here, place them in piper_tss/voices/ and update voice_list in main.py.

TODO

  • update system prompts
  • allow for the use of uv instead of virtualenv
  • add specific use cases
  • keep up cleaning up the repo!
  • Vision Model Integration: Enable handling of images copied to the clipboard for document-related queries.
  • Image Generation: Add functionality to generate images using diffuser models (e.g., Stable Diffusion XL).

References

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published