Verbilobot is a Telegram bot written in Go that transcribes voice messages, video notes, and any other media files. It uses the Groq API to transcribe the audio, and ffmpeg to convert any incoming audio to a format that Groq is happiest to convert.
Important
However you plan to run the bot, make sure to rename the .env.example
file to .env
and fill in your Telegram Bot Token and Groq API Token.
To build and run the project locally, you will need to have Go and FFmpeg installed on your machine.
On Linux:
git clone "https://github.com/bytebone/verbilobot.git"
cd verbilobot
go build -v -o verbilobot .
./verbilobot
Or on Windows:
git clone "https://github.com/bytebone/verbilobot.git"
cd verbilobot
go build -v -o verbilobot.exe .
start verbilobot.exe
To build and run the project with Docker, you will need to have Docker installed on your machine.
git clone "https://github.com/bytebone/verbilobot.git"
cd verbilobot/docker
cp ../.env.example ./.env
docker compose up --build -d
Thanks to Docker being awesome, this works the same on any platform.
The bot usually takes around 2 seconds to come online. Once the bot is running, you can forward any audio or video files to it to start the transcription process. Thanks to the high speeds at Groq, a minute of incoming audio takes only a few moments to transcribe and return to your chat. The main bottleneck you might notice is the local transcoding, which can take a noticeable amount of time to complete.