Speaker Diarization with Pyannote and Whisper.cpp

Uses Whisper.cpp to transcribe audio, and then performs speaker diarization with Pyannote.

Usage

Place video/audio files in input/, and then run main.py with docker compose up.

Performance for diarization seems to be improved when segment length for whisper is decreased, such as --max-len 50.