Riboflavin Kinase (Source)
brew install yt-dlp
brew install ffmpeg
cookies.txt
, downloaded such as with Get cookies.txt LOCALLY
Python 3.10 until Ray library updates to Python 3.11 or later
pip install -r requirements.txt
The goal of the project is to design a processing pipeline that can convert entire lecture series into 10-minute mini-lecture segments. This pipeline will be comprised of 4 scripts with functionalities ranging from downloading the lecture series from YouTube to uploading the resulting segments back to YouTube.
The first script will leverage yt-dlp to download lecture series from YouTube. It will store the downloaded content in an mp4 format. It will also use YouTube's automatic caption feature or a speech-to-text service to generate transcripts in srt format.
The second script will process the downloaded video and transcripts, segmenting them into 10-minute segments. The video splitting can be done using a tool like FFmpeg, and the transcript can be split according to the timestamps on the srt files.
The third script will use the ChatGPT model to generate titles, summaries, and tags for each segment. It will take the segmented transcript as input and use the natural language processing capabilities of ChatGPT to extract the most salient points and generate a coherent summary.
The fourth script will upload each of the segmented videos, along with their associated titles, summaries, and tags back to YouTube using the YouTube API. This script will handle any necessary YouTube authentication and provide an interface for inputting the necessary metadata (like channel ID, playlist, etc.).
/videos
|
|--- /<video>
| |--- original_video
| |--- renamed_video
| |--- /segment
| | |--- segment_video
| | |--- segment_transcript
| | |--- segment_metadata
| |--- /segment
| | |--- segment_video
| | |--- segment_transcript
| | |--- segment_metadata
|--- /<video>
| |--- original_video
| |--- renamed_video
| |--- /segment
| | |--- segment_video
| | |--- segment_transcript
| | |--- segment_metadata
| |--- /segment
| | |--- segment_video
| | |--- segment_transcript
| | |--- segment_metadata