Project: Toddler subtitles

Overview

This project brings to life the imaginative stories of a 3-year-old by using the OpenAI API to create videos with AI-generated images and subtitles. Recordings of a child's voice are transcribed, subtitled, and then transformed into captivating videos with visually appealing background images.

Initial results here

Setup

export OPENAI_API_KEY=...
mkdir sources
# Place the mp3 files in the sources folder

Workflow

Audio Segmentation

Split long audio files into smaller segments (up to 20MB each) for efficient processing.

segment.py sources/eric1.mp3
# Results: sources/eric1_0.mp3, sources/eric1_1.mp3, ...

Transcription

Convert audio segments to text and subtitles.

transcribe.py sources/eric1
concatenate.py sources/eric1_
transcribe_srt.py sources/eric1
concatenate_srt.py sources/eric1_
# Results: eric1.txt, eric1.srt

Text Improvement with GPT-4 (optional)

Refine the transcribed text using GPT-4 for more coherent storytelling.

prompt.py sources/eric1
# Results: eric1_0_corrected.txt, eric1_1_corrected.txt, ...

Simple Video Creation

Generate a basic video with audio and subtitles.

video.py sources/eric1

Enhanced Video with DALL-E Images

Create a video with DALL-E generated images for every minute of audio.

# creates a new srt with 1 minute accumulation of text.
python3 video_dallee_accumulated.py 

# creates a new srt with prompts instead of accumulationm of text
python3 video_dallee_gpt4.py

# creates the images from the prompts into a folder
python3 video_dallee_dalle.py

# creates the video using the mp3, the subtitles, and the images (from the folder, using the timestamps in the srt)
python3 video_dalle.py

TODO (in progress)

TODO (future)

Interactive Web Interface: Create a web application allowing users to upload audio and customize the video generation process.
Interactive Web Interface: Create a web application allowing users to upload audio and customize the video generation process.
Narrative Enhancement: Implement more advanced NLP techniques to enrich the storytelling aspect.
Custom Image Styles: Integrate options for different illustration styles in DALL-E image generation.

References

OpenAI Speech-to-Text Quickstart

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project: Toddler subtitles

Overview

Setup

Workflow

Audio Segmentation

Transcription

Text Improvement with GPT-4 (optional)

Simple Video Creation

Enhanced Video with DALL-E Images

TODO (in progress)

TODO (future)

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
README.md		README.md
concatenate.py		concatenate.py
concatenate_srt.py		concatenate_srt.py
prompt.py		prompt.py
segment.py		segment.py
transcribe.py		transcribe.py
transcribe_srt.py		transcribe_srt.py
video.py		video.py
video_dallee.py		video_dallee.py
video_dallee_accumulated.srt.py		video_dallee_accumulated.srt.py
video_dallee_dallee.py		video_dallee_dallee.py
video_dallee_gpt4.py		video_dallee_gpt4.py

one1zero1one/EricSpeaks

Folders and files

Latest commit

History

Repository files navigation

Project: Toddler subtitles

Overview

Setup

Workflow

Audio Segmentation

Transcription

Text Improvement with GPT-4 (optional)

Simple Video Creation

Enhanced Video with DALL-E Images

TODO (in progress)

TODO (future)

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages