HeyGenClone

The project is no longer supported.

Welcome to the HeyGenClone, an open-source analogue of the HeyGen system.

I am a developer from Moscow 🇷🇺 who devotes his free time to studying new technologies. The project is in an active development phase, but I hope it will help you achieve your goals!

Currently, translation support is enabled only from English 🇬🇧!

Installation 🥸

Clone this repo
Install conda
Create environment with Python 3.10 (for macOS refer to link)
Activate environment
Install requirements:
```
cd path_to_project
sh install.sh
```
In config.json file change HF_TOKEN argument. It is your HuggingFace token. Visit speaker-diarization, segmentation and accept user conditions
Download weights from drive, unzip downloaded file into weights folder
Install ffmpeg

Configurations (config.json) 🧙‍♂️

Key	Description
DET_TRESH	Face detection treshtold [0.0:1.0]
DIST_TRESH	Face embeddings distance treshtold [0.0:1.0]
HF_TOKEN	Your HuggingFace token (see Installation)
USE_ENHANCER	Do we need to improve faces using GFPGAN?
ADD_SUBTITLES	Subtitles in the output video

Supported languages 🙂

English (en), Spanish (es), French (fr), German (de), Italian (it), Portuguese (pt), Polish (pl), Turkish (tr), Russian (ru), Dutch (nl), Czech (cs), Arabic (ar), Chinese (zh-cn), Japanese (ja), Hungarian (hu) and Korean (ko)

Usage 🤩

Activate your environment:

  conda activate your_env_name

Сd to project path:

  cd path_to_project

At the root of the project there is a translate script that translates the video you set.

video_filename - the filename of your input video (.mp4)
output_language - the language to be translated into. Provided here (you can also find it in my code)
output_filename - the filename of output video (.mp4)

python translate.py video_filename output_language -o output_filename

I also added a script to overlay the voice on the video with lip sync, which allows you to create a video with a person pronouncing your speech. Сurrently it works for videos with one person.

voice_filename - the filename of your speech (.wav)
video_filename - the filename of your input video (.mp4)
output_filename - the filename of output video (.mp4)

python speech_changer.py voice_filename video_filename -o output_filename

How it works 😱

Detecting scenes (PySceneDetect)
Face detection (yolov8-face)
Reidentification (deepface)
Speech enhancement (MDXNet)
Speakers transcriptions and diarization (whisperX)
Text translation (googletrans)
Voice cloning (TTS)
Lip sync (lipsync)
Face restoration (GFPGAN)
[Need to fix] Search for talking faces, determining what this person is saying

Translation results 🥺

Note that this example was created without GFPGAN usage!

Destination language	Source video	Output video
🇷🇺 (Russian)

Contributors 🫵🏻

To-Do List 🤷🏼‍♂️

Fully GPU support
Multithreading support (optimizations)
Detecting talking faces (improvement)

Other 🤘🏻

Tested on macOS
⚠️ The project is under development!

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
core		core
weights		weights
.gitignore		.gitignore
README.md		README.md
config.json		config.json
install.sh		install.sh
translate.py		translate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HeyGenClone

The project is no longer supported.

Installation 🥸

Configurations (config.json) 🧙‍♂️

Supported languages 🙂

Usage 🤩

How it works 😱

Translation results 🥺

Contributors 🫵🏻

To-Do List 🤷🏼‍♂️

Other 🤘🏻

About

Releases

Packages

Languages

hlin99/HeyGenClone

Folders and files

Latest commit

History

Repository files navigation

HeyGenClone

The project is no longer supported.

Installation 🥸

Configurations (config.json) 🧙‍♂️

Supported languages 🙂

Usage 🤩

How it works 😱

Translation results 🥺

Contributors 🫵🏻

To-Do List 🤷🏼‍♂️

Other 🤘🏻

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages