This project needs many more improvement to make it better! I've focused more on building parts and make connections between them to become a whole project that works. For deep learning, I may not do enough. I will make those improments in long term
Well, so much can do to improve that I will escape this part here...
- Add more clips to train.
- Try to build dataset with smaller clip length.
- Implement hooks to recognize the long clips that needs break and get timestamp for breakpoint. - IN DEVELOPMENT
- Increase the threshold.
- Find more clips about animal sound when building the dataset.
- Use a lighter model, such as resNest, denseNet and mobileNet.
- Ensemble models.
ASFG should be able to take your translations and break them into slices and put them into the subtitle file it generates. This function is very useful and will be the next goal after current functionalities are stable and effective.