Skip to content

loganhart02/Deep-Learning-Projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 

Repository files navigation

Deep Learning Paper Implementations

originally made this back in 2021 for myself currently updating the code for all my papers and publishing them as I get them working

just a repo of deep learning papers I implemented just to practice implementing and reading/understanding deep learning papers. Every model I train will have a demo you can load up locally to test/try models. this is pretty much my main repo to practicing build ML pipelines from data collection to MLops. Each model has it's own branch. To see the template I used for each project see project_template branch

machine setup

Here is the machine I used to train all these models

  • cpu: threadripper 1920x
  • gpu: duel RTX 3090s
  • ram: 120 Gbs of DDR4(rgb to make faster)
  • os: Ubuntu 22.04 LTS/20.04 LTS
  • editor: VScode

Papers in repo

Image

  • Gradient-based learning applied to document recognition(lenet5): https://ieeexplore.ieee.org/document/726791
  • ImageNet Classification with Deep Convolutional Neural Networks (AlexNet)" by Alex Krizhevsky et al. (2012): https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
  • "Very Deep Convolutional Networks for Large-Scale Image Recognition (VGGNet)" by Karen Simonyan and Andrew Zisserman (2014)
  • "Going Deeper with Convolutions (GoogLeNet/Inception)" by Christian Szegedy et al. (2014)
  • "Deep Residual Learning for Image Recognition (ResNet)" by Kaiming He et al. (2015)
  • "U-Net: Convolutional Networks for Biomedical Image Segmentation" by Olaf Ronneberger et al. (2015)
  • "YOLO: Unified, Real-Time Object Detection" by Joseph Redmon et al. (2016)
  • "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications" by Andrew G. Howard et al. (2017)
  • "EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks" by Mingxing Tan and Quoc V. Le (2019)
  • "Generative Adversarial Nets (GANs)" by Ian Goodfellow et al. (2014)
  • "Vision Transformers (ViT): An Image is Worth 16x16 Words" by Alexey Dosovitskiy et al. (2020)
  • CLIP: Learning Transferable Visual Models From Natural Language Supervision" by Alec Radford et al. (2021)

Audio

  • WaveNet: A Generative Model for Raw Audio" by Aäron van den Oord et al. (2016)
  • "Deep Speech: Scaling up end-to-end speech recognition" by Aäron van den Oord et al. (2014)
  • "Tacotron: Towards End-to-End Speech Synthesis" by Yuxuan Wang et al. (2017)
  • "Conformer: Convolution-augmented Transformer for Speech Recognition" by Anmol Gulati et al. (2020)
  • "HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis" by Jungil Kong et al. (2020)
  • valle...

Text

  • "Sequence to Sequence Learning with Neural Networks" by Ilya Sutskever et al. (2014)
  • "Attention Is All You Need (Transformer)" by Ashish Vaswani et al. (2017)
  • "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding" by Jacob Devlin et al. (2018)
  • "GPT-2: Language Models are Unsupervised Multitask Learners" by Alec Radford et al. (2019)
  • "Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context" by Zihang Dai et al. (2019)
  • "XLNet: Generalized Autoregressive Pretraining for Language Understanding" by Zhilin Yang et al. (2019)
  • "DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter" by Victor Sanh et al. (2019)
  • "T5: Text-To-Text Transfer Transformer" by Colin Raffel et al. (2019)

Video

About

All my personal deep learning model implementations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages