Hardware : GTX 1060 took roughly 4.5 hours to complete 28k iterations/7 epochs) p.s : Disk stored dataset was horrible the first epoch due to what I believe to be TLB misses. After the first epoch, disk reads and file i/o buffers helped training go way faster.
CASIA (training) : ~0.98 training accuracy, ~1.92 Training Loss, 10575(CASIA) + 1(me) classes
LFW (evaluation) : 0.9902 accuracy
├── live_inference.py '(Live inferencing of the model using laptop camera)'
├── cosFace
│ ├── casia_train.py '(MAIN TRAINING SCRIPT)'
│ ├── casia_train_summary_CosFace_20200625-115034 '(Tensorboard metrics)'
│ │ └── events.out.tfevents.1593111034.joshuakang-Alienware
│ ├── checkpoint
│ │ ├── accuracy.npy
│ │ ├── casia_train.py
│ │ ├── dataLoader.py
│ │ ├── faceNet.py
│ │ ├── images.png
│ │ ├── inference.py
│ │ ├── lfw_eval.py
│ │ ├── loss.npy
│ │ ├── matlab_cp2tform.py
│ │ ├── net_1.pth '(model after nth epoch)'
│ │ ├── net_2.pth
│ │ ├── net_3.pth
│ │ ├── net_4.pth
│ │ ├── net_5.pth
│ │ ├── net_6.pth
│ │ ├── net_7.pth
│ │ ├── netFinal_8.pth
│ │ ├── trainingLog_0.txt '(models training loss during nth epoch)'
│ │ ├── trainingLog_1.txt
│ │ ├── trainingLog_2.txt
│ │ ├── trainingLog_3.txt
│ │ ├── trainingLog_4.txt
│ │ ├── trainingLog_5.txt
│ │ ├── trainingLog_6.txt
│ │ └── trainingLog_7.txt
│ ├── data
│ │ ├── casia_landmarkMTCNN.txt '(CASIA face landmarks generated from /mtcnn/get_landmarks.py + joshs faces)'
│ │ ├── casia_landmark.txt '(CASIA face landmarks as generated from other source)'
│ │ ├── josh_landmarksMTCNN.txt '(Facial landmarks for my face)'
│ │ ├── lfw_landmark.txt '(LFW dataset facial landmarks generated by /mtcnn/get_landmarks.py)'
│ │ └── pairs.txt '(LFW given pairs for Face cosin similarity in evaluation)'
│ ├── dataLoader.py '(dataloader for model. Landmarks file with labels required)'
│ ├── faceNet.py '(Models)'
│ ├── random_10_inference.py '(creates embeddings for random faces. For Tsne visualization mostly)'
│ ├── lfw_eval.py '(evaluation script on lfw dataset)'
│ ├── matlab_cp2tform.py
│ ├── plot_graphs.ipynb
│ └── __pycache__
│ ├── dataLoader.cpython-37.pyc
│ ├── faceNet.cpython-37.pyc
│ └── matlab_cp2tform.cpython-37.pyc
├── mtcnn
│ ├── example.jpg
│ ├── get_landmarks.py '(get landmarks for images in dataset and label)'
│ ├── LICENSE
│ └── src
│ ├── box_utils.py
│ ├── detector.py
│ ├── first_stage.py
│ ├── get_nets.py
│ ├── __init__.py
│ ├── __pycache__
│ │ ├── box_utils.cpython-37.pyc
│ │ ├── detector.cpython-37.pyc
│ │ ├── first_stage.cpython-37.pyc
│ │ ├── get_nets.cpython-37.pyc
│ │ ├── __init__.cpython-37.pyc
│ │ └── visualization_utils.cpython-37.pyc
│ ├── visualization_utils.py
│ └── weights
│ ├── onet.npy
│ ├── pnet.npy
│ └── rnet.npy
└── README.md
No usage of out of the box implementations (some changes to mtcnn) or usage of opencv (cuz I'm a boss). Other than cv2 usage for camera accessibility.
Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks.
[CosFace: Large Margin Cosine Loss for Deep Face Recognition] (https://arxiv.org/pdf/1801.09414.pdf)
Fully trained model on CASIA + 215 images of Me (could have done transfer learning, but meh this is way cooler). Can then finetune with more images of me.
Model : (https://drive.google.com/file/d/1UJW8chHcD8KEl28yGSy3vwD2KzOMb1em/view?usp=sharing)