In this PJ I build a simple Conv2d network and implement some beneficial modifications based on it. Also I conduct some investigation on the network architecture and training process, ultimately my best model achieves a 96.68% test accuracy and CrossEntropy test loss reduced to 0.0012. The explorations done are as follows:
- Increasing batch size;
- Implementing dropout;
- Implementing Residual Connection;
- Try different number of neurons/filters;
- Try different loss functions;
- Try different Activation Function;
- Try different optimizers using torch.optim;
- Network interpretation
Batch-Normalization (BN) is an algorithmic method which makes the training of Deep Neural Networks (DNN) faster and more stable. It consists of normalizing activation vectors from hidden layers using the first and the second statistical moments (mean and variance) of the current batch. This normalization step is applied right before (or right after) the nonlinear function. Here I mainly compared VGG-A with and without BN and drew the Loss Landscape.