This project is on the classification of Neuron images produced by the Psychiatry Department of the University of Oxford. The classification task is to learn to distinguish whether Neurons have been treated with the compound Amyloid-β. The images used are taken after staining with the Cy5 dye, a biomarker for the MAP2 gene found in the Neuronal Cytoskeleton.
Amyloid-β is thought to induce synapse loss. Verifying that this is the case and having a robust classifier will allow researchers to test different compounds for their abilities to reduce the effects of treatment with Amyloid-β
An example of the visualisation produced by the code in this repository: Here Red indicates "Treated" and Blue indicates "Untreated". Regions are coloured in Green if the model hasn't produced a confident enough prediction for either class.[1]
To clone this repository run:
git clone https://github.com/wfbstone/Neuron-Image-Classification.git
cd Neuron-Image-Classification
To install the requirements, run:
pip install -r requirements.txt
This project was written using iPython 3 on a Kaggle kernel. The original kernels can be found here.
Any suggestions for improvement are greatly appreciated and I encourage the use and adaptation of my code for use on other projects. Unfortunately however, the dataset cannot be made public at this time so running the code on the Neuron images is not currently possible.
The model used in classification is the VGG19 with weights pre-trained on the ImageNet dataset as provided by Keras but with 3 Fully-Connected layers of sizes 2048, 2048 and 2 on top of the Convolutional layers. The model was fine-tuned for the task of classifying the Neurons on a training set of 632 images with data augmentation including random cropping and random flipping both horizontally and vertically. After 60 epochs through the training st (15 epochs through each possibility of random flipping) the model performed extremely well on the unaugmented test set - achieving an F1 score of 0.96.
To verify that this performance isn't due to random artifacts in the data, both Grad-CAM and a Saliency Map were implemented.
For visualising the effects of treatment with Amyloid-β neither Grad-CAM nor the Saliency Map are particularly insightful. Grad-CAM is great for highlighting regions of interest but tends to highlight the entire network of Neurons and the Saliency Map is great at highlighting individual Nuclei and Neurites.
The combination of these two techniques was achieved by colouring the Saliency Map depending on what classification the corresponding region received by the Sliding Windows algorithm. Using Youden's J Statistic to determine where to set the thresholds for confidence, the Saliency Map is coloured Red or Blue if the region is confidently predicted as “Treated” or “Untreated” and Green if the prediction was uncertain.
- Experiment with convolutional implementation of the Sliding Windows algorithm as mentioned in the OverFeat paper in attempt to reduce computation time.