🏆 A Comparative Study on Handwritten Digits Recognition using Classifiers like K-NN, Multiclass Perceptron and SVM
For the full report, refer to the file named Detailed Report.pdf.
The task at hand is to classify handwritten digits using supervised machine learning methods. The digits belong to classes of 0 to 9.
“Given a query instance (a digit) in the form of an image, our machine learning model must correctly classify its appropriate class.”
MNIST Handwritten Digits dataset is used for this task. It contains images of digits taken from a variety of scanned documents, normalized in size and centered. Each image is a 28 by 28 pixel square (784 pixels total). The dataset contains 60,000 images for model training and 10,000 images for the evaluation of the model.
We have used supervised machine learning models to predict the digits. Since this is a comparative study hence we will first describe the K-Nearest Neighbors Classifier as the baseline method which will then be compared to Multiclass Perceptron Classifier and SVM Classifier.
k-Nearest Neighbors (k-NN) is an algorithm, which:
- finds a group of k objects in the training set that are closest to the test object, and
- bases the assignment of a label on the predominance of a class in this neighborhood.
When we used the K-NN method the following pros and cons were observed:
- K-NN executes quickly for small training data sets.
- No assumptions about data — useful, for example, for nonlinear data
- Simple algorithm — to explain and understand/interpret
- Versatile — useful for classification or regression
- Training phase is extremely quick because it doesn’t learn any data
- Computationally expensive — because the algorithm compares the test data with all examples in training data and then finalizes the label
- The value of K is unknown and can be predicted using cross validation techniques
- High memory requirement – because all the training data is stored
- Prediction stage might be slow if training data is large
A multiclass perceptron classifier can be made using multiple binary class classifiers trained with 1 vs all strategy. In this strategy, while training a perceptron the training labels are such that e.g. for the classifier 2 vs all, the labels with 2 will be labeled as 1 and rest will be labeled as 0 for Sigmoid Unit while for Rosenblatt’s perceptron the labels would be 1 and -1 respectively for positive and negative examples.
Now all we have to do is to train (learn the weights for) 10 classifiers separately and then feed the query instance to all these classifiers (as shown in figure above). The label of classifier with highest confidence will then be assigned to the query instance.
As we already discussed, K-NN stores all the training data and when a new query instance comes it compares its similarity with all the training data which makes it expensive both computationally and memory-wise. There is no learning involved as such. On the other hand, Multiclass perceptron takes some time in learning phase but after its training is done, it learns the new weights which can be saved and then used. Now, when a query instance comes, it only has to take to dot product of that instance with the weights learned and there comes the output (after applying activation function).
- The prediction phase is extremely fast as compared to that of K-NN.
- Also, it’s a lot more efficient in terms of computation (during prediction phase) and memory (because now it only has to store the weights instead of all the training data).
Just for comparison purposes, we have also used a third supervised machine learning technique named Support Vector Machine Classifier. The model isn’t implemented. Its imported directly from scikit learn module of python and used.
In K-NN and Multiclass Perceptron Classifier we trained our models on raw images directly instead of computing some features from the input image and training the model on those computed measurements/features.
A feature descriptor is a representation of an image that simplifies the image by extracting useful information and throwing away extraneous information. Now we are going to compute the Histogram of Oriented Gradients as features from the digit images and we will train the SVM Classifier on that. The HOG descriptor technique counts occurrences of gradient orientation in localized portions of an image - detection window.
Now the final phase. After running the experiment with different algorithms, the results are summarized. First comparing the techniques on basis of Accuracy:
When we compare the K-NN method with Multiclass Perceptron and SVM on basis of accuracy then its accuracy is similar to that of other two classifiers which means despite its simplicity K-NN is really a good classifier.
One of the main limitations of K-NN was that it was computationally expensive. Its prediction time was large because whenever a new query instance came it had to compare its similarity with all the training data and then sort the neighbors according to their confidence and then separating the top k neighbors and choosing the label of the most occurred neighbor in top k. In all this process, it takes a comparable amount of time.
While for Multiclass Perceptron Classifier we observed it will mitigate this limitation in efficiency such that its prediction time will be short because now it will only compute the dot product in the prediction phase. The majority of time is spent only once in its learning phase. Then it’s ready to predict the test instances.
When the times were calculated for the prediction phases of K-NN, Multiclass Perceptron and SVM, the Multiclass Perceptron clearly stands out with the shortest prediction time while on the other side, K-NN took a large time in predicting the test instances. Hence Multiclass Perceptron clearly leaves K-NN behind in terms of efficiency in Prediction Time and also in terms of computation and memory load. Thus, it mitigates the limitations of our baseline method K-NN.
The code files are in running condition and are directly executable.
(To install all the necessary packages at once, install Anaconda)
You can get in touch with me on my LinkedIn Profile:
You can also follow my GitHub Profile to stay updated about my latest projects:
If you liked the repo then kindly support it by giving it a star ⭐ and share in your circles so more people can benefit from the effort.
If you find any bugs, have suggestions, or face issues:
- Open an Issue in the Issues Tab to discuss them.
- Submit a Pull Request to propose fixes or improvements.
- Review Pull Requests from other contributors to help maintain the project's quality and progress.
This project thrives on community collaboration! Members are encouraged to take the initiative, support one another, and actively engage in all aspects of the project. Whether it’s debugging, fixing issues, or brainstorming new ideas, your contributions are what keep this project moving forward.
With modern AI tools like ChatGPT, solving challenges and contributing effectively is easier than ever. Let’s work together to make this project the best it can be! 🚀
Copyright (c) 2018-present, harismuneer
Hey there, I'm Haris Muneer 👨🏻💻
-
🕸️ Founder of Cyfy Labs: At Cyfy Labs, we provide advanced social media scraping tools to help businesses, researchers, and marketers extract actionable data from platforms like Facebook, Instagram, and X (formerly Twitter). Our tools support lead generation, sentiment analysis, market research, and various other use cases. To learn more, visit: www.cyfylabs.com
-
🌟 Open Source Advocate: I’m passionate about making tech accessible. I’ve open-sourced several projects that you can explore on my GitHub profile and on the Open Source Software PK page.
-
📫 How to Reach Me: You can learn more about my skills/work at LinkedIn. You can also reach out via email for collaboration or inquiries. For Cyfy Labs related queries, please reach out through the company website.