Feature Scaling on Mobile Price Classification problem

The scope of this works includes 2 distance-based models, which are KNN and SVM, and experiment how scaling features affects the performance of these 2 models. The work is conducted under the context of Mobile Price Classification dataset from Kaggle.

During the visualization, I found out that ram has the most correlation towards the target price. The result shows that:

For KNN classifier, if we do not scale the features, the model performs much better. The reason for this is because the ram feature may contributes a lot towards the target. Also, the scale of ram is much bigger comparing to other features (based on median, min, max values and std). Therefore, scaling all the features into the same scale (range 0-1) will weight down the importance of ram.
For SVM classifier, with linear kernel, if we do not scale the features, the model takes a huge amount of time for training. In other hand, if we scale the feature, SVM performs much better (comparing to KNN) and faster. For non-linear SVM, the model performs slightly worse comparing to linear kernel one.

For more details, feel free to reference the .ipynb file and give feedbacks.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.ipynb_checkpoints		.ipynb_checkpoints
input		input
README.md		README.md
scaling-features-and-classification-modeling.ipynb		scaling-features-and-classification-modeling.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Feature Scaling on Mobile Price Classification problem

About

Releases

Packages

Languages

tranctan/Feature-Scaling-on-Mobile-Price-Classification

Folders and files

Latest commit

History

Repository files navigation

Feature Scaling on Mobile Price Classification problem

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages