R-code

Mall Customers Segmentation in R. The project was a form of assessment in the Data Science course.

Kaggle dataset of 200 records was used. This is the dataset

Data pre-processing and visualization of the data was done. Here are some of the plots-

The following algorithms were used:-

Support Vector Regression
Multiple Linear Regression
K-means Clustering
k-nearest neighbours for the classification problem
Support Vector Machines for the classification problem
RandomForest for the classification problem

MLR and SVR was used for determining the variables with the highest linear correlation to the Spending Score, {Age, here} for exploration purposes.

Elbow Method for finding the optimal number of clusters revealed 5 as the optimum number of clusters as shown.

Here are the visualized clusters-

The original dataset was then modified by dividing the Spending Score into 3 levels- 'Minimal', 'Medium' and 'Excess' and a classification problem was created. The training dataset was fed to 3 models; namely KNN, SVM and RF, out of which RF had the best accuracy and hence was used to make predictions on the test dataset with a final accuracy of 87.5%.

Contributors: Tanishq Deshpande contributed to the project for the SVR model

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Images		Images
DS_CourseProject.Rmd		DS_CourseProject.Rmd
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

R-code

Mall Customers Segmentation in R. The project was a form of assessment in the Data Science course.

About

Releases

Packages

ASA-Deshpande/R-code

Folders and files

Latest commit

History

Repository files navigation

R-code

Mall Customers Segmentation in R. The project was a form of assessment in the Data Science course.

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages