This is a binary classification problem where we have information about a sample of applicants and we need to predict whether or not to grant a loan based on that data.
- Data Visualization .
- Feature selection and feature engineering
- Some techniques for data processing
- Handling missing data .
- Handling of categorical data and numerical data .
- outlier detection
- model evaluation
-
The following libraries: sklearn, matplotlib, numpy, pandas, seaborn, scipy
-
For handling missing data, we'll be using the backward 'bfill' method for numerical data and the most frequent value for categorical data.
-
4 different models:
a) Logistic Regression
b) KNeighborsClassifier
c) SVC
d) DecisionTreeClassifier
Here we go!