This repository contains a comprehensive analysis of loan data to predict loan defaults.
This project aims to analyze loan data to identify factors that contribute to loan defaults. The analysis includes exploratory data analysis (EDA), feature engineering, and model building.
- Loan Grade: Higher loan grades (e.g., F, G) have a significantly higher default rate.
- Loan Term: Longer loan terms (60 months) exhibit a higher default rate compared to shorter terms (36 months).
- Debt-to-Income Ratio (DTI): Applicants with higher DTI values have a greater tendency to default.
- Purpose of Loan: Loans for specific purposes (e.g., small business) show a higher default rate.
- Public Records: Borrowers with a history of derogatory remarks or bankruptcies have an increased risk of default.
- Python: Programming language for data analysis and model building.
- Pandas: Data manipulation and analysis library.
- NumPy: Numerical computing library.
- Matplotlib & Seaborn: Data visualization libraries.
- Scikit-learn: Machine learning library for model building and evaluation.
- Upgrad course content.
- Live sessions from upgrad.
- The dataset used in this project is sourced from LendingClub.
Created by swapnilrdx - feel free to contact me!