Skip to content

this is to understand the concepts of Univariate and Bivariate analysis on basics of data model

Notifications You must be signed in to change notification settings

zuhair30/Lending_Club_Case_Study_SYEDandNAAZ

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 

Repository files navigation

Lending_Club_Case_Study_SYEDandNAAZ

Identify these risky loan applicants, or the drivers that lead to Risky loan applicatins so that such loans can be reduced thereby cutting down the amount of credit loss. Identification of such applicants using EDA is the aim of this case study. By doing this excercise we wil understand the concepts of Univariate and Bivariate analysis on basics of data model

Table of Contents

  • [General Info]
  • [Technologies Used]
  • [Conclusions]
  • [Acknowledgements]

General Information

  • Provide general information about your project here.
  • A finance compay provides loan to various users but many are un able to pay resulting in financial loss?
  • Objective of this study is to identify the Variables that constitute and add to those causes which result in charge off or defaulters of loan?
  • For doing that analysis , Initially we have included a filtered column data for identifying the trend within the Charge Off.
  • Post that we have included the entire data for all loan status (Charge Off, Current , and Fully Paid)
  • Removal of rows for current loan status was not adding or changing any result set

Conclusions Drawn

• Candidates with high loan amount are more likely to charge off

• Grades E,F,G have higher tendency to charge off

• Grade F , G , E have high interest rates as compared to others

• Grade E,F,G have higher Debt to income ratio

• Candidates with high income are less likely to default

• Candidates with high interest rates are likely to default

• Debt to Income ratio (DTI) Is better for Fully paid candidates

• Candidates having less Employee Length(exp) are more likely to not pay the loan

• Charge Off are higher for high installments

• Customers who are on RENT or who are non verified are more likely to default and hence High risk customers

• Candidates are wanting loan more Purpose of home, small business and they take it for less term

• Installments for credit card , debt consolidation , small_business and house are more with regards to others

• Candidates who have taken loan for purpose of small business are more likely to default along with Debt consolidation

• Most percentage of defaulters is in range of less than 10k with rent or mortgages

• Maximum charge off are from California that means better checks in CA state and FL or NY

• DTI is negatively correlated with annual inc or loan amount and recoveries is also very weakly correlated with most vars not giving much insight

• Interest rate vs loan amount gives that higher median of loan amount for higher interest rates

-- Hence loand grade,Home ownership,Interest rate,Purpose,Address state, Employee income are some variables to identify High risk customers

Technologies Used

  • Anaconda , Python

    Libraries

  • numpy - version 1.24.3
  • pandas - version 2.1.1
  • seaborn - version 0.12.2
  • matplotlib - version 3.7.1

Acknowledgements

Give credit here.

Contact

Created by [@zuhair30] - feel free to contact me! on email as well [email protected]

Supported by Shagufta Naaz Shaikh - rechable at [email protected]

Git Hub Link of repository is at

https://github.com/zuhair30/Lending_Club_Case_Study_SYEDandNAAZ

In case above link is not working or ppt or python file is not rendered due to git hub slowness or size issue please use below url that will dowload the above repo in zip format in your local

https://github.com/zuhair30/Lending_Club_Case_Study_SYEDandNAAZ/zipball/master

About

this is to understand the concepts of Univariate and Bivariate analysis on basics of data model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published