Ola Drivers Churn Analysis

📚 Table of Contents

Project Overview
Objectives
Dataset Overview
Methodology
Insights and Recommendations
Results
Conclusions
Acknowledgments

🚀 Project Overview

The Ola Driver Churn Analysis project examines driver retention on the Ola platform, a leading ride-hailing service in India. Frequent driver churn disrupts operational consistency and drives up costs related to recruiting, training, and onboarding. This project focuses on identifying drivers likely to churn and delivering actionable insights to improve Ola’s retention strategy.

🎯 Objectives

The project’s main objectives include:

Predicting which drivers are at risk of leaving.
Profiling drivers to identify demographic, performance, or behavior patterns linked to churn.
Enabling Data-Driven Retention by identifying priority drivers for Ola’s retention initiatives, helping to stabilize the driver base and reduce operational costs.

📊 Dataset Overview

The dataset contains demographic, tenure, and performance-related details for Ola’s driver-partners, forming the basis for understanding patterns and training models to identify drivers at risk.

Key Features:

Driver_ID: Unique identifier for each driver.
Age, Gender, and City: Demographic details, essential for driver segmentation.
Income: Monthly income level of drivers.
Joining Date: Date of joining Ola.
Last Working Date: Date of last engagement with Ola (null values indicate active drivers).
Quarterly Rating: Driver performance rating (scale of 1-5).
Total Business Value: Monthly revenue contribution, adjusted for cancellations or refunds.

🧠 Methodology

The analysis approach follows a structured methodology to handle data, extract insights, and develop predictive models effectively.

1. Data Preprocessing

Objective: Clean and prepare data to optimize model performance and accuracy.

Missing Value Handling: Imputed missing values in Last Working Date and Income.
Encoding Categorical Variables: Transformed Gender and City fields to numerical representations.
Feature Engineering: Created derived features such as Service Duration and Income-to-Business Ratio.
Normalization: Scaled continuous variables like Income and Total Business Value to ensure equal contribution to the model.

2. Exploratory Data Analysis (EDA)

Objective: Reveal trends, patterns, and key relationships in the data.

Key Visualizations:

Income vs. Churn:

Scatter plot illustrating income levels among churned and retained drivers to highlight income impact on churn.
City-wise Churn Distribution:

Churn rates visualized across cities to examine region-specific trends.
Quarterly Ratings Distribution:

Distribution of driver ratings to analyze performance impact on churn.
Correlation Heatmap:

Feature correlation matrix to assess relationships and potential multicollinearity.

3. Modeling and Evaluation

Models Implemented:

Logistic Regression: Baseline classifier for interpretability.
Random Forest Classifier: Model using feature importance to capture non-linear patterns.
XGBoost: Effective for imbalanced data and complex relationships.

Evaluation Metrics:

Accuracy: General prediction correctness.
Precision & Recall: Identifying true churn cases (recall) while minimizing false positives (precision).
ROC-AUC Score: Evaluates model's discrimination ability.

4. Feature Importance

Feature importance analysis via Random Forest and XGBoost highlighted factors most relevant to churn prediction.

Findings:

Income: Lower-income drivers are more likely to churn.
Service Duration: Shorter tenure correlates with churn.
Quarterly Rating: Lower ratings increase churn likelihood.
City: Regional differences affect churn rates.

Visualization:

🔍 Insights and Recommendations

Key Insights

Income and Churn Correlation: Lower-income drivers are more prone to churn.
City-Specific Patterns: Certain cities show higher churn rates, possibly due to competition or operational challenges.
Performance-Linked Churn: Lower-rated drivers are likelier to churn, potentially due to reduced ride assignments or customer preference.
Feature Correlations: Service Duration and Quarterly Rating are correlated, suggesting longer-serving drivers receive better ratings.

Recommendations for Ola

Incentivize Low-Income Drivers: Provide incentives to improve earnings and reduce churn.
City-Specific Retention Strategies: Implement regional retention strategies in high-churn cities.
Performance Improvement Programs: Offer training for lower-rated drivers to boost performance and satisfaction.
Tenure-Based Rewards: Introduce rewards based on service duration to promote long-term retention.
Driver Feedback: Regular feedback can reveal underlying churn factors for proactive improvements.

⚙️ Results

Model	Precision	Recall	F1 Score	Train ROC-AUC	Test ROC-AUC
Logistic Regression	85.53%	81.35%	83.39%	86.76%	86.59%
Random Forest	88.39%	83.79%	86.03%	91.41%	88.48%
XGBoost	84.06%	88.69%	86.31%	91.66%	89.38%

Model Summary

Logistic Regression: Solid baseline model.
Random Forest: Higher precision and F1 score, good generalization.
XGBoost: Highest test ROC-AUC (89.38%) and recall (88.69%), optimal for churn identification.

XGBoost Evaluation Visualizations

Confusion Matrix: Reflects true positives (290) and false positives (55).
ROC Curve: AUC score of 0.89 highlights separation capability.
Precision-Recall Curve: Shows high precision and recall for effective churn prediction.

🔑 Conclusions

Reliable Churn Prediction: Model effectively identifies at-risk drivers.
Insight-Driven Retention: Churn profiles help Ola focus on at-risk demographics and regions.
Operational Improvements: Proactive retention reduces onboarding costs and supports consistent service.

🙏 Acknowledgments

Thanks to Ola for data access and project support. Special gratitude to Pandas, Matplotlib, Scikit-learn, and XGBoost contributors for their invaluable tools.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
images		images
OLA_Drivers_Churn_Analysis.ipynb		OLA_Drivers_Churn_Analysis.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ola Drivers Churn Analysis

📚 Table of Contents

🚀 Project Overview

🎯 Objectives