This repository contains the code, SQL queries, and Python scripts for a comprehensive data analysis project. The project involved data extraction, transformation, and statistical analysis using SQL and Python, aiming to derive valuable insights and make data-driven decisions.
This project is aimed at answering indepth statistical questions concerning the dataset. SQL was used to clean the data while python was used for statistical analysis as well as to obtaining meaningful insight from the data
The dataset used in this project is a modified version of this dataset.
The dataset contains information about the customers’ demand rate between January 2017 and August 2018. The data were collected on an hourly basis and included the time data such as date, hour, and season as well as weather data such as the weather condition, temperature, humidity, and wind speed. The ‘demand’ column represents the customer’s willingness for renting a car for a specific time. Higher demand rates show that customers are more willing to rent a car and vice versa.
The version of SQL used in this project is SQLite (solely for the purpose of practice in jupyter notebook) Some of the SQL keywords used in this project include
- CREATE TABLE
- DELETE
- INSERT
- ALTER
- UPDATE
- DROP
- SELECT
- WHERE
- CASE statement
- ORDER BY
- GROUP BY
- LIMIT, among others.
- Some of the libraries used in this project are numpy, pandas, matplotlib, scipy, Kmeans, MLPRegressor, DecisionTreeClassifier, ARIMA, GradientBoostingClassifier, MinMaxScaler, RandomForestRegressor, etc.
- Hypothesis tested:
- Tests for a significant relationship between each column (except timestamp column) and the demand rate.
The requirements.txt file contains the libraries needed for this project to run on a PC.
Run pip install -r requirements.txt
in a terminal to install the necessary libraries
The key findings, insights, visualisations, graphs and conclusions from this data analysis have all been documented in the Report.pdf.
For any questions or feedback, you can reach me at [email protected].