This repository consists of R script for Data Cleaning, sql script for data loading, hive and pig scripts for data analysis and a python script for Linear Regression to predict car sales prices. It also consists of shell script for automating the data loading and analysis process