#expedia_hotel_recommendations: Big data problem
This is a kaggle competition. The goal is to predict the booking outcome for a user event, based on their search and other attributes associated with that user event.
To get information about the data please see here.
This is a big data problem and I used Pyspark for the analysis and building the ML models. Also, inorder to speed up the analysis, I used the distributed computing service of Amazon, Elastic Map Reduce (EMR).