Skip to content

Latest commit

 

History

History
35 lines (21 loc) · 1.44 KB

README.md

File metadata and controls

35 lines (21 loc) · 1.44 KB

GeoSpatial Analysis using Hadoop and Spark.

This project was a part of the coursework completed for Distributed Database Systems CSE 512. The first two phases included setting up a Hadoop File System and Spark Clusters and running extensive tests monitoring the Resource Utilization in all the clusters using the Ganglia Monitor.

The source code uploaded is from the third phase that dealt with performing HotSpot Analysis on our Distributed System. This problem was adapted from the ACM Sigspatial Cup 2016. However, our input included only the data from January.

A detailed report is also uploaded. For instructions to run and test the code please follow the installation_run_instructions file.

Algorithm

The following flowchart describes the algorithm used.

Visualizations of the Hotspots obtained.

We were tasked with finding the top 50 hotspots based on analysis and calculation of a score.

Do not use unless you have obtained permission.

Copyright 2017

@authors