Welcome to the DARA Big Data project hackathon at the Namibia University of Science and Technology!
The DARA Big Data hackathon is designed to help you improve your data science skills in a friendly and supportive environment. At the hackathon, you will be grouped into teams of four and each team will work on completing the hack challenge. At the end of the hack each team will give a 5 minute (3 slide) presentation on the results of their challenge. These presentations will be judged by the organisers and there will be a prize for the winning team. The presentations will be judged on (1) the accuracy of the predicted results via machine learning and (2) visualisation/presentation of the data and results.
The DARA Big Data hack challenges will be run in Python3 using the IDIA Cloud. Students should have a basic working knowledge of Python (including the scipy and numpy libraries) - but you do not have to be an expert to take part and enjoy yourself!
In this challenge you are tasked with building a classifier to separate out real astronomical signals from man-made radio frequency interference (RFI). The astronomical signals that you're looking for come from pulsars, the ultra-dense relics of exploded stars. You can read more about them in this introduction to pulsars and how to classify them using random forests. The dataset is available to you in two formats: (i) as a set of eight numerical features per sample suitable for classification using random forests, SVMs, neural nets etc. and (ii) as a set of images that show the data that the numerical features are drawn from, suitable for CNN based classification. Your hack challenge is to build the best classifier that you possibly can, remembering that we want as many correctly classified pulsars as possible, with as little contamination from RFI as possible...
> git clone https://github.com/darabigdata/WindhoekHack.git
Then make sure you have the right Python libraries for the tutorials. They can all be installed using pip and the requirements.txt file in the repo:
> pip install -r requirements.txt