Skip to content

Latest commit

 

History

History
55 lines (32 loc) · 5.32 KB

README.md

File metadata and controls

55 lines (32 loc) · 5.32 KB

                    Python Build Status Dependencies GitHub Issues Contributions welcome License

Introduction

Welcome to the DARA Big Data project International Data Week 2018 hackathon at the University of Botswana!

The DARA Big Data hackathon is designed to help you improve your data science skills in a friendly and supportive environment. At the hackathon, you will be grouped into teams of four and each team will choose one of the DARA Big Data hack challenges to work on. At the end of the hack each team will give a 5 minute (3 slide) presentation on the results of their challenge. These presentations will be judged by the organisers and there will be a prize for the winning team. The presentations will be judged on (1) the accuracy of the predicted results via machine learning and (2) visualisation/presentation of the data and results.

The DARA Big Data hack challenges will be run in Python3 using the IDIA Cloud. Students should have a basic working knowledge of Python (including the scipy and numpy libraries) - but you do not have to be an expert to take part and enjoy yourself!


Challenge 1: Build a machine learning recommendation engine web application

Machine Learning can be used to provide awesome applications and services. Examples are recommender systems on Netflix, Amazon and even Google reverse Image search. However, creating an application isn't limited to big tech firms. In this challenge you will build a web application of your choice and we've put together a tutorial to help you get started, which shows you how to build a movie recommender based on a simple machine learning approach.

Challenge 2: Build a machine learning image classifier using web-scraped data

An image paints a thousand words... but with so many images out there how do you know which one is which? In this challenge you will learn how to web-scrape images from Google and use them to train/test an image-based machine learning classifier. The aim is to come up with a image classification problem (cats vs. dogs, people vs. trees, Trump vs. an orange cheeto... etc), build your own database for training the algorithm by web-scraping the images, and then to use machine learning to classify the subjects of the images. Depending on how successful you are, you might want to extend the challenge and include some location, recognition or saliency analysis. To help you get started we've put together a tutorial that shows you how to web-scrape images and then how to perform a simple image based classification.

Challenge 3: Build a machine learning music classification system

Music recommendation systems are all over the internet, from Spotify to iTunes. But how do they know what music you will like? For this challenge you build a machine learning application that classifies music using the content of the individual tracks. Your application could make recommendations for individuals, or it could suggest musical tracks that would be good in films, or it could automatically identify artists, or it could do something else! The choice is up to you. To help you get started we've put together a tutorial that shows you how to extract machine learning features from audio files and then use them with a variety of machine learning algorithms.


Not at the hackathon, but want to test your code-building skills? Feel free!

> git clone https://github.com/darabigdata/IDWBotswana.git

Then make sure you have the right Python libraries for the tutorials. They can all be installed using pip and the requirements.txt file in the repo:

> pip install -r requirements.txt

DARA Big Data is supported by: