Data Science In Python

Overview

Collecting, making sense of and classifying data has never been more important. Using Python, the students will learn how to:

Collect dispersed, but publicly available, data through tools like selenium for web scraping, and application programming interfaces (APIs).
Analyse data using dataframes and arrays, with special attention in how to optimise code using compilers like Numba for big data problems.
Make sense of, and do quality control, on high dimensional data sets, through dimensionality reduction methods like Principal Component Analysis (PCA) and multi-dimensional scaling (MDS).
Use the huge advances made recently, in Artificial Intelligence (AI) software like Keras, in utilizing neural networks to classify data.

By the end of the module, students will be able to:

Refer to and adapt from a code base of taught examples of data collection, analysis, dimensionality reduction and classification.
Build their own code base around a personally selected problem touching on the areas taught.
Have a good understanding of the ethics and common problems encountered in data collection, processing and classification.

Anaconda3 (for python3 notebooks) (Links to an external site.) We will mainly be using python notebooks for this module, with specific modules to install for each day, which you can see in each specific 'To do before the start' page for that day.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
Day_1		Day_1
Day_2		Day_2
Day_3		Day_3
Day_4		Day_4
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md