The aim of today's session will be to take you though a 'typical' data science workflow. We won't be going into any detail but rather aim to show you all the parts and how they fit together; as the course progresses we'll fill in the details.
You should aim to have an understanding of the basic parts of a data science project, and be starting to get familiar with using git, Python and Jupyter notebooks.
Throughout the course we will attempt to stick some interesting links in this document for you to follow up on at your own leisure. These are not compulsory (we realise everyone is busy!) but if you've got some downtime on the train they might be interesting to look at.
-
Building a data culture in your enterprise: "You Don’t Need a Data Scientist, You Need a Data Culture"; there's also a bunch of other good articles from this group at MIT Media Lab in a similar vein
-
There's a bunch of interesting articles floating around about data science in the resources industry specifically - for example - take a look at how companies are using data science as part of a broader automation push.
We will be talking a bit about building momentum when delivering data science projects - here are a few links about this:
-
Examples of enterprise-level data science workflows: e.g. The Apteo Data Science Workflow and others around online.