This is the repo for my "A Beginner's Guide to Machine Learning with Scikit-Learn" talk, given at the PyData NYC 2013 conference and the PyTennessee 2014 conference. It's more comprehensive than the material in my talk. You can find a video of my talk from the PyData conference here http://vimeo.com/79517341, and you can find the slides here http://www.slideshare.net/SarahGuido/a-beginners-guide-to-machine-learning-with-scikitlearn
Abstract:
Scikit-learn is one of the most well-known machine learning Python modules. But how does it work, and what, for that matter, is machine learning? This talk gives a beginner-level overview of how machine learning can be useful, important machine learning concepts such as supervised and unsupervised learning, and how to implement them with scikit-learn using real world data.
The main notebook for the presentation can be found here: http://nbviewer.ipython.org/gist/sarguido/9008376
Other materials you may find helpful:
- Data preprocessing: http://nbviewer.ipython.org/gist/sarguido/7423289
- Supervised learning: http://nbviewer.ipython.org/gist/sarguido/8969870
- Unsupervised learning: http://nbviewer.ipython.org/gist/sarguido/8969887
- Testing and Validation: http://nbviewer.ipython.org/gist/sarguido/8969894