Applied Data Analysis and Machine Learning

This site contains all material relevant for the course on Applied Data Analysis and Machine Learning.

Introduction

Probability theory and statistical methods play a central role in Science. Nowadays we are surrounded by huge amounts of data. For example, there are more than one trillion web pages; more than one hour of video is uploaded to YouTube every second, amounting to years of content every day; the genomes of 1000s of people, each of which has a length of more than a billion base pairs, have been sequenced by various labs and so on. This deluge of data calls for automated methods of data analysis, which is exactly what machine learning aims at providing.

Learning outcomes

This course aims at giving you insights and knowledge about many of the central algorithms used in Data Analysis and Machine Learning. The course is project based and through various numerical projects and weekly exercises you will be exposed to fundamental research problems in these fields, with the aim to reproduce state of the art scientific results. Both supervised and unsupervised methods will be covered. The emphasis is on a frequentist approach with an emphasis on predictions and correaltions. However, we will try, where appropriate, to link our machine learning models with a Bayesian approach as well. You will learn to develop and structure large codes for studying different cases where Machine Learning is applied to, get acquainted with computing facilities and learn to handle large scientific projects. A good scientific and ethical conduct is emphasized throughout the course. More specifically, after this course you will

Learn about basic data analysis, statistical analysis, Bayesian statistics, Monte Carlo sampling, data optimization and machine learning;
Be capable of extending the acquired knowledge to other systems and cases;
Have an understanding of central algorithms used in data analysis and machine learning;
Understand linear methods for regression and classification, from ordinary least squares, via Lasso and Ridge to Logistic regression and Kernel regression;
Learn about neural networks and deep learning methods for supervised and unsupervised learning. Emphasis on feed forward neural networks, convolutional and recurrent neural networks;
Learn about about decision trees, random forests, bagging and boosting methods;
Learn about support vector machines and kernel transformations;
Reduction of data sets and unsupervised learning, from PCA to clustering;
Autoencoders and Reinforcement Learning;
Work on numerical projects to illustrate the theory. The projects play a central role and you are expected to know modern programming languages like Python or C++ and/or Fortran (Fortran2003 or later).

Prerequisites and background

Basic knowledge in programming and mathematics, with an emphasis on linear algebra. Knowledge of Python or/and C++ as programming languages is strongly recommended and experience with Jupyter notebooks is recommended. Required courses are the equivalents to the University of Oslo mathematics courses MAT1100, MAT1110, MAT1120 and at least one of the corresponding computing and programming courses INF1000/INF1110 or MAT-INF1100/MAT-INF1100L/BIOS1100/KJM-INF1100. Most universities offer nowadays a basic programming course (often compulsory) where Python is the recurring programming language. We recommend also refreshing your knowledge on Statistics and Probability theory. The lecture notes at https://compphysics.github.io/MachineLearning/doc/LectureNotes/_build/html/intro.html offer a review of Statistics and Probability theory.

The course has two central parts

Statistical analysis and optimization of data
Machine learning

Statistical analysis and optimization of data

The following topics will be covered

Basic concepts, expectation values, variance, covariance, correlation functions and errors;
Simpler models, binomial distribution, the Poisson distribution, simple and multivariate normal distributions;
Central elements of Bayesian statistics and modeling;
Gradient methods for data optimization,
Monte Carlo methods, Markov chains, Gibbs sampling and Metropolis-Hastings sampling;
Estimation of errors and resampling techniques such as the cross-validation, blocking, bootstrapping and jackknife methods;
Principal Component Analysis (PCA) and its mathematical foundation

Machine learning

The following topics will be covered:

Linear Regression and Logistic Regression;
Neural networks and deep learning, including convolutional and recurrent neural networks
Decisions trees, Random Forests, Bagging and Boosting
Support vector machines
Bayesian linear and logistic regression
Boltzmann Machines
Unsupervised learning Dimensionality reduction, PCA, k-means and clustering
Autoenconders

Hands-on demonstrations, exercises and projects aim at deepening your understanding of these topics.

Computational aspects play a central role and you are expected to work on numerical examples and projects which illustrate the theory and various algorithms discussed during the lectures. We recommend strongly to form small project groups of 2-3 participants, if possible.

Instructor information

Name: Morten Hjorth-Jensen
Email: [email protected]
Phone: +47-48257387
Office: Department of Physics, University of Oslo, Eastern wing, room FØ470
Office hours: Anytime! In Fall Semester 2021 we hope to be able to meet in person. Individual or group office hours can be performed either in person or via zoom. Feel free to send an email for planning. In-person meetings may also be possible if allowed by the University of Oslo's COVID-19 instructions (see below for links).

Teaching Assistants FS21

Øyvind Sigmundson Schøyen, [email protected]
Stian Bilek, [email protected]
Linus Ekstrøm, [email protected], [email protected]
Nicholas Karlsen, [email protected], [email protected]
Bendik Steinsvåg Dalen, [email protected]
Philip Karim Sørli Niane, [email protected]

Practicalities

Four lectures per week, Fall semester, 10 ECTS. The lectures will be recorded and linked to this site and the official University of Oslo website for the course;
Two hours of laboratory sessions for work on computational projects and exercises for each group. Due to social distancing, at most 15 participants can attend. There will also be fully digital laboratory sessions for those who cannot attend;
Three projects which are graded and count 1/3 each of the final grade;
A selected number of weekly assignments;
The course is part of the CS Master of Science program, but is open to other bachelor and Master of Science students at the University of Oslo;
The course is offered as a so-called cloned course, FYS-STK4155 at the Master of Science level and FYS-STK3155 as a senior undergraduate)course;
Videos of teaching material are available via the links at https://compphysics.github.io/MachineLearning/doc/web/course.html;
Weekly email with summary of activities will be mailed to all participants;

Grading

Grading scale: Grades are awarded on a scale from A to F, where A is the best grade and F is a fail. There are three projects which are graded and each project counts 1/3 of the final grade. The total score is thus the average from all three projects.

The final number of points is based on the average of all projects (including eventual additional points) and the grade follows the following table:

92-100 points: A
77-91 points: B
58-76 points: C
46-57 points: D
40-45 points: E
0-39 points: F-failed

Required Technologies

Course participants are expected to have their own laptops/PCs. We use Git as version control software and the usage of providers like GitHub, GitLab or similar are strongly recommended. If you are not familiar with Git as version control software, the following video may be of interest, see https://www.youtube.com/watch?v=RGOj5yH7evk&ab_channel=freeCodeCamp.org

We will make extensive use of Python as programming language and its myriad of available libraries. You will find Jupyter notebooks invaluable in your work. You can run R codes in the Jupyter/IPython notebooks, with the immediate benefit of visualizing your data. You can also use compiled languages like C++, Rust, Julia, Fortran etc if you prefer. The focus in these lectures will be on Python.

If you have Python installed and you feel pretty familiar with installing different packages, we recommend that you install the following Python packages via pip as

pip install numpy scipy matplotlib ipython scikit-learn mglearn sympy pandas pillow

For OSX users we recommend, after having installed Xcode, to install brew. Brew allows for a seamless installation of additional software via for example

brew install python3

For Linux users, with its variety of distributions like for example the widely popular Ubuntu distribution, you can use pip as well and simply install Python as

sudo apt-get install python3

Python installers

If you don't want to perform these operations separately and venture into the hassle of exploring how to set up dependencies and paths, we recommend two widely used distrubutions which set up all relevant dependencies for Python, namely

Anaconda:https://docs.anaconda.com/,

which is an open source distribution of the Python and R programming languages for large-scale data processing, predictive analytics, and scientific computing, that aims to simplify package management and deployment. Package versions are managed by the package management system conda.

Enthought canopy:https://www.enthought.com/product/canopy/

is a Python distribution for scientific and analytic computing distribution and analysis environment, available for free and under a commercial license.

Furthermore, Google's Colab:https://colab.research.google.com/notebooks/welcome.ipynb is a free Jupyter notebook environment that requires no setup and runs entirely in the cloud. Try it out!

Useful Python libraries

Here we list several useful Python libraries we strongly recommend (if you use anaconda many of these are already there)

NumPy:https://www.numpy.org/ is a highly popular library for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays
The pandas:https://pandas.pydata.org/ library provides high-performance, easy-to-use data structures and data analysis tools
Xarray:http://xarray.pydata.org/en/stable/ is a Python package that makes working with labelled multi-dimensional arrays simple, efficient, and fun!
Scipy:https://www.scipy.org/ (pronounced “Sigh Pie”) is a Python-based ecosystem of open-source software for mathematics, science, and engineering.
Matplotlib:https://matplotlib.org/ is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms.
Autograd:https://github.com/HIPS/autograd can automatically differentiate native Python and Numpy code. It can handle a large subset of Python's features, including loops, ifs, recursion and closures, and it can even take derivatives of derivatives of derivatives
SymPy:https://www.sympy.org/en/index.html is a Python library for symbolic mathematics.
scikit-learn:https://scikit-learn.org/stable/ has simple and efficient tools for machine learning, data mining and data analysis
TensorFlow:https://www.tensorflow.org/ is a Python library for fast numerical computing created and released by Google
Keras:https://keras.io/ is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano
And many more such as pytorch:https://pytorch.org/, Theano:https://pypi.org/project/Theano/ etc

Textbooks

Recommended textbooks: The lecture notes are collected as a jupyter-book at https://compphysics.github.io/MachineLearning/doc/LectureNotes/_build/html/intro.html. In addition to the electure notes, we recommend the books of Bishop and Goodfellow et al. We will follow these texts closely and the weekly reading assignments refer to these two texts.

Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer, https://www.springer.com/gp/book/9780387310732. This is the main textbook and this course covers chapters 1-7, 11 and 12. You can download for free the textbook in PDF format at https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. The different chapters are available for free at https://www.deeplearningbook.org/. Chapters 2-14 are highly recommended. The lectures follow to a good extent this text. The weekly plans will include reading suggestions from these two textbooks. In addition, you may find the following textbooks interesting. Additional textbooks:
Trevor Hastie, Robert Tibshirani, Jerome H. Friedman, The Elements of Statistical Learning, Springer, https://www.springer.com/gp/book/9780387848570. This is a well-known text and serves as additional literature.
Aurelien Geron, Hands‑On Machine Learning with Scikit‑Learn and TensorFlow, O'Reilly, https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/. This text is very useful since it contains many code examples and hands-on applications of all algorithms discussed in this course.

General learning book on statistical analysis:

Christian Robert and George Casella, Monte Carlo Statistical Methods, Springer
Peter Hoff, A first course in Bayesian statistical models, Springer

General Machine Learning Books:

Kevin Murphy, Machine Learning: A Probabilistic Perspective, MIT Press
David J.C. MacKay, Information Theory, Inference, and Learning Algorithms, Cambridge University Press
David Barber, Bayesian Reasoning and Machine Learning, Cambridge University Press

Links to relevant courses at the University of Oslo

The link here https://www.mn.uio.no/english/research/about/centre-focus/innovation/data-science/studies/ gives an excellent overview of courses on Machine learning at UiO.

STK2100 Machine learning and statistical methods for prediction and classification http://www.uio.no/studier/emner/matnat/math/STK2100/index-eng.html.
IN3050 Introduction to Artificial Intelligence and Machine Learning https://www.uio.no/studier/emner/matnat/ifi/IN3050/index-eng.html. Introductory course in machine learning and AI with an algorithmic approach.
STK-INF3000/4000 Selected Topics in Data Science http://www.uio.no/studier/emner/matnat/math/STK-INF3000/index-eng.html. The course provides insight into selected contemporary relevant topics within Data Science.
IN4080 Natural Language Processing https://www.uio.no/studier/emner/matnat/ifi/IN4080/index.html. Probabilistic and machine learning techniques applied to natural language processing.
STK-IN4300 Statistical learning methods in Data Science https://www.uio.no/studier/emner/matnat/math/STK-IN4300/index-eng.html. An advanced introduction to statistical and machine learning. For students with a good mathematics and statistics background.
INF4490 Biologically Inspired Computing http://www.uio.no/studier/emner/matnat/ifi/INF4490/. An introduction to self-adapting methods also called artificial intelligence or machine learning.
IN-STK5000 Adaptive Methods for Data-Based Decision Making https://www.uio.no/studier/emner/matnat/ifi/IN-STK5000/index-eng.html. Methods for adaptive collection and processing of data based on machine learning techniques.
IN5400/INF5860 Machine Learning for Image Analysis https://www.uio.no/studier/emner/matnat/ifi/IN5400/. An introduction to deep learning with particular emphasis on applications within Image analysis, but useful for other application areas too.
TEK5040 Deep learning for autonomous systems https://www.uio.no/studier/emner/matnat/its/TEK5040/. The course addresses advanced algorithms and architectures for deep learning with neural networks. The course provides an introduction to how deep-learning techniques can be used in the construction of key parts of advanced autonomous systems that exist in physical environments and cyber environments.
STK4051 Computational Statistics https://www.uio.no/studier/emner/matnat/math/STK4051/index-eng.html
STK4021 Applied Bayesian Analysis and Numerical Methods https://www.uio.no/studier/emner/matnat/math/STK4021/

Teaching schedule with links to material

This course will be delivered in a hybrid mode, with online lectures and on site or online laboratory sessions.

Four lectures per week, Fall semester, 10 ECTS. The lectures are in person but will be recorded and linked to this site and the official University of Oslo website for the course;
Two hours of laboratory sessions for work on computational projects and exercises for each group. There will also be fully digital laboratory sessions for those who cannot attend;
Three projects which are graded and count 1/3 each of the final grade;
A selected number of weekly assignments;
The course is part of the CS Master of Science program, but is open to other bachelor and Master of Science students at the University of Oslo;
The course is offered as a FYS-MAT4155 (Master of Science level) and a FYS-MAT3155 (senior undergraduate) course;
Videos of teaching material are available via the links at https://compphysics.github.io/MachineLearning/doc/web/course.html;
Weekly emails with summary of activities will be mailed to all participants;

Communication channels

Chat and communications via canvas.uio.no, GDPR safe
Slack channel: machinelearninguio.slack.com
Piazza : enlist at https:piazza.com/uio.no/fall2021/fysstk4155

Weekly Schedule

For the reading assignments we use the following abbreviations:

GBC: Goodfellow, Bengio, and Courville, Deep Learning
CMB: Christopher M. Bishop, Pattern Recognition and Machine Learning
HTF: Hastie, Tibshirani, and Friedman, The Elements of Statistical Learning
AG: Aurelien Geron, Hands‑On Machine Learning with Scikit‑Learn and TensorFlow

Recommended prereading: Chapters 1-2 (linear algebra) and chapter 3 (statistics) of Goodfellow et al. and Bishop chapter 1 and chapter 2. These chapters give a relevant background to the basic mathematical and statistical foundations of the course. Parts of these chapters will be covered during the lectures the first three weeks.

Week 34 August 23-27

Lab Wednesday: Introduction to software and repetition of Python Programming
Lecture Thursday: Introduction to the course, what is Machine Learning and introduction to Linear Regression.
Video of Lecture August 26, 2021 at https://www.uio.no/studier/emner/matnat/fys/FYS-STK4155/h21/forelesningsvideoer/LectureThursdayAugust26.mp4?vrtx=view-as-webpage
Lecture Friday: Basics of Linear Regression
Video of Lecture August 27, 2021 at https://www.uio.no/studier/emner/matnat/fys/FYS-STK4155/h21/forelesningsvideoer/LectureThursdayAugust27.mp4?vrtx=view-as-webpage
Reading recommendations:
- Refresh linear algebra, GBC chapters 1 and 2.
- CMB sections 1.1 and 3.1.
- HTF chapters 2 and 3.
- See lecture notes for week 34 at https://compphysics.github.io/MachineLearning/doc/web/course.html

Week 35 August 30-September 3

Lab Wednesday: Work on exercises 1-3 for week 35
Thursday: Review of ordinary Least Squares with applications and discussion of Ridge Regression and Singular Value Decomposition
Video of lecture Thursday at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureSeptember2.mp4?vrtx=view-as-webpage.
Friday: Analysis of Ridge and Lasso Regression and links with Singular Value Decomposition
Video of lecture Friday at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureSeptember3.mp4?vrtx=view-as-webpage
Reading recommendations:
- See lecture notes for week 35 at https://compphysics.github.io/MachineLearning/doc/web/course.html.
- HTF chapter 3. GBC chapters 1 and and sections 3.1-3.11 and 5.1
- CMB sections 1.1 and 3.1

Week 36 September 6-10

Lab Wednesday: Exercises 1 and 2 from week 36
Lecture Thursday: Summary from last week on SVD, Statistics, probability theory and linear regression
Video of Lecture https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureSeptember9.mp4?vrtx=view-as-webpage
Friday: Linear Regression and links with Statistics, Resampling methods and presentation of first project.
Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK4155/h21/forelesningsvideoer/LectureSeptember10.mp4?vrtx=view-as-webpage
Reading recommendations:
- Lectures on Regression for week 36 at https://compphysics.github.io/MachineLearning/doc/web/course.html.
- Bishop 1.1, 1.2, 2.1, 2.2, 2.3 and 3.1
- Hastie et al chapter 3

Week 37 September 13-17

Lab Wednesday: Work on Project 1
Lecture Thursday: Resampling methods, cross-validation and Bootstrap
- Video of Lecture, first part at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureSeptember16Firstpart.mp4?vrtx=view-as-webpage"
- Video of Lecture, second part at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureSeptember16SecondPart.mp4?vrtx=view-as-webpage
Lecture Friday: More on Resampling methods and summary of linear regression
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureSeptember17.mp4?vrtx=view-as-webpage
Reading recommendations:
- Lectures on Resampling methods for week 37 at https://compphysics.github.io/MachineLearning/doc/web/course.html.
- Bishop 1.3 (cross-validation) and 3.2 (bias-variance tradeoff)
- Hastie et al Chapter 7, here we recommend 7.1-7.5 and 7.10 (cross-validation) and 7.11 (bootstrap)
- Goodfellow et al discuss some of these topics in sections 5.2-5.5.

Week 38 September 20-24

Lab Wednesday: Work on Project 1
Lecture Thursday: Classification problems and Logistic Regression, from binary cases to several categories
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK4155/h21/forelesningsvideoer/LectureSeptember23.mp4?vrtx=view-as-webpage
Lecture Friday: Logistic Regression and start discussions of gradient optimization
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureSeptember24.mp4?vrtx=view-as-webpage
Reading recommendations:
- See lecture notes for week 38 at https://compphysics.github.io/MachineLearning/doc/web/course.html.
- Bishop 4.1, 4.2 and 4.3. Not all the material is relevant or will be covered. Section 4.3 is the most relevant, but 4.1 and 4.2 give interesting background readings for logistic regression
- Hastie et al 4.1, 4.2 and 4.3 on logistic regression
- For a good discussion on gradient methods, see Goodfellow et al section 4.3-4.5 and chapter 8. We will come back to the latter chapter in our discussion of Neural networks as well.

Week 39 September 27- October 1

Lab Wednesday: Work on Project 1
Lecture Thursday: Gradient Optimization methods
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureSeptember30.mp4?vrtx=view-as-webpage
Lecture Friday: Gradient methods
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureOctober1.mp4?vrtx=view-as-webpage
Reading recommendations:
- See lecture notes for week 39 at https://compphysics.github.io/MachineLearning/doc/web/course.html.
- For a good discussion on gradient methods, see Goodfellow et al section 4.3-4.5 and chapter 8. We will come back to the latter chapter in our discussion of Neural networks as well.

Week 40 October 4-8

Lab Wednesday: Wrap up project 1
Lecture Thursday: Stochastic gradient methods and start discussion of neural networks
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureOctober7.mp4?vrtx=view-as-webpage
Lecture Friday: Deep Learning and Neural Networks
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureOctober8.mp4?vrtx=view-as-webpage
Reading recommendations:
- See lecture notes for week 40 at https://compphysics.github.io/MachineLearning/doc/web/course.html.
- For neural networks we recommend Goodfellow et al chapters 6 and 7 and Bishop 5.1-5.4

Week 41 October 11-15

Lab Wednesday: Work on project 2
Lecture Thursday: Deep learning and Neural Networks
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureOctober14.mp4?vrtx=view-as-webpage
Lecture Friday: Tensorflow and the mathematics of neural networks
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK4155/h21/forelesningsvideoer/LectureOctober15.mp4?vrtx=view-as-webpage
Reading recommendations:
- See lecture notes for week 41 at https://compphysics.github.io/MachineLearning/doc/web/course.html.
- For neural networks we recommend Goodfellow et al chapters 6 and 7. For CNNs, see Goodfellow et al chapter 9. chapter 11 and 12 on practicalities and applications. See also Aurelien Geron's chapters 10-11 at https://github.com/CompPhysics/MachineLearning/blob/master/doc/Textbooks/TensorflowML.pdf.

Week 42 October 18-22

Lab Wednesday: Work on project 2
Lecture Thursday: Solving differential equations with neural networks and start Convolutional Neural Networks and classification problems
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureOctober21.mp4?vrtx=view-as-webpage
Lecture Friday: Convolutional Neural Networks and classification problems
- Video of lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK4155/h21/forelesningsvideoer/LectureOctober22.mp4?vrtx=view-as-webpage
Reading recommendations:
- See lecture notes for week 42 at https://compphysics.github.io/MachineLearning/doc/web/course.html.
- For neural networks we recommend Goodfellow et al chapters 6 and 7. For CNNs, see Goodfellow et al chapter 9. See also chapter 11 and 12 on practicalities and applications

Week 43 October 25-29

Lab Wednesday: Work on project 2
Lecture Thursday: Recurrent Neural Networks
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureOctober28.mp4?vrtx=view-as-webpage
Lecture Friday: Recurrent Neural Networks and time series and principal component analysis (PCA)
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureOctober29.mp4?vrtx=view-as-webpage
Reading recommendations:
- See lecture notes for week 43 at https://compphysics.github.io/MachineLearning/doc/web/course.html.
- For RNNs, see Goodfellow et al chapter 10 and discussions in chapter 11 and 12 on practicalities and applications
- For PCA, see lecture notes chapter 11 and Geron's text chapter 8

Week 44 November 1-5

Lab Wednesday: Work on project 2
Lecture Thursday: Summary on PCA and discussion of Clustering for unsupervised learning. Decision trees, classification and regression
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureNovember4.mp4?vrtx=view-as-webpage
Lecture Friday: Decision trees, basic algorithms
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureNovember5.mp4?vrtx=view-as-webpage
Reading recommendations:
- See lecture notes for week 44 at https://compphysics.github.io/MachineLearning/doc/web/course.html.
- Hastie et al sections 9.1 and 9.2. Geron's text chapter 6 (Decision trees) and chapter 8 on PCA and Clustering

Week 45 November 8-12

Lab Wednesday: Work on project 2, project 3 available Friday 12th. Deadline project 2 is November 20.
Lecture Thursday: Decision Trees and Ensemble methods, Bagging and Voting
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureNovember11.mp4?vrtx=view-as-webpage
Lecture Friday: Ensemble Methods, Random Forests, Boosting and gradient boosting
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureNovember12.mp4?vrtx=view-as-webpage
Reading recommendations:
- See lecture notes for week 45 at https://compphysics.github.io/MachineLearning/doc/web/course.html.
- Decision Trees: Geron's chapter 6 covers decision trees while ensemble models, voting and bagging are discussed in chapter 7.
- See also lecture from "STK-IN4300, lecture 7":"https://www.uio.no/studier/emner/matnat/math/STK-IN4300/h20/slides/lecture_7.pdf".
- Chapter 9.2 of Hastie et al contains also a good discussion.

Week 46 November 15-19

Lab Wednesday: Work on project 3
Lecture Thursday: Support Vector machines. Summary Ensemble Methods, Random Forests, Boosting and gradient boosting
Lecture Friday: Workshop on project 3
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK4155/h21/forelesningsvideoer/LectureNovember19.mp4?vrtx=view-as-webpage
Reading recommendations:
- See lecture notes for week 46 at https://compphysics.github.io/MachineLearning/doc/web/course.html.
- Hastie et al chapter 12
- Bishop chapter 7.1 and 7.2

Week 47 November 22-26

Lab Wednesday: Work on project 3
Lecture Thursday: Support Vector Machines
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK4155/h21/forelesningsvideoer/LectureNovember25.mp4?vrtx=view-as-webpage
Lecture Friday: Support Vector Machines and Summary of Course
- Video of Lecture at https://www.uio.no/studier/emner/matnat/fys/FYS-STK3155/h21/forelesningsvideoer/LectureNovember26.mp4?vrtx=view-as-webpage
Reading recommendations:
- See lecture notes for week 47 at https://compphysics.github.io/MachineLearning/doc/web/course.html.
- Geron's chapter 5.
- Hastie et al Chapter 12 (sections 12.1-12.3 are the most relevant ones)
- Bishop chapter 7, with sections 7.1 and 7.2 as the essential ones

Name		Name	Last commit message	Last commit date
Latest commit History 1,619 Commits
doc		doc
.gitignore		.gitignore
.nojekyll		.nojekyll
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
README.md~		README.md~
_config.yml		_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Applied Data Analysis and Machine Learning

Introduction

Learning outcomes

Prerequisites and background

The course has two central parts

Statistical analysis and optimization of data

Machine learning

Instructor information

Teaching Assistants FS21

Practicalities

Grading

Required Technologies

Python installers

Useful Python libraries

Textbooks

Links to relevant courses at the University of Oslo

Teaching schedule with links to material

Communication channels

Weekly Schedule

Week 34 August 23-27

Week 35 August 30-September 3

Week 36 September 6-10

Week 37 September 13-17

Week 38 September 20-24

Week 39 September 27- October 1

Week 40 October 4-8

Week 41 October 11-15

Week 42 October 18-22

Week 43 October 25-29

Week 44 November 1-5

Week 45 November 8-12

Week 46 November 15-19

Week 47 November 22-26

About

Releases

Packages

License

tomaszoe/MachineLearning

Folders and files

Latest commit

History

Repository files navigation

Applied Data Analysis and Machine Learning

Introduction

Learning outcomes

Prerequisites and background

The course has two central parts

Statistical analysis and optimization of data

Machine learning

Instructor information

Teaching Assistants FS21

Practicalities

Grading

Required Technologies

Python installers

Useful Python libraries

Textbooks

Links to relevant courses at the University of Oslo

Teaching schedule with links to material

Communication channels

Weekly Schedule

Week 34 August 23-27

Week 35 August 30-September 3

Week 36 September 6-10

Week 37 September 13-17

Week 38 September 20-24

Week 39 September 27- October 1

Week 40 October 4-8

Week 41 October 11-15

Week 42 October 18-22

Week 43 October 25-29

Week 44 November 1-5

Week 45 November 8-12

Week 46 November 15-19

Week 47 November 22-26

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages