title | shorttitle | layout |
---|---|---|
Topic Index |
Topics |
default |
These are not listed in the order of when they will be covered, or even the depth in which they will be covered (this is one course, after all). But these are all topics we will be touching on. Some will be covered in class, some in homework, some in lab, and some you will be expected to read on your own.
Expect this list to either shrink, or for some topics to be replaced, as the semester goes on!
- why this course
- problems you can solve
- the Box loop
- representing models graphically
- the basics of Probability
- probability mass functions, densities and cumulative functions
- distributions example
- expectations and integration
- Gaussian Distribution
- Bernoulli Distribution
- Binomial Distribution
- Poisson Distribution
- Exponential Distribution
- law of large numbers
- pdf's vs sampling
- monte-carlo for integrals
- sampling and central limit theorem
- Frequentist principles with sampling distributions and bootstrap
- frequentist example
- Sampling distributions and bootstrap
- p-values and confidence intervals
- sampling vs simulation
- the inverse method from the cdf
- rejection sampling
- importance sampling
- SIR
- 2D and marginals from a sampling perspective
- maximum likelihood and log-likelihood
- density estimation vs supervised learning
- covariates and linear regression: decision risk
- logistic regression
- approximation (ERM) vs Statistics
- bias and variance
- cross-validation
- regularization
- classification via decision risk
- basic optimization
- gradient free methods
- gradient based methods
- stochastic gradient descent(SGD)
- convexity and Jensen's inequality
- theano and automatic differentiation
- SGD using Theano for logistic regression
- entropy and cross-entropy
- KL divergence and deviance
- model comparison with likelihood ratios and AIC
- maximum entropy distributions: binomial and normal
- the exponential family of distributions
- statistical mechanics: stationarity and the ensembles
- the boltzmann distribution
- combinatoric optimization methods
- markov chains
- simulated annealing
- the simulated annealing markov chain
- the traveling salesman problem
Hidden variables and learning
- hidden variables
- mixture models and unsupervised learning
- generative vs discriminative models
- missing data and Data Augmentation
- the expectation maximization algorithm
- EM algorithm, statistical version
- Applications of EM
- the meaning of bayes theorem
- MLE of a binomial and beta-binomial bayesian updating
- the formal structure of bayesian inference and the globe throw example
- posteriors, marginal posteriors and posterior predictives
- frequentist equivalences to bayesian stats
- priors and their choice
- MAP, plugin predictive, and point estimates
- posterior predictive intervals
- shrinkage and regularization
- empirical bayes and the (ever more) bayesian hierarchy
- hierarchical models and regularization: using empirical bayes
- combining multiple experiments: bayesian meta-analysis
- point estimates from decision theory: decision risk
- the bayesian structure of machine learning through posterior predictives
- generative models revisited and LDA
- hyper-parameters in a bayesian setup.
- are we playing with parameters or with models?
- multistage decision analysis
- when is MCMC needed? (why not always use importance sampling)
- details of the markov chain and the proposal distribution
- how to write a Metropolis-Hastings (MH) sampler
- MCMC convergence tuning and diagnostics: burnin, thinning, and autocorrelation
- the structure of pymc
- gibbs sampling, a simpler version of MCMC
- different kinds of gibbs
- the relationship of gibbs to Data Augmentation and EM
- Hierarchical model full bayesian: alternating MH and gibbs for different posteriors (rats)
- Missing data from a sampling perspective
- Convergence problems with MCMC and gibbs: correlations and efficiency
- Gelman Rubin and Gewecke tests
- External Validation of models using holdout sets
- Posterior predictive checking, posterior replications
- Posterior predictive p-values
- Interesting ideas to fix convergence issues
- Slice sampler
- Mechanics and Statistical Mechanics for HMC
- Hamiltonian Monte Carlo
- NUTS and other improvements on HMC
- HMC convergence vs others
- regression as bayesian updating
- normal prior as ridge regression
- exponential family and glms with a link function
- a bayesian glm example
- exposure and zero-inflation in glms
- overdispersion in glms
- hierarchical GLMs: radon example
- out of sample performance
- evidence
- bayes ratios
- cross validation (LOO) for model selection
- BIC/WAIC/DIC etc measures: KL and deviance out of sample.
- model averaging and ensembles
- normal approximation
- marginal posterior modes with EM
- variational inference
- expectation propagation
- ADVI
- time series and dealing with conditional dependence on previous times
- Hidden Markov Models (HMM)
- viterbi and other algorithms
- stochastic processes
- Kalman filters
- Sequential Monte Carlo
- Particle Filters
- glms with a covariance in intercepts and slopes
- spatial autocorrelation in glms
- gaussian processes
- gaussian processes for regression
- the capacity of models
- bayesian non-parametrics
- Rat Tumors
- Kidney Cancer
- Oceanic tools
- Radon in houses
- Chimpanzees
- Drinking Monks