Skip to content

Commit d9fa6a9

Browse files
add probabilistic programming lecture
1 parent 129c5e8 commit d9fa6a9

File tree

6 files changed

+225
-290
lines changed

6 files changed

+225
-290
lines changed

README.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -505,11 +505,28 @@ compilation for such hardware.
505505
- [PDEs, Convolutions, and the Mathematics of Locality (Lecture)](https://youtu.be/apkyk8n0vBo)
506506
- [PDEs, Convolutions, and the Mathematics of Locality (Notes)](https://mitmath.github.io/18337/lecture14/pdes_and_convolutions)
507507

508+
### Additional Readings
509+
510+
- [Deep Neural Networks Motivated by Partial Differential Equations](https://arxiv.org/abs/1804.04272)
511+
508512
In this lecture we will continue to relate the methods of machine learning to
509513
those in scientific computing by looking at the relationship between convolutional
510514
neural networks and partial differential equations. It turns out they are more
511515
than just similar: the two are both stencil computations on spatial data!
512516

517+
## Lecture 16: Probabilistic Programming
518+
519+
- [From Optimization to Probabilistic Programming (Lecture)](https://youtu.be/32rAwtTAGdU)
520+
- [From Optimization to Probabilistic Programming (Notes)](https://mitmath.github.io/18337/lecture16/probabilistic_programming)
521+
522+
All of our previous discussions lived in a deterministic world. Not this one. Here we turn to a probabilistic view and allow programs to have random variables. Forward simulation of a random program is seen to be simple through Monte Carlo sampling. However, parameter estimation is now much more involved, since in this case we need to estimate not just values but probability distributions. It turns out that Bayes' rule gives a framework for performing such estimations. We see that classical parameter estimation falls out as a maximization of probability with the "simplest" form of distributions, and thus this gives a nice generalization even to standard parameter estimation and justifies the use of L2 loss functions and regularization (as a perturbation by a prior). Next, we turn to estimating the distributions, which we see is possible for small problems using Metropolis Hastings, but for larger problems we develop Hamiltonian Monte Carlo. It turns out that Hamiltonian Monte Carlo has strong ties to both ODEs and differentiable programming: it is defined as solving ODEs which arise from a Hamiltonian, and derivatives of the likelihood are required, which is essentially the same idea as derivatives of cost functions! We then describe an alternative approach: Automatic Differentiation Variational Inference (ADVI), which once again is using the tools of differentiable programming to estimate distributions of probabilistic programs.
523+
524+
## Lecture 17: Global Sensitivity Analysis
525+
526+
- [Global Sensitivity Analysis (Notes)](https://mitmath.github.io/18337/lecture17/global_sensitivity)
527+
528+
Our previous analysis of sensitivities was all local. What does it mean to example the sensitivities of a model globally? It turns out the probabilistic programming viewpoint gives us a solid way of describing how we expect values to be changing over larger sets of parameters via the random variables that describe the program's inputs. This means we can decompose the output variance into indices which can be calculated via various quadrature approximations which then give a tractable measurement to "variable x has no effect on the mean solution".
529+
513530
## Lecture 18: Code Profiling and Optimization
514531

515532
- [Code Profiling and Optimization (Lecture)](https://youtu.be/h-xVBD2Pk9o)

0 commit comments

Comments
 (0)