SciML
diff --git a/‎README.md‎
Lines changed: 17 additions & 0 deletions b/‎README.md‎
Lines changed: 17 additions & 0 deletions
@@ -505,11 +505,28 @@ compilation for such hardware.
 - [PDEs, Convolutions, and the Mathematics of Locality (Lecture)](https://youtu.be/apkyk8n0vBo)
 - [PDEs, Convolutions, and the Mathematics of Locality (Notes)](https://mitmath.github.io/18337/lecture14/pdes_and_convolutions)
 
+### Additional Readings
+
+- [Deep Neural Networks Motivated by Partial Differential Equations](https://arxiv.org/abs/1804.04272)
+
 In this lecture we will continue to relate the methods of machine learning to
 those in scientific computing by looking at the relationship between convolutional
 neural networks and partial differential equations. It turns out they are more
 than just similar: the two are both stencil computations on spatial data!
 
+## Lecture 16: Probabilistic Programming
+
+- [From Optimization to Probabilistic Programming (Lecture)](https://youtu.be/32rAwtTAGdU)
+- [From Optimization to Probabilistic Programming (Notes)](https://mitmath.github.io/18337/lecture16/probabilistic_programming)
+
+All of our previous discussions lived in a deterministic world. Not this one. Here we turn to a probabilistic view and allow programs to have random variables. Forward simulation of a random program is seen to be simple through Monte Carlo sampling. However, parameter estimation is now much more involved, since in this case we need to estimate not just values but probability distributions. It turns out that Bayes' rule gives a framework for performing such estimations. We see that classical parameter estimation falls out as a maximization of probability with the "simplest" form of distributions, and thus this gives a nice generalization even to standard parameter estimation and justifies the use of L2 loss functions and regularization (as a perturbation by a prior). Next, we turn to estimating the distributions, which we see is possible for small problems using Metropolis Hastings, but for larger problems we develop Hamiltonian Monte Carlo. It turns out that Hamiltonian Monte Carlo has strong ties to both ODEs and differentiable programming: it is defined as solving ODEs which arise from a Hamiltonian, and derivatives of the likelihood are required, which is essentially the same idea as derivatives of cost functions! We then describe an alternative approach: Automatic Differentiation Variational Inference (ADVI), which once again is using the tools of differentiable programming to estimate distributions of probabilistic programs.
+
+## Lecture 17: Global Sensitivity Analysis
+
+- [Global Sensitivity Analysis (Notes)](https://mitmath.github.io/18337/lecture17/global_sensitivity)
+
+Our previous analysis of sensitivities was all local. What does it mean to example the sensitivities of a model globally? It turns out the probabilistic programming viewpoint gives us a solid way of describing how we expect values to be changing over larger sets of parameters via the random variables that describe the program's inputs. This means we can decompose the output variance into indices which can be calculated via various quadrature approximations which then give a tractable measurement to "variable x has no effect on the mean solution".
+
 ## Lecture 18: Code Profiling and Optimization
 
 - [Code Profiling and Optimization (Lecture)](https://youtu.be/h-xVBD2Pk9o)