Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions Chapter2_MorePyMC/MorePyMC.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -608,7 +608,7 @@
"It is okay that our fictional dataset does not look like our observed dataset: the probability is incredibly small it indeed would. PyMC's engine is designed to find good parameters, $\\lambda_i, \\tau$, that maximize this probability. \n",
"\n",
"\n",
"The ability to generate artificial dataset is an interesting side effect of our modeling, and we will see that this ability is a very important method of Bayesian inference. We produce a few more datasets below:"
"The ability to generate artificial datasets is an interesting side effect of our modeling, and we will see that this ability is a very important method of Bayesian inference. We produce a few more datasets below:"
]
},
{
Expand Down Expand Up @@ -2110,9 +2110,9 @@
"\n",
"The skeptical reader will say \"You deliberately chose the logistic function for $p(t)$ and the specific priors. Perhaps other functions or priors will give different results. How do I know I have chosen a good model?\" This is absolutely true. To consider an extreme situation, what if I had chosen the function $p(t) = 1,\\; \\forall t$, which guarantees a defect always occurring: I would have again predicted disaster on January 28th. Yet this is clearly a poorly chosen model. On the other hand, if I did choose the logistic function for $p(t)$, but specified all my priors to be very tight around 0, likely we would have very different posterior distributions. How do we know our model is an expression of the data? This encourages us to measure the model's **goodness of fit**.\n",
"\n",
"We can think: *how can we test whether our model is a bad fit?* An idea is to compare observed data (which if we recall is a *fixed* stochastic variable) with artificial dataset which we can simulate. The rationale is that if the simulated dataset does not appear similar, statistically, to the observed dataset, then likely our model is not accurately represented the observed data. \n",
"We can think: *how can we test whether our model is a bad fit?* An idea is to compare observed data (which if we recall is a *fixed* stochastic variable) with an artificial dataset which we can simulate. The rationale is that if the simulated dataset does not appear similar, statistically, to the observed dataset, then likely our model is not accurately represented the observed data. \n",
"\n",
"Previously in this Chapter, we simulated artificial dataset for the SMS example. To do this, we sampled values from the priors. We saw how varied the resulting datasets looked like, and rarely did they mimic our observed dataset. In the current example, we should sample from the *posterior* distributions to create *very plausible datasets*. Luckily, our Bayesian framework makes this very easy. We only need to create a new `Stochastic` variable, that is exactly the same as our variable that stored the observations, but minus the observations themselves. If you recall, our `Stochastic` variable that stored our observed data was:\n",
"Previously in this Chapter, we simulated artificial datasets for the SMS example. To do this, we sampled values from the priors. We saw how varied the resulting datasets looked like, and rarely did they mimic our observed dataset. In the current example, we should sample from the *posterior* distributions to create *very plausible datasets*. Luckily, our Bayesian framework makes this very easy. We only need to create a new `Stochastic` variable, that is exactly the same as our variable that stored the observations, but minus the observations themselves. If you recall, our `Stochastic` variable that stored our observed data was:\n",
"\n",
" observed = pm.Bernoulli( \"bernoulli_obs\", p, value=D, observed=True)\n",
"\n",
Expand Down Expand Up @@ -2344,7 +2344,7 @@
"\n",
"It is much more informative to compare this to separation plots for other models. Below we compare our model (top) versus three others:\n",
"\n",
"1. the perfect model, which predicts the posterior probability to be equal 1 if a defect did occur.\n",
"1. the perfect model, which predicts the posterior probability to be equal to 1 if a defect did occur.\n",
"2. a completely random model, which predicts random probabilities regardless of temperature.\n",
"3. a constant model: where $P(D = 1 \\; | \\; t) = c, \\;\\; \\forall t$. The best choice for $c$ is the observed frequency of defects, in this case 7/23. \n"
]
Expand Down Expand Up @@ -2613,4 +2613,4 @@
"metadata": {}
}
]
}
}
2 changes: 1 addition & 1 deletion Chapter3_MCMC/IntroMCMC.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -262,7 +262,7 @@
"source": [
"The plot on the left is the deformed landscape with the $\\text{Uniform}(0,5)$ priors, and the plot on the right is the deformed landscape with the exponential priors. Notice that the posterior landscapes look different from one another, though the data observed is identical in both cases. The reason is as follows. Notice the exponential-prior landscape, bottom right figure, puts very little *posterior* weight on values in the upper right corner of the figure: this is because *the prior does not put much weight there*. On the other hand, the uniform-prior landscape is happy to put posterior weight in the upper-right corner, as the prior puts more weight there. \n",
"\n",
"Notice also the highest-point, corresponding the the darkest red, is biased towards (0,0) in the exponential case, which is the result from the exponential prior putting more prior weight in the (0,0) corner.\n",
"Notice also the highest-point, corresponding to the darkest red, is biased towards (0,0) in the exponential case, which is the result from the exponential prior putting more prior weight in the (0,0) corner.\n",
"\n",
"The black dot represents the true parameters. Even with 1 sample point, the mountains attempts to contain the true parameter. Of course, inference with a sample size of 1 is incredibly naive, and choosing such a small sample size was only illustrative. \n",
"\n",
Expand Down