From 684bd45c9f92a17c51b77bfe9ae48d5784cde23b Mon Sep 17 00:00:00 2001
From: williamscott <williamscott@scoper.org>
Date: Sat, 16 Nov 2013 11:01:46 +1100
Subject: [PATCH 1/3] Typo correction

Replace "more clear be" with "more clear by"
---
 Chapter4_TheGreatestTheoremNeverTold/LawOfLargeNumbers.ipynb | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Chapter4_TheGreatestTheoremNeverTold/LawOfLargeNumbers.ipynb b/Chapter4_TheGreatestTheoremNeverTold/LawOfLargeNumbers.ipynb
index 87da9f9c..237d035f 100644
--- a/Chapter4_TheGreatestTheoremNeverTold/LawOfLargeNumbers.ipynb
+++ b/Chapter4_TheGreatestTheoremNeverTold/LawOfLargeNumbers.ipynb
@@ -43,7 +43,7 @@
      "source": [
       "### Intuition \n",
       "\n",
-      "If the above Law is somewhat surprising,  it can be made more clear be examining a simple example. \n",
+      "If the above Law is somewhat surprising,  it can be made more clear by examining a simple example. \n",
       "\n",
       "Consider a random variable $Z$ that can take only two values, $c_1$ and $c_2$. Suppose we have a large number of samples of $Z$, denoting a specific sample $Z_i$. The Law says that we can approximate the expected value of $Z$ by averaging over all samples. Consider the average:\n",
       "\n",
@@ -1235,4 +1235,4 @@
    "metadata": {}
   }
  ]
-}
+}
\ No newline at end of file

From 5ab2748a0c6c4ad0aa82b950bb9cc7b0af114706 Mon Sep 17 00:00:00 2001
From: williamscott <williamscott@scoper.org>
Date: Sat, 16 Nov 2013 11:59:23 +1100
Subject: [PATCH 2/3] Typo and phrasing change

1. Remove full stop between $X$ and :
2. Replace "computational-intensive" with "computationally-intensive"
---
 Chapter1_Introduction/Chapter1_Introduction.ipynb | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Chapter1_Introduction/Chapter1_Introduction.ipynb b/Chapter1_Introduction/Chapter1_Introduction.ipynb
index dea3e4b8..1fb9e0d2 100644
--- a/Chapter1_Introduction/Chapter1_Introduction.ipynb
+++ b/Chapter1_Introduction/Chapter1_Introduction.ipynb
@@ -70,7 +70,7 @@
       "\n",
       "To align ourselves with traditional probability notation, we denote our belief about event $A$ as $P(A)$. We call this quantity the *prior probability*.\n",
       "\n",
-      "John Maynard Keynes, a great economist and thinker, said \"When the facts change, I change my mind. What do you do, sir?\" This quote reflects the way a Bayesian updates his or her beliefs after seeing evidence. Even &mdash; especially &mdash; if the evidence is counter to what was initially believed, the evidence cannot be ignored. We denote our updated belief as $P(A |X )$, interpreted as the probability of $A$ given the evidence $X$. We call the updated belief the *posterior probability* so as to contrast it with the prior probability. For example, consider the posterior probabilities (read: posterior beliefs) of the above examples, after observing some evidence $X$.:\n",
+      "John Maynard Keynes, a great economist and thinker, said \"When the facts change, I change my mind. What do you do, sir?\" This quote reflects the way a Bayesian updates his or her beliefs after seeing evidence. Even &mdash; especially &mdash; if the evidence is counter to what was initially believed, the evidence cannot be ignored. We denote our updated belief as $P(A |X )$, interpreted as the probability of $A$ given the evidence $X$. We call the updated belief the *posterior probability* so as to contrast it with the prior probability. For example, consider the posterior probabilities (read: posterior beliefs) of the above examples, after observing some evidence $X$:\n",
       "\n",
       "1\\. $P(A): \\;\\;$ the coin has a 50 percent chance of being Heads. $P(A | X):\\;\\;$ You look at the coin, observe a Heads has landed, denote this information $X$, and trivially assign probability 1.0 to Heads and 0.0 to Tails.\n",
       "\n",
@@ -110,7 +110,7 @@
       "\n",
       "Denote $N$ as the number of instances of evidence we possess. As we gather an *infinite* amount of evidence, say as $N \\rightarrow \\infty$, our Bayesian results (often) align with frequentist results. Hence for large $N$, statistical inference is more or less objective. On the other hand, for small $N$, inference is much more *unstable*: frequentist estimates have more variance and larger confidence intervals. This is where Bayesian analysis excels. By introducing a prior, and returning probabilities (instead of a scalar estimate), we *preserve the uncertainty* that reflects the instability of statistical inference of a small $N$ dataset. \n",
       "\n",
-      "One may think that for large $N$, one can be indifferent between the two techniques since they offer similar inference, and might lean towards the computational-simpler, frequentist methods. An individual in this position should consider the following quote by Andrew Gelman (2005)[1], before making such a decision:\n",
+      "One may think that for large $N$, one can be indifferent between the two techniques since they offer similar inference, and might lean towards the computationally-simpler, frequentist methods. An individual in this position should consider the following quote by Andrew Gelman (2005)[1], before making such a decision:\n",
       "\n",
       "> Sample sizes are never large. If $N$ is too small to get a sufficiently-precise estimate, you need to get more data (or make more assumptions). But once $N$ is \"large enough,\" you can start subdividing the data to learn more (for example, in a public opinion poll, once you have a good estimate for the entire country, you can estimate among men and women, northerners and southerners, different age groups, etc.). $N$ is never enough because if it were \"enough\" you'd already be on to the next problem for which you need more data.\n",
       "\n",

From dfc565c1561bb05b71a316d2073efeb7c6f50e66 Mon Sep 17 00:00:00 2001
From: williamscott <williamscott@scoper.org>
Date: Sat, 16 Nov 2013 20:59:42 +1100
Subject: [PATCH 3/3] Add acronym definition and syntax correction

1. Add (MCMC) after first mention of Markov Chain Monte Carlo
2. Correct $\\lambda$s to $\lambda$s.
---
 Chapter1_Introduction/Chapter1_Introduction.ipynb | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Chapter1_Introduction/Chapter1_Introduction.ipynb b/Chapter1_Introduction/Chapter1_Introduction.ipynb
index 1fb9e0d2..a94410ab 100644
--- a/Chapter1_Introduction/Chapter1_Introduction.ipynb
+++ b/Chapter1_Introduction/Chapter1_Introduction.ipynb
@@ -734,7 +734,7 @@
      "source": [
       "The variable `observation` combines our data, `count_data`, with our proposed data-generation scheme, given by the variable `lambda_`, through the `value` keyword. We also set `observed = True` to tell PyMC that this should stay fixed in our analysis. Finally, PyMC wants us to collect all the variables of interest and create a `Model` instance out of them. This makes our life easier when we retrieve the results.\n",
       "\n",
-      "The code below will be explained in Chapter 3, but I show it here so you can see where our results come from. One can think of it as a *learning* step. The machinery being employed is called *Markov Chain Monte Carlo*, which I also delay explaining until Chapter 3. This technique returns thousands of random variables from the posterior distributions of $\\lambda_1, \\lambda_2$ and $\\tau$. We can plot a histogram of the random variables to see what the posterior distributions look like. Below, we collect the samples (called *traces* in the MCMC literature) into histograms."
+      "The code below will be explained in Chapter 3, but I show it here so you can see where our results come from. One can think of it as a *learning* step. The machinery being employed is called *Markov Chain Monte Carlo* (MCMC), which I also delay explaining until Chapter 3. This technique returns thousands of random variables from the posterior distributions of $\\lambda_1, \\lambda_2$ and $\\tau$. We can plot a histogram of the random variables to see what the posterior distributions look like. Below, we collect the samples (called *traces* in the MCMC literature) into histograms."
      ]
     },
     {
@@ -838,7 +838,7 @@
      "source": [
       "### Interpretation\n",
       "\n",
-      "Recall that Bayesian methodology returns a *distribution*. Hence we now have distributions to describe the unknown $\\lambda$s and $\\tau$. What have we gained? Immediately, we can see the uncertainty in our estimates: the wider the distribution, the less certain our posterior belief should be. We can also see what the plausible values for the parameters are: $\\lambda_1$ is around 18 and $\\lambda_2$ is around 23. The posterior distributions of the two $\\\\lambda$s are clearly distinct, indicating that it is indeed likely that there was a change in the user's text-message behaviour.\n",
+      "Recall that Bayesian methodology returns a *distribution*. Hence we now have distributions to describe the unknown $\lambda$s and $\\tau$. What have we gained? Immediately, we can see the uncertainty in our estimates: the wider the distribution, the less certain our posterior belief should be. We can also see what the plausible values for the parameters are: $\\lambda_1$ is around 18 and $\\lambda_2$ is around 23. The posterior distributions of the two $\\\\lambda$s are clearly distinct, indicating that it is indeed likely that there was a change in the user's text-message behaviour.\n",
       "\n",
       "What other observations can you make? If you look at the original data again, do these results seem reasonable? \n",
       "\n",