Merge pull request CamDavidsonPilon#53 from sash-ko/master

CamDavidsonPilon · CamDavidsonPilon · commit 2511eb35b1bb · 2013-05-23T06:01:28.000-07:00
Chapter 2 fixes
diff --git a/Chapter2_MorePyMC/MorePyMC.ipynb b/Chapter2_MorePyMC/MorePyMC.ipynb
@@ -574,13 +574,13 @@
       "\n",
       "3.  Do we know $\\lambda$? No. In fact, we have a suspicion that there are *two* $\\lambda$ values, one for the earlier behaviour and one for the latter behaviour. We don't know when the behaviour switches though, but call the switchpoint $\\tau$.\n",
       "\n",
-      "4. What is a good distribution for the two $\\lambda$s? The exponential is good, as it assigns probabilites to positive real numbers. Well the exponential distribution has a parameter too, call it $\\alpha$.\n",
+      "4. What is a good distribution for the two $\\lambda$s? The exponential is good, as it assigns probabilities to positive real numbers. Well the exponential distribution has a parameter too, call it $\\alpha$.\n",
       "\n",
       "5.  Do we know what the parameter $\\alpha$ might be? No. At this point, we could continue and assign a distribution to $\\alpha$, but it's better to stop once we reach a set level of ignorance: whereas we have a prior belief about $\\lambda$, (\"it probably changes over time\", \"it's likely between 10 and 30\", etc.), we don't really have any strong beliefs about $\\alpha$. So it's best to stop here. \n",
       "\n",
       "    What is a good value for $\\alpha$ then? We think that the $\\lambda$s are between 10-30, so if we set $\\alpha$ really low (which corresponds to larger probability on high values) we are not reflecting our prior well. Similar, a too-high alpha misses our prior belief as well. A good idea for $\\alpha$ as to reflect our belief is to set the value so that the mean of $\\lambda$, given $\\alpha$, is equal to our observed mean. This was shown in the last chapter.\n",
       "\n",
-      "6. We have no expert opinion of when $\\tau$ might have occured. So we will suppose $\\tau$ is from a discrete uniform distribution over the entire timespan.\n",
+      "6. We have no expert opinion of when $\\tau$ might have occurred. So we will suppose $\\tau$ is from a discrete uniform distribution over the entire timespan.\n",
       "\n",
       "\n",
       "Below we give a graphical visualization of this, where arrows denote `parent-child` relationships. (provided by the [Daft Python library](http://daft-pgm.org/) )\n",
@@ -786,7 +786,7 @@
       "binomial = stats.binom\n",
       "\n",
       "parameters = [ (10, .4) , (10, .9) ]  \n",
-      "colors = [\"#348ABD\", \"#A60628\", \"#7A68A6\"]\n",
+      "colors = [\"#348ABD\", \"#A60628\"]\n",
       "\n",
       "for i in range(2):\n",
       "    N, p = parameters[i]\n",
@@ -797,7 +797,6 @@
       "             label = \"$N$: %d, $p$: %.1f\"%(N,p), \n",
       "             linewidth=3)\n",
       "    \n",
-      "\n",
       "plt.legend(loc=\"upper left\")\n",
       "plt.xlim(0, 10.5)\n",
       "plt.xlabel(\"$k$\")\n",
@@ -823,7 +822,7 @@
       "\n",
       "There is another connection between Bernoulli and Binomial random variables. If we have $X_1, X_2, ... , X_N$ Bernoulli random variables with the same $p$, then $Z = X_1 + X_2 + ... + X_N \\sim \\text{Binomial}(N, p )$.\n",
       "\n",
-      "The expected value of a Bernoulli random variable is $p$. This can be seen by noting the the more general Binomial random variable has expected value $Np$ and setting $N=1$."
+      "The expected value of a Bernoulli random variable is $p$. This can be seen by noting the more general Binomial random variable has expected value $Np$ and setting $N=1$."
      ]
     },
     {
@@ -1341,6 +1340,7 @@
       "\n",
       "def logistic( x, beta):\n",
       "    return 1.0/( 1.0 + np.exp( beta*x) )\n",
+      "\n",
       "x = np.linspace( -4, 4, 100 )\n",
       "plt.plot(x, logistic( x, 1), label = r\"$\\beta = 1$\")\n",
       "plt.plot(x, logistic( x, 3), label = r\"$\\beta = 3$\")\n",
@@ -1430,6 +1430,7 @@
      "collapsed": false,
      "input": [
       "import scipy.stats as stats\n",
+      "\n",
       "nor = stats.norm\n",
       "x = np.linspace( -8, 7, 150 )\n",
       "mu = (-2, 0, 3)\n",
@@ -1442,8 +1443,6 @@
       "        label =\"$\\mu = %d,\\;\\\\tau = %.1f$\"%(_mu, _tau), color = _color )\n",
       "    plt.fill_between( x, nor.pdf( x, _mu, scale =1./_tau ), color = _color, \\\n",
       "         alpha = .33)\n",
-      "    \n",
-      "\n",
       "\n",
       "plt.legend(loc = \"upper right\")\n",
       "plt.xlabel(\"$x$\")\n",
@@ -1474,7 +1473,7 @@
       "\n",
       "\n",
       "\n",
-      "Below we continue our modeling of the the Challenger space craft:"
+      "Below we continue our modeling of the Challenger space craft:"
      ]
     },
     {
@@ -1634,6 +1633,7 @@
      "collapsed": false,
      "input": [
       "figsize( 12.5, 4)\n",
+      "\n",
       "plt.plot( t, mean_prob_t, lw = 3, label = \"average posterior \\nprobability \\\n",
       "of defect\")\n",
       "plt.plot( t, p_t[0, :], ls=\"--\",label=\"realization from posterior\" )\n",
@@ -1671,6 +1671,7 @@
      "collapsed": false,
      "input": [
       "from scipy.stats.mstats import mquantiles\n",
+      "\n",
       "# vectorized bottom and top 5% quantiles for \"confidence interval\"\n",
       "qs = mquantiles(p_t,[0.05,0.95],axis=0)\n",
       "plt.fill_between(t[:,0],*qs,alpha = 0.7,\n",
@@ -1723,6 +1724,7 @@
      "collapsed": false,
      "input": [
       "figsize(12.5, 2.5)\n",
+      "\n",
       "prob_31 = logistic( 31, beta_samples, alpha_samples )\n",
       "\n",
       "plt.xlim( 0.995, 1)\n",
@@ -1746,7 +1748,7 @@
      "source": [
       "### Is our model appropriate?\n",
       "\n",
-      "The skeptical reader will say \"You delibrately chose the logistic function for $p(t)$ and the specific priors. Perhaps other functions or priors will give different results. How do I know I have chosen a good model?\" This is absolutely true. To consider an extreme situation, what if I had chosen the function $p(t) = 1,\\; \\forall t$, which guarantees a defect always occuring: I would have again predicted disaster on January 28th. Yet this is clearly a poorly chosen model. On the other hand, if I did choose the logistic function for $p(t)$, but specificed all my priors to be very tight around 0, likely we would have very different posterior distributions. How do we know our model is an expression of the data? This encourages us to measure the model's **goodness of fit**.\n",
+      "The skeptical reader will say \"You deliberately chose the logistic function for $p(t)$ and the specific priors. Perhaps other functions or priors will give different results. How do I know I have chosen a good model?\" This is absolutely true. To consider an extreme situation, what if I had chosen the function $p(t) = 1,\\; \\forall t$, which guarantees a defect always occurring: I would have again predicted disaster on January 28th. Yet this is clearly a poorly chosen model. On the other hand, if I did choose the logistic function for $p(t)$, but specificed all my priors to be very tight around 0, likely we would have very different posterior distributions. How do we know our model is an expression of the data? This encourages us to measure the model's **goodness of fit**.\n",
       "\n",
       "We can think: *how can we test whether our model is a bad fit?* An idea is to compare observed data (which if we recall is a *fixed* stochastic variable) with artifical dataset which we can simulate. The rational is that if the simulated dataset does not appear similar, statistically, to the observed dataset, then likely our model is not accurately represented the observed data. \n",
       "\n",
@@ -1768,7 +1770,6 @@
       "simulated = mc.Bernoulli( \"bernoulli_sim\", p)\n",
       "N = 10000\n",
       "\n",
-      "\n",
       "mcmc = mc.MCMC( [simulated, alpha, beta, observed ] )\n",
       "mcmc.sample( N )"
      ],
@@ -1798,6 +1799,7 @@
      "collapsed": false,
      "input": [
       "figsize(12.5, 5)\n",
+      "\n",
       "simulations = mcmc.trace(\"bernoulli_sim\")[:]\n",
       "print simulations.shape\n",
       "\n",
@@ -1833,7 +1835,7 @@
       "\n",
       "We wish to assess how good our model is. \"Good\" is a subjective term of course, so results must be relative to other models. \n",
       "\n",
-      "We will be doing this graphically as well, which may seem like an even less objective method. The alternative is to use *Bayesian p-values*. These are still subjective, as the proper cutoff between good and bad is artibitrary. Gelman emphasises that the graphical tests are more illuminating [7] than p-value tests. We agree.\n",
+      "We will be doing this graphically as well, which may seem like an even less objective method. The alternative is to use *Bayesian p-values*. These are still subjective, as the proper cutoff between good and bad is arbitrary. Gelman emphasises that the graphical tests are more illuminating [7] than p-value tests. We agree.\n",
       "\n",
       "The following graphical test is a novel data-viz approach to logistic regression. The plots are called *separation plots*[8]. For a suite of models we wish to compare, each model is plotted on an individual separation plot. I leave most of the technical details about separation plots to the very accessible [original paper](http://mdwardlab.com/sites/default/files/GreenhillWardSacks.pdf), but I'll summarize their use here.\n",
       "\n",
@@ -1962,9 +1964,10 @@
      "cell_type": "code",
      "collapsed": false,
      "input": [
-      "figsize( 11., 1.5 )\n",
       "from separation_plot import separation_plot\n",
       "\n",
+      "figsize( 11., 1.5 )\n",
+      "\n",
       "separation_plot(posterior_probability, D )"
      ],
      "language": "python",
@@ -2046,7 +2049,7 @@
      "source": [
       "In the random model, we can see that as the probability increases there is no clustering of defects to the right-hand side. Similarly for the constant model.\n",
       "\n",
-      "The the perfect model, the probability line is not well shown, as it is stuck to the bottom and top of the figure. Of course the perfect model is only for demonstration, and we cannot infer any scientific inference from it."
+      "The perfect model, the probability line is not well shown, as it is stuck to the bottom and top of the figure. Of course the perfect model is only for demonstration, and we cannot infer any scientific inference from it."
      ]
     },
     {
@@ -2071,6 +2074,7 @@
      "input": [
       "#type your code here.\n",
       "figsize(12.5, 4 )\n",
+      "\n",
       "plt.scatter( alpha_samples, beta_samples, alpha = 0.1 )\n",
       "plt.title( \"Why does the plot look like this?\" )\n",
       "plt.xlabel( r\"$\\alpha$\")\n",
@@ -2200,4 +2204,4 @@
    "metadata": {}
   }
  ]
-}
+}