You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Prologue/Prologue.ipynb
+38-51Lines changed: 38 additions & 51 deletions
Original file line number
Diff line number
Diff line change
@@ -22,22 +22,27 @@
22
22
"cell_type": "markdown",
23
23
"metadata": {},
24
24
"source": [
25
-
"The Bayesian method is the natural approach to inference, yet it is hidden from readers behind chapters of slow, mathematical analysis. The typical text on Bayesian inference involves two to three chapters on probability theory, then enters what Bayesian inference is. Unfortunately, due to mathematical intractability of most Bayesian models, the reader is only shown simple, artificial examples. This can leave the user with a *so-what* feeling about Bayesian inference. In fact, this was the author's own prior opinion.\n",
25
+
"###Probabilistic Programming & Bayesian Methods for Hackers \n",
26
+
"#### *Using Python and PyMC*\n",
27
+
"\n",
28
+
"\n",
26
29
"\n",
27
30
"\n",
28
-
"<div style=\"float: right; margin-left:30px\"><img title=\"created by Stef Gibson at StefGibson.com\"style=\"float: right;\" src=\"http://i.imgur.com/6DKYbPb.png?1\" align=right height = 390 /></div>\n",
31
+
"The Bayesian method is the natural approach to inference, yet it is hidden from readers behind chapters of slow, mathematical analysis. The typical text on Bayesian inference involves two to three chapters on probability theory, then enters what Bayesian inference is. Unfortunately, due to mathematical intractability of most Bayesian models, the reader is only shown simple, artificial examples. This can leave the user with a *so-what* feeling about Bayesian inference. In fact, this was the author's own prior opinion.\n",
32
+
"\n",
29
33
"\n",
34
+
"<div style=\"float: right; margin-left: 30px;\"><img title=\"created by Stef Gibson at StefGibson.com\"style=\"float: right;margin-left: 30px;\" src=\"http://i.imgur.com/6DKYbPb.png?1\" align=right height = 390 /></div>\n",
30
35
"\n",
31
36
"After some recent success of Bayesian methods in machine-learning competitions, I decided to investigate the subject again. Even with my mathematical background, it took me three straight-days of reading examples and trying to put the pieces together to understand the methods. There was simply not enough literature bridging theory to practice. The problem with my misunderstanding was the disconnect between Bayesian mathematics and probabilistic programming. That being said, I suffered then so the reader would not have to now. This book attempts to bridge the gap.\n",
32
37
"\n",
33
-
"If Bayesian inference is the destination, then mathematical analysis is a particular path to it. On the other hand, computing power is cheap enough that we can afford to take an alternate route via probabilistic programming. The path is much more useful, as it denies the necessity of mathematical intervention at each step, that is, we remove often-intractable mathematical analysis as a prerequisite to Bayesian inference. Simply put, this computational path proceeds via small intermediate jumps from beginning to end, where as the first path proceeds by enormous leaps, often landing far away from our target. Furthermore, without a strong mathematical background, the analysis required by the first path cannot even take place.\n",
38
+
"If Bayesian inference is the destination, then mathematical analysis is a particular path to towards it. On the other hand, computing power is cheap enough that we can afford to take an alternate route via probabilistic programming. The latter path is much more useful, as it denies the necessity of mathematical intervention at each step, that is, we remove often-intractable mathematical analysis as a prerequisite to Bayesian inference. Simply put, this latter computational path proceeds via small intermediate jumps from beginning to end, where as the first path proceeds by enormous leaps, often landing far away from our target. Furthermore, without a strong mathematical background, the analysis required by the first path cannot even take place.\n",
34
39
"\n",
35
-
"*Bayesian Methods for Hackers* is designed as a introduction to Bayesian inference from a computational/understanding-first, and mathematics-second, point of view. Of course as an introductory book, we can only leave it at that: an introductory book. For the mathematically trained, they may cure their curiosity this text generates with other texts designed with mathematical analysis in mind. For the enthusiast with less mathematical-background, or one who is not interested in the mathematics but simply the practice of Bayesian methods, this text should be sufficient and entertaining.\n",
40
+
"*Bayesian Methods for Hackers* is designed as a introduction to Bayesian inference from a computational/understanding-first, and mathematics-second, point of view. Of course as an introductory book, we can only leave it at that: an introductory book. For the mathematically trained, they may cure the curiosity this text generates with other texts designed with mathematical analysis in mind. For the enthusiast with less mathematical-background, or one who is not interested in the mathematics but simply the practice of Bayesian methods, this text should be sufficient and entertaining.\n",
36
41
"\n",
37
42
"\n",
38
43
"The choice of PyMC as the probabilistic programming language is two-fold. As of this writing, there is currently no central resource for examples and explanations in the PyMC universe. The official documentation assumes prior knowledge of Bayesian inference and probabilistic programming. We hope this book encourages users at every level to look at PyMC. Secondly, with recent core developments and popularity of the scientific stack in Python, PyMC is likely to become a core component soon enough.\n",
39
44
"\n",
40
-
"PyMC does have dependencies to run, namely NumPy and (optionally) SciPy. To not limit the user, the examples in this book will rely only on PyMC, NumPyand SciPy only.\n",
45
+
"PyMC does have dependencies to run, namely NumPy and (optionally) SciPy. To not limit the user, the examples in this book will rely only on PyMC, NumPy, SciPy and Matplotlib only.\n",
41
46
"\n",
42
47
"\n",
43
48
"Contents\n",
@@ -48,7 +53,7 @@
48
53
"Interactive notebooks + examples can be downloaded by cloning! )\n",
49
54
"\n",
50
55
"\n",
51
-
"* [**Prologue.**](http://nbviewer.ipython.org/urls/raw.github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/master/Prologue/Prologue.ipynb) Why we do it.\n",
56
+
"* [**Prologue:**](http://nbviewer.ipython.org/urls/raw.github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/master/Prologue/Prologue.ipynb) Why we do it.\n",
52
57
"\n",
53
58
"* [**Chapter 1: Introduction to Bayesian Methods**](http://nbviewer.ipython.org/urls/raw.github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/master/Chapter1_Introduction/Chapter1_Introduction.ipynb)\n",
54
59
" Introduction to the philosophy and practice of Bayesian methods and answering the question \"What is probabilistic programming?\" Examples include:\n",
@@ -69,23 +74,23 @@
69
74
" - How to sort Reddit comments from best to worst (not as easy as you think)\n",
70
75
"\n",
71
76
"* [**Chapter 5: Would you rather loss an arm or a leg?**](http://nbviewer.ipython.org/urls/raw.github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/master/Chapter5_LossFunctions/LossFunctions.ipynb)\n",
72
-
" The introduction of Loss functions and there (awesome) use in Bayesian methods. Examples include:\n",
77
+
" The introduction of Loss functions and their (awesome) use in Bayesian methods. Examples include:\n",
73
78
" - Solving the Price is Right's Showdown\n",
74
79
" - Optimizing financial predictions\n",
75
80
" - Winning solution to the Kaggle Dark World's competition.\n",
" Probably the most important chapter. We draw on expert opinions to answer questions. Examples include:\n",
84
+
" - Multi-Armed Bandits and the Bayesian Bandit solution.\n",
81
85
" - what is the relationship between data sample size and prior?\n",
86
+
" - estimating financial unknowns using expert priors.\n",
82
87
"\n",
83
88
" We explore useful tips to be objective in analysis, and common pitfalls of priors. \n",
84
89
"\n",
85
90
"* **Chapter X1: Bayesian Markov Models**\n",
86
91
"\n",
87
92
"* **Chapter X2: Bayesian methods in Machine Learning** \n",
88
-
" We explore how to resolve the overfitting problem plus popular ML methods. Also included are probabilistic explanations of Ridge Regression and LASSO Regression.\n",
93
+
" We explore how to resolve the overfitting problem plus popular ML methods. Also included are probablistic explainations of Ridge Regression and LASSO Regression.\n",
89
94
" - Bayesian spam filtering plus *how to defeat Bayesian spam filtering*\n",
90
95
" - Tim Saliman's winning solution to Kaggle's *Don't Overfit* problem \n",
91
96
"\n",
@@ -97,72 +102,45 @@
97
102
"\n",
98
103
"\n",
99
104
"**More questions about PyMC?**\n",
100
-
"Please post your modeling, convergence, or any other PyMC question on [cross-validated](http://stats.stackexchange.com/), the statistics stack-exchange.\n",
105
+
"Please post your modeling, convergence, or any other PyMC question on [cross-validated](http://stats.stackexchange.com/), the statistcs stack-exchange.\n",
101
106
"\n",
102
107
"\n",
103
108
"Using the book\n",
104
109
"-------\n",
105
110
"\n",
106
111
"The book can be read in three different ways, starting from most recommended to least recommended: \n",
107
112
"\n",
108
-
"1. The most recommended option is to clone the repository and download the .ipynb files to your local machine. If you have IPython installed, you can view the \n",
113
+
"1. The most recommended option is to clone the repository to download the .ipynb files to your local machine. If you have IPython installed, you can view the \n",
109
114
"chapters in your browser *plus* edit and run the code provided (and try some practice questions). This is the preferred option to read\n",
110
115
"this book, though it comes with some dependencies. \n",
111
-
" - IPython 0.13 is a requirement to view the ipynb files. It can be downloaded [here](http://ipython.org/ipython-doc/dev/install/index.html)\n",
112
-
" - For Linux users, you should not have a problem installing Numpy, Scipy and PyMC. For Windows users, check out [pre-compiled versions](http://www.lfd.uci.edu/~gohlke/pythonlibs/) if you have difficulty. \n",
113
-
" - In the styles/ directory are a number of files that used to make things pretty. These are not only designed for the book, but they offer many improvements over the default settings of matplotlib and the IPython notebook.\n",
114
-
"\n",
116
+
" - IPython 0.13 is a requirement to view the ipynb files. It can be downloaded [here](http://ipython.org/)\n",
117
+
" - For Linux users, you should not have a problem installing Numpy, Scipy, Matplotlib and PyMC. For Windows users, check out [pre-compiled versions](http://www.lfd.uci.edu/~gohlke/pythonlibs/) if you have difficulty. \n",
118
+
" - In the styles/ directory are a number of files (.matplotlirc) that used to make things pretty. These are not only designed for the book, but they offer many improvements over the default settings of matplotlib and the IPython notebook.\n",
119
+
" - while technically not required, it may help to run the IPython notebook with `--pylab inline` if you encounter runtime errors.\n",
115
120
"2. The second, preferred, option is to use the nbviewer.ipython.org site, which display IPython notebooks in the browser ([example](http://nbviewer.ipython.org/urls/raw.github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/master/Chapter1_Introduction/Chapter1_Introduction.ipynb)).\n",
116
121
"The contents are updated synchronously as commits are made to the book. You can use the Contents section above to link to the chapters.\n",
117
122
"\n",
118
-
"3. The most traditional approach, but also not recommended, is to read the chapters as PDFs contained in the `previews` folder. The content\n",
119
-
"in these PDFs is not guaranteed to be the most recent content as the PDFs are only compiled periodically. Similarly, the book will not be\n",
120
-
"interactive.\n",
123
+
"3. **PDF versions are coming.** PDFs are the least-prefered method to read the book, as pdf's are static and non-interactive. If PDFs are desired, they can be created dynamically using Chrome's builtin print-to-pdf feature.\n",
121
124
"\n",
122
125
"\n",
123
126
"Installation and configuration\n",
124
127
"------\n",
125
128
"If you would like to run the IPython notebooks locally, (option 1. above), you'll need to install the following:\n",
129
+
"\n",
126
130
"- IPython 0.13 is a requirement to view the ipynb files. It can be downloaded [here](http://ipython.org/ipython-doc/dev/install/index.html)\n",
127
131
"- For Linux users, you should not have a problem installing Numpy, Scipy and PyMC. For Windows users, check out [pre-compiled versions](http://www.lfd.uci.edu/~gohlke/pythonlibs/) if you have difficulty. \n",
132
+
" - also recommended, for data-mining exercises, are [PRAW](https://github.com/praw-dev/praw) and [requests](https://github.com/kennethreitz/requests). \n",
128
133
"- In the styles/ directory are a number of files that are customized for the notebook. \n",
129
134
"These are not only designed for the book, but they offer many improvements over the \n",
130
135
"default settings of matplotlib and the IPython notebook. The in notebook style has not been finalized yet.\n",
131
-
"- Currently the formatting of the style is not set, so try to follow what has been used so far, but inconsistencies are fine. \n",
132
-
"\n",
133
-
"\n",
134
-
"\n",
135
-
"Development\n",
136
-
"------\n",
137
-
"\n",
138
-
"This book has an unusual development design. The content is open-sourced, meaning anyone can be an author. \n",
139
-
"Authors submit content or revisions using the GitHub interface. After a major revision or addition, we collect all the content, compile it to a \n",
140
-
"PDF, and increment the version of *Probabilistic Programming and Bayesian Methods for Hackers*. \n",
141
-
"\n",
142
-
"\n",
143
-
"\n",
144
-
"### How to contribute\n",
145
-
"\n",
146
-
"####What to contribute?\n",
147
-
"\n",
148
-
"- The current chapter list is not finalized. If you see something that is missing (MCMC, MAP, Bayesian networks, good prior choices, Potential classes etc.),\n",
149
-
"feel free to start there. \n",
150
-
"- Cleaning up Python code and making code more PyMC-esque.\n",
151
-
"- Giving better explanations\n",
152
-
"- Contributing to the IPython notebook styles.\n",
153
-
"\n",
154
-
"\n",
155
-
"####Commiting\n",
156
-
"\n",
157
-
"- All commits are welcome, even if they are minor ;)\n",
158
-
"- If you are unfamiliar with Github, you can email me contributions to the email below.\n",
159
136
"\n",
160
137
"\n",
161
138
"Contributions and Thanks\n",
162
139
"-----\n",
163
140
"\n",
164
141
"\n",
165
142
"Thanks to all our contributing authors, including (in chronological order):\n",
0 commit comments