@@ -123,14 +123,17 @@ get:
123123 \sum_h Q(h^{(1)}|x)p(h^{(1)})
124124
125125Optimizing this with respect to :math:`W^{(2)}` amounts to training a second-stage
126- RBM, using the output of :math:`Q(h^{(1)}|x)` as the training distribution.
126+ RBM, using the output of :math:`Q(h^{(1)}|x)` as the training distribution,
127+ when :math:`x` is sampled from the training distribution for the first RBM.
127128
128129Implementation
129130++++++++++++++
130131
131132To implement DBNs in Theano, we will use the class defined in the :doc:`rbm`
132- tutorial. As an observation, the code for the DBN is very similar with the one
133- for SdA. The main difference is that we use the RBM class instead of the dA
133+ tutorial. One can also observe that the code for the DBN is very similar with the one
134+ for SdA, because both involve the principle of unsupervised layer-wise
135+ pre-training followed by supervised fine-tuning as a deep MLP.
136+ The main difference is that we use the RBM class instead of the dA
134137class.
135138
136139We start off by defining the DBN class which will store the layers of the
@@ -181,7 +184,7 @@ classification.
181184 self.y = T.ivector('y') # the labels are presented as 1D vector of
182185 # [int] labels
183186
184- ``self.sigmoid_layers`` will store the feed-forward graph which together form
187+ ``self.sigmoid_layers`` will store the feed-forward graphs which together form
185188the MLP, while ``self.rbm_layers`` will store the RBMs used to pretrain each
186189layer of the MLP.
187190
@@ -191,7 +194,7 @@ that we replaced the non-linearity from ``tanh`` to the logistic function
191194:math:`s(x) = \frac{1}{1+e^{-x}}`) and ``n_layers`` RBMs, where ``n_layers``
192195is the depth of our model. We link the sigmoid layers such that they form an
193196MLP, and construct each RBM such that they share the weight matrix and the
194- bias with its corresponding sigmoid layer.
197+ hidden bias with its corresponding sigmoid layer.
195198
196199
197200.. code-block:: python
@@ -214,10 +217,10 @@ bias with its corresponding sigmoid layer.
214217 layer_input = self.sigmoid_layers[-1].output
215218
216219 sigmoid_layer = HiddenLayer(rng = numpy_rng,
217- input = layer_input,
218- n_in = input_size,
219- n_out = hidden_layers_sizes[i],
220- activation = T.nnet.sigmoid)
220+ input = layer_input,
221+ n_in = input_size,
222+ n_out = hidden_layers_sizes[i],
223+ activation = T.nnet.sigmoid)
221224
222225 # add the layer to our list of layers
223226 self.sigmoid_layers.append(sigmoid_layer)
@@ -229,11 +232,11 @@ bias with its corresponding sigmoid layer.
229232
230233 # Construct an RBM that shared weights with this layer
231234 rbm_layer = RBM(numpy_rng = numpy_rng, theano_rng = theano_rng,
232- input = layer_input,
233- n_visible = input_size,
234- n_hidden = hidden_layers_sizes[i],
235- W = sigmoid_layer.W,
236- hbias = sigmoid_layer.b)
235+ input = layer_input,
236+ n_visible = input_size,
237+ n_hidden = hidden_layers_sizes[i],
238+ W = sigmoid_layer.W,
239+ hbias = sigmoid_layer.b)
237240 self.rbm_layers.append(rbm_layer)
238241
239242
@@ -251,6 +254,7 @@ form an MLP. We will use the ``LogisticRegression`` class introduced in
251254
252255 # construct a function that implements one step of fine-tuning compute the cost for
253256 # second phase of training, defined as the negative log likelihood
257+ # of the logistic regression (output) layer
254258 self.finetune_cost = self.logLayer.negative_log_likelihood(self.y)
255259
256260 # compute the gradients with respect to the model parameters
0 commit comments