Skip to content

Commit c7a473d

Browse files
author
Yoshua Bengio
committed
comment clarification
1 parent 4e9fce9 commit c7a473d

2 files changed

Lines changed: 19 additions & 15 deletions

File tree

code/DBN.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,7 @@ def __init__(self, numpy_rng, theano_rng = None, n_ins = 784,
118118
self.params.extend(self.logLayer.params)
119119

120120
# compute the cost for second phase of training, defined as the
121-
# negative log likelihood
121+
# negative log likelihood of the logistic regression (output) layer
122122
self.finetune_cost = self.logLayer.negative_log_likelihood(self.y)
123123

124124
# compute the gradients with respect to the model parameters

doc/DBN.txt

Lines changed: 18 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -123,14 +123,17 @@ get:
123123
\sum_h Q(h^{(1)}|x)p(h^{(1)})
124124

125125
Optimizing this with respect to :math:`W^{(2)}` amounts to training a second-stage
126-
RBM, using the output of :math:`Q(h^{(1)}|x)` as the training distribution.
126+
RBM, using the output of :math:`Q(h^{(1)}|x)` as the training distribution,
127+
when :math:`x` is sampled from the training distribution for the first RBM.
127128

128129
Implementation
129130
++++++++++++++
130131

131132
To implement DBNs in Theano, we will use the class defined in the :doc:`rbm`
132-
tutorial. As an observation, the code for the DBN is very similar with the one
133-
for SdA. The main difference is that we use the RBM class instead of the dA
133+
tutorial. One can also observe that the code for the DBN is very similar with the one
134+
for SdA, because both involve the principle of unsupervised layer-wise
135+
pre-training followed by supervised fine-tuning as a deep MLP.
136+
The main difference is that we use the RBM class instead of the dA
134137
class.
135138

136139
We start off by defining the DBN class which will store the layers of the
@@ -181,7 +184,7 @@ classification.
181184
self.y = T.ivector('y') # the labels are presented as 1D vector of
182185
# [int] labels
183186

184-
``self.sigmoid_layers`` will store the feed-forward graph which together form
187+
``self.sigmoid_layers`` will store the feed-forward graphs which together form
185188
the MLP, while ``self.rbm_layers`` will store the RBMs used to pretrain each
186189
layer of the MLP.
187190

@@ -191,7 +194,7 @@ that we replaced the non-linearity from ``tanh`` to the logistic function
191194
:math:`s(x) = \frac{1}{1+e^{-x}}`) and ``n_layers`` RBMs, where ``n_layers``
192195
is the depth of our model. We link the sigmoid layers such that they form an
193196
MLP, and construct each RBM such that they share the weight matrix and the
194-
bias with its corresponding sigmoid layer.
197+
hidden bias with its corresponding sigmoid layer.
195198

196199

197200
.. code-block:: python
@@ -214,10 +217,10 @@ bias with its corresponding sigmoid layer.
214217
layer_input = self.sigmoid_layers[-1].output
215218

216219
sigmoid_layer = HiddenLayer(rng = numpy_rng,
217-
input = layer_input,
218-
n_in = input_size,
219-
n_out = hidden_layers_sizes[i],
220-
activation = T.nnet.sigmoid)
220+
input = layer_input,
221+
n_in = input_size,
222+
n_out = hidden_layers_sizes[i],
223+
activation = T.nnet.sigmoid)
221224

222225
# add the layer to our list of layers
223226
self.sigmoid_layers.append(sigmoid_layer)
@@ -229,11 +232,11 @@ bias with its corresponding sigmoid layer.
229232

230233
# Construct an RBM that shared weights with this layer
231234
rbm_layer = RBM(numpy_rng = numpy_rng, theano_rng = theano_rng,
232-
input = layer_input,
233-
n_visible = input_size,
234-
n_hidden = hidden_layers_sizes[i],
235-
W = sigmoid_layer.W,
236-
hbias = sigmoid_layer.b)
235+
input = layer_input,
236+
n_visible = input_size,
237+
n_hidden = hidden_layers_sizes[i],
238+
W = sigmoid_layer.W,
239+
hbias = sigmoid_layer.b)
237240
self.rbm_layers.append(rbm_layer)
238241

239242

@@ -251,6 +254,7 @@ form an MLP. We will use the ``LogisticRegression`` class introduced in
251254

252255
# construct a function that implements one step of fine-tuning compute the cost for
253256
# second phase of training, defined as the negative log likelihood
257+
# of the logistic regression (output) layer
254258
self.finetune_cost = self.logLayer.negative_log_likelihood(self.y)
255259

256260
# compute the gradients with respect to the model parameters

0 commit comments

Comments
 (0)