comment clarification

Yoshua Bengio · Yoshua Bengio · commit c7a473da77fd · 2010-03-28T11:02:49.000-04:00
diff --git a/code/DBN.py b/code/DBN.py
@@ -118,7 +118,7 @@ def __init__(self, numpy_rng, theano_rng = None, n_ins = 784,
         self.params.extend(self.logLayer.params)
 
         # compute the cost for second phase of training, defined as the 
-        # negative log likelihood 
+        # negative log likelihood of the logistic regression (output) layer
         self.finetune_cost = self.logLayer.negative_log_likelihood(self.y)
 
         # compute the gradients with respect to the model parameters
diff --git a/doc/DBN.txt b/doc/DBN.txt
@@ -123,14 +123,17 @@ get:
     \sum_h Q(h^{(1)}|x)p(h^{(1)})
     
 Optimizing this with respect to :math:`W^{(2)}` amounts to training a second-stage
-RBM, using the output of :math:`Q(h^{(1)}|x)` as the training distribution.
+RBM, using the output of :math:`Q(h^{(1)}|x)` as the training distribution,
+when :math:`x` is sampled from the training distribution for the first RBM.
 
 Implementation
 ++++++++++++++
 
 To implement DBNs in Theano, we will use the class defined in the :doc:`rbm`
-tutorial. As an observation, the code for the DBN is very similar with the one
-for SdA. The main difference is that we use the RBM class instead of the dA
+tutorial. One can also observe that the code for the DBN is very similar with the one
+for SdA, because both involve the principle of unsupervised layer-wise
+pre-training followed by supervised fine-tuning as a deep MLP. 
+The main difference is that we use the RBM class instead of the dA
 class.
 
 We start off by defining the DBN class which will store the layers of the 
@@ -181,7 +184,7 @@ classification.
             self.y  = T.ivector('y') # the labels are presented as 1D vector of 
                                      # [int] labels
 
-``self.sigmoid_layers`` will store the feed-forward graph which together form
+``self.sigmoid_layers`` will store the feed-forward graphs which together form
 the MLP, while ``self.rbm_layers`` will store the RBMs used to pretrain each
 layer of the MLP.
 
@@ -191,7 +194,7 @@ that we replaced the non-linearity from ``tanh`` to the logistic function
 :math:`s(x) = \frac{1}{1+e^{-x}}`) and ``n_layers`` RBMs, where ``n_layers``
 is the depth of our model.  We link the sigmoid layers such that they form an
 MLP, and construct each RBM such that they share the weight matrix and the
-bias with its corresponding sigmoid layer.
+hidden bias with its corresponding sigmoid layer.
 
 
 .. code-block:: python
@@ -214,10 +217,10 @@ bias with its corresponding sigmoid layer.
                 layer_input = self.sigmoid_layers[-1].output
 
             sigmoid_layer = HiddenLayer(rng   = numpy_rng, 
-                                           input = layer_input, 
-                                           n_in  = input_size, 
-                                           n_out = hidden_layers_sizes[i],
-                                           activation = T.nnet.sigmoid)
+                                        input = layer_input, 
+                                        n_in  = input_size, 
+                                        n_out = hidden_layers_sizes[i],
+                                        activation = T.nnet.sigmoid)
             
             # add the layer to our list of layers 
             self.sigmoid_layers.append(sigmoid_layer)
@@ -229,11 +232,11 @@ bias with its corresponding sigmoid layer.
         
             # Construct an RBM that shared weights with this layer
             rbm_layer = RBM(numpy_rng = numpy_rng, theano_rng = theano_rng, 
-                          input = layer_input, 
-                          n_visible = input_size, 
-                          n_hidden  = hidden_layers_sizes[i],  
-                          W = sigmoid_layer.W, 
-                          hbias = sigmoid_layer.b)
+                            input = layer_input, 
+                            n_visible = input_size, 
+                            n_hidden  = hidden_layers_sizes[i],  
+                            W = sigmoid_layer.W, 
+                            hbias = sigmoid_layer.b)
             self.rbm_layers.append(rbm_layer)        
 
 
@@ -251,6 +254,7 @@ form an MLP. We will use the ``LogisticRegression`` class introduced in
 
         # construct a function that implements one step of fine-tuning compute the cost for
         # second phase of training, defined as the negative log likelihood 
+        # of the logistic regression (output) layer
         self.finetune_cost = self.logLayer.negative_log_likelihood(self.y)
 
         # compute the gradients with respect to the model parameters