Skip to content

Commit 5110d5f

Browse files
committed
more comments to code
1 parent 69c8ee9 commit 5110d5f

1 file changed

Lines changed: 25 additions & 15 deletions

File tree

doc/SdA.txt

Lines changed: 25 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ tutorial with a short digression on :ref:`autoencoders`
1313
and then move on to how classical
1414
autoencoders are extended to denoising autoencoders (:ref:`dA`).
1515
Throughout the following subchapters we will stick as close as possible to
16-
the original paper ( [Vincent08]_ ).
16+
the original paper ( [Vincent08] ).
1717

1818

1919
.. _autoencoders:
@@ -103,9 +103,15 @@ signal :
103103

104104
.. code-block:: python
105105

106-
self.y = T.nnet.sigmoid(T.dot(x, self.W ) + self.b)
107-
z = T.nnet.sigmoid(T.dot(self.y, self.W_prime) + self.b_prime)
108-
self.L = - T.sum( x*T.log(z) + (1-x)*T.log(1-z), axis=1 )
106+
self.y = T.nnet.sigmoid(T.dot(self.x, self.W ) + self.b)
107+
self.z = T.nnet.sigmoid(T.dot(self.y, self.W_prime) + self.b_prime)
108+
# note : we sum over the size of a datapoint; if we are using minibatches,
109+
# L will be a vector, with one entry per example in minibatch
110+
self.L = - T.sum( self.x*T.log(self.z) + (1-self.x)*T.log(1-self.z), axis=1 )
111+
# note : L is now a vector, where each element is the cross-entropy cost
112+
# of the reconstruction of the corresponding example of the
113+
# minibatch. We need to compute the average of all these to get
114+
# the cost of the minibatch
109115
self.cost = T.mean(self.L)
110116

111117
Training the autoencoder consist now in updating the parameters ``W``,
@@ -121,7 +127,7 @@ cost is minimized.
121127

122128
Note that for the stacked denoising autoencoder we will not use the
123129
``train`` function as defined here, this is here just to illustrate how
124-
the autoencoder would work. In [Bengio07]_ autoencoders are used to
130+
the autoencoder would work. In [Bengio07] autoencoders are used to
125131
build deep networks.
126132

127133

@@ -136,7 +142,7 @@ This can be understood from different perspectives
136142
stochastic operator perspective,
137143
bottom-up -- information theoretic perspective,
138144
top-down -- generative model perspective ), all of which are explained in
139-
[Vincent08]_.
145+
[Vincent08].
140146

141147

142148
To convert the autoencoder class into a denoising autoencoder one, all we
@@ -192,14 +198,14 @@ The final denoising autoencoder class becomes :
192198
if input == None :
193199
# we use a matrix because we expect a minibatch of several examples,
194200
# each example being a row
195-
x = T.dmatrix(name = 'input')
201+
self.x = T.dmatrix(name = 'input')
196202
else:
197-
x = input
203+
self.x = input
198204

199-
tilde_x = theano_rng.binomial( x.shape, 1, 0.9) * x
200-
self.y = T.nnet.sigmoid(T.dot(tilde_x, self.W ) + self.b)
201-
z = T.nnet.sigmoid(T.dot(self.y, self.W_prime) + self.b_prime)
202-
self.L = - T.sum( x*T.log(z) + (1-x)*T.log(1-z), axis=1 )
205+
self.tilde_x = theano_rng.binomial( self.x.shape, 1, 0.9) * self.x
206+
self.y = T.nnet.sigmoid(T.dot(self.tilde_x, self.W ) + self.b)
207+
self.z = T.nnet.sigmoid(T.dot(self.y, self.W_prime) + self.b_prime)
208+
self.L = - T.sum( self.x*T.log(self.z) + (1-self.x)*T.log(1-self.z), axis=1 )
203209
# note : L is now a vector, where each element is the cross-entropy cost
204210
# of the reconstruction of the corresponding example of the
205211
# minibatch. We need to compute the average of all these to get
@@ -209,7 +215,7 @@ The final denoising autoencoder class becomes :
209215
# we will need the hidden layer obtained from the uncorrupted
210216
# input when for example we will pass this as input to the layer
211217
# above
212-
self.hidden_values = T.nnet.sigmoid( T.dot(x, self.W) + self.b)
218+
self.hidden_values = T.nnet.sigmoid( T.dot(self.x, self.W) + self.b)
213219

214220

215221

@@ -433,7 +439,11 @@ TODO
433439
References
434440
++++++++++
435441

436-
.. [Vincent08] Vincent, P., Larochelle H., Bengio Y. and Manzagol P.A.(2008).Extracting and Composing Robust Features with Denoising Autoencoders. ICML'08, pp. 1096 - 1103
442+
.. [Vincent08] Vincent, P., Larochelle H., Bengio Y. and Manzagol P.A. `Extracting and Composing Robust Features with Denoising Autoencoders`_. Proceedings of the Twenty-fifth International Confrence on Machine Learning (ICML'08), pages 1096 - 1103, ACM, 2008
437443

438-
.. [Bengio07] Bengio Y., Lamblin P., Popovici D. and Larochelle H.(2007). Greedy Layer-Wise Training of Deep Networks. NIPS'06, pp 153-160
444+
.. [Bengio07] Bengio Y., Lamblin P., Popovici D. and Larochelle H. `Greedy Layer-Wise Training of Deep Networks`_. Advances in Neural Information Processing Systems 19 (NIPS'06), pages 153-160, MIT Press 2007
439445

446+
447+
.. _Extracting and Composing Robust Features with Denoising Autoencoders: http://www.iro.umontreal.ca/~lisa/publications2/index.php/publications/show/217
448+
449+
.. _Greedy Layer-Wise Training of Deep Networks: http://www.iro.umontreal.ca/~lisa/publications2/index.php/publications/show/190

0 commit comments

Comments
 (0)