Skip to content

Commit 3da5edb

Browse files
committed
Merge pull request lisa-lab#21 from lisa-lab/fix_iter_index
Change reported script outputs after changing scripts
2 parents a8ec4be + 6a6ab11 commit 3da5edb

File tree

1 file changed

+31
-31
lines changed

1 file changed

+31
-31
lines changed

doc/lenet.txt

Lines changed: 31 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Convolutional Neural Networks (LeNet)
66
.. note::
77
This section assumes the reader has already read through :doc:`logreg` and
88
:doc:`mlp`. Additionally, it uses the following new Theano functions and concepts:
9-
`T.tanh`_, `shared variables`_, `basic arithmetic ops`_, `T.grad`_,
9+
`T.tanh`_, `shared variables`_, `basic arithmetic ops`_, `T.grad`_,
1010
`floatX`_, `downsample`_ , `conv2d`_, `dimshuffle`_. If you intend to run the
1111
code on GPU also read `GPU`_.
1212

@@ -33,7 +33,7 @@ Convolutional Neural Networks (LeNet)
3333

3434
.. _floatX: http://deeplearning.net/software/theano/library/config.html#config.floatX
3535

36-
.. _GPU: http://deeplearning.net/software/theano/tutorial/using_gpu.html
36+
.. _GPU: http://deeplearning.net/software/theano/tutorial/using_gpu.html
3737

3838
.. _downsample: http://deeplearning.net/software/theano/library/tensor/signal/downsample.html
3939

@@ -84,7 +84,7 @@ contiguous receptive fields. We can illustrate this graphically as follows:
8484
Imagine that layer **m-1** is the input retina.
8585
In the above, units in layer **m**
8686
have receptive fields of width 3 with respect to the input retina and are thus only
87-
connected to 3 adjacent neurons in the layer below (the retina).
87+
connected to 3 adjacent neurons in the layer below (the retina).
8888
Units in layer **m** have
8989
a similar connectivity with the layer below. We say that their receptive
9090
field with respect to the layer below is also 3, but their receptive field
@@ -135,7 +135,7 @@ feature map :math:`h^k` is obtained as follows (for :math:`tanh` non-linearities
135135
.. math::
136136
h^k_{ij} = \tanh ( (W^k * x)_{ij} + b_k ).
137137

138-
.. Note::
138+
.. Note::
139139
Recall the following definition of convolution for a 1D signal.
140140
:math:`o[n] = f[n]*g[n] = \sum_{u=-\infty}^{\infty} f[u] g[u-n] = \sum_{u=-\infty}^{\infty} f[n-u] g[u]`.
141141

@@ -144,10 +144,10 @@ feature map :math:`h^k` is obtained as follows (for :math:`tanh` non-linearities
144144

145145
To form a richer representation of the data, hidden layers are composed of
146146
a set of multiple feature maps, :math:`\{h^{(k)}, k=0..K\}`.
147-
The weights :math:`W` of this layer can be parametrized as a 4D tensor
147+
The weights :math:`W` of this layer can be parametrized as a 4D tensor
148148
(destination feature map index, source feature map index, source vertical position index, source horizontal position index)
149149
and
150-
the biases :math:`b` as a vector (one element per destination feature map index).
150+
the biases :math:`b` as a vector (one element per destination feature map index).
151151
We illustrate this graphically as follows:
152152

153153
.. figure:: images/cnn_explained.png
@@ -167,7 +167,7 @@ input feature maps, while the other two refer to the pixel coordinates.
167167

168168
Putting it all together, :math:`W^{kl}_{ij}` denotes the weight connecting
169169
each pixel of the k-th feature map at layer m, with the pixel at coordinates
170-
(i,j) of the l-th feature map of layer (m-1).
170+
(i,j) of the l-th feature map of layer (m-1).
171171

172172

173173
The ConvOp
@@ -208,7 +208,7 @@ one of Figure 1. The input consists of 3 features maps (an RGB color image) of s
208208
high=1.0 / w_bound,
209209
size=w_shp),
210210
dtype=input.dtype), name ='W')
211-
211+
212212
# initialize shared variable for bias (1D tensor) with random values
213213
# IMPORTANT: biases are usually initialized to zero. However in this
214214
# particular application, we simply apply the convolutional layer to
@@ -223,10 +223,10 @@ one of Figure 1. The input consists of 3 features maps (an RGB color image) of s
223223
conv_out = conv.conv2d(input, W)
224224

225225
# build symbolic expression to add bias and apply activation function, i.e. produce neural net layer output
226-
# A few words on ``dimshuffle`` :
226+
# A few words on ``dimshuffle`` :
227227
# ``dimshuffle`` is a powerful tool in reshaping a tensor;
228-
# what it allows you to do is to shuffle dimension around
229-
# but also to insert new ones along which the tensor will be
228+
# what it allows you to do is to shuffle dimension around
229+
# but also to insert new ones along which the tensor will be
230230
# broadcastable;
231231
# dimshuffle('x', 2, 'x', 0, 1)
232232
# This will work on 3d tensors with no broadcastable
@@ -268,7 +268,7 @@ Let's have a little bit of fun with this...
268268
# plot original image and first and second components of output
269269
pylab.subplot(1, 3, 1); pylab.axis('off'); pylab.imshow(img)
270270
pylab.gray();
271-
# recall that the convOp output (filtered image) is actually a "minibatch",
271+
# recall that the convOp output (filtered image) is actually a "minibatch",
272272
# of size 1 here, so we take index 0 in the first dimension:
273273
pylab.subplot(1, 3, 2); pylab.axis('off'); pylab.imshow(filtered_img[0, 0, :, :])
274274
pylab.subplot(1, 3, 3); pylab.axis('off'); pylab.imshow(filtered_img[0, 1, :, :])
@@ -280,7 +280,7 @@ This should generate the following output.
280280
.. image:: images/3wolfmoon_output.png
281281
:align: center
282282

283-
Notice that a randomly initialized filter acts very much like an edge detector!
283+
Notice that a randomly initialized filter acts very much like an edge detector!
284284

285285
Also of note, remark that we use the same weight initialization formula as
286286
with the MLP. Weights are sampled randomly from a uniform distribution in the
@@ -384,7 +384,7 @@ The lower-layers are composed to alternating convolution and max-pooling
384384
layers. The upper-layers however are fully-connected and correspond to a
385385
traditional MLP (hidden layer + logistic regression). The input to the
386386
first fully-connected layer is the set of all features maps at the layer
387-
below.
387+
below.
388388

389389
From an implementation point of view, this means lower-layers operate on 4D
390390
tensors. These are then flattened to a 2D matrix of rasterized feature maps,
@@ -458,7 +458,7 @@ layer.
458458
Notice that when initializing the weight values, the fan-in is determined by
459459
the size of the receptive fields and the number of input feature maps.
460460

461-
Finally, using the LogisticRegression class defined in :doc:`logreg` and
461+
Finally, using the LogisticRegression class defined in :doc:`logreg` and
462462
the HiddenLayer class defined in :doc:`mlp` , we can
463463
instantiate the network as follows.
464464

@@ -504,7 +504,7 @@ instantiate the network as follows.
504504
layer2_input = layer1.output.flatten(2)
505505

506506
# construct a fully-connected sigmoidal layer
507-
layer2 = HiddenLayer(rng, input=layer2_input,
507+
layer2 = HiddenLayer(rng, input=layer2_input,
508508
n_in=50 * 4 * 4, n_out=500,
509509
activation=T.tanh )
510510

@@ -523,7 +523,7 @@ instantiate the network as follows.
523523

524524
# create a list of gradients for all model parameters
525525
grads = T.grad(cost, params)
526-
526+
527527
# train_model is a function that updates the model parameters by SGD
528528
# Since this model has many parameters, it would be tedious to manually
529529
# create an update rule for each model parameter. We thus create the updates
@@ -548,36 +548,36 @@ Running the Code
548548
The user can then run the code by calling:
549549

550550
.. code-block:: bash
551-
551+
552552
python code/convolutional_mlp.py
553553

554-
The following output was obtained with the default parameters on a Xeon E5450
555-
CPU clocked at 3.00GHz and using flags 'floatX=float32':
554+
The following output was obtained with the default parameters on a Core i7-2600K
555+
CPU clocked at 3.40GHz and using flags 'floatX=float32':
556556

557557
.. code-block:: bash
558558

559559
Optimization complete.
560-
Best validation score of 0.910000 % obtained at iteration 16099,with test
561-
performance 0.930000 %
562-
The code for file convolutional_mlp.py ran for 755.32m
560+
Best validation score of 0.910000 % obtained at iteration 17800,with test
561+
performance 0.920000 %
562+
The code for file convolutional_mlp.py ran for 380.28m
563563

564564
Using a GeForce GTX 285, we obtained the following:
565565

566566
.. code-block:: bash
567567

568568
Optimization complete.
569-
Best validation score of 0.910000 % obtained at iteration 20099,with test
569+
Best validation score of 0.910000 % obtained at iteration 15500,with test
570570
performance 0.930000 %
571-
The code for file convolutional_mlp.py ran for 47.96m
571+
The code for file convolutional_mlp.py ran for 46.76m
572572

573573
And similarly on a GeForce GTX 480:
574574

575575
.. code-block:: bash
576576

577577
Optimization complete.
578-
Best validation score of 0.910000 % obtained at iteration 18499,with test
579-
performance 0.910000 %
580-
The code for file convolutional_mlp.py ran for 43.09m
578+
Best validation score of 0.910000 % obtained at iteration 16400,with test
579+
performance 0.930000 %
580+
The code for file convolutional_mlp.py ran for 32.52m
581581

582582
Note that the discrepancies in validation and test error (as well as iteration
583583
count) are due to different implementations of the rounding mechanism in
@@ -598,10 +598,10 @@ Number of filters
598598
*****************
599599
When choosing the number of filters per layer, keep in mind that computing the
600600
activations of a single convolutional filter is much more expensive than with
601-
traditional MLPs !
601+
traditional MLPs !
602602

603603
Assume layer :math:`(l-1)` contains :math:`K^{l-1}` feature
604-
maps and :math:`M \times N` pixel positions (i.e.,
604+
maps and :math:`M \times N` pixel positions (i.e.,
605605
number of positions times number of feature maps),
606606
and there are :math:`K^l` filters at layer :math:`l` of shape :math:`m \times n`.
607607
Then computing a feature map (applying an :math:`m \times n` filter
@@ -625,7 +625,7 @@ keeping the total number of activations (number of feature maps times
625625
number of pixel positions) to be non-decreasing from one layer to the next
626626
(of course we could hope to get away with less when we are doing supervised
627627
learning). The number of feature maps directly controls capacity and so
628-
that depends on the number of available examples and the complexity of
628+
that depends on the number of available examples and the complexity of
629629
the task.
630630

631631

0 commit comments

Comments
 (0)