Skip to content

Commit 6a6ab11

Browse files
committed
Remove trailing whitespace
1 parent 3fbfc21 commit 6a6ab11

File tree

1 file changed

+20
-20
lines changed

1 file changed

+20
-20
lines changed

doc/lenet.txt

Lines changed: 20 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Convolutional Neural Networks (LeNet)
66
.. note::
77
This section assumes the reader has already read through :doc:`logreg` and
88
:doc:`mlp`. Additionally, it uses the following new Theano functions and concepts:
9-
`T.tanh`_, `shared variables`_, `basic arithmetic ops`_, `T.grad`_,
9+
`T.tanh`_, `shared variables`_, `basic arithmetic ops`_, `T.grad`_,
1010
`floatX`_, `downsample`_ , `conv2d`_, `dimshuffle`_. If you intend to run the
1111
code on GPU also read `GPU`_.
1212

@@ -20,7 +20,7 @@ Convolutional Neural Networks (LeNet)
2020

2121
.. _floatX: http://deeplearning.net/software/theano/library/config.html#config.floatX
2222

23-
.. _GPU: http://deeplearning.net/software/theano/tutorial/using_gpu.html
23+
.. _GPU: http://deeplearning.net/software/theano/tutorial/using_gpu.html
2424

2525
.. _downsample: http://deeplearning.net/software/theano/library/tensor/signal/downsample.html
2626

@@ -71,7 +71,7 @@ contiguous receptive fields. We can illustrate this graphically as follows:
7171
Imagine that layer **m-1** is the input retina.
7272
In the above, units in layer **m**
7373
have receptive fields of width 3 with respect to the input retina and are thus only
74-
connected to 3 adjacent neurons in the layer below (the retina).
74+
connected to 3 adjacent neurons in the layer below (the retina).
7575
Units in layer **m** have
7676
a similar connectivity with the layer below. We say that their receptive
7777
field with respect to the layer below is also 3, but their receptive field
@@ -122,7 +122,7 @@ feature map :math:`h^k` is obtained as follows (for :math:`tanh` non-linearities
122122
.. math::
123123
h^k_{ij} = \tanh ( (W^k * x)_{ij} + b_k ).
124124

125-
.. Note::
125+
.. Note::
126126
Recall the following definition of convolution for a 1D signal.
127127
:math:`o[n] = f[n]*g[n] = \sum_{u=-\infty}^{\infty} f[u] g[u-n] = \sum_{u=-\infty}^{\infty} f[n-u] g[u]`.
128128

@@ -131,10 +131,10 @@ feature map :math:`h^k` is obtained as follows (for :math:`tanh` non-linearities
131131

132132
To form a richer representation of the data, hidden layers are composed of
133133
a set of multiple feature maps, :math:`\{h^{(k)}, k=0..K\}`.
134-
The weights :math:`W` of this layer can be parametrized as a 4D tensor
134+
The weights :math:`W` of this layer can be parametrized as a 4D tensor
135135
(destination feature map index, source feature map index, source vertical position index, source horizontal position index)
136136
and
137-
the biases :math:`b` as a vector (one element per destination feature map index).
137+
the biases :math:`b` as a vector (one element per destination feature map index).
138138
We illustrate this graphically as follows:
139139

140140
.. figure:: images/cnn_explained.png
@@ -154,7 +154,7 @@ input feature maps, while the other two refer to the pixel coordinates.
154154

155155
Putting it all together, :math:`W^{kl}_{ij}` denotes the weight connecting
156156
each pixel of the k-th feature map at layer m, with the pixel at coordinates
157-
(i,j) of the l-th feature map of layer (m-1).
157+
(i,j) of the l-th feature map of layer (m-1).
158158

159159

160160
The ConvOp
@@ -195,7 +195,7 @@ one of Figure 1. The input consists of 3 features maps (an RGB color image) of s
195195
high=1.0 / w_bound,
196196
size=w_shp),
197197
dtype=input.dtype), name ='W')
198-
198+
199199
# initialize shared variable for bias (1D tensor) with random values
200200
# IMPORTANT: biases are usually initialized to zero. However in this
201201
# particular application, we simply apply the convolutional layer to
@@ -210,10 +210,10 @@ one of Figure 1. The input consists of 3 features maps (an RGB color image) of s
210210
conv_out = conv.conv2d(input, W)
211211

212212
# build symbolic expression to add bias and apply activation function, i.e. produce neural net layer output
213-
# A few words on ``dimshuffle`` :
213+
# A few words on ``dimshuffle`` :
214214
# ``dimshuffle`` is a powerful tool in reshaping a tensor;
215-
# what it allows you to do is to shuffle dimension around
216-
# but also to insert new ones along which the tensor will be
215+
# what it allows you to do is to shuffle dimension around
216+
# but also to insert new ones along which the tensor will be
217217
# broadcastable;
218218
# dimshuffle('x', 2, 'x', 0, 1)
219219
# This will work on 3d tensors with no broadcastable
@@ -255,7 +255,7 @@ Let's have a little bit of fun with this...
255255
# plot original image and first and second components of output
256256
pylab.subplot(1, 3, 1); pylab.axis('off'); pylab.imshow(img)
257257
pylab.gray();
258-
# recall that the convOp output (filtered image) is actually a "minibatch",
258+
# recall that the convOp output (filtered image) is actually a "minibatch",
259259
# of size 1 here, so we take index 0 in the first dimension:
260260
pylab.subplot(1, 3, 2); pylab.axis('off'); pylab.imshow(filtered_img[0, 0, :, :])
261261
pylab.subplot(1, 3, 3); pylab.axis('off'); pylab.imshow(filtered_img[0, 1, :, :])
@@ -267,7 +267,7 @@ This should generate the following output.
267267
.. image:: images/3wolfmoon_output.png
268268
:align: center
269269

270-
Notice that a randomly initialized filter acts very much like an edge detector!
270+
Notice that a randomly initialized filter acts very much like an edge detector!
271271

272272
Also of note, remark that we use the same weight initialization formula as
273273
with the MLP. Weights are sampled randomly from a uniform distribution in the
@@ -371,7 +371,7 @@ The lower-layers are composed to alternating convolution and max-pooling
371371
layers. The upper-layers however are fully-connected and correspond to a
372372
traditional MLP (hidden layer + logistic regression). The input to the
373373
first fully-connected layer is the set of all features maps at the layer
374-
below.
374+
below.
375375

376376
From an implementation point of view, this means lower-layers operate on 4D
377377
tensors. These are then flattened to a 2D matrix of rasterized feature maps,
@@ -445,7 +445,7 @@ layer.
445445
Notice that when initializing the weight values, the fan-in is determined by
446446
the size of the receptive fields and the number of input feature maps.
447447

448-
Finally, using the LogisticRegression class defined in :doc:`logreg` and
448+
Finally, using the LogisticRegression class defined in :doc:`logreg` and
449449
the HiddenLayer class defined in :doc:`mlp` , we can
450450
instantiate the network as follows.
451451

@@ -491,7 +491,7 @@ instantiate the network as follows.
491491
layer2_input = layer1.output.flatten(2)
492492

493493
# construct a fully-connected sigmoidal layer
494-
layer2 = HiddenLayer(rng, input=layer2_input,
494+
layer2 = HiddenLayer(rng, input=layer2_input,
495495
n_in=50 * 4 * 4, n_out=500,
496496
activation=T.tanh )
497497

@@ -510,7 +510,7 @@ instantiate the network as follows.
510510

511511
# create a list of gradients for all model parameters
512512
grads = T.grad(cost, params)
513-
513+
514514
# train_model is a function that updates the model parameters by SGD
515515
# Since this model has many parameters, it would be tedious to manually
516516
# create an update rule for each model parameter. We thus create the updates
@@ -585,10 +585,10 @@ Number of filters
585585
*****************
586586
When choosing the number of filters per layer, keep in mind that computing the
587587
activations of a single convolutional filter is much more expensive than with
588-
traditional MLPs !
588+
traditional MLPs !
589589

590590
Assume layer :math:`(l-1)` contains :math:`K^{l-1}` feature
591-
maps and :math:`M \times N` pixel positions (i.e.,
591+
maps and :math:`M \times N` pixel positions (i.e.,
592592
number of positions times number of feature maps),
593593
and there are :math:`K^l` filters at layer :math:`l` of shape :math:`m \times n`.
594594
Then computing a feature map (applying an :math:`m \times n` filter
@@ -612,7 +612,7 @@ keeping the total number of activations (number of feature maps times
612612
number of pixel positions) to be non-decreasing from one layer to the next
613613
(of course we could hope to get away with less when we are doing supervised
614614
learning). The number of feature maps directly controls capacity and so
615-
that depends on the number of available examples and the complexity of
615+
that depends on the number of available examples and the complexity of
616616
the task.
617617

618618

0 commit comments

Comments
 (0)