Skip to content

Commit 39333dd

Browse files
committed
added tips and trick section for tuning hyperparams of CNNs
1 parent d068e5d commit 39333dd

1 file changed

Lines changed: 51 additions & 0 deletions

File tree

doc/lenet.txt

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -539,6 +539,57 @@ Tips and Tricks
539539
Choosing Hyperparameters
540540
------------------------
541541

542+
CNNs are especially tricky to train, as they add even more hyper-parameters than
543+
a standard MLP. While the usual rules of thumb for learning rates and
544+
regularization constants still apply, the following should be kept in mind when
545+
optimizing CNNs.
546+
547+
Number of filters
548+
*****************
549+
When choosing the number of filters per layer, keep in mind that computing the
550+
activations of a single convolutional filter is much more expensive than with
551+
traditional MLPs !
552+
553+
Assume layer :math:`(l-1)` contains :math:`S^{(l-1)}` pixels (across all feature
554+
maps), and that feature maps at layer :math:`l` are of shape :math:`m \times n`.
555+
Computing the activations of a single convolutional filter requires :math:`m
556+
\times n \times S^{(l-1)}` multiplications, compared to :math:`S^{(l-1)}` for a
557+
standard MLP. As such, the number of filters used in CNNs is typically much
558+
smaller than the number of hidden units in MLPs and depends on the size of the
559+
feature maps (itself a function of input image size and filter shapes).
560+
561+
Since feature map size decreases with depth, shallow layers will tend to
562+
have fewer filters while deep layers can have much more.
563+
564+
Filter Shape
565+
************
566+
Common filter shapes found in the litterature vary greatly, usually based on
567+
the dataset. Best results on MNIST-sized images (28x28) are usually in the 5x5
568+
range, while natural image datasets (often with hundreds of pixels in each
569+
dimension) tend to use larger filters of shape 12x12 or 15x15.
570+
571+
When optimizing filter shapes, it is good to keep in mind however that there
572+
is a relationship between the size of the input image, the filter shape and
573+
the number of hidden units. Filters which are too large with respect to the
574+
input will project the input onto a very low-dimensional space. Creating a
575+
useful high-level abstraction will thus require many hidden units, as in the
576+
case of fully connected MLPs. Smaller filter shapes (with respect to the
577+
input) can get away with fewer hidden units (i.e. as few as 6 in the case of
578+
LeNet-5) as they project into a high-dimensional space, which preserves more
579+
of the information content of the input signal.
580+
581+
The trick is thus to find the right level of "granularity" (i.e. filter
582+
shapes) in order to create abstractions at the proper scale, given a
583+
particular dataset.
584+
585+
586+
Max Pooling Shape
587+
****************
588+
Typical values are 2x2 or no max-pooling. Very large input images may warrant
589+
4x4 pooling in the lower-layers. Keep in mind however, that this will reduce the
590+
dimension of the signal by a factor of 16, and may result in throwing away too
591+
much information.
592+
542593

543594
References
544595
++++++++++

0 commit comments

Comments
 (0)