Skip to content

Commit 86eab7d

Browse files
author
Yoshua Bengio
committed
Merge branch 'master' of github.com:lisa-lab/DeepLearningTutorials
2 parents 1e9844e + 4169ca8 commit 86eab7d

1 file changed

Lines changed: 136 additions & 0 deletions

File tree

doc/deep.txt

Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
.. _deep:
2+
3+
Deep Learning
4+
=============
5+
6+
The breakthrough to effective training strategies for deep architectures came in
7+
2006 with the algorithms for training deep belief networks
8+
(DBN) [Hinton07]_ and stacked auto-encoders [Ranzato07]_ , [Bengio07]_ .
9+
All these methods are based on a similar approach: **greedy layer-wise unsupervised
10+
pre-training** followed by **supervised fine-tuning**.
11+
12+
The pretraining strategy consists in using unsupervised learning to guide the
13+
training of intermediate levels of representation. Each layer is pre-trained
14+
with an unsupervised learning algorithm, which attempts to learn a nonlinear
15+
transformation of its input, in order to captures its main variations. Higher
16+
levels of abstractions are created by feeding the output of one layer, to the
17+
input of the subsequent layer.
18+
19+
The resulting an architecture can then be seen in two lights:
20+
21+
* the pre-trained deep network can be used to initialize the weights of all, but
22+
the last layer of a deep neural network. The weights are then further adapted
23+
to a supervised task (such as classification) through traditional gradient
24+
descent (see :ref:`Multilayer perceptron <mlp>`). This is referred to as the
25+
fine-tuning step.
26+
27+
* the pre-trained deep network can also serve solely as a feature extractor. The
28+
output of the last layer is fed to a classifier, such as logistic regression,
29+
which is trained independently. Better results can be obtained by
30+
concatenating the output of the last layer, with the hidden representations of
31+
all intermediate layers [Lee09]_.
32+
33+
For the purposes of this tutorial, we will focus on the first interpretation,
34+
as that is what was first proposed in [Hinton06]_.
35+
36+
Deep Coding
37+
+++++++++++
38+
39+
Since Deep Belief Networks (DBN) and Stacked Denoising-AutoEncoders (SDA) share
40+
much of the same architecture and have very similar training algorithms (in
41+
terms of pretraining and fine-tuning stages), it makes sense to implement them
42+
in a similar fashion, as part of a "Deep Learning" framework.
43+
44+
We thus define a generic interface, which both of these architectures will
45+
share.
46+
47+
.. code-block:: python
48+
49+
class DeepLayerwiseModel(object):
50+
51+
def layerwise_pretrain(self, layer_fns, pretrain_amounts):
52+
"""
53+
"""
54+
55+
def finetune(self, datasets, lr, batch_size):
56+
"""
57+
58+
class DBN(DeepLayerwiseModel):
59+
"""
60+
"""
61+
62+
class StackedDAA(DeepLayerwiseModel):
63+
"""
64+
"""
65+
66+
.. code-block:: python
67+
68+
def deep_main(learning_rate=0.1,
69+
pretraining_epochs = 20,
70+
pretrain_lr = 0.1,
71+
training_epochs = 1000,
72+
batch_size = 20,
73+
mnist_file='mnist.pkl.gz'):
74+
75+
n_train_examples, train_valid_test = load_mnist(mnist_file)
76+
77+
# instantiate model
78+
deep_model = ...
79+
80+
####
81+
#### Phase 1: Pre-training
82+
####
83+
84+
# create an array of functions, which will be used for the greedy
85+
# layer-wise unsupervised training procedure
86+
87+
pretrain_functions = deep_model.pretrain_functions(
88+
batch_size=batch_size,
89+
train_set_x=train_set_x,
90+
learning_rate=pretrain_lr,
91+
...
92+
)
93+
94+
# loop over all the layers in our network
95+
for layer_idx, pretrain_fn in enumerate(pretrain_functions):
96+
97+
# iterate over a certain number of epochs)
98+
for i in xrange(pretraining_epochs * n_train_examples / batch_size):
99+
100+
# follow one step in the gradient of the unsupervised cost
101+
# function, at the given layer
102+
layer_fn(i)
103+
104+
105+
.. code-block:: python
106+
107+
####
108+
#### Phase 2: Fine Tuning
109+
####
110+
111+
# create theano functions for fine-tuning, as well as
112+
# validation and testing our model.
113+
114+
train_fn, valid_scores, test_scores =\
115+
deep_model.finetune_functions(
116+
train_valid_test[0][0], # training dataset
117+
learning_rate=finetune_lr, # the learning rate
118+
batch_size = batch_size) # number of examples to use at once
119+
120+
121+
# use these functions as part of the generic early-stopping procedure
122+
for i in xrange(patience_max):
123+
124+
if i >= patience:
125+
break
126+
127+
cost_i = train_fn(i)
128+
129+
...
130+
131+
132+
133+
134+
135+
136+

0 commit comments

Comments
 (0)