@@ -138,7 +138,7 @@ layer on top.
138138
139139The initial values for the weights of a hidden layer :math:`i` should be uniformly
140140sampled from a symmetric interval that depends on the activation function. For
141- :math:`tanh` activation function results obtained in [Xavier10] show that the
141+ :math:`tanh` activation function results obtained in [Xavier10]_ show that the
142142interval should be
143143:math:`[-\sqrt{\frac{6}{fan_{in}+fan_{out}}},\sqrt{\frac{6}{fan_{in}+fan_{out}}}]`, where
144144:math:`fan_{in}` is the number of units in the :math:`(i-1)`-th layer,
@@ -154,11 +154,11 @@ both upward (activations flowing from inputs to outputs) and backward
154154 # `W` is initialized with `W_values` which is uniformely sampled
155155 # from sqrt(-6./(n_in+n_hidden)) and sqrt(6./(n_in+n_hidden))
156156 # for tanh activation function
157- # the output of uniform if converted using asarray to dtype
157+ # the output of uniform is converted using asarray to dtype
158158 # theano.config.floatX so that the code is runable on GPU
159159 # Note : optimal initialization of weights is dependent on the
160160 # activation function used (among other things).
161- # For example, results presented in [Xavier10] suggest that you
161+ # For example, results presented in [Xavier10]_ suggest that you
162162 # should use 4 times larger initial weights for sigmoid
163163 # compared to tanh
164164 if activation == theano.tensor.tanh:
@@ -207,7 +207,7 @@ the ``MLP`` class :
207207
208208 A multilayer perceptron is a feedforward artificial neural network model
209209 that has one layer or more of hidden units and nonlinear activations.
210- Intermidiate layers usually have as activation function thanh or the
210+ Intermediate layers usually have as activation function tanh or the
211211 sigmoid function (defined here by a ``HiddenLayer`` class) while the
212212 top layer is a softamx layer (defined here by a ``LogisticRegression``
213213 class).
@@ -412,7 +412,7 @@ Under some assumptions, a compromise between these two constraints leads to the
412412initialization: :math:`uniform[-\frac{6}{\sqrt{fan_{in}+fan_{out}}},\frac{6}{\sqrt{fan_{in}+fan_{out}}}]`
413413for tanh and :math:`uniform[-4*\frac{6}{\sqrt{fan_{in}+fan_{out}}},4*\frac{6}{\sqrt{fan_{in}+fan_{out}}}]`
414414for sigmoid. Where :math:`fan_{in}` is the number of inputs and :math:`fan_{out}` the number of hidden units.
415- For mathematical considerations please refer to [Xavier10].
415+ For mathematical considerations please refer to [Xavier10]_ .
416416
417417Learning rate
418418--------------
0 commit comments