Skip to content

Commit dde46d8

Browse files
author
Razvan Pascanu
committed
some typos
1 parent 51df900 commit dde46d8

2 files changed

Lines changed: 8 additions & 5 deletions

File tree

doc/gettingstarted.txt

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -141,9 +141,12 @@ and then cast it to int.
141141
.. note::
142142

143143
If you are running your code on the GPU and the dataset you are using
144-
is too large to fit in memory the code will crash. In such a case, do
145-
not store the data in a shared variable. You can however copy a larger chunk
146-
of it at once (several minibatches) to reduce the overhead of data transfer.
144+
is too large to fit in memory the code will crash. In such a case you
145+
should store the data in a shared variable. You can however store a
146+
sufficiently small chunk of your data (several minibatches) in a shared
147+
variable and use that during trianing. One you got through the chunk,
148+
update the values it stores. This way you minimize the number of data
149+
transfers between CPU memory and GPU memory.
147150

148151

149152

@@ -170,7 +173,7 @@ use superscripts to distinguish training set examples: :math:`x^{(i)} \in
170173
\mathcal{R}^D` is thus the i-th training example of dimensionality :math:`D`. Similarly,
171174
:math:`y^{(i)} \in \{0, ..., L\}` is the i-th label assigned to input
172175
:math:`x^{(i)}`. It is straightforward to extend these examples to
173-
:math:`y^{(i)}` that has other types (e.g. Gaussian for regression,
176+
ones where :math:`y^{(i)}` has other types (e.g. Gaussian for regression,
174177
or groups of multinomials for predicting multiple symbols).
175178

176179
.. index:: Math Convetions

doc/mlp.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -410,7 +410,7 @@ This allows information to flow well upward and downward in the network and
410410
reduces discrepancies between layers.
411411
Under some assumptions, a compromise between these two constraints leads to the following
412412
initialization: :math:`uniform[-\frac{6}{\sqrt{fan_{in}+fan_{out}}},\frac{6}{\sqrt{fan_{in}+fan_{out}}}]`
413-
for tanh and :math:`uniform[-\4*frac{6}{\sqrt{fan_{in}+fan_{out}}},\4*frac{6}{\sqrt{fan_{in}+fan_{out}}}]`
413+
for tanh and :math:`uniform[-4*\frac{6}{\sqrt{fan_{in}+fan_{out}}},4*\frac{6}{\sqrt{fan_{in}+fan_{out}}}]`
414414
for sigmoid. Where :math:`fan_{in}` is the number of inputs and :math:`fan_{out}` the number of hidden units.
415415
For mathematical considerations please refer to [Xavier10].
416416

0 commit comments

Comments
 (0)