forked from lisa-lab/DeepLearningTutorials
-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathdatasets.txt
More file actions
50 lines (37 loc) · 1.84 KB
/
datasets.txt
File metadata and controls
50 lines (37 loc) · 1.84 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Datasets
========
MNIST Dataset
+++++++++++++
The `MNIST <http://yann.lecun.com/exdb/mnist>`_ dataset consists of handwritten
digit images and it is divided in 60 000 examples for the training set and
10 000 examples for testing. All examples have been size-normalized and
centered in a fixed size image of 28 x 28 pixels. In the original dataset
each pixel of the image is represented by a value between 0 and 255, where
0 is black, 255 is white and anything in between is a different shade of grey.
Here are some examples of MNIST digits:
|0| |1| |2| |3| |4| |5|
.. |0| image:: images/mnist_0.png
.. |1| image:: images/mnist_1.png
.. |2| image:: images/mnist_2.png
.. |3| image:: images/mnist_3.png
.. |4| image:: images/mnist_4.png
.. |5| image:: images/mnist_5.png
For convenience we pickled the dataset to make it easier to use in python.
It is available for download `here <http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz>`_.
The pickled file represents a tuple of 3 lists : the training set, the
validation set and the testing set. Each element of any of the three lists
represents a minibatch of 20 examples. Such an element is a tuple composed
of the list of 20 images and the list of class labels for each of the
images. An image is represented as numpy 1-dimensional array of 784 (28 x 28) float
values between 0 and 1 ( 0 stands for black, 1 for white). The labels
are numbers between 0 and 9 indicating which digit the image
represents. Loading and accessing the dataset in the python can be done as
follows:
.. code-block:: python
import cPickle, gzip, numpy
f = gzip.open('mnist.pkl.gz','rb')
(training_set, validation_set, testing_set) = cPickle.load(f)
f.close()
# accessing training example i of minibatch j
image = training_set[j][0][i]
label = training_set[j][1][i]