show preliminary results

boulanni · boulanni · commit f47bb92c7417 · 2013-01-03T20:31:51.000-05:00
diff --git a/code/rnnrbm.py b/code/rnnrbm.py
@@ -175,7 +175,7 @@ def __init__(self, n_hidden=150, n_hidden_recurrent=100, lr=0.001, r=(21, 109),
     self.generate_function = theano.function([], v_t, updates=updates_generate)
 
 
-  def train(self, files, batch_size=100, num_epochs=150):
+  def train(self, files, batch_size=100, num_epochs=200):
     '''Train the RNN-RBM via stochastic gradient descent (SGD) using MIDI files converted to piano-rolls.
   
   files : list of strings
@@ -217,16 +217,17 @@ def generate(self, filename, show=True):
     midiwrite(filename, piano_roll, self.r, self.dt)
     if show:
       extent = (0, self.dt * len(piano_roll)) + self.r
+      pylab.figure()
       pylab.imshow(piano_roll.T, origin='lower', aspect='auto', interpolation='nearest', cmap=pylab.cm.gray_r, extent=extent)
       pylab.xlabel('time (s)')
       pylab.ylabel('MIDI note number')
       pylab.title('generated piano-roll')
-      pylab.show()
 
 
 if __name__ == '__main__':
   model = RnnRbm()
   model.train(glob.glob('Nottingham/train/*.mid'))
   model.generate('sample1.mid')
   model.generate('sample2.mid')
+  pylab.show()
 
diff --git a/doc/rnnrbm.txt b/doc/rnnrbm.txt
@@ -17,6 +17,11 @@ Modeling and generating sequences of polyphonic music with the RNN-RBM
   The script also assumes that the content of the `Nottingham Database of folk tunes <http://www-etud.iro.umontreal.ca/~boulanni/Nottingham.zip>`_ has been extracted in the working directory.
   Alternative MIDI datasets are available `here <http://www-etud.iro.umontreal.ca/~boulanni/icml2012>`_.
 
+.. caution::
+  Depending on your locally installed Theano version, you may have problems running this script.
+  If this is the case, please use the 'bleeding-edge' developer version from github and checkout the 0.6rc1 release:
+  ``git checkout e7a68bd9b39d722fef933425e0593cdb8998e141``.
+
 
 The RNN-RBM
 +++++++++++++++++++++++++
@@ -241,7 +246,7 @@ We now have all the necessary ingredients to start training our network on real
       self.generate_function = theano.function([], v_t, updates=updates_generate)
 
 
-    def train(self, files, batch_size=100, num_epochs=150):
+    def train(self, files, batch_size=100, num_epochs=200):
       '''Train the RNN-RBM via stochastic gradient descent (SGD) using MIDI files converted to piano-rolls.
     
     files : list of strings
@@ -283,57 +288,68 @@ We now have all the necessary ingredients to start training our network on real
       midiwrite(filename, piano_roll, self.r, self.dt)
       if show:
         extent = (0, self.dt * len(piano_roll)) + self.r
+        pylab.figure()
         pylab.imshow(piano_roll.T, origin='lower', aspect='auto', interpolation='nearest', cmap=pylab.cm.gray_r, extent=extent)
         pylab.xlabel('time (s)')
         pylab.ylabel('MIDI note number')
         pylab.title('generated piano-roll')
-        pylab.show()
 
 
   if __name__ == '__main__':
     model = RnnRbm()
     model.train(glob.glob('Nottingham/train/*.mid'))
     model.generate('sample1.mid')
     model.generate('sample2.mid')
+    pylab.show()
 
 
 Results
 ++++++++
 
-We ran the code on the Nottingham database for 150 epochs; training took approximately 48 hours.
+We ran the code on the Nottingham database for 200 epochs; training took approximately 24 hours.
 
 The output was the following:
 
 .. code-block:: text
 
-    Epoch 1/150 -15.0308940028
-    Epoch 2/150 -10.4892606673
-    Epoch 3/150 -10.2394696138
-    Epoch 4/150 -10.1431669994
-    Epoch 5/150 -9.7005382843
-    Epoch 6/150 -8.5985647524
-    Epoch 7/150 -8.35115428534
-    Epoch 8/150 -8.26453580552
-    Epoch 9/150 -8.21208991542
-    Epoch 10/150 -8.16847274143
-    Epoch 11/150 -8.03408036522
-    Epoch 12/150 -7.72139097234
-    Epoch 13/150 -7.56626080635
+  Epoch 1/150 -15.0154373583
+  Epoch 2/150 -10.4948703701
+  Epoch 3/150 -10.2507567848
+  Epoch 4/150 -10.1417621708
+  Epoch 5/150 -9.69403756276
+  Epoch 6/150 -8.6036962785
+  Epoch 7/150 -8.35180803953
+  Epoch 8/150 -8.26202621624
+  Epoch 9/150 -8.21526214665
+  Epoch 10/150 -8.16552397791
+
+  ... truncated for brevity ...
 
-    ... truncated for brevity ...
+  Epoch 140/150 -5.09668220315
+  Epoch 141/150 -5.08657006002
+  Epoch 142/150 -5.09776776338
+  Epoch 143/150 -5.10151042486
+  Epoch 144/150 -5.07677377181
+  Epoch 145/150 -5.07374453388
+  Epoch 146/150 -inf
+  Epoch 147/150 -5.06393939067
+  Epoch 148/150 -5.07493685431
+  Epoch 149/150 -5.06504525246
+  Epoch 150/150 -5.04567771601
 
-    Epoch ..
 
 
-The figures below show the piano-roll of two sample sequences and the corresponding MIDI files are provided:
+The figures below show the piano-rolls of two sample sequences and we provide the corresponding MIDI files:
 
-.. figure:: images/rnnrbm.png
+.. figure:: images/sample1.png
+  :scale: 60%
 
-  `sample1.mid <sample1.mid>`_
+  Listen to `sample1.mid <http://www-etud.iro.umontreal.ca/~boulanni/sample1.mid>`_
 
-.. figure:: images/rnnrbm.png
+.. figure:: images/sample1.png
+  :scale: 60%
 
-  `sample2.mid <sample2.mid>`_
+  Listen to `sample2.mid <http://www-etud.iro.umontreal.ca/~boulanni/sample2.mid>`_
 
 
 How to improve this code
@@ -343,7 +359,7 @@ The code shown in this tutorial is a stripped-down version that can be improved
 
 * Pretraining techniques: initialize the :math:`W,b_v,b_h` parameters with independent RBMs with fully shuffled frames (i.e. :math:`W_{uh}=W_{uv}=W_{uu}=W_{vu}=0`); initialize the :math:`W_{uv},W_{uu},W_{vu},b_u` parameters of the RNN with the auxiliary cross-entropy objective via either SGD or, preferably, Hessian-free optimization [BoulangerLewandowski12]_.
 * Optimization techniques: gradient clipping, Nesterov momentum and the use of NADE for conditional density estimation.
-* Preprocessing: transposing the sequences in common tonality (e.g. C major / minor) and normalizing the tempo in beats (quarternotes) per minute can yield important improvement in the generative quality of the model.
+* Preprocessing: transposing the sequences in a common tonality (e.g. C major / minor) and normalizing the tempo in beats (quarternotes) per minute can yield substantial improvement in the generative quality of the model.
 * Hyperparameter search: learning rate (separately for the RBM and RNN parts), learning rate schedules, batch size, number of hidden units (recurrent and RBM), momentum coefficient, momentum schedule, Gibbs chain length :math:`k` and early stopping.
 * Learn the initial condition :math:`u^{(0)}` as a model parameter.