This document outlines the variety of training scripts and external resources.
This section lists advanced training scripts that train RNNs on real-world datasets.
- recurrent-language-model.lua: trains a stack of LSTM, GRU, MuFuRu, or Simple RNN on the Penn Tree Bank dataset without or without dropout.
- recurrent-visual-attention.lua: training script used in Recurrent Model for Visual Attention. Implements the REINFORCE learning rule to learn an attention mechanism for classifying MNIST digits, sometimes translated. Showcases
nn.RecurrentAttention,nn.SpatialGlimpseandnn.Reinforce. - noise-contrastive-esimate.lua: one of two training scripts used in Language modeling a billion words. Single-GPU script for training recurrent language models on the Google billion words dataset. This example showcases version 2 zero-masking. Version 2 is more efficient than version 1 as the
zeroMaskis interpolated only once. - multigpu-nce-rnnlm.lua : 4-GPU version of
noise-contrastive-estimate.luafor training larger multi-GPU models. Two of two training scripts used in the Language modeling a billion words. This script is for training multi-layer SeqLSTM language models on the Google Billion Words dataset. The example uses MaskZero to train independent variable length sequences using the NCEModule and NCECriterion. This script is our fastest yet boasting speeds of 20,000 words/second (on NVIDIA Titan X) with a 2-layer LSTM having 250 hidden units, a batchsize of 128 and sequence length of a 100. Note that you will need to have Torch installed with Lua instead of LuaJIT; - twitter-sentiment-rnn.lua : trains stack of RNNs on a twitter sentiment analysis. The problem is a text classification problem that uses a sequence-to-one architecture. In this architecture, only the last RNN's last time-step is used for classification.
This section lists simple training scripts that train RNNs on dummy datasets. These scripts showcases the fundamental principles of the package.
- simple-recurrent-network.lua: uses the
nn.LookupRNNmodule to instantiate a Simple RNN. Illustrates the first AbstractRecurrent instance in action. It has since been surpassed by the more flexiblenn.Recursorandnn.Recurrence. Thenn.Recursorclass decorates any module to make it conform to the nn.AbstractRecurrent interface. Thenn.Recurrenceimplements the recursiveh[t] <- forward(h[t-1], x[t]). Together,nn.Recursorandnn.Recurrencecan be used to implement a wide range of experimental recurrent architectures. - simple-sequencer-network.lua: uses the
nn.Sequencermodule to accept a batch of sequences asinputof sizeseqlen x batchsize x .... Both tables and tensors are accepted as input and produce the same type of output (table->table, tensor->tensor). TheSequencerclass abstract away the implementation of back-propagation through time. It also provides aremember(['neither','both'])method for triggering what theSequencerremembers between iterations (forward,backward,update). - simple-recurrence-network.lua: uses the
nn.Recurrencemodule to define the h[t] <- sigmoid(h[t-1], x[t]) Simple RNN. Decorates it usingnn.Sequencerso that an entire batch of sequences (input) can forward and backward propagated per update. - simple-bisequencer-network.lua: uses a
nn.BiSequencerLMand twonn.LookupRNNto implement a simple bi-directional language model. - simple-bisequencer-network-variable.lua: uses
nn.RecLSTM,nn.LookupTableMaskZero,nn.ZipTable,nn.MaskZeroandnn.MaskZeroCriterionto implement a simple bi-directional LSTM language model. This example uses version 1 zero-masking where thezeroMaskis automatically interpolated from theinput. - sequence-to-one.lua: a simple sequence-to-one example that uses
Recurrenceto build an RNN andSelectTable(-1)to select the last time-step for discriminating the sequence. - encoder-decoder-coupling.lua: uses two stacks of
nn.SeqLSTMto implement an encoder and decoder. The final hidden state of the encoder initializes the hidden state of the decoder. Example of sequence-to-sequence learning. - nested-recurrence-lstm.lua: demonstrates how RNNs can be nested to form complex RNNs.
- recurrent-time-series.lua demonstrates how train a simple RNN to do multi-variate time-series predication.
- rnn-benchmarks : benchmarks comparing Torch (using this library), Theano and TensorFlow.
- dataload : a collection of torch dataset loaders;
- A brief (1 hours) overview of Torch7, which includes some details about the rnn packages (at the end), is available via this NVIDIA GTC Webinar video. In any case, this presentation gives a nice overview of Logistic Regression, Multi-Layer Perceptrons, Convolutional Neural Networks and Recurrent Neural Networks using Torch7;