Skip to content

umhau/cluster-rnn

Repository files navigation

word-rnn_with_mpit

Implementing a complex Torch7 RNN implementation over a cluster with MPIT.

These are both complex projects, but the key is in adding the code from the core word-rnn script to the mpit execution script. If the variables match, the EAMSGD optimizer should be able to use the available cluster to accelerate the training process.

Assuming all dependencies are installed, run the program like this:

mpirun -np 11 -f ../machinefile th mlaunch.lua

Where '11' is the number of available cores in the cluster, '../machinefile' points to the MPI machinefile, and mlaunch.lua is configured with the specific Torch script you care to run.

To use/test the trained model, run this code: th sample.lua cv/some_checkpoint.t7 -gpuid -1

About

Implementing a complex Torch7 RNN over a cluster with MPIT.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages