Skip to content

dreibh/simproctc

Repository files navigation

SimProcTC
A Simulation Processing Tool-Chain for OMNeT++ Simulations
RSPLIB Project Logo
https://www.nntb.no/~dreibh/omnetpp

💡 What is SimProcTC (Simulation Processing Tool-Chain)?

SimProcTC (Simulation Processing Tool-Chain) is a flexible and powerful tool-chain for the setup, parallel run execution, results aggregation, data analysis and debugging of discrete event simulations based on the OMNeT++ discrete event simulator. Particularly, it is used for example for simulations with the RSPSIM RSerPool simulation model or with the NetPerfMeter/CMT-SCTP models in the INET Framework. However, SimProcTC is model-independent and can be adapted easily to other simulation models.

Further details about SimProcTC can be found in Appendix B of «Evaluation and Optimisation of Multi-Path Transport using the Stream Control Transmission Protocol»!

💾 Build from Sources

SimProcTC is released under the GNU General Public Licence (GPL).

Please use the issue tracker at https://github.com/dreibh/simproctc/issues to report bugs and issues!

Development Version

The Git repository of the SimProcTC sources can be found at https://github.com/dreibh/simproctc:

git clone https://github.com/dreibh/simproctc
cd simproctc
cd toolchain/tools && make && cd ..

Contributions:

Release Versions

See https://www.nntb.no/~dreibh/omnetpp/#current-stable-release for release packages!

📦 Installation of OMNeT++, SimProcTC and a Demo Simulation

The following items are a step-by-step installation guide for SimProcTC.

Install OMNeT++

Get the latest version of OMNeT++ here and install it under Linux. See the OMNeT++ Installation Guide for details!

If you do not have Linux installed already, you may find my Little Ubuntu Linux Installation Guide helpful. This installation guide also provides help on how to install OMNeT++ on an Ubuntu system. Note that while OMNeT++ also works under Microsoft Windows, my tool-chain has not been tested under this operating system, yet. In particular, run distribution using RSerPool will not work under Windows unless you port the RSPLIB RSerPool implementation to Windows.

After installing OMNeT++, make sure that it is working properly.

Install GNU R

Install GNU R. Usually, it will be available for your Linux distribution as installation package. However, if you decide to install it from source, you can download the source here. Under Ubuntu/Debian Linux, you can download and install GNU R using the following command line:

  • Ubuntu/Debian: sudo apt-get install r-base
  • Fedora: sudo dnf install R-base
  • FreeBSD: sudo pkg install R

After installation, you can start GNU R by:

R --vanilla

You can quit GNU R using Ctrl+D.

Install libbz2

The simulation tool-chain requires libbz2 for compression and decompression of files, including the development headers. In particular, also the developer files (include files) of this library are required to compile the tool-chain. Usually, it will be available for your Linux distribution as installation package. However, if you decide to install it from source, you can download the source from bzip2: Home. In most cases, it can be installed by the operating system’s package management:

  • Ubuntu/Debian: sudo apt-get install bzip2 libbz2-dev
  • Fedora: sudo dnf install bzip2 bzip2-devel
  • FreeBSD: sudo pkg install bzip2

Install chrpath

chrpath is a shell tool to modify the path to look for shared libraries in executables, which is needed for run distribution. If not already installed, it can be installed by the operating system's package management:

  • Ubuntu/Debian: sudo apt-get install chrpath
  • Fedora: sudo dnf install chrpath
  • FreeBSD: sudo pkg install chrpath

Install the Simulation Tool-Chain

Get the simulation tool-chain package from the Build from Sources section and unpack it, or clone the Git repository. Also, take a look at the description paper in the docs directory; they provide important information on what the tool-chain actually does! The tool-chain archive includes the files of the tool chain as well as a small example simulation. The files have the following purposes:

  • toolchain directory: This directory contains the tool-chain scripts.
    • simulation.R: Generic simulation tool-chain code.
    • simulate-version1.R: Model-specific simulation tool-chain code.
    • hashfunctions.R: GNU R functions to calculate MD5 and SHA1 hashes.
    • plotter.R: GNU R functions for plotting.
    • make-environment: Shell script to collect all files to create the environment file.
    • get-libs: Shell script to collect all shared libraries needed by the model.
    • get-neds: Shell script to collect all NED files needed by the model.
    • test1.R: Example simulation script.
    • plot-test1.R: Plotting script for the example.
    • ssdistribute: Shell script to distribute runs in a computation pool.
    • ssrun: Shell script to perform a simulation run (on a remote pool PC).
  • toolchain/tools directory: This directory contains the createsummary tool for scalar file processing. For performance reasons, it is written in C++, and therefore has to be compiled.
  • example-simulation directory: This directory contains the simple example model "example-simulation" for OMNeT++ 5.x/6.x.

In order to compile tool-chain and examples, call the following commands in the SimProcTC main directory:

cd toolchain/tools && make && cd ../.. && \
cd example-simulation && \
opp_makemake -I . -f && \
make

Notes:

  • Make sure to compile in the OMNeT++ Python environment (see the OMNeT++ Installation Guide), i.e.:

    source <PATH_TO_OMNET++_DIRECTORY>/setenv
    

    If opp_makemake is not found, this step is likely missing!

  • Make sure that everything compiles successfully. Otherwise, the tool-chain will not work properly!

After compilation, you can start the demo simulation by calling:

./example-simulation

🏃 Running the Demo Simulation

The example simulation packaged with SimProcTC simply presents the effects of fragmenting large packets into cells and forwarding them: the delays will significantly reduce at the price of increased overhead. Take a look into scenario.ned to see the parameters of the model:

  • fragmenterScenario.fragmenter.cellHeaderSize
  • fragmenterScenario.fragmenter.cellPayloadSize
  • fragmenterScenario.intermediateNodeOutputRate
  • fragmenterScenario.sourceHeaderSize
  • fragmenterScenario.sourcePayloadSize
  • fragmenterScenario.sourceInterarrivalTime

An example simulation for this model is defined in test1.R: for each parameter of the model, the list simulationConfigurations contains a list with the parameter name as first element and its value(s) as further elements. For example, list("sourcePayloadSize", 1000, 2500) means that the parameter sourcePayloadSize should be used with the values 1000 bytes and 2500 bytes. For each parameter combination, a separate run will be created. Furthermore, the variable simulationRuns specifies how many different seeds should be used. That is, for simulationRuns=3, runs for each parameter combinations are created with 3 different seeds (i.e. tripling the number of runs!).

The actual output of .ini files is realized in simulate-version1.R. Take a look over this file first, it should be quite self-explaining! In the function demoWriteParameterSection(), the actual lines for the parameters above are written for each simulation run. simCreatorAdditionalActiveVariables defines for which variables a table row should always be written. For example, if you always use cellHeaderSize=4, the createsummary tool would omit this parameter in the output table, because it always has the same value. Since it may be useful for your post-processing, you can add it to simCreatorAdditionalActiveVariables. Note, that simCreatorWriteParameterSection is set to demoWriteParameterSection. In the generic simulation.R script, always the names simCreator* instead of demo* are used. In order to be model-independent, it is necessary to set these variables to the actual model-dependent functions in simulate-version1.R! When you adapt the tool-chain to you own model, you only have to create your own simulation-versionX.R script and leave the other scripts unmodified.

The variables distributionPool and distributionProcs in test1.R are used to control the request distribution. They will be explained later. For now, make sure that distributionProcs is set to 0! This setting means that all runs are processed on the local machine.

Now, in order to perform the simulation defined in test1.R, simply execute test1.R using R:

R --vanilla < test1.R

The script will now create an .ini file for each run and a Makefile containing all runs. Finally, make will be called to process the created Makefile. make will already be called with the -j parameter corresponding to your number of CPUs/cores, so that it fully utilises the computation power of your machine. You can observe the progress of the simulation processing by monitoring the log file:

tail -f test1/make.log

You can abort the simulation processing and continue later. Only the run(s) currently in progress are lost and have to be re-processed upon resumption. Already completed runs are saved and no re-processing is necessary.

📈 Plotting the Results

After processing the simulation defined by test1.R, you can plot the results using plot-test1.R:

R --vanilla < plot-test1.R

The results will be written to test1.pdf (the file name will be the simulation output directory + .pdf). You can view it with any PDF reader, e.g. Okular. The plotter settings at the head of plot-test1.R should be almost self-explaining. For colorMode, you can also use cmBlackAndWhite or cmGreyScale. Setting plotOwnOutput to TRUE results in an own output file for each plot (instead of a single PDF file). plotConfigurations contains the definitions for each plot, in particular title, output file name for plotOwnOutput, x- and y-axis ticks, legend position and the results data for each axis given by a template. A set of model-specific templates is already defined in simulate-version1.R, you can add additional ones there or to plotVariables in plot-test1.R. See also the paper for more details on templates.

🚀 Run Distribution to a Pool of PCs

Make sure that the previous steps (performing simulations and plotting) work. If they are not working properly, the run distribution will also fail! First, it is necessary to install the RSPLIB RSerPool implementation. On a Ubuntu/Debian system, RSPLIB can be installed directly:

sudo apt-get install rsplib-registrar rsplib-services rsplib-tools

In case of a need for a manual installation, also see the RSPLIB documentation!

One one computer, run CSP monitor to display the status of the other components:

cspmonitor

Note the IP address of this system. The CSP monitor runs on UDP port 2960.

For the other components to be started, define environment variables:

export CSP_SERVER=<IP_OF_CSP_MONITOR>:2960
export CSP_INTERVAL=333

You can put these commands e.g. into $HOME/.bashrc, so that the variables are available in all new shell sessions!

In your network, start at least one RSerPool Pool Registrar (PR):

rspregistrar

With the environment variables above set correctly, the CSP monitor should show the registrar.

Then, start a Scripting Service Pool Element (PE) in another shell.

rspserver -scripting -policy=LeastUsed -ssmaxthreads=4

The parameter -ssmaxthreads specifies the number of parallel sessions; use the number of cores/CPUs in your machine). The output of rspserver should look as follows:

Starting service ...
Scripting Server - Version 2.0
==============================

General Parameters:
   Pool Handle             = ScriptingPool
   Reregistration Interval = 30.000s
   Local Addresses         = { all }
   Runtime Limit           = off
   Max Threads             = 4
   Policy Settings
      Policy Type          = LeastUsed
      Load Degradation     = 0.000%
      Load DPF             = 0.000%
      Weight               = 0
      Weight DPF           = 0.000%
Scripting Parameters:
   Keep Temp Dirs          = no
   Verbose Mode            = no
   Transmit Timeout        = 30000 [ms]
   Keep-Alive Interval     = 15000 [ms]
   Keep-Alive Timeout      = 10000 [ms]
   Cache Max Size          = 131072 [KiB]
   Cache Max Entries       = 16
   Cache Directory         =
   Keyring                 =
   Trust DB                =
Registration:
   Identifier              = $249c7176

In particular, take care of the "Identifier" line. This is the ID of the pool element under which it has been registered. If there are error messages saying that registration has failed, etc., take a look into the RSPLIB documentation. Usually, this means a small configuration problem which can be solved easily! It may also be helpful to use Wireshark for debugging network issues; it has dissectors for the RSerPool protocols as well as for CSP and the Scripting Service protocols!

With the environment variables above set correctly, the CSP monitor should show the PE.

Take a look into the script ssdistribute. Ensure that the variable setting for SIMULATION_POOLUSER points to the program scriptingclient of the RSPLIB package (if installed from the Ubuntu/Debian package: /usr/bin/scriptingclient).

SIMULATION_POOLUSER=/usr/bin/scriptingclient

If scriptingclient is located else where, e.g. $HOME/src/rsplib-3.5.4/src in your home directory, the line should be:

SIMULATION_POOLUSER=~/src/rsplib-3.5.4/src/scriptingclient

In test1.R, set distributionProcs to the maximum number of simultaneous sessions (at least 1; if you later start 5 pool elements with 2 cores each, you should use 10). It is safe to use 1 for the following test. After modifying distributionProcs, increase simulationRuns e.g. by 1. Otherwise, since you have already performed the run of test1.R before, no more runs would be necessary (since their results are already there!). Now, run test1.R again:

R --vanilla < test1.R

Take a look at the output of rspserver: it should receive jobs and process them. Also, take a look at the log output:

tail -f test1/make.log

When the job distribution is working properly, you can start more pool elements and set up your simulation computation pool. Do not forget to increase distributionProcs accordingly!

With the environment variables above set correctly, the CSP monitor should show the status of pool users and pool elements during the simulation processing.

The workload distribution system works as follows:

  • First, the Makefile calls make-environment to generate Tar/BZip2 file simulation-environment.tar.bz2 in the simulation directory. It contains the simulation binary, all shared libraries it needs (found out by the get-libs script), all .ned files it needs (found out by the get-neds script) and the script simulation.config-stage0 which sets two environment variables: SIMULATION_PROGRAM contains the name of the binary, SIMULATION_LIBS contains the location of the libraries. If your simulation needs additional files, they can be specified by the variable simulationMiscFiles in simulate-version1.R.

  • ssdistribute – which is called to actually distribute a run to a pool – creates the Tar/GZip file for the run. This file includes the environment file (i.e. usually simulation-environment.tar.bz2) specified by the variable SIMULATION_ENVIRONMENT and additional configuration files like simulation.config-stage0 (but named simulation.config-stage1, simulation.config-stage2, ...) specified by the environment variable SIMULATION_CONFIGS. You have to set these two environment variables in the ssdistribute script. Furthermore, the Tar/GZip file of the run contains the .ini file for the run.

  • ssrun performs a run on a (usually) remote node. First, it finds all simulation.config-stage* scripts and executes them in alphabetical order. That is, simulation.config-stage1 may overwrite settings of simulation.config-stage0 and so on. After that, it looks for .ini files. For each .ini file, it runs the program specified by the environment variable SIMULATION_PROGRAM. If the variable SIMULATION_LIBS is set, does not call the binary directly but tells the shared library loader to do this and use the specified set of shared libraries. If everything went well, a status file is created. The existence of this status file means that the run has been successful.

  • Finding out what is going wrong with the remote execution can be difficult sometimes. In such a case, only start a single instance of rspserver and use the parameter -sskeeptempdirs. This parameter results in not deleting the temporary session directory after shutdown of the session. That is, you can dissect the directory's contents for troubleshooting. The name of the directory for each session is shown in the output of rspserver.

🔧 Adapting SimProcTC to Your Own Simulation

In order to use SimProcTC with your own model, perform the following tasks:

Before using the RSerPool-based run distribution, first test your simulation on your local machine! This makes finding problems much easier. If everything works, you can continue with run distribution.

🖋️ Citing SimProcTC in Publications

SimProcTC and related BibTeX entries can be found in AllReferences.bib!

🔗 Useful Links

Simulation and Data Processing Software

Other Resources

About

Simulation Processing Tool-Chain (SimProcTC)

Resources

License

Stars

Watchers

Forks

Packages

No packages published