Skip to content
Victor Buendía edited this page Jul 20, 2020 · 10 revisions

Welcome to the CNetwork wiki! This page has all the information you need to use this class in your project.

Description and installation

CNetwork is a C++ class for network analysis, focused to be used as a structure to run dynamics over networks. The internal structure of CNetwork is a SparseMatrix, a special class able to operate with matrices stored in sparse format. The user should employ the CNet.cpp and CNet_Dir.cpp files in his/her projects.

CNet.cpp includes all the information relative to undirected networks, while CNet_Dir.cpp stores the directed networks. The user should include whatever fits his/her needs. The best is to copy the complete the full content of the cnet folder and then include it in your project as

#include"cnet/CNet_Dir.cpp"

All the dependences should be correct. It is not necessary to compile the files separately. Also, if you include the files with the other header files of your compiler,

#include<cnet/CNet.cpp>

The code has to be compiled for C++11, since it makes heavy use of STL functions.

Network construction

To create undirected networks, nclude the CNet.cpp into your code. To declare a CNetwork object using the constructor, you should use the following syntax:

CNetwork<node_class, edge_class> net(int max_network_size, bool is_weighted)

Take in account that the max_network_size is limited to 106 inside the class. You can tweak this value (MAX_SIZE, which is defined at the beginning of the code) if you need, but take in account the amount of memory needed for this.

The node_class and edge_class are used to indicate a data type for the nodes and edges. They can store any class you want. This is used to run dynamics over the network. If you don't worry about it, you can leave it as CNetwork<> in which case we are creating a CNetwork<void, bool>, i.e. an unweighted network with no information stored at the nodes. There are some shortcuts for the undirected network, that you may use.

CNb - Undirected, unweighted CNetwork storing booleans in the nodes CNd - Stores double in the nodes CNi - Stores ints in the nodes WCN - Undirected, weighted CNetwork that stores no information in nodes WCNd, WCNi, WCNb - Same as above, but storing information in nodes

Directed networks work exactly the same -you simply have to include the CNet_Dir file. There are also some shortcuts to directed network creation, such as WDCNd: weighted, directed network that stores doubles in the nodes.

Pre-built Models

You can create different well-known network models using the class. These are:

net.create_albert_barabasi(int nodes, int m0, int m, unsigned int random_seed);
net.create_configurational(int nodes, int kmin, double gamma, unsigned int random_seed);
net.create_wats_strogatz(int nodes, int regular_connections, double p, unsigned int random_seed);
net.create_erdos_renyi(int nodes, double mean_k, unsigned int random_seed, unsigned int n0, unsigned int nf);
  • For the Albert Barabasi network, n is the number of nodes to create, m0 is the initial fully-connected set of nodes, and m the number of links that a new node will have.
  • For the configuration network, the first argument gives the number of nodes needed, and gamma is the exponent of the scale-free network (which has to be greater than 2).
  • For the Watts-Strogatz model, the number of regular connections is half of the degree of each node in the perfect regular, p=0, network.
  • Finally the Erdos-Renyi network is construction using the average degree and the number of nodes. Apart from the random seed, the user can also provide two indices to generate the random connectivity in a part of the network only (see comment below).

Note that all the methods require a random seed for the random number generator. Using always the same seed will lead to the same network, which is useful for testing. The argument is optional and set by default to 123456789. If you want different networks, be careful to change this number.

For the Erdos-Renyi network two additional arguments can be provided. These are the initial and final indices of nodes where to build the network, so only a sub-graph will present the random connectivity. This is to ease the construction of modular random networks, which is a very common task. The following code

net.create_erdos_renyi(100, 20, 123456789, 0, 40);
net.create_erdos_renyi(100, 7.5, 987654321, 40, 100);

creates a network composed by two disconnected subgraphs, where each one is a different Erdos-Renyi network with different connectivity).

Adding/Removing nodes and links

In order to create nodes, you first want to allocate the space for the new nodes:

net.add_nodes(int number_of_nodes);

Even if you network has a maximum size of one million nodes, the amount of memory used will be only the used by the total number of nodes added and the links between them. You can add links as:

net.add_link(int from, int to);
net.add_link(int from, int to, edge_class w);

where from and to are the indices of the nodes. When you add N nodes to the network, you have nodes labelled as 0,1,2...N-1. You can also put a weight to the link, using the class specified for edges.

Important remark: add_link works for both undirected and directed networks, and it works differently: in the first one, add_link(0,1) creates a link from 0 to 1 and from 1 to 0. In the second, only the link from 0 to 1 is created.

It is also possible to remove nodes and links. The corresponding functions are:

net.remove_node(int index);
net.remove_link(int from, int to);

Both functions returns true if it the node exists, and false otherwise. This is useful in order to check if the link or node was existent or not.

As before, note that the behaviour of remove_link is different in both cases. In a directed network, if 0-1 exists and you try to erase 1-0, the function will return false.

Joining networks

Another way to create a network is by joining several existing graphs. This is particularly useful to create modular networks with heterogeneous architectures for each module. This is done directly via the CNetwork constructor, passing a vector of smaller CNetworks, as

DirectedCNetwork<int, bool> net(); //Initial size 0

vector<DirectedCNetwork<int, bool>> submodules; //To store several networks

//Let's make manually each submodule and then add it to the vector.
DirectedCNetwork<int, bool> subm1(2);
DirectedCNetwork<int, bool> subm2(3);

//Create module 1
subm1.add_nodes(2);
subm1.add_link(0, 1);

//Create module 2
subm2.add_nodes(3);
subm2.add_link(0,1);
subm2.add_link(0,2);

//Add them to the vector
submodules.push_back(subm1);
submodules.push_back(subm2);

//Assign them to the new network, joining them
net = DirectedCNetwork<int, bool>(submodules);

In this example, net will contain 5 nodes. Node 0 will point to 1, while node 2 will point to 3 and 4. Notice that the modules are cleared by default during the join process to avoid big memory allocations. Otherwise, if you had 3 modules taking up 1GB each, during the copy 6GB of RAM are necessary. In this way, there would be just a peak of 4GB (immediately reduced to 3GB again). However, if you still need the individual modules, it can be indicated as

bool erase_modules = false;
net = DirectedCNetwork<int, bool>(submodules, erase_modules);

Finally, another caveat: this constructor is designed to create modular networks in an easy way, but not for making dynamical changes to the networks or remembering your cluster structure. It does not copy the information stored in each node (nor the values or the custom properties). Then, initial conditions for dynamics must be fed after the complete network is built, not on the modules. Any information stored in the modules in lost when the complete network is done.

Exploring the Network

Getting Nodes and Links

The class has some function to compute network properties. For single nodes:

net.degree(int node_index);
net.clustering_coef(int node_index); //Not available in directed networks
net.bread_first_search(int node, vector<int> &node_indices, vector<int> &dist);

First one gives the degree of the node, second is a computation of the clustering coefficient, and last one computes the distance to all other nodes in the network. Distance to node j is in dist[j]. This will have value -1 if the node is not accesible from the source node. In the case of the directed network, the functions in_degree and out_degree are also available.

It is possible to get properties from links, knowing the link index:

net.get_weight(int link_index); //returns edge_class
net.get_link_index(int from, int to);

The function get_link_index will return a -1 if the link does not exist.

It is also possible to get the total number of current elements in the network with net.get_node_count() and net.get_link_count(), respectively.

Analyzing global properties

Usually one wants to know also some global properties of the network, so there functions for computing the average clustering coefficient, as well as analyzing the degree distribution:

net.mean_clustering_coef(); //Not available in directed networks
net.average_pathlenght();
net.component_size(vector<int> &node_in_this_component, vector<int> &size_of_components);
net.degree_distribution(vector<int> &distribution, bool normalized = false);
net.mean_degree();
net.degree_correlation(vector<int> &distribution, vector<double> &correlation, bool normalized = false);

The component_size function is useful to know if an undirected network is connected or not. It will return a vector with a representative node of every component, as well as the size of the component. For example, in a network like:

1-2-3   4-5  6-7-8-9

The algorithm would return something like:

node_in_this_component = {2, 4, 7}
size_of_components = {3,2,4}

If we want to recover a the second component in a full way, we can use the breadth first search function. Take in account that in directed networks, the network may be completely connected, but not nodes are reachable: 1->2<-3->1, for example, makes 2 an isolated component, but the full network is reachable from 3.

The degree distribution function returns a vector, whose element j contains the number of nodes with degree j (or the probability to find one of these nodes, if normalized is set to true). The degree correlation function has to compute this distribution also. Calling degree_correlation(distribution) is then equivalent (but slower) to make distribution = degree_distribution(). If you need both vectors, calling only the correlation function will save time. In addition to that, in directed networks one can get the in-degree or out-degree distribution, as:

net.degree_distribution(dist, DirectedCNetwork::IN_DEGREE, true); //In_degree distribution
net.degree_distribution(dist, DirectedCNetwork::OUT_DEGREE, true);
net.degree_distribution(dist, DirectedCNetwork::TOTAL_DEGREE, true); //Total degree of the network

Working with neighbours

A very common task is to get neighbours of a node. In the most general case, with a directed network, this is done with the following functions:

net.get_neighs_out(int node_index);
net.get_neighs_in(int node_index);

net.get_out(int node_index, int k);
net.get_in((int node_index, int k))

There are two functions, in order to get neighbours I am pointing to (out), or neighbours that point to me (int). First pair of functions return a vector filled with the neighbours' indices of node_index to work with them. The second one gives the k-th neighbour of the node_index. Since a node has a number of neighbours equal to its degree, a common way to access and work with this nodes is

for (i=0; i < net.degree(node_index); i++)
{
       cout << "The " << i << "-th neighbour of " << node_index << " is " << get_out(node_index, i) << endl;
}

The order of get_out is only O(1) since CNetwork architecture stores the neighbours of every node. For undirected network, functions with in and out do exactly the same. Even when it makes no sense to differentiate both in an undirected network, this is provided in order to port easily algorithms developed with directed networks to undirected ones.

Dynamics

To run dynamics over the network, it is often useful to store some kind of information in the nodes or links. This is provided by the template type. For example,

CNetwork<int> net_i(1000); //Each node has an int property, each link is 0 or 1 -connected or not-
CNetwork<vector<double> > net_v(1000); //Every node has a vector<double> property

If you don't want to specify a type you can use CNetwork<>, which is equivalent to the void type. Also, it is possible to feed a pair of types to the declaration in order to set also a custom type for the links:

CNetwork<int, double> net_i(1000, 1.0); //Each node has an int property. The network is weighted

Any type, custom struct or class is adequate to be stored at the nodes, which makes CNetwork adequate to integrate easily systems of coupled dynamical systems in a network. For example, a network of coupled oscillators could be done as

//Create your own struct or class with the information needed for the system of ODEs that characterize this unit
struct dynamical_unit 
{
    double phase = 0.0;
    double frequency = 1.0;
    bool is_active = false;
};

//Creating a weighted network composed of dynamical units
CNetwork<dynamical_unit, double> net(1000, 1.0); 

bool is_myunit_active = net[42].is_active; //false, due to default constructor

In the case of the edges, the property associated can be edited using the functions get_weight(link_index) and set_weight(link_index). Note that the index used is associated directly with the link, which can be retrieved with get_link_index(int from, int to).

Important: on the other hand, when setting a custom type for the links, take in account that CNetwork has a SparseMatrix implementation under the hood in order to compute, e.g. network eigenvalues. For this reason, the types bool, int, float, double or complex are the only ones encouraged. If you want to use your own type, you would have to ensure yourself that is compatible with the SparseMatrix implementation. In principle, it should enough to overload the multiplication operator (*) but correct work is not ensured in this case.

These properties will not be exported with the network when using the write_mtx or write_graphml functions, and it is implemented purely for the dynamics.

Import and Export

Finally, it is important to be able to export and import data. To import data, this should be saved in MTX format. This is the one employed at the network repository. It is defined as:

%Comments of first lines
%Then we write: number of nodes, number of nodes, number of links
%And after that we write: from, to, weight (optional)
100 100 3
0 1 0.23
0 5 0.56
16 58 0.89

This format was created to represent a general NxM sparse matrix. Although most programs don't have a MTX exporter, there are many similar formats which write data as from-to-weight or simply from-to. Simply eliminating commas and adding the number of nodes and links will be enough.

Then we can do:

CNetwork<> net(100);
net.read_mtx("my_network.mtx");

in order to import all the data in the CNetwork class. In addition to that, it is also possible to export:

net.write_graphml("graph_export", labels);
net.write_mtx("mtx_export");

where labels is a string vector which is used to store additional information. In particular, labels are an alternative name for the node, instead of the index.

Adding properties to the net

In CNetwork is possible to add additional properties to the nodes and links. This properties are intended to be useful for the visualization and export of the data of the graph, in contrast with the node values explained in the Dynamics section.

A new property can be created like this:

net.define_property("double_prop", "double", true);
net.define_property("int_prop", "int", true);
net.define_property("bool_prop", "bool", true);
net.define_property("string_prop", "string", true);

where only this four kinds of properties are the only ones allowed. First argument is the name of the property, and last one is true if it is a property for nodes. If false, it will be assigned to edges.

Take in account that you have to create the properties after the creation of the network. Any change of the network topology (such as adding a node or removing a link) will result in errors when dealing with the properties. For this reason, properties are not suitable to do network dynamics, since if the network changes they will not be able to take it in account.

You can easily set the values for the properties:

for (i=0; i < net.get_node_count(); i++)
{
    net.set_value("double_prop", i, 0.02*i);
    net.set_value("int_prop", i, i);
    net.set_value("bool_prop", i, bool(i%2));
    net.set_value("string_prop", i, (string)"hola");
}

where the first argument is the name of an existing property, next, the index of the node, and then the property. In the case of string variables it is very important to add the cast in order to avoid confusion with the boolean version of the function. It is possible to retrieve the values. Since different return types cannot be overloaded in C++, the name of the getters is slightly different for each type:

net.get_value_d("double_prop", i);
net.get_value_i("int_prop", i);
net.get_value_b("bool_prop", i);
net.get_value_s("string_prop", i);

All the defined properties will be exported to the Graphml format when the method write_graphml is called. You can use them to to export some information.