Writing extensions in C++#

It’s possible to extend graph-tool with code written in C++, as we explain in this text. This is useful for project-specific code that is not available in the library, but nevertheless needs to run faster than what is possible with the pure Python API.

Note

In order to write C++ extensions, a general background on C++ with templates, the Boost Graph Library (BGL) and boost.Python is strongly recommended.

A C++ extension to graph-tool consists of a compiled object file that is imported as a Python module, and it contains one or more functions that operate on Graph and PropertyMap objects. We demonstrate with an example that computes k-cores using the algorithm of [1] (this is already implemented in kcore_decomposition(), but serves as a good example).

The first part is a file kcore.hh that defines a function template that implements the desired algorithm:

#include "graph_tool.hh"

using namespace graph_tool;
using namespace boost;
using namespace std;

// This template function takes a graph 'g' and a vertex property map 'core_map'
// where the core values will be stored. Their exact types are unspecified at
// this point.

template <typename Graph, typename CoreMap>
void kcore_decomposition(Graph& g, CoreMap core_map)
{
    // Create some auxiliary property maps
    typedef typename vprop_map_t<size_t>::unchecked_t vmap_t;

    vmap_t deg(num_vertices(g));  // Remaining degree
    vmap_t pos(num_vertices(g));  // Position in bin (core)

    typedef typename graph_traits<Graph>::vertex_descriptor vertex_t;

    vector<vector<vertex_t>> bins; // Each bin stores the set of vertices of a
                                   // core

    // Put each vertex to the bin corresponding to its degree
    for (auto v : vertices_range(g))
    {
        size_t k = out_degree(v, g);
        deg[v] = k;
        if (k >= bins.size())
            bins.resize(k + 1);
        bins[k].push_back(v);
        pos[v] = bins[k].size() - 1;
    }

    // Proceed from smallest bin to largest. For each vertex in bin, check the
    // neighbours; if any of them have a larger remaining degree, reduce it by
    // one, and put it in the correct bin.
    for (size_t k = 0; k < bins.size(); ++k)
    {
        auto& bins_k = bins[k];
        while (!bins_k.empty())
        {
            auto v = bins_k.back();
            bins_k.pop_back();
            core_map[v] = k;
            for (auto e : out_edges_range(v, g))
            {
                auto u = target(e, g);
                auto& ku = deg[u];
                if (ku > deg[v])
                {
                    auto& bins_ku = bins[ku];
                    auto w = bins_ku.back();
                    auto pos_w = pos[w] = pos[u];
                    bins_ku[pos_w] = w;
                    bins_ku.pop_back();
                    auto& bins_ku_m = bins[ku - 1];
                    bins_ku_m.push_back(u);
                    pos[u] = bins_ku_m.size() - 1;
                    --ku;
                }
            }
        }
    }
}

The second part is a compilation unit kcore.cc that sets up the Python-C++ interface:

#include <boost/python.hpp>
#include "kcore.hh"

// The function below will take the graph that comes from graph-tool (always an
// instance of GraphInterface) and the property map (always an instance of
// boost::any).

void kcore_bind(GraphInterface& gi, std::any core_map)
{
    // We don't know the actual type of the graph represented by 'gi' and the
    // property map 'core_map'. We need to evoke the appropriate instance of the
    // algorithm at run-time using the gt_dispatch<>() function.
    //
    // The gt_dispatch<>() function takes as first argument the template
    // function object that will be called with the correct types, and the
    // remaining (variadic) arguments are the list of types that will be
    // considered for every argument of the function passed in the first
    // argument. In the case below 'all_graph_views()' represents all possible
    // graph views (directed, undirected, filtered, reversed, etc.) and
    // 'writable_vertex_scalar_properties' represents all vertex property maps
    // with scalar types that are writable (this excludes the vertex index
    // map). If we had more graphs or property maps to pass, we would simply
    // increase the parameter list accordingly.
    //
    // The gt_dispatch<>() function returns an object that needs to be called
    // with the specific objects that should be used for the dispatched call. In
    // this case we extract the actual view from `gi.get_graph_view()` and pass
    // the `core_map`.

    gt_dispatch<>()
        ([&](auto& g, auto core){ kcore_decomposition(g, core); },
         all_graph_views, writable_vertex_scalar_properties)
        (gi.get_graph_view(), core_map);
};

// The lines below setup a Python module called 'libkcore' that reflects the
// function 'kcore_bind' above as 'kcore' when imported from Python.

BOOST_PYTHON_MODULE(libkcore)
{
    using namespace boost::python;
    def("kcore", kcore_bind);
}

Releasing the Global Interpreter Lock (GIL)

The function gt_dispatch<>() will automatically release Python’s GIL before the code dispatch, so that multiple algorithms can run in parallel (see Sec. Parallel algorithms). If you wish to use the Python interpreter from inside your dispatch you need first to acquire the GIL, or alternatively disable the GIL release by calling instead gt_dispatch<>(false).

When using the gt_dispatch<>() function, we can make use of the following hana type lists for the graph views:

List of graph views	Description
`all_graph_views`	All possible graph views
`always_directed`	All directed graph views
`never_directed`	All undirected graph views
`always_reversed`	All directed but reversed graph views
`never_reversed`	All undirected and directed but not reversed graph views
`always_directed_never_reversed`	All directed but not reversed graph views
`never_filtered`	All unfiltered graph views
`never_filtered_never_reversed`	All unfiltered and not reversed graph views

Likewise, for the types of property maps we have:

List of property maps	Description
`vertex_properties`	All vertex property maps
`edge_properties`	All edge property maps
`writable_vertex_properties`	All writable vertex property maps
`writable_edge_properties`	All writable edge property maps
`vertex_scalar_properties`	All scalar-valued vertex property maps
`edge_scalar_properties`	All scalar-valued edge property maps
`writable_vertex_scalar_properties`	All writable scalar-valued vertex property maps
`writable_edge_scalar_properties`	All writable scalar-valued edge property maps
`vertex_integer_properties`	All integer-valued vertex property maps
`edge_integer_properties`	All integer-valued edge property maps
`vertex_floating_properties`	All floating-point-valued vertex property maps
`edge_floating_properties`	All floating-point-valued edge property maps
`vertex_scalar_vector_properties`	All vertex property maps with vector of scalar types
`edge_scalar_vector_properties`	All edge property maps with vector of scalar types
`vertex_integer_vector_properties`	All vertex property maps with vector of integer types
`edge_integer_vector_properties`	All edge property maps with vector of integer types
`vertex_floating_vector_properties`	All vertex property maps with vector of floating-point types
`edge_floating_vector_properties`	All edge property maps with vector of floating-point types

Finally, we need to bind things from the Python side with the file kcore.py:

import graph_tool

# We import the C++ module (called libkcore.so)
import libkcore

# The function below is what will be used from Python, and dispatch to the the
# C++ module.

def kcore_decomposition(g, core=None):
    if core is None:
        core = g.new_vertex_property("int")

    # For graph objects wee need to pass the internal GraphInterface which is
    # assessed via g._Graph__graph.

    # Property maps need to be passed as boost::any objects, which is done via
    # the _get_any() method.

    libkcore.kcore(g._Graph__graph, core._get_any())
    return core

Naturally, before we can use the extension, we need to compile it. To do this, we need to inform the compiler where to find the graph-tool include files. For that we use the pkg-config infrastructure, by calling pkg-config --cflags --libs graph-tool-py (in case graph-tool was compiled for Python 3.7). To keep things organized, we create a Makefile:

CXX=g++
CXXFLAGS=-O3 -fopenmp -std=gnu++17 -Wall -fPIC `pkg-config --cflags --libs graph-tool-py` -shared -DBOOST_ALLOW_DEPRECATED_HEADERS
ALL: libkcore.so

libkcore.so: kcore.hh kcore.cc
	${CXX} ${CXXFLAGS} kcore.cc -o libkcore.so

clean:
	rm -f libkcore.so

(If you use MacOS, you may want to use a different compiler such as clang++.) After compiling by typing make, we can import and use the kcore.py module:

>>> from kcore import kcore_decomposition
>>> g = gt.collection.data["football"]
>>> c = kcore_decomposition(g)
>>> print(c.a)
[8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
 8 8 8 8 8 7 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
 8 8 8 8]

Range-based iteration over vertices and edges#

Users familiar with the BGL may have noticed that iteration over vertices and edges was done in a peculiar way in the example above. The standard BGL way of iterating is something like:

auto rg = vertices(g);
for (auto vi = rg.first; vi != rg.second; ++vi)
{
    auto v = *vi;  // get the vertex descriptor
}

In graph-tool iteration can also be done in a more convenient way using range-based iteration:

for (auto v : vertices_range(g))
{
    // v is the vertex descriptor
}

Both approaches above are equivalent. Graph-tool also provides edges_range(), out_edges_range(), in_edges_range(), etc., in an analogous fashion.

Extracting specific property maps#

Sometimes it’s more convenient to extract property map of known types directly instead of relying on gt_dispatch<>(). This can be done simply by calling std::any_cast as follows:

void kcore_bind(GraphInterface& gi, std::any core_map)
{
    // Vertex property map of type int32_t
    typedef typename vprop_map_t<int32_t>::type vprop_t;
    vprop_t c = std::any_cast<vprop_t>(core_map);

    gt_dispatch<>()
        ([&](auto& g){ kcore_decomposition(g, c.get_unchecked()); },
         all_graph_views)(gi.get_graph_view());
};

Checked and unchecked property maps#

Property maps in graph-tool can either be ‘checked’ or ‘unchecked’, when they are seen from the C++ side. A checked property map automatically resizes itself if the underlying graph changes (via the addition of nodes or edges). However this involves bounds checking for every lookup, which can result in performance hit. For static graphs, it’s better to use unchecked property maps, which do not perform bounds checking (but do not adapt to a changing graph).

The default property map types are by default checked:

// vprop_t and eprop_t below are checked vertex and edge
// property maps of type int32_t

typedef vprop_map_t<int32_t> vprop_t;
typedef eprop_map_t<int32_t> eprop_t;

The type hidden in a std::any that comes from Python is a checked property map. However, the types propagated by gt_dispatch<>() are by default ‘unchecked’. It’s easy to obtain checked and unchecked versions of each kind of property map:

Note

The member functions get_checked() and get_unchecked() perform nothing more than a reinterpret_cast, and therefore are very cheap operations, incurring no run-time performance overhead when called.

// core_map is of type std::any

typedef vprop_map_t<int32_t> vprop_t;
vprop_t c = std::any_cast<vprop_t>(core_map);

// c is a checked property map. We can obtain an unchecked version
// as follows:

auto& uc = c.get_unchecked();   // The type of uc is vprop_map_t<int32_t>::unchecked_t

// If the underlying graph has changed size in the mean-time, we
// need to specify the new size when creating the unchecked map:

auto& uc2 = c.get_unchecked(10000);

// From a checked property map, we can get the checked type in a similar fashion:

auto& c2 = uc.get_checked();

// c and c2 have the same type.

References#

[1]

Vladimir Batagelj and Matjaž Zaveršnik. Fast algorithms for determining (generalized) core groups in social networks. Advances in Data Analysis and Classification, 5(2):129–145, July 2011. doi:10.1007/s11634-010-0079-y, [sci-hub].