Writing extensions in C++

It’s possible to extend graph-tool with code written in C++, as we explain in this text. This is useful for project-specific code that is not available in the library, but nevertheless needs to run faster than what is possible with the pure Python API.

Note

In order to write C++ extensions, a general background on C++ with templates, the Boost Graph Library (BGL) and boost.Python is strongly recommended.

A C++ extension to graph-tool consists of a compiled object file that is imported as a Python module, and it contains one or more functions that operate on Graph and PropertyMap objects. We demonstrate with an example that computes k-cores using the algorithm of [batagelj] (this is already implemented in kcore_decomposition(), but serves as a good example).

The first part is a file kcore.hh that defines a function template that implements the desired algorithm:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
#include "graph_tool.hh"

using namespace graph_tool;
using namespace boost;
using namespace std;

// This template function takes a graph 'g' and a vertex property map 'core_map'
// where the core values will be stored. Their exact types are unspecified at
// this point.

template <typename Graph, typename CoreMap>
void kcore_decomposition(Graph& g, CoreMap core_map)
{
    // The vertex index is an internal property map that every graph possesses
    typedef typename property_map<Graph, vertex_index_t>::type
        vertex_index_map_t;
    vertex_index_map_t vertex_index = get(vertex_index_t(), g);

    // Create some auxiliary property maps
    typedef typename vprop_map_t<size_t>::type::unchecked_t vmap_t;

    vmap_t deg(vertex_index, num_vertices(g));  // Remaining degree
    vmap_t pos(vertex_index, num_vertices(g));  // Position in bin (core)

    typedef typename graph_traits<Graph>::vertex_descriptor vertex_t;

    vector<vector<vertex_t>> bins; // Each bin stores the set of vertices of a
                                   // core

    // Put each vertex to the bin corresponding to its degree
    for (auto v : vertices_range(g))
    {
        size_t k = out_degree(v, g);
        deg[v] = k;
        if (k >= bins.size())
            bins.resize(k + 1);
        bins[k].push_back(v);
        pos[v] = bins[k].size() - 1;
    }

    // Proceed from smallest bin to largest. For each vertex in bin, check the
    // neighbours; if any of them have a larger remaining degree, reduce it by
    // one, and put it in the correct bin.
    for (size_t k = 0; k < bins.size(); ++k)
    {
        auto& bins_k = bins[k];
        while (!bins_k.empty())
        {
            auto v = bins_k.back();
            bins_k.pop_back();
            core_map[v] = k;
            for (auto e : out_edges_range(v, g))
            {
                auto u = target(e, g);
                auto& ku = deg[u];
                if (ku > deg[v])
                {
                    auto& bins_ku = bins[ku];
                    auto w = bins_ku.back();
                    auto pos_w = pos[w] = pos[u];
                    bins_ku[pos_w] = w;
                    bins_ku.pop_back();
                    auto& bins_ku_m = bins[ku - 1];
                    bins_ku_m.push_back(u);
                    pos[u] = bins_ku_m.size() - 1;
                    --ku;
                }
            }
        }
    }
}

The second part is a compilation unit kcore.cc that setups the Python-C++ interface:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
#include <boost/python.hpp>
#include "kcore.hh"

// The function below will take the graph that comes from graph-tool (always an
// instance of GraphInterface) and the property map (always an instance of
// boost::any).

void kcore_bind(GraphInterface& gi, boost::any core_map)
{
    // We don't know the actual type of the graph represented by 'gi' and the
    // property map 'core_map'. We need to evoke the appropriate instance of the
    // algorithm at run-time using the gt_dispatch<>() function.
    //
    // The gt_dispatch<>() function takes as first argument the template
    // function object that will be called with the correct types, and the
    // remaining (variadic) arguments are the list of types that will be
    // considered for every argument of the function passed in the first
    // argument. In the case below 'all_graph_views()' represents all possible
    // graph views (directed, undirected, filtered, reversed, etc.) and
    // 'writable_vertex_scalar_properties' represents all vertex property maps
    // with scalar types that are writable (this excludes the vertex index
    // map). If we had more graphs or property maps to pass, we would simply
    // increase the parameter list accordingly.
    //
    // The gt_dispatch<>() function returns an object that needs to be called
    // with the specific objects that should be used for the dispatched call. In
    // this case we extract the actual view from `gi.get_graph_view()` and pass
    // the `core_map`.

    gt_dispatch<>()
        ([&](auto& g, auto core){ kcore_decomposition(g, core); },
         all_graph_views(), writable_vertex_scalar_properties())
        (gi.get_graph_view(), core_map);
};

// The lines below setup a Python module called 'libkcore' that reflects the
// function 'kcore_bind' above as 'kcore' when imported from Python.

BOOST_PYTHON_MODULE(libkcore)
{
    using namespace boost::python;
    def("kcore", kcore_bind);
}

When using the gt_dispatch<>() function, we can make use of the following type lists for the graph views:

List of graph views

Description

all_graph_views

All possible graph views

always_directed

All directed graph views

never_directed

All undirected graph views

always_reversed

All directed but reversed graph views

never_reversed

All undirected, and directed but not reversed graph views

always_directed_never_reversed

All directed but not reversed graph views

never_filtered

All unfiltered graph views

never_filtered_never_reversed

All unfiltered and not reversed graph views

Likewise, for the types of property maps we have:

List of property maps

Description

vertex_properties

All vertex property maps

edge_properties

All edge property maps

writable_vertex_properties

All writable vertex property maps

writable_edge_properties

All writable edge property maps

vertex_scalar_properties

All scalar-valued vertex property maps

edge_scalar_properties

All scalar-valued edge property maps

writable_vertex_scalar_properties

All writable scalar-valued vertex property maps

writable_edge_scalar_properties

All writable scalar-valued edge property maps

vertex_integer_properties

All integer-valued vertex property maps

edge_integer_properties

All integer-valued edge property maps

vertex_floating_properties

All floating-point-valued vertex property maps

edge_floating_properties

All floating-point-valued edge property maps

vertex_scalar_vector_properties

All vertex property maps with vector of scalar types

edge_scalar_vector_properties

All edge property maps with vector of scalar types

vertex_integer_vector_properties

All vertex property maps with vector of integer types

edge_integer_vector_properties

All edge property maps with vector of integer types

vertex_floating_vector_properties

All vertex property maps with vector of floating-point types

edge_floating_vector_properties

All edge property maps with vector of floating-point types

Finally, we need to bind things from the Python side with the file kcore.py:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
import graph_tool

# We import the C++ module (called libkcore.so)
import libkcore

# The function below is what will be used from Python, and dispatch to the the
# C++ module.

def kcore_decomposition(g, core=None):
    if core is None:
        core = g.new_vertex_property("int")

    # For graph objects wee need to pass the internal GraphInterface which is
    # assessed via g._Graph__graph.

    # Property maps need to be passed as boost::any objects, which is done via
    # the _get_any() method.

    libkcore.kcore(g._Graph__graph, core._get_any())
    return core

Naturally, before we can use the extension, we need to compile it. To do this, we need to inform the compiler where to find the graph-tool include files. For that we use the pkg-config infrastructure, by calling pkg-config --cflags --libs graph-tool-py3.7 (in case graph-tool was compiled for Python 3.7). To keep things organized, we create a Makefile:

1
2
3
4
5
6
CXX=g++
CXXFLAGS=-O3 -fopenmp -std=gnu++17 -Wall -fPIC `pkg-config --cflags --libs graph-tool-py3.7` -shared
ALL: libkcore.so

libkcore.so: kcore.hh kcore.cc
	${CXX} ${CXXFLAGS} kcore.cc -o libkcore.so 

(If you use MacOS, you may want to use a different compiler such as clang++.) After compiling by typing make, we can import and use the kcore.py module:

>>> from kcore import kcore_decomposition
>>> g = gt.collection.data["football"]
>>> c = kcore_decomposition(g)
>>> print(c.a)
[8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
 8 8 8 8 8 7 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
 8 8 8 8]

Range-based iteration over vertices and edges

Users familiar with the BGL may have noticed that iteration over vertices and edges was done in a peculiar way in the example above. The standard BGL way of iterating is something like:

auto rg = vertices(g);
for(auto vi = rg.first; vi != rg.second; ++vi)
{
    auto v = *vi;  // get the vertex descriptor
}

In graph-tool iteration can also be done in a more convenient way using range-based iteration:

for(auto v : vertices_range(g))
{
    // v is the vertex descriptor
}

Both approaches above are equivalent. Graph-tool also provides edges_range(), out_edges_range(), in_edges_range(), etc. in an analogous fashion.

Extracting specific property maps

Sometimes it’s more convenient to extract property map of known types directly instead of relying on gt_dispatch<>(). This can be done simply by calling boost::any_cast as follows:

void kcore_bind(GraphInterface& gi, boost::any core_map)
{
    // Vertex property map of type int32_t
    typedef typename vprop_map_t<int32_t>::type vprop_t;
    vprop_t c = boost::any_cast<vprop_t>(core_map);

    gt_dispatch<>()
        ([&](auto& g){ kcore_decomposition(g, c.get_unchecked()); },
         all_graph_views()) (gi.get_graph_view());
};

Checked and unchecked property maps

Property maps in graph-tool can either be ‘checked’ or ‘unchecked’, when they are seen from the C++ side. A checked property map automatically resizes itself if the underlying graph changes (via the addition of nodes or edges). However this involves bounds checking for every lookup, which can result in performance hit. For static graphs, it’s better to use unchecked property maps, which do not perform bounds checking (but do not adapt to a changing graph).

The default property map types are by default checked:

// vprop_t and eprop_t below are checked vertex and edge
// property maps of type int32_t

typedef typename vprop_map_t<int32_t>::type vprop_t;
typedef typename eprop_map_t<int32_t>::type eprop_t;

The type hidden in a boost::any that comes from Python is a checked property map. However the types propagated by gt_dispatch<>() are by default ‘unchecked’. It’s easy to obtain checked and unchecked versions of each kind of property map:

// core_map is of type boost::any

typedef typename vprop_map_t<int32_t>::type vprop_t;
vprop_t c = boost::any_cast<vprop_t>(core_map);

// c is a checked property map. We can obtain an unchecked version
// as follows:

auto uc = c.get_unchecked();   // The type of uc is vprop_map_t<int32_t>::type::unchecked_t

// If the underlying graph has changed size in the mean-time, we
// need to specify the new size when creating the unchecked map:

auto uc2 = c.get_unchecked(10000);

// From a checked property map, we can get the checked type in a similar fashion:

auto c2 = uc.get_checked();

// c and c2 have the same type.

References

batagelj

Vladimir Batagelj, Matjaž Zaveršnik, “Fast algorithms for determining (generalized) core groups in social networks”, Advances in Data Analysis and Classification Volume 5, Issue 2, pp 129-145 (2011), DOI: 10.1007/s11634-010-0079-y [sci-hub, @tor], arXiv: cs/0310049