Writing extensions in C++#
It’s possible to extend graph-tool with code written in C++, as we explain in this text. This is useful for project-specific code that is not available in the library, but nevertheless needs to run faster than what is possible with the pure Python API.
Note
In order to write C++ extensions, a general background on C++ with templates, the Boost Graph Library (BGL) and boost.Python is strongly recommended.
A C++ extension to graph-tool consists of a compiled object file that is
imported as a Python module, and it contains one or more functions that
operate on Graph
and
PropertyMap
objects. We demonstrate with an example
that computes k-cores using the
algorithm of [batagelj] (this is already implemented in
kcore_decomposition()
, but serves as a good
example).
The first part is a file kcore.hh
that defines a
function template that implements the desired algorithm:
1#include "graph_tool.hh"
2
3using namespace graph_tool;
4using namespace boost;
5using namespace std;
6
7// This template function takes a graph 'g' and a vertex property map 'core_map'
8// where the core values will be stored. Their exact types are unspecified at
9// this point.
10
11template <typename Graph, typename CoreMap>
12void kcore_decomposition(Graph& g, CoreMap core_map)
13{
14 // Create some auxiliary property maps
15 typedef typename vprop_map_t<size_t>::unchecked_t vmap_t;
16
17 vmap_t deg(num_vertices(g)); // Remaining degree
18 vmap_t pos(num_vertices(g)); // Position in bin (core)
19
20 typedef typename graph_traits<Graph>::vertex_descriptor vertex_t;
21
22 vector<vector<vertex_t>> bins; // Each bin stores the set of vertices of a
23 // core
24
25 // Put each vertex to the bin corresponding to its degree
26 for (auto v : vertices_range(g))
27 {
28 size_t k = out_degree(v, g);
29 deg[v] = k;
30 if (k >= bins.size())
31 bins.resize(k + 1);
32 bins[k].push_back(v);
33 pos[v] = bins[k].size() - 1;
34 }
35
36 // Proceed from smallest bin to largest. For each vertex in bin, check the
37 // neighbours; if any of them have a larger remaining degree, reduce it by
38 // one, and put it in the correct bin.
39 for (size_t k = 0; k < bins.size(); ++k)
40 {
41 auto& bins_k = bins[k];
42 while (!bins_k.empty())
43 {
44 auto v = bins_k.back();
45 bins_k.pop_back();
46 core_map[v] = k;
47 for (auto e : out_edges_range(v, g))
48 {
49 auto u = target(e, g);
50 auto& ku = deg[u];
51 if (ku > deg[v])
52 {
53 auto& bins_ku = bins[ku];
54 auto w = bins_ku.back();
55 auto pos_w = pos[w] = pos[u];
56 bins_ku[pos_w] = w;
57 bins_ku.pop_back();
58 auto& bins_ku_m = bins[ku - 1];
59 bins_ku_m.push_back(u);
60 pos[u] = bins_ku_m.size() - 1;
61 --ku;
62 }
63 }
64 }
65 }
66}
The second part is a compilation unit kcore.cc
that sets up the Python-C++ interface:
1#include <boost/python.hpp>
2#include "kcore.hh"
3
4// The function below will take the graph that comes from graph-tool (always an
5// instance of GraphInterface) and the property map (always an instance of
6// boost::any).
7
8void kcore_bind(GraphInterface& gi, std::any core_map)
9{
10 // We don't know the actual type of the graph represented by 'gi' and the
11 // property map 'core_map'. We need to evoke the appropriate instance of the
12 // algorithm at run-time using the gt_dispatch<>() function.
13 //
14 // The gt_dispatch<>() function takes as first argument the template
15 // function object that will be called with the correct types, and the
16 // remaining (variadic) arguments are the list of types that will be
17 // considered for every argument of the function passed in the first
18 // argument. In the case below 'all_graph_views()' represents all possible
19 // graph views (directed, undirected, filtered, reversed, etc.) and
20 // 'writable_vertex_scalar_properties' represents all vertex property maps
21 // with scalar types that are writable (this excludes the vertex index
22 // map). If we had more graphs or property maps to pass, we would simply
23 // increase the parameter list accordingly.
24 //
25 // The gt_dispatch<>() function returns an object that needs to be called
26 // with the specific objects that should be used for the dispatched call. In
27 // this case we extract the actual view from `gi.get_graph_view()` and pass
28 // the `core_map`.
29
30 gt_dispatch<>()
31 ([&](auto& g, auto core){ kcore_decomposition(g, core); },
32 all_graph_views, writable_vertex_scalar_properties)
33 (gi.get_graph_view(), core_map);
34};
35
36// The lines below setup a Python module called 'libkcore' that reflects the
37// function 'kcore_bind' above as 'kcore' when imported from Python.
38
39BOOST_PYTHON_MODULE(libkcore)
40{
41 using namespace boost::python;
42 def("kcore", kcore_bind);
43}
Releasing the Global Interpreter Lock (GIL)
The function gt_dispatch<>()
will automatically release Python’s GIL before the code
dispatch, so that multiple algorithms can run in parallel (see Sec.
Parallel algorithms). If you wish to use the Python interpreter from
inside your dispatch you need first to acquire the GIL, or alternatively
disable the GIL release by calling instead gt_dispatch<>(false)
.
When using the gt_dispatch<>()
function, we can make use of the following
hana type lists for the graph views:
List of graph views |
Description |
---|---|
|
All possible graph views |
|
All directed graph views |
|
All undirected graph views |
|
All directed but reversed graph views |
|
All undirected, and directed but not reversed graph views |
|
All directed but not reversed graph views |
|
All unfiltered graph views |
|
All unfiltered and not reversed graph views |
Likewise, for the types of property maps we have:
List of property maps |
Description |
---|---|
|
All vertex property maps |
|
All edge property maps |
|
All writable vertex property maps |
|
All writable edge property maps |
|
All scalar-valued vertex property maps |
|
All scalar-valued edge property maps |
|
All writable scalar-valued vertex property maps |
|
All writable scalar-valued edge property maps |
|
All integer-valued vertex property maps |
|
All integer-valued edge property maps |
|
All floating-point-valued vertex property maps |
|
All floating-point-valued edge property maps |
|
All vertex property maps with vector of scalar types |
|
All edge property maps with vector of scalar types |
|
All vertex property maps with vector of integer types |
|
All edge property maps with vector of integer types |
|
All vertex property maps with vector of floating-point types |
|
All edge property maps with vector of floating-point types |
Finally, we need to bind things from the Python side with the file
kcore.py
:
1import graph_tool
2
3# We import the C++ module (called libkcore.so)
4import libkcore
5
6# The function below is what will be used from Python, and dispatch to the the
7# C++ module.
8
9def kcore_decomposition(g, core=None):
10 if core is None:
11 core = g.new_vertex_property("int")
12
13 # For graph objects wee need to pass the internal GraphInterface which is
14 # assessed via g._Graph__graph.
15
16 # Property maps need to be passed as boost::any objects, which is done via
17 # the _get_any() method.
18
19 libkcore.kcore(g._Graph__graph, core._get_any())
20 return core
Naturally, before we can use the extension, we need to compile it. To do
this, we need to inform the compiler where to find the graph-tool
include files. For that we use the pkg-config
infrastructure, by calling pkg-config --cflags --libs graph-tool-py3.7
(in case graph-tool was compiled for Python 3.7). To keep things
organized, we create a Makefile
:
1PY3DOTVERSION=$(shell python -c "import sys; print(sys.version_info[0],sys.version_info[1],sep='.')")
2CXX=g++
3CXXFLAGS=-O3 -fopenmp -std=gnu++17 -Wall -fPIC `pkg-config --cflags --libs graph-tool-py$(PY3DOTVERSION)` -shared
4ALL: libkcore.so
5
6libkcore.so: kcore.hh kcore.cc
7 ${CXX} ${CXXFLAGS} kcore.cc -o libkcore.so
(If you use MacOS, you may want to use a different compiler such as
clang++
.) After compiling by typing make
, we can import and use
the kcore.py
module:
>>> from kcore import kcore_decomposition
>>> g = gt.collection.data["football"]
>>> c = kcore_decomposition(g)
>>> print(c.a)
[8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
8 8 8 8 8 7 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
8 8 8 8]
Range-based iteration over vertices and edges#
Users familiar with the BGL may have noticed that iteration over vertices and edges was done in a peculiar way in the example above. The standard BGL way of iterating is something like:
auto rg = vertices(g);
for (auto vi = rg.first; vi != rg.second; ++vi)
{
auto v = *vi; // get the vertex descriptor
}
In graph-tool iteration can also be done in a more convenient way using range-based iteration:
for (auto v : vertices_range(g))
{
// v is the vertex descriptor
}
Both approaches above are equivalent. Graph-tool also provides
edges_range()
, out_edges_range()
, in_edges_range()
, etc., in
an analogous fashion.
Extracting specific property maps#
Sometimes it’s more convenient to extract property map of known types
directly instead of relying on gt_dispatch<>()
. This can be done
simply by calling std::any_cast
as follows:
void kcore_bind(GraphInterface& gi, std::any core_map)
{
// Vertex property map of type int32_t
typedef typename vprop_map_t<int32_t>::type vprop_t;
vprop_t c = std::any_cast<vprop_t>(core_map);
gt_dispatch<>()
([&](auto& g){ kcore_decomposition(g, c.get_unchecked()); },
all_graph_views)(gi.get_graph_view());
};
Checked and unchecked property maps#
Property maps in graph-tool can either be ‘checked’ or ‘unchecked’, when they are seen from the C++ side. A checked property map automatically resizes itself if the underlying graph changes (via the addition of nodes or edges). However this involves bounds checking for every lookup, which can result in performance hit. For static graphs, it’s better to use unchecked property maps, which do not perform bounds checking (but do not adapt to a changing graph).
The default property map types are by default checked:
// vprop_t and eprop_t below are checked vertex and edge
// property maps of type int32_t
typedef vprop_map_t<int32_t> vprop_t;
typedef eprop_map_t<int32_t> eprop_t;
The type hidden in a std::any
that comes from Python is a checked
property map. However, the types propagated by gt_dispatch<>()
are by default ‘unchecked’. It’s easy to obtain checked and unchecked
versions of each kind of property map:
Note
The member functions get_checked()
and get_unchecked()
perform
nothing more than a reinterpret_cast, and
therefore are very cheap operations, incurring no run-time performance
overhead when called.
// core_map is of type std::any
typedef vprop_map_t<int32_t> vprop_t;
vprop_t c = std::any_cast<vprop_t>(core_map);
// c is a checked property map. We can obtain an unchecked version
// as follows:
auto& uc = c.get_unchecked(); // The type of uc is vprop_map_t<int32_t>::unchecked_t
// If the underlying graph has changed size in the mean-time, we
// need to specify the new size when creating the unchecked map:
auto& uc2 = c.get_unchecked(10000);
// From a checked property map, we can get the checked type in a similar fashion:
auto& c2 = uc.get_checked();
// c and c2 have the same type.
References#
Vladimir Batagelj, Matjaž Zaveršnik, “Fast algorithms for determining (generalized) core groups in social networks”, Advances in Data Analysis and Classification Volume 5, Issue 2, pp 129-145 (2011), DOI: 10.1007/s11634-010-0079-y [sci-hub, @tor], arXiv: cs/0310049