Parallel algorithms#
graph-tool
has support for shared memory parallelization via OpenMP for many algorithms, as is indicated in
their docstring.
OpenMP support is optional, and must be enabled during compilation. To check
whether it is available, the function openmp_enabled()
can be
used.
- graph_tool.openmp_enabled()[source]
Return
True
if OpenMP was enabled during compilation.
Several parallelization parameters can be controlled at runtime, including the number of threads, the work sharing schedule, and the minimum number of nodes required for parallelization to be enabled.
- graph_tool.openmp_get_num_threads()[source]
Return the number of OpenMP threads.
- graph_tool.openmp_set_num_threads(n)[source]
Set the number of OpenMP threads.
- graph_tool.openmp_get_schedule()[source]
Return the runtime OpenMP schedule and chunk size. The schedule can by any of:
"static"
,"dynamic"
,"guided"
,"auto"
.
- graph_tool.openmp_set_schedule(schedule, chunk=0)[source]
Set the runtime OpenMP schedule and chunk size. The schedule can by any of:
"static"
,"dynamic"
,"guided"
,"auto"
.
- graph_tool.openmp_get_thresh()[source]
Return the minimum number of vertices necessary to enable parallelization.
- graph_tool.openmp_set_thresh(n)[source]
Set the the minimum number of vertices necessary to enable parallelization.
It’s possible to set these parameter temporarily using a context manager:
- graph_tool.openmp_context(nthreads=None, schedule=None, chunk=0, thresh=None)[source]
Return a context manager that sets the tuntime OpenMP parameters, and restores the original values when exited.
For example, to constrain temporarily the number of threads to 3 and use a “guided” scheduling one could do:
>>> g = gt.collection.data["polblogs"]
>>> with gt.openmp_context(nthreads=3, schedule="guided"):
... ret = gt.pagerank(g)
Parallelization can be disabled altogether in the same way, but using nthreads=1
.
Warning
By default, graph-tool
will configure the OpenMP wait policy to
“passive”, since this usually results in improved performance for the
majority of algorithms.
In order for change this behavor, the following environment variable should be set before the Python interpreted is evoked:
export OMP_WAIT_POLICY=active
or alternatively from Python before graph-tool is imported:
import os
os.environ["OMP_WAIT_POLICY"] = "active"
Due to an OpenMP API limitation, this can no longer be changed after graph-tool has been imported.
The global interpreter lock (GIL)#
graph-tool
releases Python’s GIL as soon as the C++
implementations are reached, even for algorithms that are not implemented in
parallel with OpenMP. This means that Python’s threading
functionally can
be used with many functions to achieve parallelism. For example, the following
code will run several calls of subgraph_isomorphism()
in parallel using 16 threads, which each individual call running sequentially:
>>> from concurrent.futures import ThreadPoolExecutor
>>> g = gt.collection.data["netscience"]
>>> def find_sub():
... u = gt.random_graph(11, lambda: 4, directed=False, model="erdos")
... gt.subgraph_isomorphism(u, g, max_n=100)
>>> with ThreadPoolExecutor(max_workers=16) as executor:
... futures = [executor.submit(find_sub) for i in range(16)]
... for future in futures:
... future.result()
The same kind of functionality can be achieved with multiprocessing
with
a nearly identical interface, but the above offers smaller overhead, since no
inter-process communication is needed.