motif_significance#

graph_tool.clustering.motif_significance(g, k, n_shuffles=100, p=1.0, motif_list=None, threshold=0, self_loops=False, parallel_edges=False, full_output=False, shuffle_model='configuration')[source]#

Obtain the motif significance profile, for subgraphs with k vertices. A tuple with two lists is returned: the list of motifs found, and their respective z-scores.

Parameters:

gGraph: Graph to be used.
kint: Number of vertices of the motifs
n_shufflesint (optional, default: 100): Number of shuffled networks to consider for the z-score
pfloat or float list (optional, default: 1.0): Uniform fraction of the motifs to be sampled. If a float list is provided, it will be used as the fraction at each depth \([1,\dots,k]\) in the algorithm. See [wernicke-efficient-2006] for more details.
motif_listlist of Graph objects (optional, default: None): If supplied, the algorithms will only search for the motifs in this list (isomorphisms)
thresholdint (optional, default: 0): If a given motif count is below this level, it is not considered.
self_loopsbool (optional, default: False): Whether or not the shuffled graphs are allowed to contain self-loops
parallel_edgesbool (optional, default: False): Whether or not the shuffled graphs are allowed to contain parallel edges.
full_outputbool (optional, default: False): If set to True, three additional lists are returned: the count of each motif, the average count of each motif in the shuffled networks, and the standard deviation of the average count of each motif in the shuffled networks.
shuffle_modelstring (optional, default: “configuration”): Shuffle model to use. See random_rewire() for details.

Returns:

motifslist of Graph objects: List of motifs of size k found in the Graph. Graphs are grouped according to their isomorphism class, and only one of each class appears in this list. The list is sorted according to in-degree sequence, out-degree-sequence, and number of edges (in this order).
z-scoreslist of floats: The z-score of the respective motives. See below for the definition of the z-score.

See also

motifs: motif counting or sampling
local_clustering: local clustering coefficient
global_clustering: global clustering coefficient
extended_clustering: extended (generalized) clustering coefficient

Notes

The z-score \(z_i\) of motif i is defined as

\[z_i = \frac{N_i - \left<N^s_i\right>} {\sqrt{\left<(N^s_i)^2\right> - \left<N^s_i\right>^2}},\]

where \(N_i\) is the number of times motif i found, and \(N^s_i\) is the count of the same motif but on a shuffled network. It measures how many standard deviations is each motif count, in respect to an ensemble of randomly shuffled graphs with the same degree sequence.

The z-score values are not normalized.

Parallel implementation.

If enabled during compilation, this algorithm will run in parallel using OpenMP. See the parallel algorithms section for information about how to control several aspects of parallelization.

References

[wernicke-efficient-2006]

S. Wernicke, “Efficient detection of network motifs”, IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), Volume 3, Issue 4, Pages 347-359, 2006. DOI: 10.1109/TCBB.2006.51 [sci-hub, @tor]

Examples

>>> from numpy import random
>>> random.seed(10)
>>> g = gt.random_graph(100, lambda: (3,3))
>>> motifs, zscores = gt.motif_significance(g, 3)
>>> print(len(motifs))
11
>>> print(zscores)
[0.6047878414563249, 0.6094680990819686, 1.0229305974010572, 0.39872798688563355, -0.6865659282127776, -0.7003194037662325, -1.1702999575943225, -0.15, -0.15, -0.29, -0.15]

motif_significance

Contents

motif_significance#