graph_tool.clustering.motif_significance#

graph_tool.clustering.motif_significance(g, k, n_shuffles=100, p=1.0, motif_list=None, threshold=0, self_loops=False, parallel_edges=False, full_output=False, shuffle_model='configuration')[source]#

Obtain the motif significance profile, for subgraphs with k vertices. A tuple with two lists is returned: the list of motifs found, and their respective z-scores.

Parameters:
gGraph

Graph to be used.

kint

Number of vertices of the motifs

n_shufflesint (optional, default: 100)

Number of shuffled networks to consider for the z-score

pfloat or float list (optional, default: 1.0)

Uniform fraction of the motifs to be sampled. If a float list is provided, it will be used as the fraction at each depth \([1,\dots,k]\) in the algorithm. See [wernicke-efficient-2006] for more details.

motif_listlist of Graph objects (optional, default: None)

If supplied, the algorithms will only search for the motifs in this list (isomorphisms)

thresholdint (optional, default: 0)

If a given motif count is below this level, it is not considered.

self_loopsbool (optional, default: False)

Whether or not the shuffled graphs are allowed to contain self-loops

parallel_edgesbool (optional, default: False)

Whether or not the shuffled graphs are allowed to contain parallel edges.

full_outputbool (optional, default: False)

If set to True, three additional lists are returned: the count of each motif, the average count of each motif in the shuffled networks, and the standard deviation of the average count of each motif in the shuffled networks.

shuffle_modelstring (optional, default: “configuration”)

Shuffle model to use. See random_rewire() for details.

Returns:
motifslist of Graph objects

List of motifs of size k found in the Graph. Graphs are grouped according to their isomorphism class, and only one of each class appears in this list. The list is sorted according to in-degree sequence, out-degree-sequence, and number of edges (in this order).

z-scoreslist of floats

The z-score of the respective motives. See below for the definition of the z-score.

See also

motifs

motif counting or sampling

local_clustering

local clustering coefficient

global_clustering

global clustering coefficient

extended_clustering

extended (generalized) clustering coefficient

Notes

The z-score \(z_i\) of motif i is defined as

\[z_i = \frac{N_i - \left<N^s_i\right>} {\sqrt{\left<(N^s_i)^2\right> - \left<N^s_i\right>^2}},\]

where \(N_i\) is the number of times motif i found, and \(N^s_i\) is the count of the same motif but on a shuffled network. It measures how many standard deviations is each motif count, in respect to an ensemble of randomly shuffled graphs with the same degree sequence.

The z-scores values are not normalized.

If enabled during compilation, this algorithm runs in parallel.

References

[wernicke-efficient-2006]

S. Wernicke, “Efficient detection of network motifs”, IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), Volume 3, Issue 4, Pages 347-359, 2006. DOI: 10.1109/TCBB.2006.51 [sci-hub, @tor]

Examples

>>> from numpy import random
>>> random.seed(10)
>>> g = gt.random_graph(100, lambda: (3,3))
>>> motifs, zscores = gt.motif_significance(g, 3)
>>> print(len(motifs))
12
>>> print(zscores)
[0.9356448791848487, 0.8189579649509503, 0.2720861155829026, -0.5086069434054314, -1.091630710611107, -0.8379737597626044, 1.092607157633067, 0.87, 0.89, -0.34, -0.16, -0.01]