graph_tool.clustering.motif_significance#
- graph_tool.clustering.motif_significance(g, k, n_shuffles=100, p=1.0, motif_list=None, threshold=0, self_loops=False, parallel_edges=False, full_output=False, shuffle_model='configuration')[source]#
Obtain the motif significance profile, for subgraphs with k vertices. A tuple with two lists is returned: the list of motifs found, and their respective z-scores.
- Parameters:
- g
Graph
Graph to be used.
- kint
Number of vertices of the motifs
- n_shufflesint (optional, default: 100)
Number of shuffled networks to consider for the z-score
- pfloat or float list (optional, default: 1.0)
Uniform fraction of the motifs to be sampled. If a float list is provided, it will be used as the fraction at each depth \([1,\dots,k]\) in the algorithm. See [wernicke-efficient-2006] for more details.
- motif_listlist of
Graph
objects (optional, default: None) If supplied, the algorithms will only search for the motifs in this list (isomorphisms)
- thresholdint (optional, default: 0)
If a given motif count is below this level, it is not considered.
- self_loopsbool (optional, default: False)
Whether or not the shuffled graphs are allowed to contain self-loops
- parallel_edgesbool (optional, default: False)
Whether or not the shuffled graphs are allowed to contain parallel edges.
- full_outputbool (optional, default: False)
If set to True, three additional lists are returned: the count of each motif, the average count of each motif in the shuffled networks, and the standard deviation of the average count of each motif in the shuffled networks.
- shuffle_modelstring (optional, default: “configuration”)
Shuffle model to use. See
random_rewire()
for details.
- g
- Returns:
- motifslist of
Graph
objects List of motifs of size k found in the Graph. Graphs are grouped according to their isomorphism class, and only one of each class appears in this list. The list is sorted according to in-degree sequence, out-degree-sequence, and number of edges (in this order).
- z-scoreslist of floats
The z-score of the respective motives. See below for the definition of the z-score.
- motifslist of
See also
motifs
motif counting or sampling
local_clustering
local clustering coefficient
global_clustering
global clustering coefficient
extended_clustering
extended (generalized) clustering coefficient
Notes
The z-score \(z_i\) of motif i is defined as
\[z_i = \frac{N_i - \left<N^s_i\right>} {\sqrt{\left<(N^s_i)^2\right> - \left<N^s_i\right>^2}},\]where \(N_i\) is the number of times motif i found, and \(N^s_i\) is the count of the same motif but on a shuffled network. It measures how many standard deviations is each motif count, in respect to an ensemble of randomly shuffled graphs with the same degree sequence.
The z-scores values are not normalized.
If enabled during compilation, this algorithm runs in parallel.
References
[wernicke-efficient-2006]S. Wernicke, “Efficient detection of network motifs”, IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), Volume 3, Issue 4, Pages 347-359, 2006. DOI: 10.1109/TCBB.2006.51 [sci-hub, @tor]
Examples
>>> from numpy import random >>> random.seed(10) >>> g = gt.random_graph(100, lambda: (3,3)) >>> motifs, zscores = gt.motif_significance(g, 3) >>> print(len(motifs)) 11 >>> print(zscores) [-0.002447784231335365, 0.11542965095109282, 0.5455110559583692, 0.4457583812533446, -0.220159937740175, -0.4920799893836054, -0.7529294261980236, 0.9299999999999999, -0.11, -0.44, -0.28]