graph_tool.inference.nested_partition_overlap#

graph_tool.inference.nested_partition_overlap(x, y, norm=True)[source]#

Returns the hierarchical maximum overlap between nested partitions, according to an optimal recursive label alignment.

Parameters:
xiterable of iterables of int values

First partition.

yiterable of iterables of int values

Second partition.

norm(optional, default: True)

If True, the result will be normalized in the range \([0,1]\).

Returns:
wfloat or int

Maximum hierarchical overlap value.

Notes

The maximum overlap between partitions \(\bar{\boldsymbol x}\) and \(\bar{\boldsymbol y}\) is defined as

\[\omega(\bar{\boldsymbol x},\bar{\boldsymbol y}) = \sum_l\underset{\boldsymbol\mu_l}{\max}\sum_i\delta_{x_i^l,\mu_l(\tilde y_i^l)},\]

where \(\boldsymbol\mu_l\) is a bijective mapping between group labels at level \(l\), and \(\tilde y_i^l = y^i_{\mu_{l-1}(i)}\) are the nodes reordered according to the lower level. It corresponds to solving an instance of the maximum weighted bipartite matching problem for every hierarchical level, which is done with the Kuhn-Munkres algorithm [kuhn_hungarian_1955] [munkres_algorithms_1957].

If norm == True, the normalized value is returned:

\[1 - \frac{\left(\sum_lN_l\right) - \omega(\bar{\boldsymbol x}, \bar{\boldsymbol y})}{\sum_l\left(N_l - 1\right)}\]

which lies in the unit interval \([0,1]\), where \(N_l=\max(N_{{\boldsymbol x}^l}, N_{{\boldsymbol y}^l})\) is the number of nodes in level l.

This algorithm runs in time \(O[\sum_l N_l + (B_x^l+B_y^l)E_m^l]\) where \(B_x^l\) and \(B_y^l\) are the number of labels in partitions \(\bar{\boldsymbol x}\) and \(\bar{\boldsymbol y}\) at level \(l\), respectively, and \(E_m^l \le B_x^lB_y^l\) is the number of nonzero entries in the contingency table between both partitions.

References

[peixoto-revealing-2021]

Tiago P. Peixoto, “Revealing consensus and dissensus between network partitions”, Phys. Rev. X 11 021003 (2021) DOI: 10.1103/PhysRevX.11.021003 [sci-hub, @tor], arXiv: 2005.13977

[kuhn_hungarian_1955]

H. W. Kuhn, “The Hungarian method for the assignment problem,” Naval Research Logistics Quarterly 2, 83–97 (1955) DOI: 10.1002/nav.3800020109 [sci-hub, @tor]

[munkres_algorithms_1957]

James Munkres, “Algorithms for the Assignment and Transportation Problems,” Journal of the Society for Industrial and Applied Mathematics 5, 32–38 (1957). DOI: 10.1137/0105003 [sci-hub, @tor]

Examples

>>> x = [np.random.randint(0, 100, 1000), np.random.randint(0, 10, 100), np.random.randint(0, 3, 10)]
>>> y = [np.random.randint(0, 100, 1000), np.random.randint(0, 10, 100), np.random.randint(0, 3, 10)]
>>> gt.nested_partition_overlap(x, y)
0.140018...