partition_overlap#
- graph_tool.inference.partition_overlap(x, y, norm=True)[source]#
Returns the maximum overlap between partitions, according to an optimal label alignment.
- Parameters:
- xiterable of
int
values orPropertyMap
First partition.
- yiterable of
int
values orPropertyMap
Second partition.
- norm(optional, default:
True
) If
True
, the result will be normalized in the range \([0,1]\).
- xiterable of
- Returns:
- w
float
orint
Maximum overlap value.
- w
Notes
The maximum overlap between partitions \(\boldsymbol x\) and \(\boldsymbol y\) is defined as
\[\omega(\boldsymbol x,\boldsymbol y) = \underset{\boldsymbol\mu}{\max}\sum_i\delta_{x_i,\mu(y_i)},\]where \(\boldsymbol\mu\) is a bijective mapping between group labels. It corresponds to solving an instance of the maximum weighted bipartite matching problem, which is done with the Kuhn-Munkres algorithm [kuhn_hungarian_1955] [munkres_algorithms_1957].
If
norm == True
, the normalized value is returned:\[\frac{\omega(\boldsymbol x,\boldsymbol y)}{N}\]which lies in the unit interval \([0,1]\).
This algorithm runs in time \(O[N + (B_x+B_y)E_m]\) where \(N\) is the length of \(\boldsymbol x\) and \(\boldsymbol y\), \(B_x\) and \(B_y\) are the number of labels in partitions \(\boldsymbol x\) and \(\boldsymbol y\), respectively, and \(E_m \le B_xB_y\) is the number of nonzero entries in the contingency table between both partitions.
References
[peixoto-revealing-2021]Tiago P. Peixoto, “Revealing consensus and dissensus between network partitions”, Phys. Rev. X 11 021003 (2021) DOI: 10.1103/PhysRevX.11.021003 [sci-hub, @tor], arXiv: 2005.13977
[kuhn_hungarian_1955]H. W. Kuhn, “The Hungarian method for the assignment problem,” Naval Research Logistics Quarterly 2, 83–97 (1955) DOI: 10.1002/nav.3800020109 [sci-hub, @tor]
[munkres_algorithms_1957]James Munkres, “Algorithms for the Assignment and Transportation Problems,” Journal of the Society for Industrial and Applied Mathematics 5, 32–38 (1957). DOI: 10.1137/0105003 [sci-hub, @tor]
Examples
>>> x = np.random.randint(0, 10, 1000) >>> y = np.random.randint(0, 10, 1000) >>> print(gt.partition_overlap(x, y)) 0.143