reduced_mutual_information#
- graph_tool.inference.reduced_mutual_information(x, y, norm=False)[source]#
Returns the reduced mutual information between two partitions.
- Parameters:
- xiterable of
int
values First partition.
- yiterable of
int
values Second partition.
- norm(optional, default:
True
) If
True
, the result will be normalized such that the maximum possible value is one.
- xiterable of
- Returns:
- RMI
float
Reduced mutual information value.
- RMI
Notes
The reduced mutual information (RMI) [newman_improved_2020] is defined as
\[\text{RMI}(\boldsymbol x,\boldsymbol y) = \frac{1}{N}\left[\ln \frac{N!\prod_{rs}m_{rs}!}{\prod_rn_r!\prod_sn_s'!} -\ln\Omega(\boldsymbol n, \boldsymbol n')\right],\]with \(m_{rs}=\sum_i\delta_{x_i,r}\delta_{y_i,s}\) being the contingency table between \(\boldsymbol x\) and \(\boldsymbol y\), and \(n_r=\sum_sm_{rs}\) and \(n'_s=\sum_rm_{rs}\) are the group sizes in both partitions, and \(\Omega(\boldsymbol n, \boldsymbol n')\) is the total number of contingency tables with fixed row and column sums.
If
norm == True
, the normalized value is returned:\[\frac{2\ln \frac{N!\prod_{rs}m_{rs}!}{\prod_rn_r!\prod_sn_s'!} -2\ln\Omega(\boldsymbol n, \boldsymbol n')} {\ln\frac{N!}{\prod_rn_r!} + \ln\frac{N!}{\prod_rn'_r!} -\ln\Omega(\boldsymbol n, \boldsymbol n) -\ln\Omega(\boldsymbol n', \boldsymbol n')}\]which can take a maximum value of one.
Note that both the normalized and unormalized RMI values can be negative.
This algorithm runs in time \(O(N)\) where \(N\) is the length of \(\boldsymbol x\) and \(\boldsymbol y\).
References
[newman_improved_2020]M. E. J. Newman, G. T. Cantwell and J.-G. Young, “Improved mutual information measure for classification and community detection”, Phys. Rev. E, 101, 042304 (2020), DOI: 10.1103/PhysRevE.101.042304 [sci-hub, @tor], arXiv: 1907.12581
Examples
>>> x = np.random.randint(0, 10, 1000) >>> y = np.random.randint(0, 10, 1000) >>> gt.reduced_mutual_information(x, y) -0.069938...