variation_information#
- graph_tool.inference.variation_information(x, y, norm=False)[source]#
Returns the variation of information between two partitions.
- Parameters:
- xiterable of
int
values First partition.
- yiterable of
int
values Second partition.
- norm(optional, default:
False
) If
True
, the result will be normalized in the range .
- xiterable of
- Returns:
- VI
float
Variation of information value.
- VI
Notes
The variation of information [meila_comparing_2003] is defined as
with
being the contingency table between and , and and are the group sizes in both partitions.If
norm == True
, the normalized value is returned:which lies in the unit interval
.This algorithm runs in time
where is the length of and .References
[meila_comparing_2003]Marina Meilă, “Comparing Clusterings by the Variation of Information,” in Learning Theory and Kernel Machines, Lecture Notes in Computer Science No. 2777, edited by Bernhard Schölkopf and Manfred K. Warmuth (Springer Berlin Heidelberg, 2003) pp. 173–187. DOI: 10.1007/978-3-540-45167-9_14 [sci-hub, @tor]
Examples
>>> x = np.random.randint(0, 10, 1000) >>> y = np.random.randint(0, 10, 1000) >>> gt.variation_information(x, y) 4.525389...