# graph_tool.inference - Statistical inference of generative network models¶

This module contains algorithms for the identification of large-scale network structure via the statistical inference of generative models.

Note

An introduction to the concepts used here, as well as a basic HOWTO is included in the cookbook section: Inferring modular network structure.

## Nonparametric stochastic block model inference¶

### High-level functions¶

 minimize_blockmodel_dl Fit the stochastic block model, by minimizing its description length using an agglomerative heuristic. minimize_nested_blockmodel_dl Fit the nested stochastic block model, by minimizing its description length using an agglomerative heuristic.

### State classes¶

 BlockState The stochastic block model state of a given graph. OverlapBlockState The overlapping stochastic block model state of a given graph. LayeredBlockState The (possibly overlapping) block state of a given graph, where the edges are divided into discrete layers. NestedBlockState The nested stochastic block model state of a given graph. PPBlockState Obtain the partition of a network according to the Bayesian planted partition model. RankedBlockState Obtain the ordered partition of a network according to the ranked stochastic block model. ModularityState Obtain the partition of a network according to Newman's modularity. NormCutState Obtain the partition of a network according to the normalized cut. TemperingState This class aggregates several state classes and corresponding inverse-temperature values to implement parallel tempering MCMC. CliqueState The state of a clique decomposition of a given graph.

### Abstract base classes¶

 MCMCState Base state that implements single-flip MCMC sweeps MultiflipMCMCState Base state that implements multiflip (merge-split) MCMC sweeps MultilevelMCMCState Base state that implements multilevel agglomerative MCMC sweeps GibbsMCMCState Base state that implements single flip MCMC sweeps MulticanonicalMCMCState Base state that implements multicanonical MCMC sweeps ExhaustiveSweepState Base state that implements exhaustive enumerative sweeps

### Sampling and minimization¶

 mcmc_equilibrate Equilibrate a MCMC with a given starting state. mcmc_anneal Equilibrate a MCMC at a specified target temperature by performing simulated annealing. multicanonical_equilibrate Equilibrate a multicanonical Monte Carlo sampling using the Wang-Landau algorithm. MulticanonicalState The density of states of a multicanonical Monte Carlo algorithm.

### Comparing and manipulating partitions¶

 PartitionModeState The random label model state for a set of labelled partitions, which attempts to align them with a common group labelling. ModeClusterState The mixed random label model state for a set of labelled partitions, which attempts to align them inside clusters with a common group labelling. PartitionCentroidState Obtain the center of a set of partitions, according to the variation of information metric or reduced mutual information. partition_overlap Returns the maximum overlap between partitions, according to an optimal label alignment. nested_partition_overlap Returns the hierarchical maximum overlap between nested partitions, according to an optimal recursive label alignment. variation_information Returns the variation of information between two partitions. mutual_information Returns the mutual information between two partitions. reduced_mutual_information Returns the reduced mutual information between two partitions. contingency_graph Returns the contingency graph between both partitions. shuffle_partition_labels Returns a copy of partition x, with the group labels randomly shuffled. order_partition_labels Returns a copy of partition x, with the group labels ordered decreasingly according to group size. order_nested_partition_labels Returns a copy of nested partition x, with the group labels ordered decreasingly according to group size at each level. align_partition_labels Returns a copy of partition x, with the group labels aligned as to maximize the overlap with y. align_nested_partition_labels Returns a copy of nested partition x, with the group labels aligned as to maximize the overlap with y. partition_overlap_center Find a partition with a maximal overlap to all items of the list of partitions given. nested_partition_overlap_center Find a nested partition with a maximal overlap to all items of the list of nested partitions given. nested_partition_clear_null Returns a copy of nested partition x where the null values -1 are replaced with 0. contiguous_map Remap the values of prop in the contiguous range $$[0, N-1]$$. nested_contiguous_map Remap the values of the nested partition bs in the contiguous range $$[0, N_l-1]$$ for each level $$l$$.

### Auxiliary functions¶

 mf_entropy Compute the "mean field" entropy given the vertex block membership marginals. bethe_entropy Compute the Bethe entropy given the edge block membership marginals. microstate_entropy Compute microstate entropy given a histogram of partitions. marginal_multigraph_entropy Compute the entropy of the marginal latent multigraph distribution. half_edge_graph Generate a half-edge graph, where each half-edge is represented by a node, and an edge connects the half-edges like in the original graph. get_block_edge_gradient Get edge gradients corresponding to the block membership at the endpoints of the edges given by the be edge property map.

### Auxiliary classes¶

 PartitionHist Histogram of partitions, implemented in C++. BlockPairHist Histogram of block pairs, implemented in C++.

## Nonparametric network reconstruction¶

### State classes¶

 LatentMultigraphBlockState Inference state of an erased Poisson multigraph, using the stochastic block model as a prior. LatentClosureBlockState Inference state of the stochastic block model with latent triadic closure edges. MeasuredBlockState Inference state of a measured graph, using the stochastic block model as a prior. MeasuredClosureBlockState Inference state of a measured graph, using the stochastic block model with triadic closure as a prior. MixedMeasuredBlockState Inference state of a measured graph with heterogeneous errors, using the stochastic block model as a prior. UncertainBlockState Inference state of an uncertain graph, using the stochastic block model as a prior. UncertainBaseState Base state for uncertain network inference. DynamicsBlockStateBase Base state for network reconstruction based on dynamical data, using the stochastic block model as a prior. EpidemicsBlockState Inference state for network reconstruction based on epidemic dynamics, using the stochastic block model as a prior. IsingBaseBlockState Base state for network reconstruction based on the Ising model, using the stochastic block model as a prior. IsingGlauberBlockState State for network reconstruction based on the Glauber dynamics of the Ising model, using the stochastic block model as a prior. CIsingGlauberBlockState State for network reconstruction based on the Glauber dynamics of the continuous Ising model, using the stochastic block model as a prior. PseudoIsingBlockState State for network reconstruction based on the equilibrium configurations of the Ising model, using the Pseudolikelihood approximation and the stochastic block model as a prior. PseudoCIsingBlockState State for network reconstruction based on the equilibrium configurations of the continuous Ising model, using the Pseudolikelihood approximation and the stochastic block model as a prior.

### Expectation-maximization Inference¶

 latent_multigraph Infer latent Poisson multigraph model given an "erased" simple graph.

## Semiparametric stochastic block model inference¶

### State classes¶

 EMBlockState The parametric, undirected stochastic block model state of a given graph.

### Expectation-maximization Inference¶

 em_infer Infer the model parameters and latent variables using the expectation-maximization (EM) algorithm with initial state given by state.

## Large-scale descriptors¶

 modularity Calculate Newman's (generalized) modularity of a network partition.

## Contents¶

graph_tool.inference.minimize_blockmodel_dl(g, state=<class 'graph_tool.inference.blockmodel.BlockState'>, state_args={}, multilevel_mcmc_args={})[source]

Fit the stochastic block model, by minimizing its description length using an agglomerative heuristic.

Parameters
gGraph

The graph.

stateSBM-like state class (optional, default: BlockState)

Type of model that will be used. Must be derived from MultilevelMCMCState.

state_argsdict (optional, default: {})

Arguments to be passed to appropriate state constructor (e.g. BlockState)

multilevel_mcmc_argsdict (optional, default: {})

Arguments to be passed to multilevel_mcmc_sweep().

Returns
min_statetype given by parameter state

State with minimum description length.

Notes

This function is a convenience wrapper around multilevel_mcmc_sweep().

See [peixoto-efficient-2014] for details on the algorithm.

This algorithm has a complexity of $$O(V \ln^2 V)$$, where $$V$$ is the number of nodes in the network.

References

peixoto-efficient-2014

Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378.

Examples

>>> g = gt.collection.data["polbooks"]
>>> state = gt.minimize_blockmodel_dl(g)
>>> state.draw(pos=g.vp["pos"], vertex_shape=state.get_blocks(),
...            output="polbooks_blocks_mdl.svg")
<...>


Block partition of a political books network, which minimizes the description length of the network according to the degree-corrected stochastic blockmodel.

>>> g = gt.collection.data["polbooks"]
>>> state = gt.minimize_blockmodel_dl(g, state=gt.OverlapBlockState)
>>> state.draw(pos=g.vp["pos"], output="polbooks_overlap_blocks_mdl.svg")
<...>


Overlapping partition of a political books network, which minimizes the description length of the network according to the overlapping degree-corrected stochastic blockmodel.

>>> g = gt.collection.data["celegansneural"]
>>> state = gt.minimize_blockmodel_dl(g, state=gt.PPBlockState)
>>> state.draw(output="celegans_mdl_pp.pdf")
<...>


Assortative partition of the C. elegans neural network, which minimizes the description length of the network according to the degree-corrected planted-partition blockmodel.

graph_tool.inference.minimize_nested_blockmodel_dl(g, init_bs=None, state=<class 'graph_tool.inference.nested_blockmodel.NestedBlockState'>, state_args={}, multilevel_mcmc_args={})[source]

Fit the nested stochastic block model, by minimizing its description length using an agglomerative heuristic.

Parameters
gGraph

The graph.

init_bsiterable of iterable of ints (optional, default: None)

Initial hierarchical partition.

B_minint (optional, default: 1)

The minimum number of blocks.

B_maxint (optional, default: numpy.iinfo(numpy.int64).max)

The maximum number of blocks.

b_minVertexPropertyMap (optional, default: None)

The partition to be used with the minimum number of blocks.

b_maxVertexPropertyMap (optional, default: None)

The partition to be used with the maximum number of blocks.

stateSBM state class (optional, default: NestedBlockState)

Type of model that will be used.

state_argsdict (optional, default: {})

Arguments to be passed to appropriate state constructor (e.g. BlockState)

multilevel_mcmc_argsdict (optional, default: {})

Arguments to be passed to multilevel_mcmc_sweep().

Returns
min_statetype given by parameter state

State with minimum description length.

Notes

This function is a convenience wrapper around multilevel_mcmc_sweep().

See [peixoto-hierarchical-2014] for details on the algorithm.

This algorithm has a complexity of $$O(E \ln^2 V)$$, where $$E$$ and $$V$$ are the number of edges and nodes in the network, respectively.

References

peixoto-hierarchical-2014

Tiago P. Peixoto, “Hierarchical block structures and high-resolution model selection in large networks “, Phys. Rev. X 4, 011047 (2014), DOI: 10.1103/PhysRevX.4.011047 [sci-hub, @tor], arXiv: 1310.4377.

Examples

>>> g = gt.collection.data["power"]
>>> state = gt.minimize_nested_blockmodel_dl(g)
>>> state.draw(output="power_nested_mdl.pdf")
(...)


Hierarchical Block partition of a power-grid network, which minimizes the description length of the network according to the nested (degree-corrected) stochastic blockmodel.

>>> g = gt.collection.data["celegansneural"]
>>> state = gt.minimize_nested_blockmodel_dl(g, state_args=dict(overlap=True))
>>> state.draw(output="celegans_nested_mdl_overlap.pdf")
(...)


Overlapping block partition of the C. elegans neural network, which minimizes the description length of the network according to the nested overlapping degree-corrected stochastic blockmodel.

class graph_tool.inference.BlockState(g, b=None, B=None, eweight=None, vweight=None, recs=[], rec_types=[], rec_params=[], clabel=None, pclabel=None, bfield=None, Bfield=None, deg_corr=True, dense_bg=False, **kwargs)[source]

The stochastic block model state of a given graph.

Parameters
gGraph

Graph to be modelled.

bVertexPropertyMap (optional, default: None)

Initial block labels on the vertices. If not supplied, it will be randomly sampled.

Bint (optional, default: None)

Number of blocks (or vertex groups). If not supplied it will be obtained from the parameter b.

eweightEdgePropertyMap (optional, default: None)

Edge multiplicities (for multigraphs or block graphs).

vweightVertexPropertyMap (optional, default: None)

Vertex multiplicities (for block graphs).

recslist of EdgePropertyMap instances (optional, default: [])

List of real or discrete-valued edge covariates.

rec_typeslist of edge covariate types (optional, default: [])

List of types of edge covariates. The possible types are: "real-exponential", "real-normal", "discrete-geometric", "discrete-poisson" or "discrete-binomial".

rec_paramslist of dict (optional, default: [])

Model hyperparameters for edge covariates. This should be a list of dict instances, or the string “microcanonical” (the default if nothing is specified). The keys depend on the type of edge covariate:

"real-exponential" or "discrete-poisson"

The parameter list is ["r", "theta"], corresponding to the parameters of the Gamma prior distribution. If unspecified, the default is the “empirical Bayes” choice: r = 1.0 and theta is the global average of the edge covariate.

"discrete-geometric"

The parameter list is ["alpha", "beta"], corresponding to the parameters of the Beta prior distribution. If unspecified, the default is the noninformative choice: alpha = beta = 1.0

"discrete-binomial"

The parameter list is ["N", "alpha", "beta"], corresponding to the number of trials N and the parameters of the Beta prior distribution. If unspecified, the default is the noninformative choice, alpha = beta = 1.0, and N is taken to be the maximum edge covarite value.

"real-normal"

The parameter list is ["m0", "k0", "v0", "nu0"] corresponding to the normal-inverse-chi-squared prior. If unspecified, the defaults are: m0 = rec.fa.mean(), k0 = 1, v0 = rec.fa.std() ** 2, and nu0 = 3, where rec is the corresponding edge covariate property map.

clabelVertexPropertyMap (optional, default: None)

Constraint labels on the vertices. If supplied, vertices with different label values will not be clustered in the same group.

pclabelVertexPropertyMap (optional, default: None)

Partition constraint labels on the vertices. This has the same interpretation as clabel, but will be used to compute the partition description length.

bfieldVertexPropertyMap (optional, default: None)

Local field acting as a prior for the node partition. This should be a vector property map of type vector<double>, and contain the log-probability for each node to be placed in each group.

deg_corrbool (optional, default: True)

If True, the degree-corrected version of the blockmodel ensemble will be assumed, otherwise the traditional variant will be used.

dense_bgbool (optional, default: False)

If True a dense matrix is used for the block graph, otherwise a sparse matrix will be used.

get_rec_params()[source]

Get model hyperparameters for edge covariates.

set_rec_params(params)[source]

Update model hyperparameters for edge covariates.

copy(g=None, eweight=None, vweight=None, b=None, B=None, deg_corr=None, clabel=None, overlap=False, pclabel=None, bfield=None, dense_bg=None, **kwargs)[source]

Copies the block state. The parameters override the state properties, and have the same meaning as in the constructor.

get_block_state(b=None, vweight=False, **kwargs)[source]

Returns a BlockState corresponding to the block graph (i.e. the blocks of the current state become the nodes). The parameters have the same meaning as the in the constructor. If vweight == True the nodes of the block state are weighted with the node counts.

get_E()[source]

Returns the total number of edges.

get_N()[source]

Returns the total number of nodes.

get_B()[source]

Returns the total number of blocks.

get_nonempty_B()[source]

Returns the total number of nonempty blocks.

get_Be()[source]

Returns the effective number of blocks, defined as $$e^{H}$$, with $$H=-\sum_r\frac{n_r}{N}\ln \frac{n_r}{N}$$, where $$n_r$$ is the number of nodes in group r.

get_bclabel(clabel=None)[source]

Returns a VertexPropertyMap corresponding to constraint labels for the block graph.

get_bpclabel()[source]

Returns a VertexPropertyMap corresponding to partition constraint labels for the block graph.

get_blocks()[source]

Returns the property map which contains the block labels for each vertex.

get_state()[source]

Alias to get_blocks().

set_state(b)[source]

Sets the internal partition of the state.

get_bg()[source]

Returns the block graph.

get_ers()[source]

Returns the edge property map of the block graph which contains the $$e_{rs}$$ matrix entries. For undirected graphs, the diagonal values (self-loops) contain $$e_{rr}/2$$.

get_er()[source]

Returns the vertex property map of the block graph which contains the number $$e_r$$ of half-edges incident on block $$r$$. If the graph is directed, a pair of property maps is returned, with the number of out-edges $$e^+_r$$ and in-edges $$e^-_r$$, respectively.

get_nr()[source]

Returns the vertex property map of the block graph which contains the block sizes $$n_r$$.

entropy(adjacency=True, dl=True, partition_dl=True, degree_dl=True, degree_dl_kind='distributed', edges_dl=True, dense=False, multigraph=True, deg_entropy=True, recs=True, recs_dl=True, beta_dl=1.0, Bfield=True, exact=True, **kwargs)[source]

Calculate the entropy (a.k.a. negative log-likelihood) associated with the current block partition.

Parameters
adjacencybool (optional, default: True)

If True, the adjacency term of the description length will be included.

dlbool (optional, default: True)

If True, the description length for the parameters will be included.

partition_dlbool (optional, default: True)

If True, and dl == True the partition description length will be included.

degree_dlbool (optional, default: True)

If True, and dl == True the degree sequence description length will be included (for degree-corrected models).

degree_dl_kindstr (optional, default: "distributed")

This specifies the prior used for the degree sequence. It must be one of: "uniform", "distributed" (default) or "entropy".

edges_dlbool (optional, default: True)

If True, and dl == True the edge matrix description length will be included.

densebool (optional, default: False)

If True, the “dense” variant of the entropy will be computed.

multigraphbool (optional, default: True)

If True, the multigraph entropy will be used.

deg_entropybool (optional, default: True)

If True, the degree entropy term that is independent of the network partition will be included (for degree-corrected models).

recsbool (optional, default: True)

If True, the likelihood for real or discrete-valued edge covariates is computed.

recs_dlbool (optional, default: True)

If True, and dl == True the edge covariate description length will be included.

beta_dldouble (optional, default: 1.)

Prior inverse temperature.

exactbool (optional, default: True)

If True, the exact expressions will be used. Otherwise, Stirling’s factorial approximation will be used for some terms.

Notes

The “entropy” of the state is the negative log-likelihood of the microcanonical SBM, that includes the generated graph $$\boldsymbol{A}$$ and the model parameters $$\boldsymbol{\theta}$$,

$\begin{split}\Sigma &= - \ln P(\boldsymbol{A},\boldsymbol{\theta}) \\ &= - \ln P(\boldsymbol{A}|\boldsymbol{\theta}) - \ln P(\boldsymbol{\theta}).\end{split}$

This value is also called the description length of the data, and it corresponds to the amount of information required to describe it (in nats).

For the traditional blockmodel (deg_corr == False), the model parameters are $$\boldsymbol{\theta} = \{\boldsymbol{e}, \boldsymbol{b}\}$$, where $$\boldsymbol{e}$$ is the matrix of edge counts between blocks, and $$\boldsymbol{b}$$ is the partition of the nodes into blocks. For the degree-corrected blockmodel (deg_corr == True), we have an additional set of parameters, namely the degree sequence $$\boldsymbol{k}$$.

For the traditional blockmodel, the model likelihood is

$\begin{split}P(\boldsymbol{A}|\boldsymbol{e},\boldsymbol{b}) &= \frac{\prod_{r<s}e_{rs}!\prod_re_{rr}!!}{\prod_rn_r^{e_r}}\times \frac{1}{\prod_{i<j}A_{ij}!\prod_iA_{ii}!!},\\ P(\boldsymbol{A}|\boldsymbol{e},\boldsymbol{b}) &= \frac{\prod_{rs}e_{rs}!}{\prod_rn_r^{e_r}}\times \frac{1}{\prod_{ij}A_{ij}!},\end{split}$

for undirected and directed graphs, respectively, where $$e_{rs}$$ is the number of edges from block $$r$$ to $$s$$ (or the number of half-edges for the undirected case when $$r=s$$), and $$n_r$$ is the number of vertices in block $$r$$ .

For the degree-corrected variant the equivalent expressions are

$\begin{split}P(\boldsymbol{A}|\boldsymbol{e},\boldsymbol{b},\boldsymbol{k}) &= \frac{\prod_{r<s}e_{rs}!\prod_re_{rr}!!}{\prod_re_r!}\times \frac{\prod_ik_i!}{\prod_{i<j}A_{ij}!\prod_iA_{ii}!!},\\ P(\boldsymbol{A}|\boldsymbol{e},\boldsymbol{b},\boldsymbol{k}) &= \frac{\prod_{rs}e_{rs}!}{\prod_re_r^+!\prod_re_r^-!}\times \frac{\prod_ik_i^+!\prod_ik_i^-!}{\prod_{ij}A_{ij}!},\end{split}$

where $$e_r = \sum_se_{rs}$$ is the number of half-edges incident on block $$r$$, and $$e^+_r = \sum_se_{rs}$$ and $$e^-_r = \sum_se_{sr}$$ are the numbers of out- and in-edges adjacent to block $$r$$, respectively.

If exact == False, Stirling’s approximation is used in the above expression.

If dense == True, the likelihood for the non-degree-corrected model becomes instead

$\begin{split}P(\boldsymbol{A}|\boldsymbol{e},\boldsymbol{b})^{-1} &= \prod_{r<s}{n_rn_s\choose e_{rs}}\prod_r{{n_r\choose 2}\choose e_{rr}/2},\\ P(\boldsymbol{A}|\boldsymbol{e},\boldsymbol{b})^{-1} &= \prod_{rs}{n_rn_s\choose e_{rs}}\end{split}$

if multigraph == False, otherwise we replace $${n\choose m}\to\left(\!\!{n\choose m}\!\!\right)$$ above, where $$\left(\!\!{n\choose m}\!\!\right) = {n+m-1\choose m}$$. A “dense” entropy for the degree-corrected model is not available, and if requested will raise a NotImplementedError.

If dl == True, the description length $$\mathcal{L} = -\ln P(\boldsymbol{\theta})$$ of the model will be returned as well. The terms $$P(\boldsymbol{e})$$ and $$P(\boldsymbol{b})$$ are described in described as follows.

For an undirected graph, the number of distinct $$e_{rs}$$ matrices is given by,

$\Omega_m = \left(\!\!{B(B+1)/2 \choose E}\!\!\right)$

and for a directed graph,

$\Omega_m = \left(\!\!{B^2 \choose E}\!\!\right)$

where $$\left(\!{n \choose k}\!\right) = {n+k-1\choose k}$$ is the number of $$k$$ combinations with repetitions from a set of size $$n$$. Hence, we have the description length of the edge counts

$-\ln P(\boldsymbol{e}) = \ln \Omega_m.$

For the node partition $$\boldsymbol{b}$$ we assume a two-level Bayesian hierarchy, where first the group size histogram is generated, and conditioned on it the partition, which leads to a description length:

$-\ln P(\boldsymbol{b}) = \ln {N - 1 \choose B - 1} + \ln N! - \sum_r \ln n_r!.$

where $$n_r$$ is the number of nodes in block $$r$$.

The total information necessary to describe the model is then,

$-\ln P(\boldsymbol{e}, \boldsymbol{b}) = -\ln P(\boldsymbol{e}) - \ln P(\boldsymbol{b}).$

If nr is None, it is assumed $$n_r=N/B$$. If nr is False, the partition term $$-\ln P(\boldsymbol{b})$$ is omitted entirely.

For the degree-corrected model we need to specify the prior $$P(\boldsymbol{k})$$ for the degree sequence as well. Here there are three options:

1. degree_dl_kind == "uniform"

$P(\boldsymbol{k}|\boldsymbol{e},\boldsymbol{b}) = \prod_r\left(\!\!{n_r\choose e_r}\!\!\right)^{-1}.$

This corresponds to a noninformative prior, where the degrees are sampled from a uniform distribution.

2. degree_dl_kind == "distributed" (default)

$P(\boldsymbol{k}|\boldsymbol{e},\boldsymbol{b}) = \prod_r\frac{\prod_k\eta_k^r!}{n_r!} \prod_r q(e_r, n_r)^{-1}$

with $$\eta_k^r$$ being the number of nodes with degree $$k$$ in group $$r$$, and $$q(n,m)$$ being the number of partitions of integer $$n$$ into at most $$m$$ parts.

This corresponds to a prior for the degree sequence conditioned on the degree frequencies, which are themselves sampled from a uniform hyperprior. This option should be preferred in most cases.

3. degree_dl_kind == "entropy"

$P(\boldsymbol{k}|\boldsymbol{e},\boldsymbol{b}) \approx \prod_r\exp\left(-n_rH(\boldsymbol{k}_r)\right)$

where $$H(\boldsymbol{k}_r) = -\sum_kp_r(k)\ln p_r(k)$$ is the entropy of the degree distribution inside block $$r$$.

Note that, differently from the other two choices, this represents only an approximation of the description length. It is meant to be used only for comparison purposes, and should be avoided in practice.

For the directed case, the above expressions are duplicated for the in- and out-degrees.

References

peixoto-nonparametric-2017

Tiago P. Peixoto, “Nonparametric Bayesian inference of the microcanonical stochastic block model”, Phys. Rev. E 95 012317 (2017), DOI: 10.1103/PhysRevE.95.012317 [sci-hub, @tor], arXiv: 1610.02703

peixoto-hierarchical-2014

Tiago P. Peixoto, “Hierarchical block structures and high-resolution model selection in large networks “, Phys. Rev. X 4, 011047 (2014), DOI: 10.1103/PhysRevX.4.011047 [sci-hub, @tor], arXiv: 1310.4377.

peixoto-weighted-2017

Tiago P. Peixoto, “Nonparametric weighted stochastic block models”, Phys. Rev. E 97, 012306 (2018), DOI: 10.1103/PhysRevE.97.012306 [sci-hub, @tor], arXiv: 1708.01432

get_matrix()[source]

Returns the block matrix (as a sparse csr_matrix), which contains the number of edges between each block pair.

Warning

This corresponds to the adjacency matrix of the block graph, which by convention includes twice the amount of edges in the diagonal entries if the graph is undirected.

Examples

>>> g = gt.collection.data["polbooks"]
>>> state = gt.BlockState(g, B=5, deg_corr=True)
>>> state.mcmc_sweep(niter=1000)
(...)
>>> m = state.get_matrix()
>>> figure()
<...>
>>> matshow(m.todense())
<...>
>>> savefig("bloc_mat.svg")


A 5x5 block matrix.

virtual_vertex_move(v, s, **kwargs)[source]

Computes the entropy difference if vertex v is moved to block s. The remaining parameters are the same as in graph_tool.inference.BlockState.entropy().

move_vertex(v, s)[source]

Move vertex v to block s.

This optionally accepts a list of vertices and blocks to move simultaneously.

remove_vertex(v)[source]

Remove vertex v from its current group.

This optionally accepts a list of vertices to remove.

Warning

This will leave the state in an inconsistent state before the vertex is returned to some other group, or if the same vertex is removed twice.

Add vertex v to block r.

This optionally accepts a list of vertices and blocks to add.

Warning

This can leave the state in an inconsistent state if a vertex is added twice to the same group.

sample_vertex_move(v, c=1.0, d=0.1)[source]

Sample block membership proposal of vertex v according to real-valued sampling parameters c and d: For $$c\to 0$$ the blocks are sampled according to the local neighborhood and their connections; for $$c\to\infty$$ the blocks are sampled randomly. With a probability d, a new (empty) group is sampled.

get_move_prob(v, s, c=1.0, d=0.1, reverse=False)[source]

Compute the log-probability of a move proposal for vertex v to block s according to sampling parameters c and d, as obtained with graph_tool.inference.BlockState.sample_vertex_move(). If reverse == True, the reverse probability of moving the node back from block s to its current one is obtained.

get_edges_prob(missing, spurious=[], entropy_args={})[source]

Compute the joint log-probability of the missing and spurious edges given by missing and spurious (a list of (source, target) tuples, or Edge() instances), together with the observed edges.

More precisely, the log-likelihood returned is

$\ln \frac{P(\boldsymbol G + \delta \boldsymbol G | \boldsymbol b)}{P(\boldsymbol G| \boldsymbol b)}$

where $$\boldsymbol G + \delta \boldsymbol G$$ is the modified graph (with missing edges added and spurious edges deleted).

The values in entropy_args are passed to entropy() to calculate the log-probability.

draw(**kwargs)

Convenience wrapper to graph_draw() that draws the state of the graph as colors on the vertices and edges.

exhaustive_sweep(entropy_args={}, callback=None, density=None, vertices=None, initial_partition=None, max_iter=None)

Perform an exhaustive loop over all possible network partitions.

Parameters
entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

callbackcallable object (optional, default: None)

Function to be called for each partition, with three arguments (S, S_min, b_min) corresponding to the the current entropy value, the minimum entropy value so far, and the corresponding partition, respectively. If not provided, and hist is None an iterator over the same values will be returned instead.

densitytuple (optional, default: None)

If provided, it should contain a tuple with values (S_min, S_max, n_bins), which will be used to obtain the density of states via a histogram of size n_bins. This parameter is ignored unless callback is None.

verticesiterable of ints (optional, default: None)

If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.

initial_partitioniterable of ints (optional, default: None)

If provided, this will provide the initial partition for the iteration.

max_iterint (optional, default: None)

If provided, this will limit the total number of iterations.

Returns
statesiterator over (S, S_min, b_min)

If callback is None and hist is None, the function will return an iterator over (S, S_min, b_min) corresponding to the the current entropy value, the minimum entropy value so far, and the corresponding partition, respectively.

Ss, countspair of numpy.ndarray

If callback is None and hist is not None, the function will return the values of each bin (Ss) and the state count of each bin (counts).

b_minVertexPropertyMap

If callback is not None or hist is not None, the function will also return partition with smallest entropy.

Notes

This algorithm has an $$O(B^N)$$ complexity, where $$B$$ is the number of groups, and $$N$$ is the number of vertices.

gibbs_sweep(beta=1.0, niter=1, entropy_args={}, allow_new_group=True, sequential=True, deterministic=False, vertices=None, verbose=False, **kwargs)

Perform niter sweeps of a rejection-free Gibbs MCMC to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

allow_new_groupbool (optional, default: True)

Allow the number of groups to increase and decrease.

sequentialbool (optional, default: True)

If sequential == True each vertex move attempt is made sequentially, where vertices are visited in random order. Otherwise the moves are attempted by sampling vertices randomly, so that the same vertex can be moved more than once, before other vertices had the chance to move.

deterministicbool (optional, default: False)

If sequential == True and deterministic == True the vertices will be visited in deterministic order.

verticeslist of ints (optional, default: None)

If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E\times B)$$ complexity, where $$B$$ is the number of groups, and $$E$$ is the number of edges.

mcmc_sweep(beta=1.0, c=0.5, d=0.01, niter=1, entropy_args={}, allow_vacate=True, sequential=True, deterministic=False, vertices=None, verbose=False, **kwargs)

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

dfloat (optional, default: .01)

Probability of selecting a new (i.e. empty) group for a given move.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

allow_vacatebool (optional, default: True)

Allow groups to be vacated.

sequentialbool (optional, default: True)

If sequential == True each vertex move attempt is made sequentially, where vertices are visited in random order. Otherwise the moves are attempted by sampling vertices randomly, so that the same vertex can be moved more than once, before other vertices had the chance to move.

deterministicbool (optional, default: False)

If sequential == True and deterministic == True the vertices will be visited in deterministic order.

verticeslist of ints (optional, default: None)

If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

peixoto-efficient-2014

Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378

multicanonical_sweep(m_state, multiflip=False, **kwargs)

Perform niter sweeps of a non-Markovian multicanonical sampling using the Wang-Landau algorithm.

Parameters
m_stateMulticanonicalState

MulticanonicalState instance containing the current state of the Wang-Landau run.

multiflipbool (optional, default: False)

If True, multiflip_mcmc_sweep() will be used, otherwise mcmc_sweep().

**kwargsKeyword parameter list

The remaining parameters will be passed to multiflip_mcmc_sweep() or mcmc_sweep().

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

wang-efficient-2001

Fugao Wang, D. P. Landau, “An efficient, multiple range random walk algorithm to calculate the density of states”, Phys. Rev. Lett. 86, 2050 (2001), DOI: 10.1103/PhysRevLett.86.2050 [sci-hub, @tor], arXiv: cond-mat/0011174

multiflip_mcmc_sweep(beta=1.0, c=0.5, psingle=None, psplit=1, pmerge=1, pmergesplit=1, pmovelabel=0, d=0.01, gibbs_sweeps=10, niter=1, entropy_args={}, accept_stats=None, verbose=False, **kwargs)

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC with multiple simultaneous moves (i.e. merges and splits) to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

psplitfloat (optional, default: 1)

Relative probability of proposing a group split.

pmergesplitfloat (optional, default: 1)

Relative probability of proposing a marge-split move.

pmovelabelfloat (optional, default: 0)

Relative probability of proposing a group label move.

dfloat (optional, default: 1)

Probability of selecting a new (i.e. empty) group for a given single-node move.

gibbs_sweepsint (optional, default: 10)

Number of sweeps of Gibbs sampling to be performed (i.e. each node is attempted once per sweep) to refine a split proposal.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

peixoto-merge-split-2020

Tiago P. Peixoto, “Merge-split Markov chain Monte Carlo for community detection”, Phys. Rev. E 102, 012305 (2020), DOI: 10.1103/PhysRevE.102.012305 [sci-hub, @tor], arXiv: 2003.07070

multilevel_mcmc_sweep(niter=1, beta=1.0, c=0.5, psingle=None, pmultilevel=1, d=0.01, r=0.9, random_bisect=True, merge_sweeps=10, mh_sweeps=10, init_r=0.99, init_beta=1.0, gibbs=False, B_min=1, B_max=18446744073709551615, b_min=None, b_max=None, M=None, cache_states=True, entropy_args={}, verbose=False, **kwargs)

Perform niter sweeps of a multilevel agglomerative acceptance-rejection MCMC to sample network partitions, that uses a bisection search on the number of groups, together with group merges and singe-node moves.

Parameters
niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

pmultilevelfloat (optional, default: 1)

Relative probability of proposing a multilevel move.

dfloat (optional, default: .01)

Probability of selecting a new (i.e. empty) group for a given single-node move.

rfloat (optional, default: 0.9)

Group shrink ratio. The number of groups is reduced by this fraction at each merge sweep.

random_bisectbool (optional, default: True)

If True, bisections are done at randomly chosen intervals. Otherwise a Fibonacci sequence is used.

merge_sweepsint (optional, default: 10)

Number of sweeps spent to find good merge proposals.

mh_sweepsint (optional, default: 10)

Number of single-node Metropolis-Hastings sweeps between merge splits.

init_rdouble (optional, default: 0.99)

Stopping criterion for the intialization phase, after each node is put in their own group, to set the initial upper bound of the bisection search. A number of single-node Metropolis-Hastings sweeps is done until the number of groups is shrunk by a factor that is larger than this parameter.

init_betafloat (optional, default: 1.)

Inverse temperature to be used for the very first sweep of the initialization phase.

gibbsbool (optional, default: False)

If True, the single node moves use (slower) Gibbs sampling, rather than Metropolis-Hastings.

B_minint (optional, default: 1)

Minimum number of groups to be considered in the search.

b_minVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_min.

B_maxint (optional, default: 1)

Maximum number of groups to be considered in the search.

b_maxVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_max.

Mint (optional, default: None)

Maximum number of groups to select for the multilevel move. If None is provided, then all groups are always elected.

cache_statesbool (optional, default: True)

If True, intermediary states will be cached during the bisection search.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E\ln^2 N)$$ complexity, where $$E$$ is the number of edges and $$N$$ is the number of nodes (independently of the number of groups).

References

peixoto-efficient-2014

Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378

collect_edge_marginals(p=None, update=1)[source]

Collect the edge marginal histogram, which counts the number of times the endpoints of each node have been assigned to a given block pair.

This should be called multiple times, e.g. after repeated runs of the graph_tool.inference.BlockState.mcmc_sweep() function.

Parameters
pEdgePropertyMap (optional, default: None)

Edge property map with edge marginals to be updated. If not provided, an empty histogram will be created.

updatefloat (optional, default: 1)

Each call increases the current count by the amount given by this parameter.

Returns
pEdgePropertyMap

Edge property map with updated edge marginals.

Examples

>>> np.random.seed(42)
>>> gt.seed_rng(42)
>>> g = gt.collection.data["polbooks"]
>>> state = gt.BlockState(g, B=4, deg_corr=True)
>>> pe = None
>>> state.mcmc_sweep(niter=1000)   # remove part of the transient
(...)
>>> for i in range(1000):
...     ret = state.mcmc_sweep(niter=10)
...     pe = state.collect_edge_marginals(pe)
>>> gt.bethe_entropy(g, pe)[0]
0.571383...

collect_vertex_marginals(p=None, b=None, unlabel=False, update=1)[source]

Collect the vertex marginal histogram, which counts the number of times a node was assigned to a given block.

This should be called multiple times, e.g. after repeated runs of the graph_tool.inference.BlockState.mcmc_sweep() function.

Parameters
pVertexPropertyMap (optional, default: None)

Vertex property map with vector-type values, storing the previous block membership counts. If not provided, an empty histogram will be created.

bVertexPropertyMap (optional, default: None)

Vertex property map with group partition. If not provided, the state’s partition will be used.

unlabelbool (optional, default: False)

If True, a canonical labelling of the groups will be used, so that each partition is uniquely represented.

updateint (optional, default: 1)

Each call increases the current count by the amount given by this parameter.

Returns
pVertexPropertyMap

Vertex property map with vector-type values, storing the accumulated block membership counts.

Examples

>>> np.random.seed(42)
>>> gt.seed_rng(42)
>>> g = gt.collection.data["polbooks"]
>>> state = gt.BlockState(g, B=4, deg_corr=True)
>>> pv = None
>>> state.mcmc_sweep(niter=1000)   # remove part of the transient
(...)
>>> for i in range(1000):
...     ret = state.mcmc_sweep(niter=10)
...     pv = state.collect_vertex_marginals(pv)
>>> gt.mf_entropy(g, pv)
15.037062...
>>> gt.graph_draw(g, pos=g.vp["pos"], vertex_shape="pie",
...               vertex_pie_fractions=pv, output="polbooks_blocks_soft_B4.svg")
<...>


“Soft” block partition of a political books network with $$B=4$$.

collect_partition_histogram(h=None, update=1, unlabel=True)[source]

Collect a histogram of partitions.

This should be called multiple times, e.g. after repeated runs of the graph_tool.inference.BlockState.mcmc_sweep() function.

Parameters
hPartitionHist (optional, default: None)

Partition histogram. If not provided, an empty histogram will be created.

updatefloat (optional, default: 1)

Each call increases the current count by the amount given by this parameter.

unlabelbool (optional, default: True)

If True, a canonical labelling of the groups will be used, so that each partition is uniquely represented.

Returns
hPartitionHist (optional, default: None)

Updated Partition histogram.

Examples

>>> np.random.seed(42)
>>> gt.seed_rng(42)
>>> g = gt.collection.data["polbooks"]
>>> state = gt.BlockState(g, B=4, deg_corr=True)
>>> ph = None
>>> state.mcmc_sweep(niter=1000)   # remove part of the transient
(...)
>>> for i in range(1000):
...     ret = state.mcmc_sweep(niter=10)
...     ph = state.collect_partition_histogram(ph)
>>> gt.microstate_entropy(ph)
126.171367...

sample_graph(canonical=False, multigraph=True, self_loops=True, sample_params=False, max_ent=False, n_iter=1000)[source]

Sample a new graph from the fitted model.

Parameters
canonicalbool (optional, default: False)

If canonical == True, the graph will be sampled from the maximum-likelihood estimate of the canonical stochastic block model. Otherwise, it will be sampled from the microcanonical model.

multigraphbool (optional, default: True)

If True, parallel edges will be allowed.

self-loopsbool (optional, default: True)

If True, self-loops will be allowed.

sample_paramsbool (optional, default: True)

If True, and canonical == True and max_ent == False, the count parameters (edges between groups and node degrees) will be sampled from their posterior distribution conditioned on the actual state. Otherwise, their maximum-likelihood values will be used.

max_entbool (optional, default: False)

If True, maximum-entropy model variants will be used.

n_iterint (optional, default: 1000)

Number of iterations used (only relevant if canonical == False and max_ent == True).

Returns
gGraph

Generated graph.

Notes

This function is just a convenience wrapper to generate_sbm(). However, if max_ent==True and canonical == False it wraps random_rewire() instead.

Examples

>>> g = gt.collection.data["polbooks"]
>>> state = gt.minimize_blockmodel_dl(g, multilevel_mcmc_args=dict(B_max=3))
>>> u = state.sample_graph(canonical=True, self_loops=False, multigraph=False)
>>> ustate = gt.BlockState(u, b=state.b)
>>> state.draw(pos=g.vp.pos, output="polbooks-sbm.svg")
<...>
>>> ustate.draw(pos=u.own_property(g.vp.pos), output="polbooks-sbm-sampled.svg")
<...>


Left: Political books network. Right: Sample from the degree-corrected SBM fitted to the original network.

class graph_tool.inference.OverlapBlockState(g, b=None, B=None, recs=[], rec_types=[], rec_params=[], clabel=None, pclabel=None, deg_corr=True, dense_bg=False, **kwargs)[source]

The overlapping stochastic block model state of a given graph.

Parameters
gGraph

Graph to be modelled.

bVertexPropertyMap or numpy.ndarray (optional, default: None)

Initial block labels on the vertices or half-edges. If not supplied, it will be randomly sampled. If the value passed is a vertex property map, it will be assumed to be a non-overlapping partition of the vertices. If it is an edge property map, it should contain a vector for each edge, with the block labels at each end point (sorted according to their vertex index, in the case of undirected graphs, otherwise from source to target). If the value is an numpy.ndarray, it will be assumed to correspond directly to a partition of the list of half-edges.

Bint (optional, default: None)

Number of blocks (or vertex groups). If not supplied it will be obtained from the parameter b.

recslist of EdgePropertyMap instances (optional, default: [])

List of real or discrete-valued edge covariates.

rec_typeslist of edge covariate types (optional, default: [])

List of types of edge covariates. The possible types are: "real-exponential", "real-normal", "discrete-geometric", "discrete-poisson" or "discrete-binomial".

rec_paramslist of dict (optional, default: [])

Model hyperparameters for edge covariates. This should a list of dict instances. See BlockState for more details.

clabelVertexPropertyMap (optional, default: None)

Constraint labels on the vertices. If supplied, vertices with different label values will not be clustered in the same group.

deg_corrbool (optional, default: True)

If True, the degree-corrected version of the blockmodel ensemble will be assumed, otherwise the traditional variant will be used.

dense_bgbool (optional, default: False)

If True a dense matrix is used for the block graph, otherwise a sparse matrix will be used.

copy(g=None, b=None, B=None, deg_corr=None, clabel=None, pclabel=None, **kwargs)[source]

Copies the block state. The parameters override the state properties, and have the same meaning as in the constructor. If overlap=False an instance of BlockState is returned. This is by default a shallow copy.

get_E()[source]

Returns the total number of edges.

get_N()[source]

Returns the total number of nodes.

get_B()[source]

Returns the total number of blocks.

get_nonempty_B()[source]

Returns the total number of nonempty blocks.

get_edge_blocks()[source]

Returns an edge property map which contains the block labels pairs for each edge.

get_overlap_blocks()[source]

Returns the mixed membership of each vertex.

Returns
bvVertexPropertyMap

A vector-valued vertex property map containing the block memberships of each node.

bc_inVertexPropertyMap

The labelled in-degrees of each node, i.e. how many in-edges belong to each group, in the same order as the bv property above.

bc_outVertexPropertyMap

The labelled out-degrees of each node, i.e. how many out-edges belong to each group, in the same order as the bv property above.

bc_totalVertexPropertyMap

The labelled total degrees of each node, i.e. how many incident edges belong to each group, in the same order as the bv property above.

get_nonoverlap_blocks()[source]

Returns a scalar-valued vertex property map with the block mixture represented as a single number.

get_majority_blocks()[source]

Returns a scalar-valued vertex property map with the majority block membership of each node.

get_bclabel(clabel=None)[source]

Returns a VertexPropertyMap corresponding to constraint labels for the block graph.

entropy(adjacency=True, dl=True, partition_dl=True, degree_dl=True, degree_dl_kind='distributed', edges_dl=True, dense=False, multigraph=True, deg_entropy=True, recs=True, recs_dl=True, beta_dl=1.0, exact=True, **kwargs)[source]

Calculate the entropy associated with the current block partition.

Parameters
adjacencybool (optional, default: True)

If True, the adjacency term of the description length will be included.

dlbool (optional, default: True)

If True, the description length for the parameters will be included.

partition_dlbool (optional, default: True)

If True, and dl == True the partition description length will be included.

degree_dlbool (optional, default: True)

If True, and dl == True the degree sequence description length will be included (for degree-corrected models).

degree_dl_kindstr (optional, default: "distributed")

This specifies the prior used for the degree sequence. It must be one of: "uniform", "distributed" (default) or "entropy".

edges_dlbool (optional, default: True)

If True, and dl == True the edge matrix description length will be included.

densebool (optional, default: False)

If True, the “dense” variant of the entropy will be computed.

multigraphbool (optional, default: True)

If True, the multigraph entropy will be used.

deg_entropybool (optional, default: True)

If True, the degree entropy term that is independent of the network partition will be included (for degree-corrected models).

recsbool (optional, default: True)

If True, the likelihood for real or discrete-valued edge covariates is computed.

recs_dlbool (optional, default: True)

If True, and dl == True the edge covariate description length will be included.

beta_dldouble (optional, default: 1.)

Prior inverse temperature.

exactbool (optional, default: True)

If True, the exact expressions will be used. Otherwise, Stirling’s factorial approximation will be used for some terms.

Notes

The “entropy” of the state is minus the log-likelihood of the microcanonical SBM, that includes the generated graph $$\boldsymbol{A}$$ and the model parameters $$\boldsymbol{\theta}$$,

$\begin{split}\mathcal{S} &= - \ln P(\boldsymbol{A},\boldsymbol{\theta}) \\ &= - \ln P(\boldsymbol{A}|\boldsymbol{\theta}) - \ln P(\boldsymbol{\theta}).\end{split}$

This value is also called the description length of the data, and it corresponds to the amount of information required to describe it (in nats).

For the traditional blockmodel (deg_corr == False), the model parameters are $$\boldsymbol{\theta} = \{\boldsymbol{e}, \boldsymbol{b}\}$$, where $$\boldsymbol{e}$$ is the matrix of edge counts between blocks, and $$\boldsymbol{b}$$ is the overlapping partition of the nodes into blocks. For the degree-corrected blockmodel (deg_corr == True), we have an additional set of parameters, namely the labelled degree sequence $$\boldsymbol{k}$$.

The model likelihood $$P(\boldsymbol{A}|\theta)$$ is given analogously to the non-overlapping case, as described in graph_tool.inference.BlockState.entropy().

If dl == True, the description length $$\mathcal{L} = -\ln P(\boldsymbol{\theta})$$ of the model will be returned as well. The edge-count prior $$P(\boldsymbol{e})$$ is described in described in entropy(). For the overlapping partition $$P(\boldsymbol{b})$$, we have

$-\ln P(\boldsymbol{b}) = \ln\left(\!\!{D \choose N}\!\!\right) + \sum_d \ln {\left(\!\!{{B\choose d}\choose n_d}\!\!\right)} + \ln N! - \sum_{\vec{b}}\ln n_{\vec{b}}!,$

where $$d \equiv |\vec{b}|_1 = \sum_rb_r$$ is the mixture size, $$n_d$$ is the number of nodes in a mixture of size $$d$$, $$D$$ is the maximum value of $$d$$, $$n_{\vec{b}}$$ is the number of nodes in mixture $$\vec{b}$$.

For the degree-corrected model we need to specify the prior $$P(\boldsymbol{k})$$ for the labelled degree sequence as well:

$-\ln P(\boldsymbol{k}) = \sum_r\ln\left(\!\!{m_r \choose e_r}\!\!\right) - \sum_{\vec{b}}\ln P(\boldsymbol{k}|{\vec{b}}),$

where $$m_r$$ is the number of non-empty mixtures which contain type $$r$$, and $$P(\boldsymbol{k}|{\vec{b}})$$ is the likelihood of the labelled degree sequence inside mixture $$\vec{b}$$. For this term we have three options:

1. degree_dl_kind == "uniform"

$P(\boldsymbol{k}|\vec{b}) = \prod_r\left(\!\!{n_{\vec{b}}\choose e^r_{\vec{b}}}\!\!\right)^{-1}.$
2. degree_dl_kind == "distributed"

$P(\boldsymbol{k}|\vec{b}) = \prod_{\vec{b}}\frac{\prod_{\vec{k}}\eta_{\vec{k}}^{\vec{b}}!}{n_{\vec{b}}!} \prod_r q(e_{\vec{b}}^r - n_{\vec{b}}, n_{\vec{b}})$

where $$n^{\vec{b}}_{\vec{k}}$$ is the number of nodes in mixture $$\vec{b}$$ with labelled degree $$\vec{k}$$, and $$q(n,m)$$ is the number of partitions of integer $$n$$ into at most $$m$$ parts.

3. degree_dl_kind == "entropy"

$P(\boldsymbol{k}|\vec{b}) = \prod_{\vec{b}}\exp\left(-n_{\vec{b}}H(\boldsymbol{k}_{\vec{b}})\right)$

where $$H(\boldsymbol{k}_{\vec{b}}) = -\sum_{\vec{k}}p_{\vec{b}}(\vec{k})\ln p_{\vec{b}}(\vec{k})$$ is the entropy of the labelled degree distribution inside mixture $$\vec{b}$$.

Note that, differently from the other two choices, this represents only an approximation of the description length. It is meant to be used only for comparison purposes, and should be avoided in practice.

For the directed case, the above expressions are duplicated for the in- and out-degrees.

mcmc_sweep(bundled=False, **kwargs)[source]

Perform sweeps of a Metropolis-Hastings rejection sampling MCMC to sample network partitions. If bundled == True, the half-edges incident of the same node that belong to the same group are moved together. All remaining parameters are passed to graph_tool.inference.BlockState.mcmc_sweep().

Add vertex v to block r.

This optionally accepts a list of vertices and blocks to add.

Warning

This can leave the state in an inconsistent state if a vertex is added twice to the same group.

collect_edge_marginals(p=None, update=1)

Collect the edge marginal histogram, which counts the number of times the endpoints of each node have been assigned to a given block pair.

This should be called multiple times, e.g. after repeated runs of the graph_tool.inference.BlockState.mcmc_sweep() function.

Parameters
pEdgePropertyMap (optional, default: None)

Edge property map with edge marginals to be updated. If not provided, an empty histogram will be created.

updatefloat (optional, default: 1)

Each call increases the current count by the amount given by this parameter.

Returns
pEdgePropertyMap

Edge property map with updated edge marginals.

Examples

>>> np.random.seed(42)
>>> gt.seed_rng(42)
>>> g = gt.collection.data["polbooks"]
>>> state = gt.BlockState(g, B=4, deg_corr=True)
>>> pe = None
>>> state.mcmc_sweep(niter=1000)   # remove part of the transient
(...)
>>> for i in range(1000):
...     ret = state.mcmc_sweep(niter=10)
...     pe = state.collect_edge_marginals(pe)
>>> gt.bethe_entropy(g, pe)[0]
0.571383...

collect_partition_histogram(h=None, update=1, unlabel=True)

Collect a histogram of partitions.

This should be called multiple times, e.g. after repeated runs of the graph_tool.inference.BlockState.mcmc_sweep() function.

Parameters
hPartitionHist (optional, default: None)

Partition histogram. If not provided, an empty histogram will be created.

updatefloat (optional, default: 1)

Each call increases the current count by the amount given by this parameter.

unlabelbool (optional, default: True)

If True, a canonical labelling of the groups will be used, so that each partition is uniquely represented.

Returns
hPartitionHist (optional, default: None)

Updated Partition histogram.

Examples

>>> np.random.seed(42)
>>> gt.seed_rng(42)
>>> g = gt.collection.data["polbooks"]
>>> state = gt.BlockState(g, B=4, deg_corr=True)
>>> ph = None
>>> state.mcmc_sweep(niter=1000)   # remove part of the transient
(...)
>>> for i in range(1000):
...     ret = state.mcmc_sweep(niter=10)
...     ph = state.collect_partition_histogram(ph)
>>> gt.microstate_entropy(ph)
126.171367...

collect_vertex_marginals(p=None, b=None, unlabel=False, update=1)

Collect the vertex marginal histogram, which counts the number of times a node was assigned to a given block.

This should be called multiple times, e.g. after repeated runs of the graph_tool.inference.BlockState.mcmc_sweep() function.

Parameters
pVertexPropertyMap (optional, default: None)

Vertex property map with vector-type values, storing the previous block membership counts. If not provided, an empty histogram will be created.

bVertexPropertyMap (optional, default: None)

Vertex property map with group partition. If not provided, the state’s partition will be used.

unlabelbool (optional, default: False)

If True, a canonical labelling of the groups will be used, so that each partition is uniquely represented.

updateint (optional, default: 1)

Each call increases the current count by the amount given by this parameter.

Returns
pVertexPropertyMap

Vertex property map with vector-type values, storing the accumulated block membership counts.

Examples

>>> np.random.seed(42)
>>> gt.seed_rng(42)
>>> g = gt.collection.data["polbooks"]
>>> state = gt.BlockState(g, B=4, deg_corr=True)
>>> pv = None
>>> state.mcmc_sweep(niter=1000)   # remove part of the transient
(...)
>>> for i in range(1000):
...     ret = state.mcmc_sweep(niter=10)
...     pv = state.collect_vertex_marginals(pv)
>>> gt.mf_entropy(g, pv)
15.037062...
>>> gt.graph_draw(g, pos=g.vp["pos"], vertex_shape="pie",
...               vertex_pie_fractions=pv, output="polbooks_blocks_soft_B4.svg")
<...>


“Soft” block partition of a political books network with $$B=4$$.

exhaustive_sweep(entropy_args={}, callback=None, density=None, vertices=None, initial_partition=None, max_iter=None)

Perform an exhaustive loop over all possible network partitions.

Parameters
entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

callbackcallable object (optional, default: None)

Function to be called for each partition, with three arguments (S, S_min, b_min) corresponding to the the current entropy value, the minimum entropy value so far, and the corresponding partition, respectively. If not provided, and hist is None an iterator over the same values will be returned instead.

densitytuple (optional, default: None)

If provided, it should contain a tuple with values (S_min, S_max, n_bins), which will be used to obtain the density of states via a histogram of size n_bins. This parameter is ignored unless callback is None.

verticesiterable of ints (optional, default: None)

If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.

initial_partitioniterable of ints (optional, default: None)

If provided, this will provide the initial partition for the iteration.

max_iterint (optional, default: None)

If provided, this will limit the total number of iterations.

Returns
statesiterator over (S, S_min, b_min)

If callback is None and hist is None, the function will return an iterator over (S, S_min, b_min) corresponding to the the current entropy value, the minimum entropy value so far, and the corresponding partition, respectively.

Ss, countspair of numpy.ndarray

If callback is None and hist is not None, the function will return the values of each bin (Ss) and the state count of each bin (counts).

b_minVertexPropertyMap

If callback is not None or hist is not None, the function will also return partition with smallest entropy.

Notes

This algorithm has an $$O(B^N)$$ complexity, where $$B$$ is the number of groups, and $$N$$ is the number of vertices.

get_Be()

Returns the effective number of blocks, defined as $$e^{H}$$, with $$H=-\sum_r\frac{n_r}{N}\ln \frac{n_r}{N}$$, where $$n_r$$ is the number of nodes in group r.

get_bg()

Returns the block graph.

get_block_state(b=None, vweight=False, **kwargs)

Returns a BlockState corresponding to the block graph (i.e. the blocks of the current state become the nodes). The parameters have the same meaning as the in the constructor. If vweight == True the nodes of the block state are weighted with the node counts.

get_blocks()

Returns the property map which contains the block labels for each vertex.

get_bpclabel()

Returns a VertexPropertyMap corresponding to partition constraint labels for the block graph.

get_edges_prob(missing, spurious=[], entropy_args={})

Compute the joint log-probability of the missing and spurious edges given by missing and spurious (a list of (source, target) tuples, or Edge() instances), together with the observed edges.

More precisely, the log-likelihood returned is

$\ln \frac{P(\boldsymbol G + \delta \boldsymbol G | \boldsymbol b)}{P(\boldsymbol G| \boldsymbol b)}$

where $$\boldsymbol G + \delta \boldsymbol G$$ is the modified graph (with missing edges added and spurious edges deleted).

The values in entropy_args are passed to entropy() to calculate the log-probability.

get_er()

Returns the vertex property map of the block graph which contains the number $$e_r$$ of half-edges incident on block $$r$$. If the graph is directed, a pair of property maps is returned, with the number of out-edges $$e^+_r$$ and in-edges $$e^-_r$$, respectively.

get_ers()

Returns the edge property map of the block graph which contains the $$e_{rs}$$ matrix entries. For undirected graphs, the diagonal values (self-loops) contain $$e_{rr}/2$$.

get_matrix()

Returns the block matrix (as a sparse csr_matrix), which contains the number of edges between each block pair.

Warning

This corresponds to the adjacency matrix of the block graph, which by convention includes twice the amount of edges in the diagonal entries if the graph is undirected.

Examples

>>> g = gt.collection.data["polbooks"]
>>> state = gt.BlockState(g, B=5, deg_corr=True)
>>> state.mcmc_sweep(niter=1000)
(...)
>>> m = state.get_matrix()
>>> figure()
<...>
>>> matshow(m.todense())
<...>
>>> savefig("bloc_mat.svg")


A 5x5 block matrix.

get_move_prob(v, s, c=1.0, d=0.1, reverse=False)

Compute the log-probability of a move proposal for vertex v to block s according to sampling parameters c and d, as obtained with graph_tool.inference.BlockState.sample_vertex_move(). If reverse == True, the reverse probability of moving the node back from block s to its current one is obtained.

get_nr()

Returns the vertex property map of the block graph which contains the block sizes $$n_r$$.

get_rec_params()

Get model hyperparameters for edge covariates.

get_state()

Alias to get_blocks().

gibbs_sweep(beta=1.0, niter=1, entropy_args={}, allow_new_group=True, sequential=True, deterministic=False, vertices=None, verbose=False, **kwargs)

Perform niter sweeps of a rejection-free Gibbs MCMC to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

allow_new_groupbool (optional, default: True)

Allow the number of groups to increase and decrease.

sequentialbool (optional, default: True)

If sequential == True each vertex move attempt is made sequentially, where vertices are visited in random order. Otherwise the moves are attempted by sampling vertices randomly, so that the same vertex can be moved more than once, before other vertices had the chance to move.

deterministicbool (optional, default: False)

If sequential == True and deterministic == True the vertices will be visited in deterministic order.

verticeslist of ints (optional, default: None)

If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E\times B)$$ complexity, where $$B$$ is the number of groups, and $$E$$ is the number of edges.

move_vertex(v, s)

Move vertex v to block s.

This optionally accepts a list of vertices and blocks to move simultaneously.

multicanonical_sweep(m_state, multiflip=False, **kwargs)

Perform niter sweeps of a non-Markovian multicanonical sampling using the Wang-Landau algorithm.

Parameters
m_stateMulticanonicalState

MulticanonicalState instance containing the current state of the Wang-Landau run.

multiflipbool (optional, default: False)

If True, multiflip_mcmc_sweep() will be used, otherwise mcmc_sweep().

**kwargsKeyword parameter list

The remaining parameters will be passed to multiflip_mcmc_sweep() or mcmc_sweep().

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

wang-efficient-2001

Fugao Wang, D. P. Landau, “An efficient, multiple range random walk algorithm to calculate the density of states”, Phys. Rev. Lett. 86, 2050 (2001), DOI: 10.1103/PhysRevLett.86.2050 [sci-hub, @tor], arXiv: cond-mat/0011174

multiflip_mcmc_sweep(beta=1.0, c=0.5, psingle=None, psplit=1, pmerge=1, pmergesplit=1, pmovelabel=0, d=0.01, gibbs_sweeps=10, niter=1, entropy_args={}, accept_stats=None, verbose=False, **kwargs)

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC with multiple simultaneous moves (i.e. merges and splits) to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

psplitfloat (optional, default: 1)

Relative probability of proposing a group split.

pmergesplitfloat (optional, default: 1)

Relative probability of proposing a marge-split move.

pmovelabelfloat (optional, default: 0)

Relative probability of proposing a group label move.

dfloat (optional, default: 1)

Probability of selecting a new (i.e. empty) group for a given single-node move.

gibbs_sweepsint (optional, default: 10)

Number of sweeps of Gibbs sampling to be performed (i.e. each node is attempted once per sweep) to refine a split proposal.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

peixoto-merge-split-2020

Tiago P. Peixoto, “Merge-split Markov chain Monte Carlo for community detection”, Phys. Rev. E 102, 012305 (2020), DOI: 10.1103/PhysRevE.102.012305 [sci-hub, @tor], arXiv: 2003.07070

multilevel_mcmc_sweep(niter=1, beta=1.0, c=0.5, psingle=None, pmultilevel=1, d=0.01, r=0.9, random_bisect=True, merge_sweeps=10, mh_sweeps=10, init_r=0.99, init_beta=1.0, gibbs=False, B_min=1, B_max=18446744073709551615, b_min=None, b_max=None, M=None, cache_states=True, entropy_args={}, verbose=False, **kwargs)

Perform niter sweeps of a multilevel agglomerative acceptance-rejection MCMC to sample network partitions, that uses a bisection search on the number of groups, together with group merges and singe-node moves.

Parameters
niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

pmultilevelfloat (optional, default: 1)

Relative probability of proposing a multilevel move.

dfloat (optional, default: .01)

Probability of selecting a new (i.e. empty) group for a given single-node move.

rfloat (optional, default: 0.9)

Group shrink ratio. The number of groups is reduced by this fraction at each merge sweep.

random_bisectbool (optional, default: True)

If True, bisections are done at randomly chosen intervals. Otherwise a Fibonacci sequence is used.

merge_sweepsint (optional, default: 10)

Number of sweeps spent to find good merge proposals.

mh_sweepsint (optional, default: 10)

Number of single-node Metropolis-Hastings sweeps between merge splits.

init_rdouble (optional, default: 0.99)

Stopping criterion for the intialization phase, after each node is put in their own group, to set the initial upper bound of the bisection search. A number of single-node Metropolis-Hastings sweeps is done until the number of groups is shrunk by a factor that is larger than this parameter.

init_betafloat (optional, default: 1.)

Inverse temperature to be used for the very first sweep of the initialization phase.

gibbsbool (optional, default: False)

If True, the single node moves use (slower) Gibbs sampling, rather than Metropolis-Hastings.

B_minint (optional, default: 1)

Minimum number of groups to be considered in the search.

b_minVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_min.

B_maxint (optional, default: 1)

Maximum number of groups to be considered in the search.

b_maxVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_max.

Mint (optional, default: None)

Maximum number of groups to select for the multilevel move. If None is provided, then all groups are always elected.

cache_statesbool (optional, default: True)

If True, intermediary states will be cached during the bisection search.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E\ln^2 N)$$ complexity, where $$E$$ is the number of edges and $$N$$ is the number of nodes (independently of the number of groups).

References

peixoto-efficient-2014

Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378

remove_vertex(v)

Remove vertex v from its current group.

This optionally accepts a list of vertices to remove.

Warning

This will leave the state in an inconsistent state before the vertex is returned to some other group, or if the same vertex is removed twice.

sample_graph(canonical=False, multigraph=True, self_loops=True, sample_params=False, max_ent=False, n_iter=1000)

Sample a new graph from the fitted model.

Parameters
canonicalbool (optional, default: False)

If canonical == True, the graph will be sampled from the maximum-likelihood estimate of the canonical stochastic block model. Otherwise, it will be sampled from the microcanonical model.

multigraphbool (optional, default: True)

If True, parallel edges will be allowed.

self-loopsbool (optional, default: True)

If True, self-loops will be allowed.

sample_paramsbool (optional, default: True)

If True, and canonical == True and max_ent == False, the count parameters (edges between groups and node degrees) will be sampled from their posterior distribution conditioned on the actual state. Otherwise, their maximum-likelihood values will be used.

max_entbool (optional, default: False)

If True, maximum-entropy model variants will be used.

n_iterint (optional, default: 1000)

Number of iterations used (only relevant if canonical == False and max_ent == True).

Returns
gGraph

Generated graph.

Notes

This function is just a convenience wrapper to generate_sbm(). However, if max_ent==True and canonical == False it wraps random_rewire() instead.

Examples

>>> g = gt.collection.data["polbooks"]
>>> state = gt.minimize_blockmodel_dl(g, multilevel_mcmc_args=dict(B_max=3))
>>> u = state.sample_graph(canonical=True, self_loops=False, multigraph=False)
>>> ustate = gt.BlockState(u, b=state.b)
>>> state.draw(pos=g.vp.pos, output="polbooks-sbm.svg")
<...>
>>> ustate.draw(pos=u.own_property(g.vp.pos), output="polbooks-sbm-sampled.svg")
<...>


Left: Political books network. Right: Sample from the degree-corrected SBM fitted to the original network.

sample_vertex_move(v, c=1.0, d=0.1)

Sample block membership proposal of vertex v according to real-valued sampling parameters c and d: For $$c\to 0$$ the blocks are sampled according to the local neighborhood and their connections; for $$c\to\infty$$ the blocks are sampled randomly. With a probability d, a new (empty) group is sampled.

set_rec_params(params)

Update model hyperparameters for edge covariates.

set_state(b)

Sets the internal partition of the state.

virtual_vertex_move(v, s, **kwargs)

Computes the entropy difference if vertex v is moved to block s. The remaining parameters are the same as in graph_tool.inference.BlockState.entropy().

draw(**kwargs)[source]

Convenience wrapper to graph_draw() that draws the state of the graph as colors on the vertices and edges.

class graph_tool.inference.LayeredBlockState(g, ec, eweight=None, vweight=None, recs=[], rec_types=[], rec_params=[], b=None, B=None, clabel=None, pclabel=None, layers=False, deg_corr=True, overlap=False, **kwargs)[source]

The (possibly overlapping) block state of a given graph, where the edges are divided into discrete layers.

Parameters
gGraph

Graph to be modelled.

ecEdgePropertyMap

Edge property map containing discrete edge covariates that will split the network in discrete layers.

recslist of EdgePropertyMap instances (optional, default: [])

List of real or discrete-valued edge covariates.

rec_typeslist of edge covariate types (optional, default: [])

List of types of edge covariates. The possible types are: "real-exponential", "real-normal", "discrete-geometric", "discrete-poisson" or "discrete-binomial".

rec_paramslist of dict (optional, default: [])

Model hyperparameters for edge covariates. This should a list of dict instances. See BlockState for more details.

eweightEdgePropertyMap (optional, default: None)

Edge multiplicities (for multigraphs or block graphs).

vweightVertexPropertyMap (optional, default: None)

Vertex multiplicities (for block graphs).

bVertexPropertyMap or numpy.ndarray (optional, default: None)

Initial block labels on the vertices or half-edges. If not supplied, it will be randomly sampled.

Bint (optional, default: None)

Number of blocks (or vertex groups). If not supplied it will be obtained from the parameter b.

clabelVertexPropertyMap (optional, default: None)

Constraint labels on the vertices. If supplied, vertices with different label values will not be clustered in the same group.

pclabelVertexPropertyMap (optional, default: None)

Partition constraint labels on the vertices. This has the same interpretation as clabel, but will be used to compute the partition description length.

layersbool (optional, default: False)

If layers == True, the “independent layers” version of the model is used, instead of the “edge covariates” version.

deg_corrbool (optional, default: True)

If True, the degree-corrected version of the blockmodel ensemble will be assumed, otherwise the traditional variant will be used.

overlapbool (optional, default: False)

If True, the overlapping version of the model will be used.

get_N()[source]

Returns the total number of edges.

get_E()[source]

Returns the total number of nodes.

get_B()[source]

Returns the total number of blocks.

get_nonempty_B()[source]

Returns the total number of nonempty blocks.

copy(g=None, eweight=None, vweight=None, b=None, B=None, deg_corr=None, clabel=None, pclabel=None, bfield=None, overlap=None, layers=None, ec=None, **kwargs)[source]

Copies the block state. The parameters override the state properties, and have the same meaning as in the constructor.

get_bg()[source]

Returns the block graph.

get_block_state(b=None, vweight=False, deg_corr=False, overlap=False, layers=None, **kwargs)[source]

Returns a LayeredBlockState corresponding to the block graph. The parameters have the same meaning as the in the constructor.

get_edge_blocks()[source]

Returns an edge property map which contains the block labels pairs for each edge.

get_overlap_blocks()[source]

Returns the mixed membership of each vertex.

Returns
bvVertexPropertyMap

A vector-valued vertex property map containing the block memberships of each node.

bc_inVertexPropertyMap

The labelled in-degrees of each node, i.e. how many in-edges belong to each group, in the same order as the bv property above.

bc_outVertexPropertyMap

The labelled out-degrees of each node, i.e. how many out-edges belong to each group, in the same order as the bv property above.

bc_totalVertexPropertyMap

The labelled total degrees of each node, i.e. how many incident edges belong to each group, in the same order as the bv property above.

get_nonoverlap_blocks()[source]

Returns a scalar-valued vertex property map with the block mixture represented as a single number.

get_majority_blocks()[source]

Returns a scalar-valued vertex property map with the majority block membership of each node.

entropy(adjacency=True, dl=True, partition_dl=True, degree_dl=True, degree_dl_kind='distributed', edges_dl=True, dense=False, multigraph=True, deg_entropy=True, exact=True, **kwargs)[source]

Calculate the entropy associated with the current block partition. The meaning of the parameters are the same as in graph_tool.inference.BlockState.entropy().

get_edges_prob(missing, spurious=[], entropy_args={})[source]

Compute the joint log-probability of the missing and spurious edges given by missing and spurious (a list of (source, target, layer) tuples, or Edge() instances), together with the observed edges.

More precisely, the log-likelihood returned is

$\ln \frac{P(\boldsymbol G + \delta \boldsymbol G | \boldsymbol b)}{P(\boldsymbol G| \boldsymbol b)}$

where $$\boldsymbol G + \delta \boldsymbol G$$ is the modified graph (with missing edges added and spurious edges deleted).

The values in entropy_args are passed to graph_tool.inference.BlockState.entropy() to calculate the log-probability.

Add vertex v to block r.

This optionally accepts a list of vertices and blocks to add.

Warning

This can leave the state in an inconsistent state if a vertex is added twice to the same group.

collect_edge_marginals(p=None, update=1)

Collect the edge marginal histogram, which counts the number of times the endpoints of each node have been assigned to a given block pair.

This should be called multiple times, e.g. after repeated runs of the graph_tool.inference.BlockState.mcmc_sweep() function.

Parameters
pEdgePropertyMap (optional, default: None)

Edge property map with edge marginals to be updated. If not provided, an empty histogram will be created.

updatefloat (optional, default: 1)

Each call increases the current count by the amount given by this parameter.

Returns
pEdgePropertyMap

Edge property map with updated edge marginals.

Examples

>>> np.random.seed(42)
>>> gt.seed_rng(42)
>>> g = gt.collection.data["polbooks"]
>>> state = gt.BlockState(g, B=4, deg_corr=True)
>>> pe = None
>>> state.mcmc_sweep(niter=1000)   # remove part of the transient
(...)
>>> for i in range(1000):
...     ret = state.mcmc_sweep(niter=10)
...     pe = state.collect_edge_marginals(pe)
>>> gt.bethe_entropy(g, pe)[0]
0.571383...

collect_partition_histogram(h=None, update=1, unlabel=True)

Collect a histogram of partitions.

This should be called multiple times, e.g. after repeated runs of the graph_tool.inference.BlockState.mcmc_sweep() function.

Parameters
hPartitionHist (optional, default: None)

Partition histogram. If not provided, an empty histogram will be created.

updatefloat (optional, default: 1)

Each call increases the current count by the amount given by this parameter.

unlabelbool (optional, default: True)

If True, a canonical labelling of the groups will be used, so that each partition is uniquely represented.

Returns
hPartitionHist (optional, default: None)

Updated Partition histogram.

Examples

>>> np.random.seed(42)
>>> gt.seed_rng(42)
>>> g = gt.collection.data["polbooks"]
>>> state = gt.BlockState(g, B=4, deg_corr=True)
>>> ph = None
>>> state.mcmc_sweep(niter=1000)   # remove part of the transient
(...)
>>> for i in range(1000):
...     ret = state.mcmc_sweep(niter=10)
...     ph = state.collect_partition_histogram(ph)
>>> gt.microstate_entropy(ph)
126.171367...

collect_vertex_marginals(p=None, b=None, unlabel=False, update=1)

Collect the vertex marginal histogram, which counts the number of times a node was assigned to a given block.

This should be called multiple times, e.g. after repeated runs of the graph_tool.inference.BlockState.mcmc_sweep() function.

Parameters
pVertexPropertyMap (optional, default: None)

Vertex property map with vector-type values, storing the previous block membership counts. If not provided, an empty histogram will be created.

bVertexPropertyMap (optional, default: None)

Vertex property map with group partition. If not provided, the state’s partition will be used.

unlabelbool (optional, default: False)

If True, a canonical labelling of the groups will be used, so that each partition is uniquely represented.

updateint (optional, default: 1)

Each call increases the current count by the amount given by this parameter.

Returns
pVertexPropertyMap

Vertex property map with vector-type values, storing the accumulated block membership counts.

Examples

>>> np.random.seed(42)
>>> gt.seed_rng(42)
>>> g = gt.collection.data["polbooks"]
>>> state = gt.BlockState(g, B=4, deg_corr=True)
>>> pv = None
>>> state.mcmc_sweep(niter=1000)   # remove part of the transient
(...)
>>> for i in range(1000):
...     ret = state.mcmc_sweep(niter=10)
...     pv = state.collect_vertex_marginals(pv)
>>> gt.mf_entropy(g, pv)
15.037062...
>>> gt.graph_draw(g, pos=g.vp["pos"], vertex_shape="pie",
...               vertex_pie_fractions=pv, output="polbooks_blocks_soft_B4.svg")
<...>


“Soft” block partition of a political books network with $$B=4$$.

exhaustive_sweep(entropy_args={}, callback=None, density=None, vertices=None, initial_partition=None, max_iter=None)

Perform an exhaustive loop over all possible network partitions.

Parameters
entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

callbackcallable object (optional, default: None)

Function to be called for each partition, with three arguments (S, S_min, b_min) corresponding to the the current entropy value, the minimum entropy value so far, and the corresponding partition, respectively. If not provided, and hist is None an iterator over the same values will be returned instead.

densitytuple (optional, default: None)

If provided, it should contain a tuple with values (S_min, S_max, n_bins), which will be used to obtain the density of states via a histogram of size n_bins. This parameter is ignored unless callback is None.

verticesiterable of ints (optional, default: None)

If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.

initial_partitioniterable of ints (optional, default: None)

If provided, this will provide the initial partition for the iteration.

max_iterint (optional, default: None)

If provided, this will limit the total number of iterations.

Returns
statesiterator over (S, S_min, b_min)

If callback is None and hist is None, the function will return an iterator over (S, S_min, b_min) corresponding to the the current entropy value, the minimum entropy value so far, and the corresponding partition, respectively.

Ss, countspair of numpy.ndarray

If callback is None and hist is not None, the function will return the values of each bin (Ss) and the state count of each bin (counts).

b_minVertexPropertyMap

If callback is not None or hist is not None, the function will also return partition with smallest entropy.

Notes

This algorithm has an $$O(B^N)$$ complexity, where $$B$$ is the number of groups, and $$N$$ is the number of vertices.

get_Be()

Returns the effective number of blocks, defined as $$e^{H}$$, with $$H=-\sum_r\frac{n_r}{N}\ln \frac{n_r}{N}$$, where $$n_r$$ is the number of nodes in group r.

get_bclabel(clabel=None)

Returns a VertexPropertyMap corresponding to constraint labels for the block graph.

get_blocks()

Returns the property map which contains the block labels for each vertex.

get_bpclabel()

Returns a VertexPropertyMap corresponding to partition constraint labels for the block graph.

get_er()

Returns the vertex property map of the block graph which contains the number $$e_r$$ of half-edges incident on block $$r$$. If the graph is directed, a pair of property maps is returned, with the number of out-edges $$e^+_r$$ and in-edges $$e^-_r$$, respectively.

get_ers()

Returns the edge property map of the block graph which contains the $$e_{rs}$$ matrix entries. For undirected graphs, the diagonal values (self-loops) contain $$e_{rr}/2$$.

get_matrix()

Returns the block matrix (as a sparse csr_matrix), which contains the number of edges between each block pair.

Warning

This corresponds to the adjacency matrix of the block graph, which by convention includes twice the amount of edges in the diagonal entries if the graph is undirected.

Examples

>>> g = gt.collection.data["polbooks"]
>>> state = gt.BlockState(g, B=5, deg_corr=True)
>>> state.mcmc_sweep(niter=1000)
(...)
>>> m = state.get_matrix()
>>> figure()
<...>
>>> matshow(m.todense())
<...>
>>> savefig("bloc_mat.svg")


A 5x5 block matrix.

get_move_prob(v, s, c=1.0, d=0.1, reverse=False)

Compute the log-probability of a move proposal for vertex v to block s according to sampling parameters c and d, as obtained with graph_tool.inference.BlockState.sample_vertex_move(). If reverse == True, the reverse probability of moving the node back from block s to its current one is obtained.

get_nr()

Returns the vertex property map of the block graph which contains the block sizes $$n_r$$.

get_rec_params()

Get model hyperparameters for edge covariates.

get_state()

Alias to get_blocks().

gibbs_sweep(beta=1.0, niter=1, entropy_args={}, allow_new_group=True, sequential=True, deterministic=False, vertices=None, verbose=False, **kwargs)

Perform niter sweeps of a rejection-free Gibbs MCMC to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

allow_new_groupbool (optional, default: True)

Allow the number of groups to increase and decrease.

sequentialbool (optional, default: True)

If sequential == True each vertex move attempt is made sequentially, where vertices are visited in random order. Otherwise the moves are attempted by sampling vertices randomly, so that the same vertex can be moved more than once, before other vertices had the chance to move.

deterministicbool (optional, default: False)

If sequential == True and deterministic == True the vertices will be visited in deterministic order.

verticeslist of ints (optional, default: None)

If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E\times B)$$ complexity, where $$B$$ is the number of groups, and $$E$$ is the number of edges.

mcmc_sweep(bundled=False, **kwargs)[source]

Perform sweeps of a Metropolis-Hastings rejection sampling MCMC to sample network partitions. If bundled == True and the state is an overlapping one, the half-edges incident of the same node that belong to the same group are moved together. All remaining parameters are passed to graph_tool.inference.BlockState.mcmc_sweep().

move_vertex(v, s)

Move vertex v to block s.

This optionally accepts a list of vertices and blocks to move simultaneously.

multicanonical_sweep(m_state, multiflip=False, **kwargs)

Perform niter sweeps of a non-Markovian multicanonical sampling using the Wang-Landau algorithm.

Parameters
m_stateMulticanonicalState

MulticanonicalState instance containing the current state of the Wang-Landau run.

multiflipbool (optional, default: False)

If True, multiflip_mcmc_sweep() will be used, otherwise mcmc_sweep().

**kwargsKeyword parameter list

The remaining parameters will be passed to multiflip_mcmc_sweep() or mcmc_sweep().

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

wang-efficient-2001

Fugao Wang, D. P. Landau, “An efficient, multiple range random walk algorithm to calculate the density of states”, Phys. Rev. Lett. 86, 2050 (2001), DOI: 10.1103/PhysRevLett.86.2050 [sci-hub, @tor], arXiv: cond-mat/0011174

multiflip_mcmc_sweep(beta=1.0, c=0.5, psingle=None, psplit=1, pmerge=1, pmergesplit=1, pmovelabel=0, d=0.01, gibbs_sweeps=10, niter=1, entropy_args={}, accept_stats=None, verbose=False, **kwargs)

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC with multiple simultaneous moves (i.e. merges and splits) to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

psplitfloat (optional, default: 1)

Relative probability of proposing a group split.

pmergesplitfloat (optional, default: 1)

Relative probability of proposing a marge-split move.

pmovelabelfloat (optional, default: 0)

Relative probability of proposing a group label move.

dfloat (optional, default: 1)

Probability of selecting a new (i.e. empty) group for a given single-node move.

gibbs_sweepsint (optional, default: 10)

Number of sweeps of Gibbs sampling to be performed (i.e. each node is attempted once per sweep) to refine a split proposal.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

peixoto-merge-split-2020

Tiago P. Peixoto, “Merge-split Markov chain Monte Carlo for community detection”, Phys. Rev. E 102, 012305 (2020), DOI: 10.1103/PhysRevE.102.012305 [sci-hub, @tor], arXiv: 2003.07070

multilevel_mcmc_sweep(niter=1, beta=1.0, c=0.5, psingle=None, pmultilevel=1, d=0.01, r=0.9, random_bisect=True, merge_sweeps=10, mh_sweeps=10, init_r=0.99, init_beta=1.0, gibbs=False, B_min=1, B_max=18446744073709551615, b_min=None, b_max=None, M=None, cache_states=True, entropy_args={}, verbose=False, **kwargs)

Perform niter sweeps of a multilevel agglomerative acceptance-rejection MCMC to sample network partitions, that uses a bisection search on the number of groups, together with group merges and singe-node moves.

Parameters
niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

pmultilevelfloat (optional, default: 1)

Relative probability of proposing a multilevel move.

dfloat (optional, default: .01)

Probability of selecting a new (i.e. empty) group for a given single-node move.

rfloat (optional, default: 0.9)

Group shrink ratio. The number of groups is reduced by this fraction at each merge sweep.

random_bisectbool (optional, default: True)

If True, bisections are done at randomly chosen intervals. Otherwise a Fibonacci sequence is used.

merge_sweepsint (optional, default: 10)

Number of sweeps spent to find good merge proposals.

mh_sweepsint (optional, default: 10)

Number of single-node Metropolis-Hastings sweeps between merge splits.

init_rdouble (optional, default: 0.99)

Stopping criterion for the intialization phase, after each node is put in their own group, to set the initial upper bound of the bisection search. A number of single-node Metropolis-Hastings sweeps is done until the number of groups is shrunk by a factor that is larger than this parameter.

init_betafloat (optional, default: 1.)

Inverse temperature to be used for the very first sweep of the initialization phase.

gibbsbool (optional, default: False)

If True, the single node moves use (slower) Gibbs sampling, rather than Metropolis-Hastings.

B_minint (optional, default: 1)

Minimum number of groups to be considered in the search.

b_minVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_min.

B_maxint (optional, default: 1)

Maximum number of groups to be considered in the search.

b_maxVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_max.

Mint (optional, default: None)

Maximum number of groups to select for the multilevel move. If None is provided, then all groups are always elected.

cache_statesbool (optional, default: True)

If True, intermediary states will be cached during the bisection search.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E\ln^2 N)$$ complexity, where $$E$$ is the number of edges and $$N$$ is the number of nodes (independently of the number of groups).

References

peixoto-efficient-2014

Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378

remove_vertex(v)

Remove vertex v from its current group.

This optionally accepts a list of vertices to remove.

Warning

This will leave the state in an inconsistent state before the vertex is returned to some other group, or if the same vertex is removed twice.

sample_graph(canonical=False, multigraph=True, self_loops=True, sample_params=False, max_ent=False, n_iter=1000)

Sample a new graph from the fitted model.

Parameters
canonicalbool (optional, default: False)

If canonical == True, the graph will be sampled from the maximum-likelihood estimate of the canonical stochastic block model. Otherwise, it will be sampled from the microcanonical model.

multigraphbool (optional, default: True)

If True, parallel edges will be allowed.

self-loopsbool (optional, default: True)

If True, self-loops will be allowed.

sample_paramsbool (optional, default: True)

If True, and canonical == True and max_ent == False, the count parameters (edges between groups and node degrees) will be sampled from their posterior distribution conditioned on the actual state. Otherwise, their maximum-likelihood values will be used.

max_entbool (optional, default: False)

If True, maximum-entropy model variants will be used.

n_iterint (optional, default: 1000)

Number of iterations used (only relevant if canonical == False and max_ent == True).

Returns
gGraph

Generated graph.

Notes

This function is just a convenience wrapper to generate_sbm(). However, if max_ent==True and canonical == False it wraps random_rewire() instead.

Examples

>>> g = gt.collection.data["polbooks"]
>>> state = gt.minimize_blockmodel_dl(g, multilevel_mcmc_args=dict(B_max=3))
>>> u = state.sample_graph(canonical=True, self_loops=False, multigraph=False)
>>> ustate = gt.BlockState(u, b=state.b)
>>> state.draw(pos=g.vp.pos, output="polbooks-sbm.svg")
<...>
>>> ustate.draw(pos=u.own_property(g.vp.pos), output="polbooks-sbm-sampled.svg")
<...>


Left: Political books network. Right: Sample from the degree-corrected SBM fitted to the original network.

sample_vertex_move(v, c=1.0, d=0.1)

Sample block membership proposal of vertex v according to real-valued sampling parameters c and d: For $$c\to 0$$ the blocks are sampled according to the local neighborhood and their connections; for $$c\to\infty$$ the blocks are sampled randomly. With a probability d, a new (empty) group is sampled.

set_rec_params(params)

Update model hyperparameters for edge covariates.

set_state(b)

Sets the internal partition of the state.

virtual_vertex_move(v, s, **kwargs)

Computes the entropy difference if vertex v is moved to block s. The remaining parameters are the same as in graph_tool.inference.BlockState.entropy().

draw(**kwargs)[source]

Convenience function to draw the current state. All keyword arguments are passed to graph_tool.inference.BlockState.draw() or graph_tool.inference.OverlapBlockState.draw(), as appropriate.

class graph_tool.inference.NestedBlockState(g, bs=None, base_type=<class 'graph_tool.inference.blockmodel.BlockState'>, state_args={}, hstate_args={}, hentropy_args={}, **kwargs)[source]

Bases: object

The nested stochastic block model state of a given graph.

Parameters
gGraph

Graph to be modeled.

bslist of VertexPropertyMap or numpy.ndarray (optional, default: None)

Hierarchical node partition. If not provided it will correspond to a single-group hierarchy of length $$\lceil\log_2(N)\rceil$$.

base_typetype (optional, default: BlockState)

State type for lowermost level (e.g. BlockState, OverlapBlockState or LayeredBlockState)

hstate_argsdict (optional, default: {})

Keyword arguments to be passed to the constructor of the higher-level states.

hentropy_argsdict (optional, default: {})

Keyword arguments to be passed to the entropy() method of the higher-level states.

state_argsdict (optional, default: {})

Keyword arguments to be passed to base type constructor.

**kwargskeyword arguments

Keyword arguments to be passed to base type constructor. The state_args parameter overrides this.

copy(**kwargs)[source]

Copies the block state. The parameters override the state properties, and have the same meaning as in the constructor.

get_bs()[source]

Get hierarchy levels as a list of numpy.ndarray objects with the group memberships at each level.

get_state()[source]

Alias to get_bs().

set_state(bs)[source]

Sets the internal nested partition of the state.

get_levels()[source]

Get hierarchy levels as a list of BlockState instances.

project_partition(j, l)[source]

Project partition of level j onto level l, and return it.

propagate_clabel(l)[source]

Project base clabel to level l.

get_clabel(l)[source]

Get clabel for level l.

level_entropy(l, bstate=None, **kwargs)[source]

Compute the entropy of level l.

entropy(**kwargs)[source]

Compute the entropy of whole hierarchy.

The keyword arguments are passed to the entropy() method of the underlying state objects (e.g. graph_tool.inference.BlockState.entropy, graph_tool.inference.OverlapBlockState.entropy, or graph_tool.inference.LayeredBlockState.entropy).

move_vertex(v, s)[source]

Move vertex v to block s.

remove_vertex(v)[source]

Remove vertex v from its current group.

This optionally accepts a list of vertices to remove.

Warning

This will leave the state in an inconsistent state before the vertex is returned to some other group, or if the same vertex is removed twice.

Add vertex v to block r.

This optionally accepts a list of vertices and blocks to add.

Warning

This can leave the state in an inconsistent state if a vertex is added twice to the same group.

get_edges_prob(missing, spurious=[], entropy_args={})[source]

Compute the joint log-probability of the missing and spurious edges given by missing and spurious (a list of (source, target) tuples, or Edge() instances), together with the observed edges.

More precisely, the log-likelihood returned is

$\ln \frac{P(\boldsymbol G + \delta \boldsymbol G | \boldsymbol b)}{P(\boldsymbol G| \boldsymbol b)}$

where $$\boldsymbol G + \delta \boldsymbol G$$ is the modified graph (with missing edges added and spurious edges deleted).

The values in entropy_args are passed to graph_tool.inference.BlockState.entropy() to calculate the log-probability.

get_bstack()[source]

Return the nested levels as individual graphs.

This returns a list of Graph instances representing the inferred hierarchy at each level. Each graph has two internal vertex and edge property maps named “count” which correspond to the vertex and edge counts at the lower level, respectively. Additionally, an internal vertex property map named “b” specifies the block partition.

project_level(l)[source]

Project the partition at level l onto the lowest level, and return the corresponding state.

print_summary()[source]

Print a hierarchy summary.

mcmc_sweep(**kwargs)[source]

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC to sample hierarchical network partitions.

The arguments accepted are the same as in graph_tool.inference.BlockState.mcmc_sweep().

If the parameter c is a scalar, the values used at each level are c * 2 ** l for l in the range [0, L-1]. Optionally, a list of values may be passed instead, which specifies the value of c[l] to be used at each level.

Warning

This function performs niter sweeps at each hierarchical level once. This means that in order for the chain to equilibrate, we need to call this function several times, i.e. it is not enough to call it once with a large value of niter.

multiflip_mcmc_sweep(**kwargs)[source]

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC with multiple moves to sample hierarchical network partitions.

The arguments accepted are the same as in graph_tool.inference.BlockState.multiflip_mcmc_sweep().

If the parameter c is a scalar, the values used at each level are c * 2 ** l for l in the range [0, L-1]. Optionally, a list of values may be passed instead, which specifies the value of c[l] to be used at each level.

Warning

This function performs niter sweeps at each hierarchical level once. This means that in order for the chain to equilibrate, we need to call this function several times, i.e. it is not enough to call it once with a large value of niter.

multilevel_mcmc_sweep(**kwargs)[source]

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC with multilevel moves to sample hierarchical network partitions.

The arguments accepted are the same as in graph_tool.inference.BlockState.multilevel_mcmc_sweep().

If the parameter c is a scalar, the values used at each level are c * 2 ** l for l in the range [0, L-1]. Optionally, a list of values may be passed instead, which specifies the value of c[l] to be used at each level.

Warning

This function performs niter sweeps at each hierarchical level once. This means that in order for the chain to equilibrate, we need to call this function several times, i.e. it is not enough to call it once with a large value of niter.

gibbs_sweep(**kwargs)[source]

Perform niter sweeps of a rejection-free Gibbs sampling MCMC to sample network partitions.

The arguments accepted are the same as in graph_tool.inference.BlockState.gibbs_sweep().

Warning

This function performs niter sweeps at each hierarchical level once. This means that in order for the chain to equilibrate, we need to call this function several times, i.e. it is not enough to call it once with a large value of niter.

multicanonical_sweep(m_state, **kwargs)[source]

Perform niter sweeps of a non-Markovian multicanonical sampling using the Wang-Landau algorithm.

The arguments accepted are the same as in graph_tool.inference.BlockState.multicanonical_sweep().

collect_partition_histogram(h=None, update=1)[source]

Collect a histogram of partitions.

This should be called multiple times, e.g. after repeated runs of the graph_tool.inference.NestedBlockState.mcmc_sweep() function.

Parameters
hPartitionHist (optional, default: None)

Partition histogram. If not provided, an empty histogram will be created.

updatefloat (optional, default: 1)

Each call increases the current count by the amount given by this parameter.

Returns
hPartitionHist (optional, default: None)

Updated Partition histogram.

draw(**kwargs)[source]

Convenience wrapper to draw_hierarchy() that draws the hierarchical state.

class graph_tool.inference.PPBlockState(g, b=None)[source]

Obtain the partition of a network according to the Bayesian planted partition model.

Parameters
gGraph

Graph to be modelled.

bPropertyMap (optional, default: None)

Initial partition. If not supplied, a partition into a single group will be used.

References

lizhi-statistical-2020

Lizhi Zhang, Tiago P. Peixoto, “Statistical inference of assortative community structures”, Phys. Rev. Research 2 043271 (2020), DOI: 10.1103/PhysRevResearch.2.043271 [sci-hub, @tor], arXiv: 2006.14493

copy(g=None, b=None)[source]

Copies the state. The parameters override the state properties, and have the same meaning as in the constructor.

get_blocks()[source]

Returns the property map which contains the block labels for each vertex.

get_state()[source]

Alias to get_blocks().

get_B()[source]

Returns the total number of blocks.

get_nonempty_B()[source]

Alias to get_B().

get_Be()[source]

Returns the effective number of blocks, defined as $$e^{H}$$, with $$H=-\sum_r\frac{n_r}{N}\ln \frac{n_r}{N}$$, where $$n_r$$ is the number of nodes in group r.

virtual_vertex_move(v, s, **kwargs)[source]

Computes the entropy difference if vertex v is moved to block s. The remaining parameters are the same as in entropy().

move_vertex(v, s)[source]

Move vertex v to block s.

entropy(uniform=False, degree_dl_kind='distributed', **kwargs)[source]

Return the model entropy (negative log-likelihood).

Parameters
uniformbool (optional, default: False)

If True, the uniform planted partition model is used, otherwise a non-uniform version is used.

degree_dl_kindstr (optional, default: "distributed")

This specifies the prior used for the degree sequence. It must be one of: "uniform" or "distributed" (default).

Notes

The “entropy” of the state is the negative log-likelihood of the microcanonical SBM, that includes the generated graph $$\boldsymbol{A}$$ and the model parameters $$e_{\text{in}}$$, $$e_{\text{out}}$$, $$\boldsymbol{k}$$ and $$\boldsymbol{b}$$,

$\begin{split}\Sigma &= - \ln P(\boldsymbol{A},e_{\text{in}},e_{\text{out}},\boldsymbol{k},\boldsymbol{b}) \\ &= - \ln P(\boldsymbol{A}|e_{\text{in}},e_{\text{out}},\boldsymbol{k},\boldsymbol{b}) - \ln P(e_{\text{in}},e_{\text{out}},\boldsymbol{k},\boldsymbol{b}).\end{split}$

This value is also called the description length of the data, and it corresponds to the amount of information required to describe it (in nats).

For the uniform version of the model, the likelihood is

$P(\boldsymbol{A}|\boldsymbol{k},\boldsymbol{b}) = \frac{e_{\text{in}}!e_{\text{out}}!} {\left(\frac{B}{2}\right)^{e_{\text{in}}}{B\choose 2}^{e_{\text{out}}}(E+1)^{1-\delta_{B,1}}\prod_re_r!}\times \frac{\prod_ik_i!}{\prod_{i<j}A_{ij}!\prod_i A_{ii}!!}.$

where $$e_{\text{in}}$$ and $$e_{\text{out}}$$ are the number of edges inside and outside communities, respectively, and $$e_r$$ is the sum of degrees in group $$r$$.

For the non-uniform model we have instead:

$P(\boldsymbol{A}|\boldsymbol{k},\boldsymbol{b}) = \frac{e_{\text{out}}!\prod_re_{rr}!!} {{B\choose 2}^{e_{\text{out}}}(E+1)^{1-\delta_{B,1}}\prod_re_r!}\times{B + e_{\text{in}} - 1 \choose e_{\text{in}}}^{-1}\times \frac{\prod_ik_i!}{\prod_{i<j}A_{ij}!\prod_i A_{ii}!!}.$

Here there are two options for the prior on the degrees:

1. degree_dl_kind == "uniform"

$P(\boldsymbol{k}|\boldsymbol{e},\boldsymbol{b}) = \prod_r\left(\!\!{n_r\choose e_r}\!\!\right)^{-1}.$

This corresponds to a noninformative prior, where the degrees are sampled from a uniform distribution.

2. degree_dl_kind == "distributed" (default)

$P(\boldsymbol{k}|\boldsymbol{e},\boldsymbol{b}) = \prod_r\frac{\prod_k\eta_k^r!}{n_r!} \prod_r q(e_r, n_r)^{-1}$

with $$\eta_k^r$$ being the number of nodes with degree $$k$$ in group $$r$$, and $$q(n,m)$$ being the number of partitions of integer $$n$$ into at most $$m$$ parts.

This corresponds to a prior for the degree sequence conditioned on the degree frequencies, which are themselves sampled from a uniform hyperprior. This option should be preferred in most cases.

For the partition prior $$P(\boldsymbol{b})$$ please refer to entropy().

References

lizhi-statistical-2020

Lizhi Zhang, Tiago P. Peixoto, “Statistical inference of assortative community structures”, Phys. Rev. Research 2 043271 (2020), DOI: 10.1103/PhysRevResearch.2.043271 [sci-hub, @tor], arXiv: 2006.14493

draw(**kwargs)

Convenience wrapper to graph_draw() that draws the state of the graph as colors on the vertices and edges.

gibbs_sweep(beta=1.0, niter=1, entropy_args={}, allow_new_group=True, sequential=True, deterministic=False, vertices=None, verbose=False, **kwargs)

Perform niter sweeps of a rejection-free Gibbs MCMC to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

allow_new_groupbool (optional, default: True)

Allow the number of groups to increase and decrease.

sequentialbool (optional, default: True)

If sequential == True each vertex move attempt is made sequentially, where vertices are visited in random order. Otherwise the moves are attempted by sampling vertices randomly, so that the same vertex can be moved more than once, before other vertices had the chance to move.

deterministicbool (optional, default: False)

If sequential == True and deterministic == True the vertices will be visited in deterministic order.

verticeslist of ints (optional, default: None)

If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E\times B)$$ complexity, where $$B$$ is the number of groups, and $$E$$ is the number of edges.

mcmc_sweep(beta=1.0, c=0.5, d=0.01, niter=1, entropy_args={}, allow_vacate=True, sequential=True, deterministic=False, vertices=None, verbose=False, **kwargs)

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

dfloat (optional, default: .01)

Probability of selecting a new (i.e. empty) group for a given move.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

allow_vacatebool (optional, default: True)

Allow groups to be vacated.

sequentialbool (optional, default: True)

If sequential == True each vertex move attempt is made sequentially, where vertices are visited in random order. Otherwise the moves are attempted by sampling vertices randomly, so that the same vertex can be moved more than once, before other vertices had the chance to move.

deterministicbool (optional, default: False)

If sequential == True and deterministic == True the vertices will be visited in deterministic order.

verticeslist of ints (optional, default: None)

If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

peixoto-efficient-2014

Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378

multiflip_mcmc_sweep(beta=1.0, c=0.5, psingle=None, psplit=1, pmerge=1, pmergesplit=1, pmovelabel=0, d=0.01, gibbs_sweeps=10, niter=1, entropy_args={}, accept_stats=None, verbose=False, **kwargs)

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC with multiple simultaneous moves (i.e. merges and splits) to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

psplitfloat (optional, default: 1)

Relative probability of proposing a group split.

pmergesplitfloat (optional, default: 1)

Relative probability of proposing a marge-split move.

pmovelabelfloat (optional, default: 0)

Relative probability of proposing a group label move.

dfloat (optional, default: 1)

Probability of selecting a new (i.e. empty) group for a given single-node move.

gibbs_sweepsint (optional, default: 10)

Number of sweeps of Gibbs sampling to be performed (i.e. each node is attempted once per sweep) to refine a split proposal.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

peixoto-merge-split-2020

Tiago P. Peixoto, “Merge-split Markov chain Monte Carlo for community detection”, Phys. Rev. E 102, 012305 (2020), DOI: 10.1103/PhysRevE.102.012305 [sci-hub, @tor], arXiv: 2003.07070

multilevel_mcmc_sweep(niter=1, beta=1.0, c=0.5, psingle=None, pmultilevel=1, d=0.01, r=0.9, random_bisect=True, merge_sweeps=10, mh_sweeps=10, init_r=0.99, init_beta=1.0, gibbs=False, B_min=1, B_max=18446744073709551615, b_min=None, b_max=None, M=None, cache_states=True, entropy_args={}, verbose=False, **kwargs)

Perform niter sweeps of a multilevel agglomerative acceptance-rejection MCMC to sample network partitions, that uses a bisection search on the number of groups, together with group merges and singe-node moves.

Parameters
niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

pmultilevelfloat (optional, default: 1)

Relative probability of proposing a multilevel move.

dfloat (optional, default: .01)

Probability of selecting a new (i.e. empty) group for a given single-node move.

rfloat (optional, default: 0.9)

Group shrink ratio. The number of groups is reduced by this fraction at each merge sweep.

random_bisectbool (optional, default: True)

If True, bisections are done at randomly chosen intervals. Otherwise a Fibonacci sequence is used.

merge_sweepsint (optional, default: 10)

Number of sweeps spent to find good merge proposals.

mh_sweepsint (optional, default: 10)

Number of single-node Metropolis-Hastings sweeps between merge splits.

init_rdouble (optional, default: 0.99)

Stopping criterion for the intialization phase, after each node is put in their own group, to set the initial upper bound of the bisection search. A number of single-node Metropolis-Hastings sweeps is done until the number of groups is shrunk by a factor that is larger than this parameter.

init_betafloat (optional, default: 1.)

Inverse temperature to be used for the very first sweep of the initialization phase.

gibbsbool (optional, default: False)

If True, the single node moves use (slower) Gibbs sampling, rather than Metropolis-Hastings.

B_minint (optional, default: 1)

Minimum number of groups to be considered in the search.

b_minVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_min.

B_maxint (optional, default: 1)

Maximum number of groups to be considered in the search.

b_maxVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_max.

Mint (optional, default: None)

Maximum number of groups to select for the multilevel move. If None is provided, then all groups are always elected.

cache_statesbool (optional, default: True)

If True, intermediary states will be cached during the bisection search.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E\ln^2 N)$$ complexity, where $$E$$ is the number of edges and $$N$$ is the number of nodes (independently of the number of groups).

References

peixoto-efficient-2014

Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378

class graph_tool.inference.RankedBlockState(g, b=None, u=None, **kwargs)[source]

Obtain the ordered partition of a network according to the ranked stochastic block model.

Parameters
gGraph

Graph to be modelled.

bPropertyMap (optional, default: None)

Initial partition. If not supplied, a partition into a single group will be used.

uPropertyMap or iterable (optional, default: None)

Ordering of the group labels. It should contain a map from each group label to the unit interval $$[0,1]$$, inidicating how the groups should be ordered. If not supplied, the numeric values of the group lalbels will be used to initialize the ordering.

References

peixoto-ordered-2022

Tiago P. Peixoto, “Ordered community detection in directed networks”, arXiv: 2203.16460

copy(**kwargs)[source]

Copies the state. The parameters override the state properties, and have the same meaning as in the constructor.

get_block_state(b=None, vweight=None, deg_corr=False, **kwargs)[source]

Returns a ReciprocalBlockState corresponding to the block graph (i.e. the blocks of the current state become the nodes).

get_blocks()[source]

Returns the property map which contains the block labels for each vertex.

get_state()[source]

Alias to get_blocks().

get_block_order()[source]

Returns an array indexed by the group label containing its rank order.

get_vertex_order()[source]

Returns a vertex PropertyMap with the rank order for every vertex.

get_vertex_position()[source]

Returns a vertex PropertyMap with vertex positions in the unit interval $$[0,1]$$.

collect_vertex_marginals(p=None, b=None, update=1)[source]

Collect the vertex marginal histogram, which counts the number of times a node was assigned to a given block.

Parameters
pVertexPropertyMap (optional, default: None)

Vertex property map with vector-type values, storing the previous block membership counts. If not provided, an empty histogram will be created.

bVertexPropertyMap (optional, default: None)

Vertex property map with group partition. If not provided, the state’s partition will be used.

updateint (optional, default: 1)

Each call increases the current count by the amount given by this parameter.

Returns
pVertexPropertyMap

Vertex property map with vector-type values, storing the accumulated block membership counts.

get_edge_dir()[source]

Return an edge PropertyMap containing the edge direction: -1 (downstream), 0 (lateral), +1 (upstream).

get_N()[source]

Return the number of nodes.

get_E()[source]

Return the number of edges.

get_Es()[source]

Return the number of dowstream, lateral, and upstream edges.

get_B()[source]

Returns the total number of blocks.

get_nonempty_B()[source]

Alias to get_B().

get_Be()[source]

Returns the effective number of blocks, defined as $$e^{H}$$, with $$H=-\sum_r\frac{n_r}{N}\ln \frac{n_r}{N}$$, where $$n_r$$ is the number of nodes in group r.

virtual_vertex_move(v, s, **kwargs)[source]

Computes the entropy difference if vertex v is moved to block s. The remaining parameters are the same as in entropy().

move_vertex(v, s)[source]

Move vertex v to block s.

entropy(adjacency=True, dl=True, partition_dl=True, degree_dl=True, degree_dl_kind='distributed', edges_dl=True, dense=False, multigraph=True, deg_entropy=True, recs=True, recs_dl=True, beta_dl=1.0, Bfield=True, exact=True, **kwargs)[source]

Return the model entropy (negative log-likelihood). See BlockState.entropy() for documentation.

sample_graph(**kwargs)[source]

Sample a new graph from the fitted model. See BlockState.sample_graph() for documentation.

draw(**kwargs)[source]

Convenience wrapper to graph_draw() that draws the state of the graph as colors on the vertices and edges.

gibbs_sweep(beta=1.0, niter=1, entropy_args={}, allow_new_group=True, sequential=True, deterministic=False, vertices=None, verbose=False, **kwargs)

Perform niter sweeps of a rejection-free Gibbs MCMC to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

allow_new_groupbool (optional, default: True)

Allow the number of groups to increase and decrease.

sequentialbool (optional, default: True)

If sequential == True each vertex move attempt is made sequentially, where vertices are visited in random order. Otherwise the moves are attempted by sampling vertices randomly, so that the same vertex can be moved more than once, before other vertices had the chance to move.

deterministicbool (optional, default: False)

If sequential == True and deterministic == True the vertices will be visited in deterministic order.

verticeslist of ints (optional, default: None)

If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E\times B)$$ complexity, where $$B$$ is the number of groups, and $$E$$ is the number of edges.

mcmc_sweep(beta=1.0, c=0.5, d=0.01, niter=1, entropy_args={}, allow_vacate=True, sequential=True, deterministic=False, vertices=None, verbose=False, **kwargs)

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

dfloat (optional, default: .01)

Probability of selecting a new (i.e. empty) group for a given move.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

allow_vacatebool (optional, default: True)

Allow groups to be vacated.

sequentialbool (optional, default: True)

If sequential == True each vertex move attempt is made sequentially, where vertices are visited in random order. Otherwise the moves are attempted by sampling vertices randomly, so that the same vertex can be moved more than once, before other vertices had the chance to move.

deterministicbool (optional, default: False)

If sequential == True and deterministic == True the vertices will be visited in deterministic order.

verticeslist of ints (optional, default: None)

If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

peixoto-efficient-2014

Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378

multiflip_mcmc_sweep(beta=1.0, c=0.5, psingle=None, psplit=1, pmerge=1, pmergesplit=1, pmovelabel=0, d=0.01, gibbs_sweeps=10, niter=1, entropy_args={}, accept_stats=None, verbose=False, **kwargs)

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC with multiple simultaneous moves (i.e. merges and splits) to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

psplitfloat (optional, default: 1)

Relative probability of proposing a group split.

pmergesplitfloat (optional, default: 1)

Relative probability of proposing a marge-split move.

pmovelabelfloat (optional, default: 0)

Relative probability of proposing a group label move.

dfloat (optional, default: 1)

Probability of selecting a new (i.e. empty) group for a given single-node move.

gibbs_sweepsint (optional, default: 10)

Number of sweeps of Gibbs sampling to be performed (i.e. each node is attempted once per sweep) to refine a split proposal.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

peixoto-merge-split-2020

Tiago P. Peixoto, “Merge-split Markov chain Monte Carlo for community detection”, Phys. Rev. E 102, 012305 (2020), DOI: 10.1103/PhysRevE.102.012305 [sci-hub, @tor], arXiv: 2003.07070

multilevel_mcmc_sweep(niter=1, beta=1.0, c=0.5, psingle=None, pmultilevel=1, d=0.01, r=0.9, random_bisect=True, merge_sweeps=10, mh_sweeps=10, init_r=0.99, init_beta=1.0, gibbs=False, B_min=1, B_max=18446744073709551615, b_min=None, b_max=None, M=None, cache_states=True, entropy_args={}, verbose=False, **kwargs)

Perform niter sweeps of a multilevel agglomerative acceptance-rejection MCMC to sample network partitions, that uses a bisection search on the number of groups, together with group merges and singe-node moves.

Parameters
niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

pmultilevelfloat (optional, default: 1)

Relative probability of proposing a multilevel move.

dfloat (optional, default: .01)

Probability of selecting a new (i.e. empty) group for a given single-node move.

rfloat (optional, default: 0.9)

Group shrink ratio. The number of groups is reduced by this fraction at each merge sweep.

random_bisectbool (optional, default: True)

If True, bisections are done at randomly chosen intervals. Otherwise a Fibonacci sequence is used.

merge_sweepsint (optional, default: 10)

Number of sweeps spent to find good merge proposals.

mh_sweepsint (optional, default: 10)

Number of single-node Metropolis-Hastings sweeps between merge splits.

init_rdouble (optional, default: 0.99)

Stopping criterion for the intialization phase, after each node is put in their own group, to set the initial upper bound of the bisection search. A number of single-node Metropolis-Hastings sweeps is done until the number of groups is shrunk by a factor that is larger than this parameter.

init_betafloat (optional, default: 1.)

Inverse temperature to be used for the very first sweep of the initialization phase.

gibbsbool (optional, default: False)

If True, the single node moves use (slower) Gibbs sampling, rather than Metropolis-Hastings.

B_minint (optional, default: 1)

Minimum number of groups to be considered in the search.

b_minVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_min.

B_maxint (optional, default: 1)

Maximum number of groups to be considered in the search.

b_maxVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_max.

Mint (optional, default: None)

Maximum number of groups to select for the multilevel move. If None is provided, then all groups are always elected.

cache_statesbool (optional, default: True)

If True, intermediary states will be cached during the bisection search.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E\ln^2 N)$$ complexity, where $$E$$ is the number of edges and $$N$$ is the number of nodes (independently of the number of groups).

References

peixoto-efficient-2014

Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378

class graph_tool.inference.PartitionCentroidState(bs, b=None, RMI=False)[source]

Obtain the center of a set of partitions, according to the variation of information metric or reduced mutual information.

Parameters
bsiterable of iterable of int

List of partitions.

blist or numpy.ndarray (optional, default: None)

Initial partition. If not supplied, a partition into a single group will be used.

RMIbool (optional, default: False)

If True, the reduced mutual information will be used, otherwise the variation of information metric will be used instead.

copy(bs=None, b=None, RMI=None)[source]

Copies the state. The parameters override the state properties, and have the same meaning as in the constructor.

get_B()[source]

Returns the total number of blocks.

get_Be()[source]

Returns the effective number of blocks, defined as $$e^{H}$$, with $$H=-\sum_r\frac{n_r}{N}\ln \frac{n_r}{N}$$, where $$n_r$$ is the number of nodes in group r.

entropy()[source]
mcmc_sweep(beta=1.0, c=0.5, d=0.01, niter=1, entropy_args={}, allow_vacate=True, sequential=True, deterministic=False, vertices=None, verbose=False, **kwargs)

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

dfloat (optional, default: .01)

Probability of selecting a new (i.e. empty) group for a given move.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

allow_vacatebool (optional, default: True)

Allow groups to be vacated.

sequentialbool (optional, default: True)

If sequential == True each vertex move attempt is made sequentially, where vertices are visited in random order. Otherwise the moves are attempted by sampling vertices randomly, so that the same vertex can be moved more than once, before other vertices had the chance to move.

deterministicbool (optional, default: False)

If sequential == True and deterministic == True the vertices will be visited in deterministic order.

verticeslist of ints (optional, default: None)

If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

peixoto-efficient-2014

Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378

multiflip_mcmc_sweep(beta=1.0, c=0.5, psingle=None, psplit=1, pmerge=1, pmergesplit=1, pmovelabel=0, d=0.01, gibbs_sweeps=10, niter=1, entropy_args={}, accept_stats=None, verbose=False, **kwargs)

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC with multiple simultaneous moves (i.e. merges and splits) to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

psplitfloat (optional, default: 1)

Relative probability of proposing a group split.

pmergesplitfloat (optional, default: 1)

Relative probability of proposing a marge-split move.

pmovelabelfloat (optional, default: 0)

Relative probability of proposing a group label move.

dfloat (optional, default: 1)

Probability of selecting a new (i.e. empty) group for a given single-node move.

gibbs_sweepsint (optional, default: 10)

Number of sweeps of Gibbs sampling to be performed (i.e. each node is attempted once per sweep) to refine a split proposal.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

peixoto-merge-split-2020

Tiago P. Peixoto, “Merge-split Markov chain Monte Carlo for community detection”, Phys. Rev. E 102, 012305 (2020), DOI: 10.1103/PhysRevE.102.012305 [sci-hub, @tor], arXiv: 2003.07070

multilevel_mcmc_sweep(niter=1, beta=1.0, c=0.5, psingle=None, pmultilevel=1, d=0.01, r=0.9, random_bisect=True, merge_sweeps=10, mh_sweeps=10, init_r=0.99, init_beta=1.0, gibbs=False, B_min=1, B_max=18446744073709551615, b_min=None, b_max=None, M=None, cache_states=True, entropy_args={}, verbose=False, **kwargs)

Perform niter sweeps of a multilevel agglomerative acceptance-rejection MCMC to sample network partitions, that uses a bisection search on the number of groups, together with group merges and singe-node moves.

Parameters
niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

pmultilevelfloat (optional, default: 1)

Relative probability of proposing a multilevel move.

dfloat (optional, default: .01)

Probability of selecting a new (i.e. empty) group for a given single-node move.

rfloat (optional, default: 0.9)

Group shrink ratio. The number of groups is reduced by this fraction at each merge sweep.

random_bisectbool (optional, default: True)

If True, bisections are done at randomly chosen intervals. Otherwise a Fibonacci sequence is used.

merge_sweepsint (optional, default: 10)

Number of sweeps spent to find good merge proposals.

mh_sweepsint (optional, default: 10)

Number of single-node Metropolis-Hastings sweeps between merge splits.

init_rdouble (optional, default: 0.99)

Stopping criterion for the intialization phase, after each node is put in their own group, to set the initial upper bound of the bisection search. A number of single-node Metropolis-Hastings sweeps is done until the number of groups is shrunk by a factor that is larger than this parameter.

init_betafloat (optional, default: 1.)

Inverse temperature to be used for the very first sweep of the initialization phase.

gibbsbool (optional, default: False)

If True, the single node moves use (slower) Gibbs sampling, rather than Metropolis-Hastings.

B_minint (optional, default: 1)

Minimum number of groups to be considered in the search.

b_minVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_min.

B_maxint (optional, default: 1)

Maximum number of groups to be considered in the search.

b_maxVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_max.

Mint (optional, default: None)

Maximum number of groups to select for the multilevel move. If None is provided, then all groups are always elected.

cache_statesbool (optional, default: True)

If True, intermediary states will be cached during the bisection search.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E\ln^2 N)$$ complexity, where $$E$$ is the number of edges and $$N$$ is the number of nodes (independently of the number of groups).

References

peixoto-efficient-2014

Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378

class graph_tool.inference.PartitionModeState(bs, relabel=True, nested=False, converge=False, **kwargs)[source]

Bases: object

The random label model state for a set of labelled partitions, which attempts to align them with a common group labelling.

Parameters
bslist of iterables

List of partitions to be aligned. If nested=True, these should be hierarchical partitions, composed each as a list of partitions.

relabelbool (optional, default: True)

If True, an initial alignment of the partitions will be attempted during instantiation, otherwise they will be incorporated as they are.

nestedbool (optional, default: False)

If True, the partitions will be assumed to be hierarchical.

convergebool (optional, default: False)

If True, the label alignment will be iterated until convergence upon initialization (otherwise replace_partitions() needs to be called repeatedly).

References

peixoto-revealing-2021

Tiago P. Peixoto, “Revealing consensus and dissensus between network partitions”, Phys. Rev. X 11 021003 (2021) DOI: 10.1103/PhysRevX.11.021003 [sci-hub, @tor], arXiv: 2005.13977

copy(bs=None)[source]

Copies the state. The parameters override the state properties, and have the same meaning as in the constructor.

Adds partition b to the ensemble, after relabelling it if relabel=True, and returns its index in the population.

remove_partition(i)[source]

Removes partition with index i from the ensemble.

Computes the entropy difference (negative log probability) if partition b were inserted the ensemble, after relabelling it if relabel=True.

virtual_remove_partition(b, relabel=True)[source]

Computes the entropy difference (negative log probability) if partition b were removed from the ensemble, after relabelling it if relabel=True.

replace_partitions()[source]

Removes and re-adds every partition, after relabelling, and returns the entropy difference (negative log probability).

relabel_partition(b)[source]

Returns a relabelled copy of partition b, according to its alignment with the ensemble.

align_mode(mode)[source]

Relabel entire ensemble to align with another ensemble given by mode, which should be an instance of PartitionModeState.

get_partition(i)[source]

Returns partition with index i.

get_nested_partition(i)[source]

Returns nested partition with index i.

get_partitions()[source]

Returns all partitions.

get_nested_partitions()[source]

Returns all nested partitions.

relabel()[source]

Re-order group labels according to group sizes.

entropy()[source]

Return the model entropy (negative log-likelihood).

posterior_entropy(MLE=True)[source]

Return the entropy of the random label model, using maximum likelihood estimates for the marginal node probabilities if MLE=True, otherwise using posterior mean estimates.

posterior_cdev(MLE=True)[source]

Return the uncertainty of the mode in the range $$[0,1]$$, using maximum likelihood estimates for the marginal node probabilities if MLE=True, otherwise using posterior mean estimates.

posterior_lprob(b, MLE=True)[source]

Return the log-probability of partition b, using maximum likelihood estimates for the marginal node probabilities if MLE=True, otherwise using posterior mean estimates.

get_coupled_state()[source]

Return the instance of PartitionModeState representing the model at the upper hierarchical level.

get_marginal(g)[source]

Return a VertexPropertyMap for Graph g, with vector<int> values containing the marginal group membership counts for each node.

get_max(g)[source]

Return a VertexPropertyMap for Graph g, with int values containing the maximum marginal group membership for each node.

get_max_nested()[source]

Return a hierarchical partition as a list of numpy.ndarray objects, containing the maximum marginal group membership for each node in every level.

get_B()[source]

Return the total number of labels used.

get_M()[source]

Return the number of partitions

sample_partition(MLE=True)[source]

Sampled a partition from the inferred model, using maximum likelihood estimates for the marginal node probabilities if MLE=True, otherwise using posterior mean estimates.

sample_nested_partition(MLE=True, fix_empty=True)[source]

Sampled a nested partition from the inferred model, using maximum likelihood estimates for the marginal node probabilities if MLE=True, otherwise using posterior mean estimates.

class graph_tool.inference.ModeClusterState(bs, b=None, B=1, nested=False, relabel=True)[source]

The mixed random label model state for a set of labelled partitions, which attempts to align them inside clusters with a common group labelling.

Parameters
bslist of iterables

List of partitions to be aligned. If nested=True, these should be hierarchical partitions, composed each as a list of partitions.

biterable (optional, default: None)

Initial cluster membership for every partition. If None a random division into B groups will be used.

Bint (optional, default: 1)

Number of groups for initial division.

relabelbool (optional, default: True)

If True, an initial alignment of the partitions will be attempted during instantiation, otherwise they will be incorporated as they are.

nestedbool (optional, default: False)

If True, the partitions will be assumed to be hierarchical.

References

peixoto-revealing-2021

Tiago P. Peixoto, “Revealing consensus and dissensus between network partitions”, Phys. Rev. X 11 021003 (2021) DOI: 10.1103/PhysRevX.11.021003 [sci-hub, @tor], arXiv: 2005.13977

copy(bs=None, b=None)[source]

Copies the state. The parameters override the state properties, and have the same meaning as in the constructor.

get_mode(r)[source]

Return the mode in cluster r as an instance of PartitionModeState.

get_modes(sort=True)[source]

Return the list of nonempty modes, as instances of PartitionModeState. If sorted == True, the modes are retured in decreasing order with respect to their size.

get_wr()[source]

Return cluster sizes.

get_B()[source]

Returns the total number of clusters.

get_Be()[source]

Returns the effective number of clusters, defined as $$e^{H}$$, with $$H=-\sum_r\frac{n_r}{N}\ln \frac{n_r}{N}$$, where $$n_r$$ is the number of partitions in cluster $$r$$.

Computes the entropy difference (negative log probability) if partition b were inserted in cluster r, after relabelling it if relabel=True.

Add partition b in cluster r, after relabelling it if relabel=True.

classify_partition(b, relabel=True, new_group=True, sample=False)[source]

Returns the cluster r to which partition b would belong, after relabelling it if relabel=True, according to the most probable assignment, or randomly sampled according to the relative probabilities if sample==True. If new_group==True, a new previously unoccupied group is also considered for the classification.

relabel(epsilon=1e-06, maxiter=100)[source]

Attempt to align group labels between clusters via a greedy algorithm. The algorithm stops after maxiter iterations or when the entropy improvement lies below epsilon.

entropy()[source]

Return the model entropy (negative log-likelihood).

posterior_entropy(MLE=True)[source]

Return the entropy of the random label model, using maximum likelihood estimates for the marginal node probabilities if MLE=True, otherwise using posterior mean estimates.

posterior_lprob(r, b, MLE=True)[source]

Return the log-probability of partition b belonging to mode r, using maximum likelihood estimates for the marginal node probabilities if MLE=True, otherwise using posterior mean estimates.

replace_partitions()[source]

For every cluster, removes and re-adds every partition, after relabelling, and returns the entropy difference (negative log probability).

sample_partition(MLE=True)[source]

Sampled a cluster label and partition from the inferred model, using maximum likelihood estimates for the marginal node probabilities if MLE=True, otherwise using posterior mean estimates.

sample_nested_partition(MLE=True, fix_empty=True)[source]

Sampled a cluster label and nested partition from the inferred model, using maximum likelihood estimates for the marginal node probabilities if MLE=True, otherwise using posterior mean estimates.

mcmc_sweep(beta=1.0, c=0.5, d=0.01, niter=1, entropy_args={}, allow_vacate=True, sequential=True, deterministic=False, vertices=None, verbose=False, **kwargs)

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

dfloat (optional, default: .01)

Probability of selecting a new (i.e. empty) group for a given move.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

allow_vacatebool (optional, default: True)

Allow groups to be vacated.

sequentialbool (optional, default: True)

If sequential == True each vertex move attempt is made sequentially, where vertices are visited in random order. Otherwise the moves are attempted by sampling vertices randomly, so that the same vertex can be moved more than once, before other vertices had the chance to move.

deterministicbool (optional, default: False)

If sequential == True and deterministic == True the vertices will be visited in deterministic order.

verticeslist of ints (optional, default: None)

If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

peixoto-efficient-2014

Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378

multiflip_mcmc_sweep(beta=1.0, c=0.5, psingle=None, psplit=1, pmerge=1, pmergesplit=1, pmovelabel=0, d=0.01, gibbs_sweeps=10, niter=1, entropy_args={}, accept_stats=None, verbose=False, **kwargs)

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC with multiple simultaneous moves (i.e. merges and splits) to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

psplitfloat (optional, default: 1)

Relative probability of proposing a group split.

pmergesplitfloat (optional, default: 1)

Relative probability of proposing a marge-split move.

pmovelabelfloat (optional, default: 0)

Relative probability of proposing a group label move.

dfloat (optional, default: 1)

Probability of selecting a new (i.e. empty) group for a given single-node move.

gibbs_sweepsint (optional, default: 10)

Number of sweeps of Gibbs sampling to be performed (i.e. each node is attempted once per sweep) to refine a split proposal.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

peixoto-merge-split-2020

Tiago P. Peixoto, “Merge-split Markov chain Monte Carlo for community detection”, Phys. Rev. E 102, 012305 (2020), DOI: 10.1103/PhysRevE.102.012305 [sci-hub, @tor], arXiv: 2003.07070

multilevel_mcmc_sweep(niter=1, beta=1.0, c=0.5, psingle=None, pmultilevel=1, d=0.01, r=0.9, random_bisect=True, merge_sweeps=10, mh_sweeps=10, init_r=0.99, init_beta=1.0, gibbs=False, B_min=1, B_max=18446744073709551615, b_min=None, b_max=None, M=None, cache_states=True, entropy_args={}, verbose=False, **kwargs)

Perform niter sweeps of a multilevel agglomerative acceptance-rejection MCMC to sample network partitions, that uses a bisection search on the number of groups, together with group merges and singe-node moves.

Parameters
niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

pmultilevelfloat (optional, default: 1)

Relative probability of proposing a multilevel move.

dfloat (optional, default: .01)

Probability of selecting a new (i.e. empty) group for a given single-node move.

rfloat (optional, default: 0.9)

Group shrink ratio. The number of groups is reduced by this fraction at each merge sweep.

random_bisectbool (optional, default: True)

If True, bisections are done at randomly chosen intervals. Otherwise a Fibonacci sequence is used.

merge_sweepsint (optional, default: 10)

Number of sweeps spent to find good merge proposals.

mh_sweepsint (optional, default: 10)

Number of single-node Metropolis-Hastings sweeps between merge splits.

init_rdouble (optional, default: 0.99)

Stopping criterion for the intialization phase, after each node is put in their own group, to set the initial upper bound of the bisection search. A number of single-node Metropolis-Hastings sweeps is done until the number of groups is shrunk by a factor that is larger than this parameter.

init_betafloat (optional, default: 1.)

Inverse temperature to be used for the very first sweep of the initialization phase.

gibbsbool (optional, default: False)

If True, the single node moves use (slower) Gibbs sampling, rather than Metropolis-Hastings.

B_minint (optional, default: 1)

Minimum number of groups to be considered in the search.

b_minVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_min.

B_maxint (optional, default: 1)

Maximum number of groups to be considered in the search.

b_maxVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_max.

Mint (optional, default: None)

Maximum number of groups to select for the multilevel move. If None is provided, then all groups are always elected.

cache_statesbool (optional, default: True)

If True, intermediary states will be cached during the bisection search.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E\ln^2 N)$$ complexity, where $$E$$ is the number of edges and $$N$$ is the number of nodes (independently of the number of groups).

References

peixoto-efficient-2014

Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378

class graph_tool.inference.ModularityState(g, b=None)[source]

Obtain the partition of a network according to Newman’s modularity.

Warning

Do not use this approach in the analysis of networks without understanding the consequences. This algorithm is included only for comparison purposes. In general, the inference-based approaches based on BlockState, NestedBlockState, and PPBlockState should be universally preferred.

Parameters
gGraph

Graph to be partitioned.

bPropertyMap (optional, default: None)

Initial partition. If not supplied, a partition into a single group will be used.

copy(g=None, b=None)[source]

Copies the state. The parameters override the state properties, and have the same meaning as in the constructor.

get_B()[source]

Returns the total number of blocks.

get_Be()[source]

Returns the effective number of blocks, defined as $$e^{H}$$, with $$H=-\sum_r\frac{n_r}{N}\ln \frac{n_r}{N}$$, where $$n_r$$ is the number of nodes in group r.

entropy(gamma=1.0, **kwargs)[source]

Return the unnormalized negative generalized modularity.

Notes

The unnormalized negative generalized modularity is defined as

$-\sum_{ij}\left(A_{ij}-\gamma \frac{k_ik_j}{2E}\right)$

Where $$A_{ij}$$ is the adjacency matrix, $$k_i$$ is the degree of node $$i$$, and $$E$$ is the total number of edges.

modularity(gamma=1)[source]

Return the generalized modularity.

Notes

The generalized modularity is defined as

$\frac{1}{2E}\sum_{ij}\left(A_{ij}-\gamma \frac{k_ik_j}{2E}\right)$

Where $$A_{ij}$$ is the adjacency matrix, $$k_i$$ is the degree of node $$i$$, and $$E$$ is the total number of edges.

draw(**kwargs)

Convenience wrapper to graph_draw() that draws the state of the graph as colors on the vertices and edges.

gibbs_sweep(beta=1.0, niter=1, entropy_args={}, allow_new_group=True, sequential=True, deterministic=False, vertices=None, verbose=False, **kwargs)

Perform niter sweeps of a rejection-free Gibbs MCMC to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

allow_new_groupbool (optional, default: True)

Allow the number of groups to increase and decrease.

sequentialbool (optional, default: True)

If sequential == True each vertex move attempt is made sequentially, where vertices are visited in random order. Otherwise the moves are attempted by sampling vertices randomly, so that the same vertex can be moved more than once, before other vertices had the chance to move.

deterministicbool (optional, default: False)

If sequential == True and deterministic == True the vertices will be visited in deterministic order.

verticeslist of ints (optional, default: None)

If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E\times B)$$ complexity, where $$B$$ is the number of groups, and $$E$$ is the number of edges.

mcmc_sweep(beta=1.0, c=0.5, d=0.01, niter=1, entropy_args={}, allow_vacate=True, sequential=True, deterministic=False, vertices=None, verbose=False, **kwargs)

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

dfloat (optional, default: .01)

Probability of selecting a new (i.e. empty) group for a given move.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

allow_vacatebool (optional, default: True)

Allow groups to be vacated.

sequentialbool (optional, default: True)

If sequential == True each vertex move attempt is made sequentially, where vertices are visited in random order. Otherwise the moves are attempted by sampling vertices randomly, so that the same vertex can be moved more than once, before other vertices had the chance to move.

deterministicbool (optional, default: False)

If sequential == True and deterministic == True the vertices will be visited in deterministic order.

verticeslist of ints (optional, default: None)

If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

peixoto-efficient-2014

Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378

multiflip_mcmc_sweep(beta=1.0, c=0.5, psingle=None, psplit=1, pmerge=1, pmergesplit=1, pmovelabel=0, d=0.01, gibbs_sweeps=10, niter=1, entropy_args={}, accept_stats=None, verbose=False, **kwargs)

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC with multiple simultaneous moves (i.e. merges and splits) to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

psplitfloat (optional, default: 1)

Relative probability of proposing a group split.

pmergesplitfloat (optional, default: 1)

Relative probability of proposing a marge-split move.

pmovelabelfloat (optional, default: 0)

Relative probability of proposing a group label move.

dfloat (optional, default: 1)

Probability of selecting a new (i.e. empty) group for a given single-node move.

gibbs_sweepsint (optional, default: 10)

Number of sweeps of Gibbs sampling to be performed (i.e. each node is attempted once per sweep) to refine a split proposal.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

peixoto-merge-split-2020

Tiago P. Peixoto, “Merge-split Markov chain Monte Carlo for community detection”, Phys. Rev. E 102, 012305 (2020), DOI: 10.1103/PhysRevE.102.012305 [sci-hub, @tor], arXiv: 2003.07070

multilevel_mcmc_sweep(niter=1, beta=1.0, c=0.5, psingle=None, pmultilevel=1, d=0.01, r=0.9, random_bisect=True, merge_sweeps=10, mh_sweeps=10, init_r=0.99, init_beta=1.0, gibbs=False, B_min=1, B_max=18446744073709551615, b_min=None, b_max=None, M=None, cache_states=True, entropy_args={}, verbose=False, **kwargs)

Perform niter sweeps of a multilevel agglomerative acceptance-rejection MCMC to sample network partitions, that uses a bisection search on the number of groups, together with group merges and singe-node moves.

Parameters
niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

pmultilevelfloat (optional, default: 1)

Relative probability of proposing a multilevel move.

dfloat (optional, default: .01)

Probability of selecting a new (i.e. empty) group for a given single-node move.

rfloat (optional, default: 0.9)

Group shrink ratio. The number of groups is reduced by this fraction at each merge sweep.

random_bisectbool (optional, default: True)

If True, bisections are done at randomly chosen intervals. Otherwise a Fibonacci sequence is used.

merge_sweepsint (optional, default: 10)

Number of sweeps spent to find good merge proposals.

mh_sweepsint (optional, default: 10)

Number of single-node Metropolis-Hastings sweeps between merge splits.

init_rdouble (optional, default: 0.99)

Stopping criterion for the intialization phase, after each node is put in their own group, to set the initial upper bound of the bisection search. A number of single-node Metropolis-Hastings sweeps is done until the number of groups is shrunk by a factor that is larger than this parameter.

init_betafloat (optional, default: 1.)

Inverse temperature to be used for the very first sweep of the initialization phase.

gibbsbool (optional, default: False)

If True, the single node moves use (slower) Gibbs sampling, rather than Metropolis-Hastings.

B_minint (optional, default: 1)

Minimum number of groups to be considered in the search.

b_minVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_min.

B_maxint (optional, default: 1)

Maximum number of groups to be considered in the search.

b_maxVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_max.

Mint (optional, default: None)

Maximum number of groups to select for the multilevel move. If None is provided, then all groups are always elected.

cache_statesbool (optional, default: True)

If True, intermediary states will be cached during the bisection search.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E\ln^2 N)$$ complexity, where $$E$$ is the number of edges and $$N$$ is the number of nodes (independently of the number of groups).

References

peixoto-efficient-2014

Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378

class graph_tool.inference.NormCutState(g, b=None)[source]

Obtain the partition of a network according to the normalized cut.

Warning

Do not use this approach in the analysis of networks without understanding the consequences. This algorithm is included only for comparison purposes. In general, the inference-based approaches based on BlockState, NestedBlockState, and PPBlockState should be universally preferred.

Parameters
gGraph

Graph to be partitioned.

bPropertyMap (optional, default: None)

Initial partition. If not supplied, a partition into a single group will be used.

copy(g=None, b=None)[source]

Copies the state. The parameters override the state properties, and have the same meaning as in the constructor.

get_B()[source]

Returns the total number of blocks.

get_Be()[source]

Returns the effective number of blocks, defined as $$e^{H}$$, with $$H=-\sum_r\frac{n_r}{N}\ln \frac{n_r}{N}$$, where $$n_r$$ is the number of nodes in group r.

entropy(**kwargs)[source]

Return the normalized cut. See norm_cut().

norm_cut()[source]

Return the normalized cut.

Notes

The normalized cut is defined as

$B - \sum_r \frac{e_{rr}}{e_r}$

Where $$B$$ is the number of groups, $$e_{rr}$$ is twice the number of edges between nodes of group $$r$$, and $$e_r$$ is the sum of degrees of nodes in group $$r$$.

draw(**kwargs)

Convenience wrapper to graph_draw() that draws the state of the graph as colors on the vertices and edges.

gibbs_sweep(beta=1.0, niter=1, entropy_args={}, allow_new_group=True, sequential=True, deterministic=False, vertices=None, verbose=False, **kwargs)

Perform niter sweeps of a rejection-free Gibbs MCMC to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

allow_new_groupbool (optional, default: True)

Allow the number of groups to increase and decrease.

sequentialbool (optional, default: True)

If sequential == True each vertex move attempt is made sequentially, where vertices are visited in random order. Otherwise the moves are attempted by sampling vertices randomly, so that the same vertex can be moved more than once, before other vertices had the chance to move.

deterministicbool (optional, default: False)

If sequential == True and deterministic == True the vertices will be visited in deterministic order.

verticeslist of ints (optional, default: None)

If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E\times B)$$ complexity, where $$B$$ is the number of groups, and $$E$$ is the number of edges.

mcmc_sweep(beta=1.0, c=0.5, d=0.01, niter=1, entropy_args={}, allow_vacate=True, sequential=True, deterministic=False, vertices=None, verbose=False, **kwargs)

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

dfloat (optional, default: .01)

Probability of selecting a new (i.e. empty) group for a given move.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

allow_vacatebool (optional, default: True)

Allow groups to be vacated.

sequentialbool (optional, default: True)

If sequential == True each vertex move attempt is made sequentially, where vertices are visited in random order. Otherwise the moves are attempted by sampling vertices randomly, so that the same vertex can be moved more than once, before other vertices had the chance to move.

deterministicbool (optional, default: False)

If sequential == True and deterministic == True the vertices will be visited in deterministic order.

verticeslist of ints (optional, default: None)

If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

peixoto-efficient-2014

Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378

multiflip_mcmc_sweep(beta=1.0, c=0.5, psingle=None, psplit=1, pmerge=1, pmergesplit=1, pmovelabel=0, d=0.01, gibbs_sweeps=10, niter=1, entropy_args={}, accept_stats=None, verbose=False, **kwargs)

Perform niter sweeps of a Metropolis-Hastings acceptance-rejection MCMC with multiple simultaneous moves (i.e. merges and splits) to sample network partitions.

Parameters
betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

psplitfloat (optional, default: 1)

Relative probability of proposing a group split.

pmergesplitfloat (optional, default: 1)

Relative probability of proposing a marge-split move.

pmovelabelfloat (optional, default: 0)

Relative probability of proposing a group label move.

dfloat (optional, default: 1)

Probability of selecting a new (i.e. empty) group for a given single-node move.

gibbs_sweepsint (optional, default: 10)

Number of sweeps of Gibbs sampling to be performed (i.e. each node is attempted once per sweep) to refine a split proposal.

niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E)$$ complexity, where $$E$$ is the number of edges (independent of the number of groups).

References

peixoto-merge-split-2020

Tiago P. Peixoto, “Merge-split Markov chain Monte Carlo for community detection”, Phys. Rev. E 102, 012305 (2020), DOI: 10.1103/PhysRevE.102.012305 [sci-hub, @tor], arXiv: 2003.07070

multilevel_mcmc_sweep(niter=1, beta=1.0, c=0.5, psingle=None, pmultilevel=1, d=0.01, r=0.9, random_bisect=True, merge_sweeps=10, mh_sweeps=10, init_r=0.99, init_beta=1.0, gibbs=False, B_min=1, B_max=18446744073709551615, b_min=None, b_max=None, M=None, cache_states=True, entropy_args={}, verbose=False, **kwargs)

Perform niter sweeps of a multilevel agglomerative acceptance-rejection MCMC to sample network partitions, that uses a bisection search on the number of groups, together with group merges and singe-node moves.

Parameters
niterint (optional, default: 1)

Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.

betafloat (optional, default: 1.)

Inverse temperature.

cfloat (optional, default: .5)

Sampling parameter c for move proposals: For $$c\to 0$$ the blocks are sampled according to the local neighborhood of a given node and their block connections; for $$c\to\infty$$ the blocks are sampled randomly. Note that only for $$c > 0$$ the MCMC is guaranteed to be ergodic.

psinglefloat (optional, default: None)

Relative probability of proposing a single node move. If None, it will be selected as the number of nodes in the graph.

pmultilevelfloat (optional, default: 1)

Relative probability of proposing a multilevel move.

dfloat (optional, default: .01)

Probability of selecting a new (i.e. empty) group for a given single-node move.

rfloat (optional, default: 0.9)

Group shrink ratio. The number of groups is reduced by this fraction at each merge sweep.

random_bisectbool (optional, default: True)

If True, bisections are done at randomly chosen intervals. Otherwise a Fibonacci sequence is used.

merge_sweepsint (optional, default: 10)

Number of sweeps spent to find good merge proposals.

mh_sweepsint (optional, default: 10)

Number of single-node Metropolis-Hastings sweeps between merge splits.

init_rdouble (optional, default: 0.99)

Stopping criterion for the intialization phase, after each node is put in their own group, to set the initial upper bound of the bisection search. A number of single-node Metropolis-Hastings sweeps is done until the number of groups is shrunk by a factor that is larger than this parameter.

init_betafloat (optional, default: 1.)

Inverse temperature to be used for the very first sweep of the initialization phase.

gibbsbool (optional, default: False)

If True, the single node moves use (slower) Gibbs sampling, rather than Metropolis-Hastings.

B_minint (optional, default: 1)

Minimum number of groups to be considered in the search.

b_minVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_min.

B_maxint (optional, default: 1)

Maximum number of groups to be considered in the search.

b_maxVertexPropertyMap (optional, default: None)

If provided, this will be used for the partition corresponding to B_max.

Mint (optional, default: None)

Maximum number of groups to select for the multilevel move. If None is provided, then all groups are always elected.

cache_statesbool (optional, default: True)

If True, intermediary states will be cached during the bisection search.

entropy_argsdict (optional, default: {})

Entropy arguments, with the same meaning and defaults as in graph_tool.inference.BlockState.entropy().

verbosebool (optional, default: False)

If verbose == True, detailed information will be displayed.

Returns
dSfloat

Entropy difference after the sweeps.

nattemptsint

Number of vertex moves attempted.

nmovesint

Number of vertices moved.

Notes

This algorithm has an $$O(E\ln^2 N)$$ complexity, where $$E$$ is the number of edges and $$N$$ is the number of nodes (independently of the number of groups).

References

peixoto-efficient-2014

Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378

class graph_tool.inference.LatentMultigraphBlockState(g, aE=nan, nested=True, state_args={}, bstate=None, self_loops=False, **kwargs)[source]

Inference state of an erased Poisson multigraph, using the stochastic block model as a prior.

Parameters
gGraph

Measured graph.

aEfloat (optional, default: NaN)

Expected total number of edges used in prior. If NaN, a flat prior will be used instead.

nestedboolean (optional, default: True)

If True, a NestedBlockState will be used, otherwise BlockState.

state_argsdict (optional, default: {})

Arguments to be passed to NestedBlockState or BlockState.

bstateNestedBlockState or BlockState (optional, default: None)

If passed, this will be used to initialize the block state directly.

self_loopsbool (optional, default: False)

If True, it is assumed that the uncertain graph can contain self-loops.

References

peixoto-latent-2020

Tiago P. Peixoto, “Latent Poisson models for networks with heterogeneous density”, DOI: 10.1103/PhysRevE.102.012309 [sci-hub, @tor], arXiv: 2002.07803

entropy(density=True, **kwargs)[source]

Return the entropy, i.e. negative log-likelihood.

copy(**kwargs)[source]

Return a copy of the state.

collect_marginal(g=None)

Collect marginal inferred network during MCMC runs.

Parameters
gGraph (optional, default: None)

Previous marginal graph.

Returns
gGraph

New marginal graph, with internal edge EdgePropertyMap "eprob", containing the marginal probabilities for each edge.

Notes

The posterior marginal probability of an edge $$(i,j)$$ is defined as

$\pi_{ij} = \sum_{\boldsymbol A}A_{ij}P(\boldsymbol A|\boldsymbol D)$

where $$P(\boldsymbol A|\boldsymbol D)$$ is the posterior probability given the data.

collect_marginal_multigraph(g=None)

Collect marginal latent multigraph during MCMC runs.

Parameters
gGraph (optional, default: None)

Previous marginal multigraph.

Returns
gGraph

New marginal graph, with internal edge EdgePropertyMap "w" and "wcount", containing the edge multiplicities and their respective counts.

Notes

The mean posterior marginal multiplicity distribution of a multi-edge $$(i,j)$$ is defined as

$\pi_{ij}(w) = \sum_{\boldsymbol G}\delta_{w,G_{ij}}P(\boldsymbol G|\boldsymbol D)$

where $$P(\boldsymbol G|\boldsymbol D)$$ is the posterior probability of a multigraph $$\boldsymbol G$$ given the data.

get_block_state()

Return the underlying block state, which can be either BlockState or NestedBlockState.

get_edge_prob(u, v, entropy_args={}, epsilon=1e-08)

Return conditional posterior log-probability of edge $$(u,v)$$.

get_edges_prob(elist, entropy_args={}, epsilon=1e-08)

Return conditional posterior log-probability of an edge list, with shape $$(E,2)$$.

get_graph()

Return the current inferred graph.

mcmc_sweep(r=0.5, multiflip=True, **kwargs)

Perform sweeps of a Metropolis-Hastings acceptance-rejection sampling MCMC to sample network partitions and latent edges. The parameter r controls the probability with which edge move will be attempted, instead of partition moves. The remaining keyword parameters will be passed to mcmc_sweep() or multiflip_mcmc_sweep(), if multiflip=True.

multiflip_mcmc_sweep(**kwargs)

Alias for mcmc_sweep() with multiflip=True.

set_state(g, w)
virtual_remove_edge(u, v, entropy_args={})
class graph_tool.inference.UncertainBlockState(g, q, q_default=0.0, aE=nan, nested=True, state_args={}, bstate=None, self_loops=False, **kwargs)[source]

Inference state of an uncertain graph, using the stochastic block model as a prior.

Parameters
gGraph

Measured graph.

qEdgePropertyMap

Edge probabilities in range $$[0,1]$$.

q_defaultfloat (optional, default: 0.)

Non-edge probability in range $$[0,1]$$.

aEfloat (optional, default: NaN)

Expected total number of edges used in prior. If NaN, a flat prior will be used instead.

nestedboolean (optional, default: True)

If True, a NestedBlockState will be used, otherwise BlockState.

state_argsdict (optional, default: {})

Arguments to be passed to NestedBlockState or BlockState.

bstateNestedBlockState or BlockState (optional, default: None)

If passed, this will be used to initialize the block state directly.

self_loopsbool (optional, default: False)

If True, it is assumed that the uncertain graph can contain self-loops.

References

peixoto-reconstructing-2018

Tiago P. Peixoto, “Reconstructing networks with unknown and heterogeneous errors”, Phys. Rev. X 8 041011 (2018). DOI: 10.1103/PhysRevX.8.041011 [sci-hub, @tor], arXiv: 1806.07956

copy(**kwargs)[source]

Return a copy of the state.

collect_marginal(g=None)

Collect marginal inferred network during MCMC runs.

Parameters
gGraph (optional, default: None)

Previous marginal graph.

Returns
gGraph

New marginal graph, with internal edge EdgePropertyMap "eprob", containing the marginal probabilities for each edge.

Notes

The posterior marginal probability of an edge $$(i,j)$$ is defined as

$\pi_{ij} = \sum_{\boldsymbol A}A_{ij}P(\boldsymbol A|\boldsymbol D)$

where $$P(\boldsymbol A|\boldsymbol D)$$ is the posterior probability given the data.

collect_marginal_multigraph(g=None)

Collect marginal latent multigraph during MCMC runs.

Parameters
gGraph (optional, default: None)

Previous marginal multigraph.

Returns
gGraph

New marginal graph, with internal edge EdgePropertyMap "w" and "wcount", containing the edge multiplicities and their respective counts.

Notes

The mean posterior marginal multiplicity distribution of a multi-edge $$(i,j)$$ is defined as

$\pi_{ij}(w) = \sum_{\boldsymbol G}\delta_{w,G_{ij}}P(\boldsymbol G|\boldsymbol D)$

where $$P(\boldsymbol G|\boldsymbol D)$$ is the posterior probability of a multigraph $$\boldsymbol G$$ given the data.

entropy(latent_edges=True, density=True, **kwargs)

Return the entropy, i.e. negative log-likelihood.

get_block_state()

Return the underlying block state, which can be either BlockState or NestedBlockState.

get_edge_prob(u, v, entropy_args={}, epsilon=1e-08)

Return conditional posterior log-probability of edge $$(u,v)$$.

get_edges_prob(elist, entropy_args={}, epsilon=1e-08)

Return conditional posterior log-probability of an edge list, with shape $$(E,2)$$.

get_graph()

Return the current inferred graph.

mcmc_sweep(r=0.5, multiflip=True, **kwargs)

Perform sweeps of a Metropolis-Hastings acceptance-rejection sampling MCMC to sample network partitions and latent edges. The parameter r controls the probability with which edge move will be attempted, instead of partition moves. The remaining keyword parameters will be passed to mcmc_sweep() or multiflip_mcmc_sweep(), if multiflip=True.

multiflip_mcmc_sweep(**kwargs)

Alias for mcmc_sweep() with multiflip=True.

set_state(g, w)
virtual_remove_edge(u, v, entropy_args={})
class graph_tool.inference.MeasuredBlockState(g, n, x, n_default=1, x_default=0, fn_params={'alpha': 1, 'beta': 1}, fp_params={'mu': 1, 'nu': 1}, aE=nan, nested=True, state_args={}, bstate=None, self_loops=False, **kwargs)[source]

Inference state of a measured graph, using the stochastic block model as a prior.

Parameters
gGraph

Measured graph.

nEdgePropertyMap

Edge property map of type int, containing the total number of measurements for each edge.

xEdgePropertyMap

Edge property map of type int, containing the number of positive measurements for each edge.

n_defaultint (optional, default: 1)

Total number of measurements for each non-edge.

x_defaultint (optional, default: 0)

Total number of positive measurements for each non-edge.

fn_paramsdict (optional, default: dict(alpha=1, beta=1))

Beta distribution hyperparameters for the probability of missing edges (false negatives).

fp_paramsdict (optional, default: dict(mu=1, nu=1))

Beta distribution hyperparameters for the probability of spurious edges (false positives).

aEfloat (optional, default: NaN)

Expected total number of edges used in prior. If NaN, a flat prior will be used instead.

nestedboolean (optional, default: True)

If True, a NestedBlockState will be used, otherwise BlockState.

state_argsdict (optional, default: {})

Arguments to be passed to NestedBlockState or BlockState.

bstateNestedBlockState or BlockState (optional, default: None)

If passed, this will be used to initialize the block state directly.

self_loopsbool (optional, default: False)

If True, it is assumed that the uncertain graph can contain self-loops.

References

peixoto-reconstructing-2018

Tiago P. Peixoto, “Reconstructing networks with unknown and heterogeneous errors”, Phys. Rev. X 8 041011 (2018). DOI: 10.1103/PhysRevX.8.041011 [sci-hub, @tor], arXiv: 1806.07956

copy(**kwargs)[source]

Return a copy of the state.

set_hparams(alpha, beta, mu, nu)[source]

Set edge and non-edge hyperparameters.

get_p_posterior()[source]

Get beta distribution parameters for the posterior probability of missing edges.

get_q_posterior()[source]

Get beta distribution parameters for the posterior probability of spurious edges.

collect_marginal(g=None)

Collect marginal inferred network during MCMC runs.

Parameters
gGraph (optional, default: None)

Previous marginal graph.

Returns
gGraph

New marginal graph, with internal edge EdgePropertyMap "eprob", containing the marginal probabilities for each edge.

Notes

The posterior marginal probability of an edge $$(i,j)$$ is defined as

$\pi_{ij} = \sum_{\boldsymbol A}A_{ij}P(\boldsymbol A|\boldsymbol D)$

where $$P(\boldsymbol A|\boldsymbol D)$$ is the posterior probability given the data.

collect_marginal_multigraph(g=None)

Collect marginal latent multigraph during MCMC runs.

Parameters
gGraph (optional, default: None)

Previous marginal multigraph.

Returns
gGraph

New marginal graph, with internal edge EdgePropertyMap "w" and "wcount", containing the edge multiplicities and their respective counts.

Notes

The mean posterior marginal multiplicity distribution of a multi-edge $$(i,j)$$ is defined as

$\pi_{ij}(w) = \sum_{\boldsymbol G}\delta_{w,G_{ij}}P(\boldsymbol G|\boldsymbol D)$

where $$P(\boldsymbol G|\boldsymbol D)$$ is the posterior probability of a multigraph $$\boldsymbol G$$ given the data.

entropy(latent_edges=True, density=True, **kwargs)

Return the entropy, i.e. negative log-likelihood.

get_block_state()

Return the underlying block state, which can be either BlockState or NestedBlockState.

get_edge_prob(u, v, entropy_args={}, epsilon=1e-08)

Return conditional posterior log-probability of edge $$(u,v)$$.

get_edges_prob(elist, entropy_args={}, epsilon=1e-08)

Return conditional posterior log-probability of an edge list, with shape $$(E,2)$$.

get_graph()

Return the current inferred graph.

mcmc_sweep(r=0.5, multiflip=True, **kwargs)

Perform sweeps of a Metropolis-Hastings acceptance-rejection sampling MCMC to sample network partitions and latent edges. The parameter r controls the probability with which edge move will be attempted, instead of partition moves. The remaining keyword parameters will be passed to mcmc_sweep() or multiflip_mcmc_sweep(), if multiflip=True.

multiflip_mcmc_sweep(**kwargs)

Alias for mcmc_sweep() with multiflip=True.

set_state(g, w)
virtual_remove_edge(u, v, entropy_args={})
class graph_tool.inference.UncertainBaseState(g, nested=True, state_args={}, bstate=None, self_loops=False, init_empty=False)[source]

Bases: object

Base state for uncertain network inference.

get_block_state()[source]

Return the underlying block state, which can be either BlockState or NestedBlockState.

entropy(latent_edges=True, density=True, **kwargs)[source]

Return the entropy, i.e. negative log-likelihood.

virtual_remove_edge(u, v, entropy_args={})[source]
set_state(g, w)[source]
mcmc_sweep(r=0.5, multiflip=True, **kwargs)[source]

Perform sweeps of a Metropolis-Hastings acceptance-rejection sampling MCMC to sample network partitions and latent edges. The parameter r controls the probability with which edge move will be attempted, instead of partition moves. The remaining keyword parameters will be passed to mcmc_sweep() or multiflip_mcmc_sweep(), if multiflip=True.

multiflip_mcmc_sweep(**kwargs)[source]

Alias for mcmc_sweep() with multiflip=True.

get_edge_prob(u, v, entropy_args={}, epsilon=1e-08)[source]

Return conditional posterior log-probability of edge $$(u,v)$$.

get_edges_prob(elist, entropy_args={}, epsilon=1e-08)[source]

Return conditional posterior log-probability of an edge list, with shape $$(E,2)$$.

get_graph()[source]

Return the current inferred graph.

collect_marginal(g=None)[source]

Collect marginal inferred network during MCMC runs.

Parameters
gGraph (optional, default: None)

Previous marginal graph.

Returns
gGraph

New marginal graph, with internal edge EdgePropertyMap "eprob", containing the marginal probabilities for each edge.

Notes

The posterior marginal probability of an edge $$(i,j)$$ is defined as

$\pi_{ij} = \sum_{\boldsymbol A}A_{ij}P(\boldsymbol A|\boldsymbol D)$

where $$P(\boldsymbol A|\boldsymbol D)$$ is the posterior probability given the data.

collect_marginal_multigraph(g=None)[source]

Collect marginal latent multigraph during MCMC runs.

Parameters
gGraph (optional, default: None)

Previous marginal multigraph.

Returns
gGraph

New marginal graph, with internal edge EdgePropertyMap "w" and "wcount", containing the edge multiplicities and their respective counts.

Notes

The mean posterior marginal multiplicity distribution of a multi-edge $$(i,j)$$ is defined as

$\pi_{ij}(w) = \sum_{\boldsymbol G}\delta_{w,G_{ij}}P(\boldsymbol G|\boldsymbol D)$

where $$P(\boldsymbol G|\boldsymbol D)$$ is the posterior probability of a multigraph $$\boldsymbol G$$ given the data.

class graph_tool.inference.MixedMeasuredBlockState(g, n, x, n_default=1, x_default=0, fn_params={'alpha': 1, 'beta': 10}, fp_params={'mu': 1, 'nu': 10}, aE=nan, nested=True, state_args={}, bstate=None, self_loops=False, **kwargs)[source]

Inference state of a measured graph with heterogeneous errors, using the stochastic block model as a prior.

Parameters
gGraph

Measured graph.

nEdgePropertyMap

Edge property map of type int, containing the total number of measurements for each edge.

xEdgePropertyMap

Edge property map of type int, containing the number of positive measurements for each edge.

n_defaultint (optional, default: 1)

Total number of measurements for each non-edge.

x_defaultint (optional, default: 1)

Total number of positive measurements for each non-edge.

fn_paramsdict (optional, default: dict(alpha=1, beta=10))

Beta distribution hyperparameters for the probability of missing edges (false negatives).

fp_paramsdict (optional, default: dict(mu=1, nu=10))

Beta distribution hyperparameters for the probability of spurious edges (false positives).

aEfloat (optional, default: NaN)

Expected total number of edges used in prior. If NaN, a flat prior will be used instead.

nestedboolean (optional, default: True)

If True, a NestedBlockState will be used, otherwise BlockState.

state_argsdict (optional, default: {})

Arguments to be passed to NestedBlockState or BlockState.

bstateNestedBlockState or BlockState (optional, default: None)

If passed, this will be used to initialize the block state directly.

self_loopsbool (optional, default: False)

If True, it is assumed that the uncertain graph can contain self-loops.

References

peixoto-reconstructing-2018

Tiago P. Peixoto, “Reconstructing networks with unknown and heterogeneous errors”, Phys. Rev. X 8 041011 (2018). DOI: 10.1103/PhysRevX.8.041011 [sci-hub, @tor], arXiv: 1806.07956

sync_q()[source]
transform(na, xa)[source]
set_hparams(alpha, beta, mu, nu)[source]

Set edge and non-edge hyperparameters.

copy(**kwargs)[source]

Return a copy of the state.

mcmc_sweep(r=0.5, h=0.1, hstep=1, multiflip=True, **kwargs)[source]

Perform sweeps of a Metropolis-Hastings acceptance-rejection sampling MCMC to sample network partitions and latent edges. The parameter r controls the probability with which edge move will be attempted, instead of partition moves. The parameter h controls the relative probability with which hyperparamters moves will be attempted, and hstep is the size of the step.

The remaining keyword parameters will be passed to mcmc_sweep() or multiflip_mcmc_sweep(), if multiflip=True.

collect_marginal(g=None)

Collect marginal inferred network during MCMC runs.

Parameters
gGraph (optional, default: None)

Previous marginal graph.

Returns
gGraph

New marginal graph, with internal edge EdgePropertyMap "eprob", containing the marginal probabilities for each edge.

Notes

The posterior marginal probability of an edge $$(i,j)$$ is defined as

$\pi_{ij} = \sum_{\boldsymbol A}A_{ij}P(\boldsymbol A|\boldsymbol D)$

where $$P(\boldsymbol A|\boldsymbol D)$$ is the posterior probability given the data.

collect_marginal_multigraph(g=None)

Collect marginal latent multigraph during MCMC runs.

Parameters
gGraph (optional, default: None)

Previous marginal multigraph.

Returns
gGraph

New marginal graph, with internal edge EdgePropertyMap "w" and "wcount", containing the edge multiplicities and their respective counts.

Notes

The mean posterior marginal multiplicity distribution of a multi-edge $$(i,j)$$ is defined as

$\pi_{ij}(w) = \sum_{\boldsymbol G}\delta_{w,G_{ij}}P(\boldsymbol G|\boldsymbol D)$

where $$P(\boldsymbol G|\boldsymbol D)$$ is the posterior probability of a multigraph $$\boldsymbol G$$ given the data.

entropy(latent_edges=True, density=True, **kwargs)

Return the entropy, i.e. negative log-likelihood.

get_block_state()

Return the underlying block state, which can be either BlockState or NestedBlockState.

get_edge_prob(u, v, entropy_args={}, epsilon=1e-08)

Return conditional posterior log-probability of edge $$(u,v)$$.

get_edges_prob(elist, entropy_args={}, epsilon=1e-08)

Return conditional posterior log-probability of an edge list, with shape $$(E,2)$$.

get_graph()

Return the current inferred graph.

multiflip_mcmc_sweep(**kwargs)

Alias for mcmc_sweep() with multiflip=True.

set_state(g, w)
virtual_remove_edge(u, v, entropy_args={})
class graph_tool.inference.DynamicsBlockStateBase(g, s, t, x=None, aE=nan, nested=True, state_args={}, bstate=None, self_loops=False, **kwargs)[source]

Base state for network reconstruction based on dynamical data, using the stochastic block model as a prior. This class is not meant to be instantiated directly, only indirectly via one of its subclasses.

set_params(params)[source]

Sets the model parameters via the dictionary params.

copy(**kwargs)[source]

Return a copy of the state.

get_x()[source]

Return latent edge covariates.

get_edge_prob(u, v, x, entropy_args={}, epsilon=1e-08)[source]

Return conditional posterior log-probability of edge $$(u,v)$$.

collect_marginal(g=None)[source]

Collect marginal inferred network during MCMC runs.

Parameters
gGraph (optional, default: None)

Previous marginal graph.

Returns
gGraph

New marginal graph, with internal edge EdgePropertyMap "eprob", containing the marginal probabilities for each edge.

Notes

The posterior marginal probability of an edge $$(i,j)$$ is defined as

$\pi_{ij} = \sum_{\boldsymbol A}A_{ij}P(\boldsymbol A|\boldsymbol D)$

where $$P(\boldsymbol A|\boldsymbol D)$$ is the posterior probability given the data.

collect_marginal_multigraph(g=None)

Collect marginal latent multigraph during MCMC runs.

Parameters
gGraph (optional, default: None)

Previous marginal multigraph.

Returns
gGraph

New marginal graph, with internal edge EdgePropertyMap "w" and "wcount", containing the edge multiplicities and their respective counts.

Notes

The mean posterior marginal multiplicity distribution of a multi-edge $$(i,j)$$ is defined as

$\pi_{ij}(w) = \sum_{\boldsymbol G}\delta_{w,G_{ij}}P(\boldsymbol G|\boldsymbol D)$

where $$P(\boldsymbol G|\boldsymbol D)$$ is the posterior probability of a multigraph $$\boldsymbol G$$ given the data.

entropy(latent_edges=True, density=True, **kwargs)

Return the entropy, i.e. negative log-likelihood.

get_block_state()

Return the underlying block state, which can be either BlockState or NestedBlockState.

get_edges_prob(elist, entropy_args={}, epsilon=1e-08)

Return conditional posterior log-probability of an edge list, with shape $$(E,2)$$.

get_graph()

Return the current inferred graph.

mcmc_sweep(r=0.5, multiflip=True, **kwargs)

Perform sweeps of a Metropolis-Hastings acceptance-rejection sampling MCMC to sample network partitions and latent edges. The parameter r controls the probability with which edge move will be attempted, instead of partition moves. The remaining keyword parameters will be passed to mcmc_sweep() or multiflip_mcmc_sweep(), if multiflip=True.

multiflip_mcmc_sweep(**kwargs)

Alias for mcmc_sweep() with multiflip=True.

set_state(g, w)
virtual_remove_edge(u, v, entropy_args={})
class graph_tool.inference.EpidemicsBlockState(g, s, beta, r, r_v=None, global_beta=None, active=None, t=[], exposed=False, aE=nan, nested=True, state_args={}, bstate=None, self_loops=False, **kwargs)[source]

Inference state for network reconstruction based on epidemic dynamics, using the stochastic block model as a prior.

Parameters
gGraph

Initial graph state.

slist of VertexPropertyMap

Collection of time-series with node states over time. Each entry in this list must be a VertexPropertyMap with type vector<int> containing the states of each node in each time step. A value of 1 means infected and 0 susceptible. Other values are allowed (e.g. for recovered), but their actual value is unimportant for reconstruction.

If the parameter t below is given, each property map value for a given node should contain only the states for the same points in time given by that parameter.

betafloat or EdgePropertyMap

Initial value of the global or local transmission probability for each edge.

rfloat

Spontaneous infection probability.

r_vVertexPropertyMap (optional, default: None)

If given, this will set the initial spontaneous infection probability for each node, and trigger the use of a model where this quantity is in principle different for each node.

global_betafloat (optional, default: None)

If provided, and beta is None this will trigger the use of a model where all transmission probabilities on edges are the same, and given (initially) by this value.

tlist of VertexPropertyMap (optional, default: [])

If nonempty, this allows for a compressed representation of the time-series parameter s, corresponding only to points in time where the state of each node changes. Each entry in this list must be a VertexPropertyMap with type vector<int> containing the points in time where the state of each node change. The corresponding state of the nodes at these times are given by parameter s.

activelist of VertexPropertyMap (optional, default: None)

If given, this specifies the points in time where each node is “active”, and prepared to change its state according to the state of its neighbors. Each entry in this list must be a VertexPropertyMap with type vector<int> containing the states of each node in each time step. A value of 1 means active and 0 inactive.

exposedboolean (optional, default: False)

If True, the data is supposed to come from a SEI, SEIR, etc. model, where a susceptible node (valued 0) first transits to an exposed state (valued -1) upon transmission, before transiting to the infective state (valued 1).

aEfloat (optional, default: NaN)

Expected total number of edges used in prior. If NaN, a flat prior will be used instead.

nestedboolean (optional, default: True)

If True, a NestedBlockState will be used, otherwise BlockState.

state_argsdict (optional, default: {})

Arguments to be passed to NestedBlockState or BlockState.

bstateNestedBlockState or BlockState (optional, default: None)

If passed, this will be used to initialize the block state directly.

self_loopsbool (optional, default: False)

If True, it is assumed that the inferred graph can contain self-loops.

References

peixoto-network-2019

Tiago P. Peixoto, “Network reconstruction and community detection from dynamics”, Phys. Rev. Lett. 123 128301 (2019), DOI: 10.1103/PhysRevLett.123.128301 [sci-hub, @tor], arXiv: 1903.10833

copy(**kwargs)[source]

Return a copy of the state.

set_params(params)[source]

Sets the model parameters via the dictionary params.

get_x()[source]

Return latent edge transmission probabilities.

mcmc_sweep(r=0.5, p=0.1, pstep=0.1, h=0.1, hstep=1, xstep=0.1, multiflip=True, **kwargs)[source]

Perform sweeps of a Metropolis-Hastings acceptance-rejection sampling MCMC to sample network partitions and latent edges. The parameter r controls the probability with which edge move will be attempted, instead of partition moves. The parameter h controls the relative probability with which moves for the parameters r_v will be attempted, and hstep is the size of the step. The parameter p controls the relative probability with which moves for the parameters global_beta and r will be attempted, and pstep is the size of the step. The paramter xstep determines the size of the attempted steps for the edge transmission probabilities.

The remaining keyword parameters will be passed to mcmc_sweep() or multiflip_mcmc_sweep(), if multiflip=True.

collect_marginal(g=None)

Collect marginal inferred network during MCMC runs.

Parameters
gGraph (optional, default: None)

Previous marginal graph.

Returns
gGraph

New marginal graph, with internal edge EdgePropertyMap "eprob", containing the marginal probabilities for each edge.

Notes

The posterior marginal probability of an edge $$(i,j)$$ is defined as

$\pi_{ij} = \sum_{\boldsymbol A}A_{ij}P(\boldsymbol A|\boldsymbol D)$

where $$P(\boldsymbol A|\boldsymbol D)$$ is the posterior probability given the data.

collect_marginal_multigraph(g=None)

Collect marginal latent multigraph during MCMC runs.

Parameters
gGraph (optional, default: None)

Previous marginal multigraph.

Returns
gGraph

New marginal graph, with internal edge EdgePropertyMap "w" and "wcount", containing the edge multiplicities and their respective counts.

Notes

The mean posterior marginal multiplicity distribution of a multi-edge $$(i,j)$$ is defined as

$\pi_{ij}(w) = \sum_{\boldsymbol G}\delta_{w,G_{ij}}P(\boldsymbol G|\boldsymbol D)$

where $$P(\boldsymbol G|\boldsymbol D)$$ is the posterior probability of a multigraph $$\boldsymbol G$$ given the data.

entropy(latent_edges=True, density=True, **kwargs)

Return the entropy, i.e. negative log-likelihood.

get_block_state()

Return the underlying block state, which can be either BlockState or NestedBlockState.

get_edge_prob(u, v, x, entropy_args={}, epsilon=1e-08)

Return conditional posterior log-probability of edge $$(u,v)$$.

get_edges_prob(elist, entropy_args={}, epsilon=1e-08)

Return conditional posterior log-probability of an edge list, with shape $$(E,2)$$.

get_graph()

Return the current inferred graph.

multiflip_mcmc_sweep(**kwargs)

Alias for mcmc_sweep() with multiflip=True.

set_state(g, w)
virtual_remove_edge(u, v, entropy_args={})
class graph_tool.inference.IsingBaseBlockState(g, s, beta, x=None, h=None, t=None, aE=nan, nested=True, state_args={}, bstate=None, self_loops=False, has_zero=False, **kwargs)[source]

Base state for network reconstruction based on the Ising model, using the stochastic block model as a prior. This class is not supposed to be instantiated directly. Instead one of its specialized subclasses must be used, which have the same signature: IsingGlauberBlockState, PseudoIsingBlockState, CIsingGlauberBlockState, PseudoCIsingBlockState.

Parameters
gGraph

Initial graph state.

slist of VertexPropertyMap or VertexPropertyMap

Collection of time-series with node states over time, or a single time-series. Each time-series must be a VertexPropertyMap with type vector<int> containing the Ising states (-1 or +1) of each node in each time step.

If the parameter t below is given, each property map value for a given node should contain only the states for the same points in time given by that parameter.

betafloat

Initial value of the global inverse temperature.

xEdgePropertyMap (optional, default: None)

Initial value of the local coupling for each edge. If not given, a uniform value of 1 will be used.

hVertexPropertyMap (optional, default: None)

If given, this will set the initial local fields of each node. Otherwise a value of 0 will be used.

tlist of VertexPropertyMap (optional, default: [])

If nonempty, this allows for a compressed representation of the time-series parameter s, corresponding only to points in time where the state of each node changes. Each entry in this list must be a VertexPropertyMap with type vector<int> containing the points in time where the state of each node change. The corresponding state of the nodes at these times are given by parameter s.

aEfloat (optional, default: NaN)

Expected total number of edges used in prior. If NaN, a flat prior will be used instead.

nestedboolean (optional, default: True)

If True, a NestedBlockState will be used, otherwise BlockState.

state_argsdict (optional, default: {})

Arguments to be passed to