OverlapBlockState#
- class graph_tool.inference.OverlapBlockState(g, b=None, B=None, recs=[], rec_types=[], rec_params=[], clabel=None, pclabel=None, deg_corr=True, dense_bg=False, entropy_args={}, **kwargs)[source]#
Bases:
BlockState
The overlapping stochastic block model state of a given graph.
- Parameters:
- g
Graph
Graph to be modelled.
- b
VertexPropertyMap
ornumpy.ndarray
(optional, default:None
) Initial block labels on the vertices or half-edges. If not supplied, it will be randomly sampled. If the value passed is a vertex property map, it will be assumed to be a non-overlapping partition of the vertices. If it is an edge property map, it should contain a vector for each edge, with the block labels at each end point (sorted according to their vertex index, in the case of undirected graphs, otherwise from source to target). If the value is an
numpy.ndarray
, it will be assumed to correspond directly to a partition of the list of half-edges.- B
int
(optional, default:None
) Number of blocks (or vertex groups). If not supplied it will be obtained from the parameter
b
.- recslist of
EdgePropertyMap
instances (optional, default:[]
) List of real or discrete-valued edge covariates.
- rec_typeslist of edge covariate types (optional, default:
[]
) List of types of edge covariates. The possible types are:
"real-exponential"
,"real-normal"
,"discrete-geometric"
,"discrete-poisson"
or"discrete-binomial"
.- rec_paramslist of
dict
(optional, default:[]
) Model hyperparameters for edge covariates. This should a list of
dict
instances. SeeBlockState
for more details.- clabel
VertexPropertyMap
(optional, default:None
) Constraint labels on the vertices. If supplied, vertices with different label values will not be clustered in the same group.
- deg_corr
bool
(optional, default:True
) If
True
, the degree-corrected version of the blockmodel ensemble will be assumed, otherwise the traditional variant will be used.- dense_bg
bool
(optional, default:False
) If
True
a dense matrix is used for the block graph, otherwise a sparse matrix will be used.- entropy_args: ``dict`` (optional, default: ``{}``)
Override default arguments for
entropy()
method and releated operations.
- g
Methods
add_vertex
(v, r)Add vertex
v
to blockr
.collect_edge_marginals
([p, update])Collect the edge marginal histogram, which counts the number of times the endpoints of each node have been assigned to a given block pair.
collect_partition_histogram
([h, update, unlabel])Collect a histogram of partitions.
collect_vertex_marginals
([p, b, unlabel, update])Collect the vertex marginal histogram, which counts the number of times a node was assigned to a given block.
copy
([g, b, B, deg_corr, clabel, pclabel])Copies the block state.
draw
(**kwargs)Convenience wrapper to
graph_draw()
that draws the state of the graph as colors on the vertices and edges.entropy
([adjacency, dl, partition_dl, ...])Calculate the entropy associated with the current block partition.
exhaustive_sweep
([entropy_args, callback, ...])Perform an exhaustive loop over all possible network partitions.
get_B
()Returns the total number of blocks.
get_Be
()Returns the effective number of blocks, defined as \(e^{H}\), with \(H=-\sum_r\frac{n_r}{N}\ln \frac{n_r}{N}\), where \(n_r\) is the number of nodes in group r.
get_E
()Returns the total number of edges.
get_N
()Returns the total number of nodes.
get_bclabel
([clabel])Returns a
VertexPropertyMap
corresponding to constraint labels for the block graph.get_bg
()Returns the block graph.
get_block_state
([b, vweight])Returns a
BlockState
corresponding to the block graph (i.e. the blocks of the current state become the nodes).Returns the property map which contains the block labels for each vertex.
Returns a
VertexPropertyMap
corresponding to partition constraint labels for the block graph.Returns an edge property map which contains the block labels pairs for each edge.
get_edges_prob
(missing[, spurious, entropy_args])Compute the joint log-probability of the missing and spurious edges given by
missing
andspurious
(a list of(source, target)
tuples, orEdge()
instances), together with the observed edges.Return the current default values for the parameters of the function
entropy()
, together with other operations that depend on them.get_er
()Returns the vertex property map of the block graph which contains the number \(e_r\) of half-edges incident on block \(r\).
get_ers
()Returns the edge property map of the block graph which contains the \(e_{rs}\) matrix entries.
Returns a scalar-valued vertex property map with the majority block membership of each node.
Returns the block matrix (as a sparse
csr_matrix
), which contains the number of edges between each block pair.get_move_prob
(v, s[, c, d, reverse])Compute the log-probability of a move proposal for vertex
v
to blocks
according to sampling parametersc
andd
, as obtained withgraph_tool.inference.BlockState.sample_vertex_move()
.Returns the total number of nonempty blocks.
Returns a scalar-valued vertex property map with the block mixture represented as a single number.
get_nr
()Returns the vertex property map of the block graph which contains the block sizes \(n_r\).
Returns the mixed membership of each vertex.
Get model hyperparameters for edge covariates.
Alias to
get_blocks()
.gibbs_sweep
([beta, niter, entropy_args, ...])Perform
niter
sweeps of a rejection-free Gibbs MCMC to sample network partitions.mcmc_sweep
([bundled])Perform sweeps of a Metropolis-Hastings rejection sampling MCMC to sample network partitions.
move_vertex
(v, s)Move vertex
v
to blocks
.multicanonical_sweep
(m_state[, multiflip])Perform
niter
sweeps of a non-Markovian multicanonical sampling using the Wang-Landau algorithm.multiflip_mcmc_sweep
([beta, c, psingle, ...])Perform
niter
sweeps of a Metropolis-Hastings acceptance-rejection MCMC with multiple simultaneous moves (i.e. merges and splits) to sample network partitions.multilevel_mcmc_sweep
([niter, beta, c, d, ...])Perform
niter
sweeps of a multilevel agglomerative acceptance-rejection pseudo-MCMC (i.e. detailed balance is not preserved) to sample network partitions, that uses a bisection search on the number of groups, together with group merges and singe-node moves.Remove vertex
v
from its current group.Reset the current default values for the parameters of the function
entropy()
, together with other operations that depend on them.sample_graph
([canonical, multigraph, ...])Sample a new graph from the fitted model.
sample_vertex_move
(v[, c, d])Sample block membership proposal of vertex
v
according to real-valued sampling parametersc
andd
: For \(c\to 0\) the blocks are sampled according to the local neighborhood and their connections; for \(c\to\infty\) the blocks are sampled randomly.set_rec_params
(params)Update model hyperparameters for edge covariates.
set_state
(b)Sets the internal partition of the state.
update_entropy_args
(**kwargs)Update the default values for the parameters of the function
entropy()
from the keyword arguments, in a stateful way, together with other operations that depend on them.virtual_vertex_move
(v, s, **kwargs)Computes the entropy difference if vertex
v
is moved to blocks
.- add_vertex(v, r)#
Add vertex
v
to blockr
.This optionally accepts a list of vertices and blocks to add.
Warning
This can leave the state in an inconsistent state if a vertex is added twice to the same group.
- collect_edge_marginals(p=None, update=1)#
Collect the edge marginal histogram, which counts the number of times the endpoints of each node have been assigned to a given block pair.
This should be called multiple times, e.g. after repeated runs of the
graph_tool.inference.BlockState.mcmc_sweep()
function.- Parameters:
- p
EdgePropertyMap
(optional, default:None
) Edge property map with edge marginals to be updated. If not provided, an empty histogram will be created.
- updatefloat (optional, default:
1
) Each call increases the current count by the amount given by this parameter.
- p
- Returns:
- p
EdgePropertyMap
Edge property map with updated edge marginals.
- p
Examples
>>> np.random.seed(42) >>> gt.seed_rng(42) >>> g = gt.collection.data["polbooks"] >>> state = gt.BlockState(g, B=4, deg_corr=True) >>> pe = None >>> state.mcmc_sweep(niter=1000) # remove part of the transient (...) >>> for i in range(1000): ... ret = state.mcmc_sweep(niter=10) ... pe = state.collect_edge_marginals(pe) >>> gt.bethe_entropy(g, pe)[0] 7.106015...
- collect_partition_histogram(h=None, update=1, unlabel=True)#
Collect a histogram of partitions.
This should be called multiple times, e.g. after repeated runs of the
graph_tool.inference.BlockState.mcmc_sweep()
function.- Parameters:
- h
PartitionHist
(optional, default:None
) Partition histogram. If not provided, an empty histogram will be created.
- updatefloat (optional, default:
1
) Each call increases the current count by the amount given by this parameter.
- unlabelbool (optional, default:
True
) If
True
, a canonical labelling of the groups will be used, so that each partition is uniquely represented.
- h
- Returns:
- h
PartitionHist
(optional, default:None
) Updated Partition histogram.
- h
Examples
>>> np.random.seed(42) >>> gt.seed_rng(42) >>> g = gt.collection.data["polbooks"] >>> state = gt.BlockState(g, B=4, deg_corr=True) >>> ph = None >>> state.mcmc_sweep(niter=1000) # remove part of the transient (...) >>> for i in range(1000): ... ret = state.mcmc_sweep(niter=10) ... ph = state.collect_partition_histogram(ph) >>> gt.microstate_entropy(ph) 132.299325...
- collect_vertex_marginals(p=None, b=None, unlabel=False, update=1)#
Collect the vertex marginal histogram, which counts the number of times a node was assigned to a given block.
This should be called multiple times, e.g. after repeated runs of the
graph_tool.inference.BlockState.mcmc_sweep()
function.- Parameters:
- p
VertexPropertyMap
(optional, default:None
) Vertex property map with vector-type values, storing the previous block membership counts. If not provided, an empty histogram will be created.
- b
VertexPropertyMap
(optional, default:None
) Vertex property map with group partition. If not provided, the state’s partition will be used.
- unlabelbool (optional, default:
False
) If
True
, a canonical labelling of the groups will be used, so that each partition is uniquely represented.- updateint (optional, default:
1
) Each call increases the current count by the amount given by this parameter.
- p
- Returns:
- p
VertexPropertyMap
Vertex property map with vector-type values, storing the accumulated block membership counts.
- p
Examples
>>> np.random.seed(42) >>> gt.seed_rng(42) >>> g = gt.collection.data["polbooks"] >>> state = gt.BlockState(g, B=4, deg_corr=True) >>> pv = None >>> state.mcmc_sweep(niter=1000) # remove part of the transient (...) >>> for i in range(1000): ... ret = state.mcmc_sweep(niter=10) ... pv = state.collect_vertex_marginals(pv) >>> gt.mf_entropy(g, pv) 22.533702... >>> gt.graph_draw(g, pos=g.vp["pos"], vertex_shape="pie", ... vertex_pie_fractions=pv, output="polbooks_blocks_soft_B4.svg") <...>
- copy(g=None, b=None, B=None, deg_corr=None, clabel=None, pclabel=None, **kwargs)[source]#
Copies the block state. The parameters override the state properties, and have the same meaning as in the constructor. If
overlap=False
an instance ofBlockState
is returned. This is by default a shallow copy.
- draw(**kwargs)[source]#
Convenience wrapper to
graph_draw()
that draws the state of the graph as colors on the vertices and edges.
- entropy(adjacency=True, dl=True, partition_dl=True, degree_dl=True, degree_dl_kind='distributed', edges_dl=True, dense=False, multigraph=True, deg_entropy=True, recs=True, recs_dl=True, beta_dl=1.0, Bfield=False, exact=True, **kwargs)#
Calculate the entropy associated with the current block partition.
Warning
The default arguments of this function are overriden by those obtained from
get_entropy_args()
. To update the defaults in a stateful way,update_entropy_args()
should be called.- Parameters:
- adjacency
bool
(optional, default:True
) If
True
, the adjacency term of the description length will be included.- dl
bool
(optional, default:True
) If
True
, the description length for the parameters will be included.- partition_dl
bool
(optional, default:True
) If
True
, anddl == True
the partition description length will be included.- degree_dl
bool
(optional, default:True
) If
True
, anddl == True
the degree sequence description length will be included (for degree-corrected models).- degree_dl_kind
str
(optional, default:"distributed"
) This specifies the prior used for the degree sequence. It must be one of:
"uniform"
,"distributed"
(default) or"entropy"
.- edges_dl
bool
(optional, default:True
) If
True
, anddl == True
the edge matrix description length will be included.- dense
bool
(optional, default:False
) If
True
, the “dense” variant of the entropy will be computed.- multigraph
bool
(optional, default:True
) If
True
, the multigraph entropy will be used.- deg_entropy
bool
(optional, default:True
) If
True
, the degree entropy term that is independent of the network partition will be included (for degree-corrected models).- recs
bool
(optional, default:True
) If
True
, the likelihood for real or discrete-valued edge covariates is computed.- recs_dl
bool
(optional, default:True
) If
True
, anddl == True
the edge covariate description length will be included.- beta_dl
double
(optional, default:1.
) Prior inverse temperature.
- Bfield
bool
(optional, default:False
) If True, the
Bfield
parameter passed to the construtor will be taken into account.- exact
bool
(optional, default:True
) If
True
, the exact expressions will be used. Otherwise, Stirling’s factorial approximation will be used for some terms.
- adjacency
Notes
The “entropy” of the state is minus the log-likelihood of the microcanonical SBM, that includes the generated graph \(\boldsymbol{A}\) and the model parameters \(\boldsymbol{\theta}\),
\[\begin{split}\mathcal{S} &= - \ln P(\boldsymbol{A},\boldsymbol{\theta}) \\ &= - \ln P(\boldsymbol{A}|\boldsymbol{\theta}) - \ln P(\boldsymbol{\theta}).\end{split}\]This value is also called the description length of the data, and it corresponds to the amount of information required to describe it (in nats).
For the traditional blockmodel (
deg_corr == False
), the model parameters are \(\boldsymbol{\theta} = \{\boldsymbol{e}, \boldsymbol{b}\}\), where \(\boldsymbol{e}\) is the matrix of edge counts between blocks, and \(\boldsymbol{b}\) is the overlapping partition of the nodes into blocks. For the degree-corrected blockmodel (deg_corr == True
), we have an additional set of parameters, namely the labelled degree sequence \(\boldsymbol{k}\).The model likelihood \(P(\boldsymbol{A}|\theta)\) is given analogously to the non-overlapping case, as described in
graph_tool.inference.BlockState.entropy()
.If
dl == True
, the description length \(\mathcal{L} = -\ln P(\boldsymbol{\theta})\) of the model will be returned as well. The edge-count prior \(P(\boldsymbol{e})\) is described in described inentropy()
. For the overlapping partition \(P(\boldsymbol{b})\), we have\[-\ln P(\boldsymbol{b}) = \ln\left(\!\!{D \choose N}\!\!\right) + \sum_d \ln {\left(\!\!{{B\choose d}\choose n_d}\!\!\right)} + \ln N! - \sum_{\vec{b}}\ln n_{\vec{b}}!,\]where \(d \equiv |\vec{b}|_1 = \sum_rb_r\) is the mixture size, \(n_d\) is the number of nodes in a mixture of size \(d\), \(D\) is the maximum value of \(d\), \(n_{\vec{b}}\) is the number of nodes in mixture \(\vec{b}\).
For the degree-corrected model we need to specify the prior \(P(\boldsymbol{k})\) for the labelled degree sequence as well:
\[-\ln P(\boldsymbol{k}) = \sum_r\ln\left(\!\!{m_r \choose e_r}\!\!\right) - \sum_{\vec{b}}\ln P(\boldsymbol{k}|{\vec{b}}),\]where \(m_r\) is the number of non-empty mixtures which contain type \(r\), and \(P(\boldsymbol{k}|{\vec{b}})\) is the likelihood of the labelled degree sequence inside mixture \(\vec{b}\). For this term we have three options:
degree_dl_kind == "uniform"
\[P(\boldsymbol{k}|\vec{b}) = \prod_r\left(\!\!{n_{\vec{b}}\choose e^r_{\vec{b}}}\!\!\right)^{-1}.\]degree_dl_kind == "distributed"
\[P(\boldsymbol{k}|\vec{b}) = \prod_{\vec{b}}\frac{\prod_{\vec{k}}\eta_{\vec{k}}^{\vec{b}}!}{n_{\vec{b}}!} \prod_r q(e_{\vec{b}}^r - n_{\vec{b}}, n_{\vec{b}})\]where \(n^{\vec{b}}_{\vec{k}}\) is the number of nodes in mixture \(\vec{b}\) with labelled degree \(\vec{k}\), and \(q(n,m)\) is the number of partitions of integer \(n\) into at most \(m\) parts.
degree_dl_kind == "entropy"
\[P(\boldsymbol{k}|\vec{b}) = \prod_{\vec{b}}\exp\left(-n_{\vec{b}}H(\boldsymbol{k}_{\vec{b}})\right)\]where \(H(\boldsymbol{k}_{\vec{b}}) = -\sum_{\vec{k}}p_{\vec{b}}(\vec{k})\ln p_{\vec{b}}(\vec{k})\) is the entropy of the labelled degree distribution inside mixture \(\vec{b}\).
Note that, differently from the other two choices, this represents only an approximation of the description length. It is meant to be used only for comparison purposes, and should be avoided in practice.
For the directed case, the above expressions are duplicated for the in- and out-degrees.
- exhaustive_sweep(entropy_args={}, callback=None, density=None, vertices=None, initial_partition=None, max_iter=None)#
Perform an exhaustive loop over all possible network partitions.
- Parameters:
- entropy_args
dict
(optional, default:{}
) Entropy arguments, with the same meaning and defaults as in
graph_tool.inference.BlockState.entropy()
.- callbackcallable object (optional, default:
None
) Function to be called for each partition, with three arguments
(S, S_min, b_min)
corresponding to the the current entropy value, the minimum entropy value so far, and the corresponding partition, respectively. If not provided, andhist is None
an iterator over the same values will be returned instead.- density
tuple
(optional, default:None
) If provided, it should contain a tuple with values
(S_min, S_max, n_bins)
, which will be used to obtain the density of states via a histogram of sizen_bins
. This parameter is ignored unlesscallback is None
.- verticesiterable of ints (optional, default:
None
) If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.
- initial_partitioniterable of ints (optional, default:
None
) If provided, this will provide the initial partition for the iteration.
- max_iter
int
(optional, default:None
) If provided, this will limit the total number of iterations.
- entropy_args
- Returns:
- statesiterator over (S, S_min, b_min)
If
callback
isNone
andhist
isNone
, the function will return an iterator over(S, S_min, b_min)
corresponding to the the current entropy value, the minimum entropy value so far, and the corresponding partition, respectively.- Ss, countspair of
numpy.ndarray
If
callback is None
andhist is not None
, the function will return the values of each bin (Ss
) and the state count of each bin (counts
).- b_min
VertexPropertyMap
If
callback is not None
orhist is not None
, the function will also return partition with smallest entropy.
Notes
This algorithm has an \(O(B^N)\) complexity, where \(B\) is the number of groups, and \(N\) is the number of vertices.
- get_Be()#
Returns the effective number of blocks, defined as \(e^{H}\), with \(H=-\sum_r\frac{n_r}{N}\ln \frac{n_r}{N}\), where \(n_r\) is the number of nodes in group r.
- get_bclabel(clabel=None)[source]#
Returns a
VertexPropertyMap
corresponding to constraint labels for the block graph.
- get_bg()#
Returns the block graph.
- get_block_state(b=None, vweight=False, **kwargs)#
Returns a
BlockState
corresponding to the block graph (i.e. the blocks of the current state become the nodes). The parameters have the same meaning as the in the constructor. Ifvweight == True
the nodes of the block state are weighted with the node counts.
- get_blocks()#
Returns the property map which contains the block labels for each vertex.
- get_bpclabel()#
Returns a
VertexPropertyMap
corresponding to partition constraint labels for the block graph.
- get_edge_blocks()[source]#
Returns an edge property map which contains the block labels pairs for each edge.
- get_edges_prob(missing, spurious=[], entropy_args={})#
Compute the joint log-probability of the missing and spurious edges given by
missing
andspurious
(a list of(source, target)
tuples, orEdge()
instances), together with the observed edges.More precisely, the log-likelihood returned is
\[\ln \frac{P(\boldsymbol G + \delta \boldsymbol G | \boldsymbol b)}{P(\boldsymbol G| \boldsymbol b)}\]where \(\boldsymbol G + \delta \boldsymbol G\) is the modified graph (with missing edges added and spurious edges deleted).
The values in
entropy_args
are passed toentropy()
to calculate the log-probability.
- get_entropy_args()#
Return the current default values for the parameters of the function
entropy()
, together with other operations that depend on them.
- get_er()#
Returns the vertex property map of the block graph which contains the number \(e_r\) of half-edges incident on block \(r\). If the graph is directed, a pair of property maps is returned, with the number of out-edges \(e^+_r\) and in-edges \(e^-_r\), respectively.
- get_ers()#
Returns the edge property map of the block graph which contains the \(e_{rs}\) matrix entries. For undirected graphs, the diagonal values (self-loops) contain \(e_{rr}/2\).
- get_majority_blocks()[source]#
Returns a scalar-valued vertex property map with the majority block membership of each node.
- get_matrix()#
Returns the block matrix (as a sparse
csr_matrix
), which contains the number of edges between each block pair.Warning
This corresponds to the adjacency matrix of the block graph, which by convention includes twice the amount of edges in the diagonal entries if the graph is undirected.
Examples
>>> g = gt.collection.data["polbooks"] >>> state = gt.BlockState(g, B=5, deg_corr=True) >>> state.mcmc_sweep(niter=1000) (...) >>> m = state.get_matrix() >>> figure() <...> >>> matshow(m.todense()) <...> >>> savefig("bloc_mat.svg")
- get_move_prob(v, s, c=1.0, d=0.1, reverse=False)#
Compute the log-probability of a move proposal for vertex
v
to blocks
according to sampling parametersc
andd
, as obtained withgraph_tool.inference.BlockState.sample_vertex_move()
. Ifreverse == True
, the reverse probability of moving the node back from blocks
to its current one is obtained.
- get_nonoverlap_blocks()[source]#
Returns a scalar-valued vertex property map with the block mixture represented as a single number.
- get_nr()#
Returns the vertex property map of the block graph which contains the block sizes \(n_r\).
- get_overlap_blocks()[source]#
Returns the mixed membership of each vertex.
- Returns:
- bv
VertexPropertyMap
A vector-valued vertex property map containing the block memberships of each node.
- bc_in
VertexPropertyMap
The labelled in-degrees of each node, i.e. how many in-edges belong to each group, in the same order as the
bv
property above.- bc_out
VertexPropertyMap
The labelled out-degrees of each node, i.e. how many out-edges belong to each group, in the same order as the
bv
property above.- bc_total
VertexPropertyMap
The labelled total degrees of each node, i.e. how many incident edges belong to each group, in the same order as the
bv
property above.
- bv
- get_rec_params()#
Get model hyperparameters for edge covariates.
- get_state()#
Alias to
get_blocks()
.
- gibbs_sweep(beta=1.0, niter=1, entropy_args={}, allow_new_group=True, sequential=True, deterministic=False, vertices=None, verbose=False, **kwargs)#
Perform
niter
sweeps of a rejection-free Gibbs MCMC to sample network partitions.- Parameters:
- beta
float
(optional, default:1.
) Inverse temperature.
- niter
int
(optional, default:1
) Number of sweeps to perform. During each sweep, a move attempt is made for each node.
- entropy_args
dict
(optional, default:{}
) Entropy arguments, with the same meaning and defaults as in
graph_tool.inference.BlockState.entropy()
.- allow_new_group
bool
(optional, default:True
) Allow the number of groups to increase and decrease.
- sequential
bool
(optional, default:True
) If
sequential == True
each vertex move attempt is made sequentially, where vertices are visited in random order. Otherwise the moves are attempted by sampling vertices randomly, so that the same vertex can be moved more than once, before other vertices had the chance to move.- deterministic
bool
(optional, default:False
) If
sequential == True
anddeterministic == True
the vertices will be visited in deterministic order.- vertices
list
of ints (optional, default:None
) If provided, this should be a list of vertices which will be moved. Otherwise, all vertices will.
- verbose
bool
(optional, default:False
) If
verbose == True
, detailed information will be displayed.
- beta
- Returns:
- dS
float
Entropy difference after the sweeps.
- nattempts
int
Number of vertex moves attempted.
- nmoves
int
Number of vertices moved.
- dS
Notes
This algorithm has an \(O(E\times B)\) complexity, where \(B\) is the number of groups, and \(E\) is the number of edges.
- mcmc_sweep(bundled=False, **kwargs)[source]#
Perform sweeps of a Metropolis-Hastings rejection sampling MCMC to sample network partitions. If
bundled == True
, the half-edges incident of the same node that belong to the same group are moved together. All remaining parameters are passed tograph_tool.inference.BlockState.mcmc_sweep()
.
- move_vertex(v, s)#
Move vertex
v
to blocks
.This optionally accepts a list of vertices and blocks to move simultaneously.
- multicanonical_sweep(m_state, multiflip=False, **kwargs)#
Perform
niter
sweeps of a non-Markovian multicanonical sampling using the Wang-Landau algorithm.- Parameters:
- m_state
MulticanonicalState
MulticanonicalState
instance containing the current state of the Wang-Landau run.- multiflip
bool
(optional, default:False
) If
True
,multiflip_mcmc_sweep()
will be used, otherwisemcmc_sweep()
.- **kwargsKeyword parameter list
The remaining parameters will be passed to
multiflip_mcmc_sweep()
ormcmc_sweep()
.
- m_state
- Returns:
- dS
float
Entropy difference after the sweeps.
- nattempts
int
Number of vertex moves attempted.
- nmoves
int
Number of vertices moved.
- dS
Notes
This algorithm has an \(O(E)\) complexity, where \(E\) is the number of edges (independent of the number of groups).
References
[wang-efficient-2001]Fugao Wang, D. P. Landau, “An efficient, multiple range random walk algorithm to calculate the density of states”, Phys. Rev. Lett. 86, 2050 (2001), DOI: 10.1103/PhysRevLett.86.2050 [sci-hub, @tor], arXiv: cond-mat/0011174
- multiflip_mcmc_sweep(beta=1.0, c=0.5, psingle=None, psplit=1, pmerge=1, pmergesplit=1, pmovelabel=0, d=0.01, gibbs_sweeps=10, niter=1, entropy_args={}, accept_stats=None, verbose=False, **kwargs)#
Perform
niter
sweeps of a Metropolis-Hastings acceptance-rejection MCMC with multiple simultaneous moves (i.e. merges and splits) to sample network partitions.- Parameters:
- beta
float
(optional, default:1.
) Inverse temperature.
- c
float
(optional, default:.5
) Sampling parameter
c
for move proposals: For \(c\to 0\) the blocks are sampled according to the local neighborhood of a given node and their block connections; for \(c\to\infty\) the blocks are sampled randomly. Note that only for \(c > 0\) the MCMC is guaranteed to be ergodic.- psingle
float
(optional, default:None
) Relative probability of proposing a single node move. If
None
, it will be selected as the number of nodes in the graph.- psplit
float
(optional, default:1
) Relative probability of proposing a group split.
- pmerge
float
(optional, default:1
) Relative probability of proposing a group merge.
- pmergesplit
float
(optional, default:1
) Relative probability of proposing a marge-split move.
- pmovelabel
float
(optional, default:0
) Relative probability of proposing a group label move.
- d
float
(optional, default:1
) Probability of selecting a new (i.e. empty) group for a given single-node move.
- gibbs_sweeps
int
(optional, default:10
) Number of sweeps of Gibbs sampling to be performed (i.e. each node is attempted once per sweep) to refine a split proposal.
- niter
int
(optional, default:1
) Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.
- entropy_args
dict
(optional, default:{}
) Entropy arguments, with the same meaning and defaults as in
graph_tool.inference.BlockState.entropy()
.- accept_stats
dict
(optional, default:None
) If provided, this dictionary will be updated with the proposal and acceptance counts for each kind of move.
- verbose
bool
(optional, default:False
) If
verbose == True
, detailed information will be displayed.
- beta
- Returns:
- dS
float
Entropy difference after the sweeps.
- nattempts
int
Number of vertex moves attempted.
- nmoves
int
Number of vertices moved.
- dS
Notes
This algorithm has an \(O(E)\) complexity, where \(E\) is the number of edges (independent of the number of groups).
References
[peixoto-merge-split-2020]Tiago P. Peixoto, “Merge-split Markov chain Monte Carlo for community detection”, Phys. Rev. E 102, 012305 (2020), DOI: 10.1103/PhysRevE.102.012305 [sci-hub, @tor], arXiv: 2003.07070
- multilevel_mcmc_sweep(niter=1, beta=inf, c=0.5, d=0.01, r=0.9, random_bisect=True, merge_sweeps=10, mh_sweeps=10, init_r=0.99, init_min_iter=5, init_beta=1.0, gibbs=False, B_min=1, B_max=18446744073709551615, b_min=None, b_max=None, M=None, cache_states=True, force_accept=False, parallel=False, entropy_args={}, verbose=False, **kwargs)#
Perform
niter
sweeps of a multilevel agglomerative acceptance-rejection pseudo-MCMC (i.e. detailed balance is not preserved) to sample network partitions, that uses a bisection search on the number of groups, together with group merges and singe-node moves.- Parameters:
- niter
int
(optional, default:1
) Number of sweeps to perform. During each sweep, a move attempt is made for each node, on average.
- beta
float
(optional, default:numpy.inf
) Inverse temperature.
- c
float
(optional, default:.5
) Sampling parameter
c
for move proposals: For \(c\to 0\) the blocks are sampled according to the local neighborhood of a given node and their block connections; for \(c\to\infty\) the blocks are sampled randomly. Note that only for \(c > 0\) the MCMC is guaranteed to be ergodic.- d
float
(optional, default:.01
) Probability of selecting a new (i.e. empty) group for a given single-node move.
- r
float
(optional, default:0.9
) Group shrink ratio. The number of groups is reduced by this fraction at each merge sweep.
- random_bisect
bool
(optional, default:True
) If
True
, bisections are done at randomly chosen intervals. Otherwise a Fibonacci sequence is used.- merge_sweeps
int
(optional, default:10
) Number of sweeps spent to find good merge proposals.
- mh_sweeps
int
(optional, default:10
) Number of single-node Metropolis-Hastings sweeps between merge splits.
- init_r
double
(optional, default:0.99
) Stopping criterion for the intialization phase, after each node is put in their own group, to set the initial upper bound of the bisection search. A number of single-node Metropolis-Hastings sweeps is done until the number of groups is shrunk by a factor that is larger than this parameter.
- init_min_iter
int
(optional, default:5
) Minimum number of iterations at the intialization phase.
- init_beta
float
(optional, default:1.
) Inverse temperature to be used for the very first sweep of the initialization phase.
- gibbs
bool
(optional, default:False
) If
True
, the single node moves use (slower) Gibbs sampling, rather than Metropolis-Hastings.- B_min
int
(optional, default:1
) Minimum number of groups to be considered in the search.
- b_min
VertexPropertyMap
(optional, default:None
) If provided, this will be used for the partition corresponding to
B_min
.- B_max
int
(optional, default:numpy.iinfo(numpy.uint64).max
) Maximum number of groups to be considered in the search.
- b_max
VertexPropertyMap
(optional, default:None
) If provided, this will be used for the partition corresponding to
B_max
.- M
int
(optional, default:None
) Maximum number of groups to select for the multilevel move. If
None
is provided, then all groups are always elected.- cache_states
bool
(optional, default:True
) If
True
, intermediary states will be cached during the bisection search.- force_accept
bool
(optional, default:False
) If
True
, new state will be accepted even if it does not improve the objective function.- parallel
bool
(optional, default:False
) If
True
, the algorithm will run in parallel (if enabled during compilation).- entropy_args
dict
(optional, default:{}
) Entropy arguments, with the same meaning and defaults as in
graph_tool.inference.BlockState.entropy()
.- verbose
bool
(optional, default:False
) If
verbose == True
, detailed information will be displayed.
- niter
- Returns:
- dS
float
Entropy difference after the sweeps.
- nattempts
int
Number of vertex moves attempted.
- nmoves
int
Number of vertices moved.
- dS
Notes
This algorithm has an \(O(E\ln^2 N)\) complexity, where \(E\) is the number of edges and \(N\) is the number of nodes (independently of the number of groups).
References
[peixoto-efficient-2014]Tiago P. Peixoto, “Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models”, Phys. Rev. E 89, 012804 (2014), DOI: 10.1103/PhysRevE.89.012804 [sci-hub, @tor], arXiv: 1310.4378
- remove_vertex(v)#
Remove vertex
v
from its current group.This optionally accepts a list of vertices to remove.
Warning
This will leave the state in an inconsistent state before the vertex is returned to some other group, or if the same vertex is removed twice.
- reset_entropy_args()#
Reset the current default values for the parameters of the function
entropy()
, together with other operations that depend on them.
- sample_graph(canonical=False, multigraph=True, self_loops=True, sample_params=False, max_ent=False, n_iter=1000)#
Sample a new graph from the fitted model.
- Parameters:
- canonical
bool
(optional, default:False
) If
canonical == True
, the graph will be sampled from the maximum-likelihood estimate of the canonical stochastic block model. Otherwise, it will be sampled from the microcanonical model.- multigraph
bool
(optional, default:True
) If
True
, parallel edges will be allowed.- self-loops
bool
(optional, default:True
) If
True
, self-loops will be allowed.- sample_params
bool
(optional, default:True
) If
True
, andcanonical == True
andmax_ent == False
, the count parameters (edges between groups and node degrees) will be sampled from their posterior distribution conditioned on the actual state. Otherwise, their maximum-likelihood values will be used.- max_ent
bool
(optional, default:False
) If
True
, maximum-entropy model variants will be used.- n_iter
int
(optional, default:1000
) Number of iterations used (only relevant if
canonical == False
andmax_ent == True
).
- canonical
- Returns:
- g
Graph
Generated graph.
- g
Notes
This function is just a convenience wrapper to
generate_sbm()
. However, ifmax_ent==True
andcanonical == False
it wrapsrandom_rewire()
instead.Examples
>>> g = gt.collection.data["polbooks"] >>> state = gt.minimize_blockmodel_dl(g, multilevel_mcmc_args=dict(B_max=3)) >>> u = state.sample_graph(canonical=True, self_loops=False, multigraph=False) >>> ustate = gt.BlockState(u, b=state.b) >>> state.draw(pos=g.vp.pos, output="polbooks-sbm.svg") <...> >>> ustate.draw(pos=u.own_property(g.vp.pos), output="polbooks-sbm-sampled.svg") <...>
Left: Political books network. Right: Sample from the degree-corrected SBM fitted to the original network.
- sample_vertex_move(v, c=1.0, d=0.1)#
Sample block membership proposal of vertex
v
according to real-valued sampling parametersc
andd
: For \(c\to 0\) the blocks are sampled according to the local neighborhood and their connections; for \(c\to\infty\) the blocks are sampled randomly. With a probabilityd
, a new (empty) group is sampled.
- set_rec_params(params)#
Update model hyperparameters for edge covariates.
- set_state(b)#
Sets the internal partition of the state.
- update_entropy_args(**kwargs)#
Update the default values for the parameters of the function
entropy()
from the keyword arguments, in a stateful way, together with other operations that depend on them.Values updated in this manner are preserved by the copying or pickling of the state.
- virtual_vertex_move(v, s, **kwargs)#
Computes the entropy difference if vertex
v
is moved to blocks
. The remaining parameters are the same as ingraph_tool.inference.BlockState.entropy()
.