# Network reconstruction from dynamics and behavior#

In some cases, direct measurements of the edges of a network are either impossible, or can be done only at significant experimental cost. In such situations, we need to infer the network of interactions from the observed functional behavior [peixoto-network-2024]. In graph-tool this can be done for a variety of static and dynamical models, where the data $$\boldsymbol X$$ is a $$N\times M$$ matrix containing $$M$$ samples (or “features”) for each node.

For static models we have that the likelihood is given by

$P(\boldsymbol X | \boldsymbol W,\boldsymbol\theta) = \prod_m P(\boldsymbol X_{m}|\boldsymbol W, \boldsymbol\theta),$

where each sample is assumed to be i.i.d., whereas for dynamic models $$\boldsymbol X$$ is assumed to contain the states of a Markov chain with likelihood

$P(\boldsymbol X | \boldsymbol W,\boldsymbol\theta) = \prod_m P(\boldsymbol X_{m}|X_{m-1},\boldsymbol W, \boldsymbol\theta).$

In both cases above, $$\boldsymbol W$$ represents the weighted adjacency matrix and $$\boldsymbol \theta$$ are additional node parameters.

The inference procedure consists in the maximum a posteriori (MAP) estimation:

\begin{split}\begin{aligned} \widehat{\boldsymbol W}, \widehat{\boldsymbol\theta} &= \underset{\boldsymbol W, \boldsymbol\theta}{\operatorname{arg max}}\; P(\boldsymbol W,\boldsymbol\theta | \boldsymbol X) \\ &= \underset{\boldsymbol W, \boldsymbol\theta}{\operatorname{arg max}}\; \log P(\boldsymbol X | \boldsymbol W,\boldsymbol\theta) + \log P(\boldsymbol W,\boldsymbol\theta), \end{aligned}\end{split}

where the prior $$P(\boldsymbol W,\boldsymbol\theta)$$ implements regularization.

In graph-tool, a variety of models are implemented, as given by the following table:

Name

State

Data domain

Kind

Multivariate normal

PseudoNormalBlockState

$$\mathbb{R}$$

Static

Dynamic multivariate normal

NormalGlauberBlockState

$$\mathbb{R}$$

Dynamic

Generalized Lotka-Volterra

LVBlockState

$$\mathbb{R}^+$$

Dynamic

Compartmental epidemics

EpidemicsBlockState

$$\{0, 1\}$$

Dynamic

Ising model

PseudoIsingBlockState

$$\{-1, +1\}$$

Static

Kinetic Ising model

IsingGlauberBlockState

$$\{-1, +1\}$$

Dynamic

Continuous Ising model

PseudoCIsingBlockState

$$[-1,1]$$

Static

Continuous kinetic Ising model

CIsingGlauberBlockState

$$[-1,1]$$

Dynamic

## Reconstruction with synthetic data#

Here we demonstrate how reconstruction is performed with a simple example, based on the simulation of the Kinetic Ising model on the dolphins network (we can use the IsingGlauberState class for this purpose):

Note

g = gt.collection.data["dolphins"]
E = g.num_edges()
N = g.num_vertices()
w = g.new_ep("double", vals=np.random.normal(N/(2*E), .05, E))  # True edge weights
istate = gt.IsingGlauberState(g, w=w)

M = 1000
X = []
for m in range(M):
istate.iterate_sync()
X.append(istate.get_state().a.copy())
X = np.array(X).T


At this point the array X contains our simulated data. We will now reconstruct the original network from it using a nonparametric approach, which consists of initializing an IsingGlauberBlockState object, and iterating over mcmc_sweep() until sufficient convergence is reached:

Note

When doing reconstruction, the default for mcmc_sweep() is beta=numpy.inf, meaning that this function performs a greedy optimization, not actual samples from a reversible Markov chain.

state = gt.IsingGlauberBlockState(X)

for i in range(10):
delta, *_ = state.mcmc_sweep(niter=10)
print(delta)


This outputs the following deltas, indicating that convergence was achieved.

-4374.171741316452
-3430.442095268032
-904.7176098519118
-190.28064153057187
-26.28956787113603
-1.9477590574467705
-0.4964012530509585
-0.0012287004910120913
-1.295718874416707e-07
-8.93862761586206e-12


We can compare the reconstructed network using the similarity() function, as follows

u = state.get_graph()      # reconstructed network
w_r = state.get_x()        # reconstructed weights
t_r = state.get_theta()    # reconstructed thetas

print(gt.similarity(g, u, w, w_r))


which gives:

0.907831...


We can visualize the reconstruction with:

cnorm = matplotlib.colors.Normalize(vmin=-abs(w.fa).max(), vmax=abs(w.fa).max())
# Original graph
gt.graph_draw(g, g.vp.pos, edge_pen_width=gt.prop_to_size(w.t(abs), 2, 8, power=1), edge_color=w,
ecmap=matplotlib.cm.coolwarm_r, ecnorm=cnorm, output="dolphins-orig.svg")

# Inferred graph
gt.graph_draw(u, g.vp.pos, edge_pen_width=gt.prop_to_size(w_r.t(abs), 2, 8, power=1), edge_color=w_r,
ecmap=matplotlib.cm.coolwarm_r, ecnorm=cnorm, output="dolphins-inf.svg")

# Inferred graph with SBM partition
edge_pen_width=gt.prop_to_size(w_r.t(abs), 2, 8, power=1),
edge_color=w_r, ecmap=matplotlib.cm.coolwarm_r, ecnorm=cnorm,
output="dolphins-inf-sbm.svg")


which outputs the networks below:

 True network Reconstructed Reconstructed with SBM

Note

The algorithm implemented is the one described in [peixoto-scalable-2024] combined with the MDL regularization based on weight quantization from [peixoto-network-2024], which includes also a SBM as part of the prior for the latent edges.

The algorithm runs in time $$O(N\log^2 N)$$ in the typical case, where $$N$$ is the number of nodes, and if enabled during compilation will run in parallel using OpenMP, and therefore can be employed to reconstruct large networks.

### $$L_1$$ regularization#

The default reconstruction methods employs a nonparametric regularization scheme [peixoto-network-2024], but optionally $$L_1$$ can be used instead. This is done by initializing the reconstruction state as

lstate = gt.IsingGlauberBlockState(X, disable_xdist=True, disable_tdist=True)


This method requires a sparsity parameter $$\lambda$$ to be set, e.g. the following sets $$\lambda=200$$,

lstate.update_entropy_args(xl1=200)


and then we can proceed in the same way,

for i in range(10):
delta, *_ = lstate.mcmc_sweep(niter=10)


which produces the network:

u = lstate.get_graph()
w_r = lstate.get_x()
gt.graph_draw(u, g.vp.pos, edge_pen_width=gt.prop_to_size(w_r.t(abs), 2, 8, power=1), edge_color=w_r,
ecmap=matplotlib.cm.coolwarm_r, ecnorm=cnorm, output="dolphins-inf-l1.svg")


Caution

The $$L_1$$ regularization method is faster than MDL, but the most appropriate value of $$\lambda$$ is tricky to find, usually requiring cross-validation, which in the end makes it slower than MDL. It also introduces a shrinkage bias and tends to overfit when compared to MDL, which in general should be preferred. See [peixoto-network-2024] for details.

## Reconstruction with empirical data#

As an example of how to perform reconstructions on empirical data, we will consider the log-returns of the stock prices of the S&P 500 companies containing the 500 largest US companies.

We will begin by fetching the data and computing the log-returns:

import requests
from collections import defaultdict
import csv

r = requests.get('https://raw.githubusercontent.com/plotly/datasets/master/all_stocks_5yr.csv')
lines = r.iter_lines(decode_unicode=True)
prices = defaultdict(list)
prices[vals[-1]].append(float(vals[4]))     # use closing price
stocks, s = zip(*[(stock, vals) for stock, vals in prices.items() if len(vals) == 1259])

# compute log-returns
s = [[log(p[i+1]) - log(p[i]) for i in range(len(p)-1)] for p in s]

s = array(s)


We will use a multivariate normal model via PseudoNormalBlockState:

Tip

For some models/data it may be better to first do a sweep over the edges before any other latent variable. This happens to be true for this case!

state = gt.PseudoNormalBlockState(s)

# Do an initial sweep over the edges before any other latent variable.
ret = state.edge_mcmc_sweep(niter=10)

delta = abs(ret[0])
while delta > 20:
ret = state.mcmc_sweep(niter=10)
delta = abs(ret[0])


We can now visualize the reconstructed network:

bstate = state.get_block_state().levels[0]
pos = gt.sfdp_layout(state.get_graph(), groups=bstate.get_blocks(), gamma=.02)
edge_pen_width=gt.prop_to_size(state.get_x().t(abs), .1, 3, power=1),
output="SP500.svg")


## References#

[peixoto-network-2019]

Tiago P. Peixoto, “Network reconstruction and community detection from dynamics”, Phys. Rev. Lett. 123 128301 (2019), DOI: 10.1103/PhysRevLett.123.128301 [sci-hub, @tor], arXiv: 1903.10833

Tiago P. Peixoto, “Scalable network reconstruction in subquadratic time”, arXiv: 2401.01404

[peixoto-network-2024] (1,2,3,4)

Tiago P. Peixoto, “Network reconstruction via the minimum description length principle”, arXiv: 2405.01015