Network reconstruction from dynamics and behavior

Network reconstruction from dynamics and behavior#

In some cases, direct measurements of the edges of a network are either impossible, or can be done only at significant experimental cost. In such situations, we need to infer the network of interactions from the observed functional behavior [peixoto-network-2024]. In graph-tool this can be done for a variety of static and dynamical models, where the data X is a N×M matrix containing M samples (or “features”) for each node.

For static models we have that the likelihood is given by

P(X|W,θ)=mP(Xm|W,θ),

where each sample is assumed to be i.i.d., whereas for dynamic models X is assumed to contain the states of a Markov chain with likelihood

P(X|W,θ)=mP(Xm|Xm1,W,θ).

In both cases above, W represents the weighted adjacency matrix and θ are additional node parameters.

The inference procedure consists in the maximum a posteriori (MAP) estimation:

W^,θ^=argmaxW,θP(W,θ|X)=argmaxW,θlogP(X|W,θ)+logP(W,θ),

where the prior P(W,θ) implements regularization.

In graph-tool, a variety of models are implemented, as given by the following table:

Name

State

Data domain

Kind

Multivariate normal

PseudoNormalBlockState

R

Static

Dynamic multivariate normal

NormalGlauberBlockState

R

Dynamic

Generalized Lotka-Volterra

LVBlockState

R+

Dynamic

Compartmental epidemics

EpidemicsBlockState

{0,1}

Dynamic

Ising model

PseudoIsingBlockState

{1,+1}

Static

Kinetic Ising model

IsingGlauberBlockState

{1,+1}

Dynamic

Continuous Ising model

PseudoCIsingBlockState

[1,1]

Static

Continuous kinetic Ising model

CIsingGlauberBlockState

[1,1]

Dynamic

Reconstruction with synthetic data#

Here we demonstrate how reconstruction is performed with a simple example, based on the simulation of the Kinetic Ising model on the dolphins network (we can use the IsingGlauberState class for this purpose):

Note

../../_images/dolphin.svg

This is the dolphins network we will attempt to reconstruct from a the kinetic Ising model.#

g = gt.collection.data["dolphins"]
E = g.num_edges()
N = g.num_vertices()
w = g.new_ep("double", vals=np.random.normal(N/(2*E), .05, E))  # True edge weights
istate = gt.IsingGlauberState(g, w=w)

M = 1000
X = []
for m in range(M):
    istate.iterate_sync()
    X.append(istate.get_state().a.copy())
X = np.array(X).T

At this point the array X contains our simulated data. We will now reconstruct the original network from it using a nonparametric approach, which consists of initializing an IsingGlauberBlockState object, and iterating over mcmc_sweep() until sufficient convergence is reached:

Note

When doing reconstruction, the default for mcmc_sweep() is the inverse “temperature” beta=1, meaning that this function performs sweeps of a MCMC. Here we are interested in a greedy optimization instead, so we need to pass beta=numpy.inf.

state = gt.IsingGlauberBlockState(X, directed=False, self_loops=False)

for i in range(10):
    delta, *_ = state.mcmc_sweep(beta=np.inf, niter=10)
    print(delta)

This outputs the following deltas, indicating that convergence was achieved.

-4686.196037213296
-3286.049921543921
-818.8563094596482
-115.3160422147071
-19.18297219710969
-2.427082044189543
-0.002041309916862133
-0.0029370158383912326
-0.0002670763882761662
-2.5721647034515627e-12

We can compare the reconstructed network using the similarity() function, as follows

u = state.get_graph()      # reconstructed network
w_r = state.get_x()        # reconstructed weights
t_r = state.get_theta()    # reconstructed thetas

print(gt.similarity(g, u, w, w_r))

which gives:

0.908651...

We can visualize the reconstruction with:

cnorm = matplotlib.colors.Normalize(vmin=-abs(w.fa).max(), vmax=abs(w.fa).max())
# Original graph
gt.graph_draw(g, g.vp.pos, edge_pen_width=gt.prop_to_size(w.t(abs), 2, 8, power=1), edge_color=w,
              ecmap=matplotlib.cm.coolwarm_r, ecnorm=cnorm, output="dolphins-orig.svg")

# Inferred graph
gt.graph_draw(u, g.vp.pos, edge_pen_width=gt.prop_to_size(w_r.t(abs), 2, 8, power=1), edge_color=w_r,
              ecmap=matplotlib.cm.coolwarm_r, ecnorm=cnorm, output="dolphins-inf.svg")

# Inferred graph with SBM partition
state.bstate.draw(pos=g.vp.pos, edge_gradient=[],
                  edge_pen_width=gt.prop_to_size(w_r.t(abs), 2, 8, power=1),
                  edge_color=w_r, ecmap=matplotlib.cm.coolwarm_r, ecnorm=cnorm,
                  output="dolphins-inf-sbm.svg")

which outputs the networks below:

True network

Reconstructed

Reconstructed with SBM

../../_images/dolphins-orig.svg ../../_images/dolphins-inf.svg ../../_images/dolphins-inf-sbm.svg

Note

The algorithm implemented is the one described in [peixoto-scalable-2024] combined with the MDL regularization based on weight quantization from [peixoto-network-2024], which includes also a SBM as part of the prior for the latent edges.

The algorithm runs in time O(Nlog2N) in the typical case, where N is the number of nodes, and if enabled during compilation will run in parallel using OpenMP, and therefore can be employed to reconstruct large networks.

L1 regularization#

The default reconstruction methods employs a nonparametric regularization scheme [peixoto-network-2024], but optionally L1 can be used instead. This is done by initializing the reconstruction state as

lstate = gt.IsingGlauberBlockState(X, disable_xdist=True, disable_tdist=True, directed=False, self_loops=False)

This method requires a sparsity parameter λ to be set, e.g. the following sets λ=200,

lstate.update_entropy_args(xl1=200)

and then we can proceed in the same way,

for i in range(10):
   delta, *_ = lstate.mcmc_sweep(beta=np.inf, niter=10)

which produces the network:

u = lstate.get_graph()
w_r = lstate.get_x()
gt.graph_draw(u, g.vp.pos, edge_pen_width=gt.prop_to_size(w_r.t(abs), 2, 8, power=1), edge_color=w_r,
             ecmap=matplotlib.cm.coolwarm_r, ecnorm=cnorm, output="dolphins-inf-l1.svg")
../../_images/dolphins-inf-l1.svg

Network reconstructed using L1 regularization with λ=200.#

Caution

The L1 regularization method is faster than MDL, but the most appropriate value of λ is tricky to find, usually requiring cross-validation, which in the end makes it slower than MDL. It also introduces a shrinkage bias and tends to overfit when compared to MDL, which in general should be preferred. See [peixoto-network-2024] for details.

Reconstruction with empirical data#

As an example of how to perform reconstructions on empirical data, we will consider the log-returns of the stock prices of the S&P 500 companies containing the 500 largest US companies.

We will begin by fetching the data and computing the log-returns:

import requests
from collections import defaultdict
import csv

r = requests.get('https://downloads.skewed.de/data/all_stocks_5yr.csv')
lines = r.iter_lines(decode_unicode=True)
next(lines)                                     # remove header
prices = defaultdict(list)
for vals in csv.reader(lines, delimiter=','):
    prices[vals[-1]].append(float(vals[4]))     # use closing price
stocks, s = zip(*[(stock, vals) for stock, vals in prices.items() if len(vals) == 1259])

# compute log-returns
s = [[log(p[i+1]) - log(p[i]) for i in range(len(p)-1)] for p in s]

s = array(s)

We will use a multivariate normal model via PseudoNormalBlockState:

state = gt.PseudoNormalBlockState(s)

delta = np.inf
while delta > 20:
    ret = state.mcmc_sweep(beta=np.inf, niter=1)
    delta = abs(ret[0])

We can now visualize the reconstructed network:

bstate = state.get_block_state().levels[0]
pos = gt.sfdp_layout(state.get_graph(), groups=bstate.get_blocks(), gamma=.02)
bstate.draw(pos=pos, vertex_color="white", edge_gradient=[],
            edge_pen_width=gt.prop_to_size(state.get_x().t(abs), .1, 3, power=1),
            output="SP500.svg")
../../_images/SP500.svg

Reconstructed network of interactions between the S&P500 companies according to their daily log-returns and a multivariate normal model. The edge thickness corresponds to the inferred negative inverse covariance (a.k.a. precision matrix entries).#

References#

[peixoto-network-2019]

Tiago P. Peixoto, “Network reconstruction and community detection from dynamics”, Phys. Rev. Lett. 123 128301 (2019), DOI: 10.1103/PhysRevLett.123.128301 [sci-hub, @tor], arXiv: 1903.10833

[peixoto-scalable-2024]

Tiago P. Peixoto, “Scalable network reconstruction in subquadratic time”, arXiv: 2401.01404

[peixoto-network-2024] (1,2,3,4)

Tiago P. Peixoto, “Network reconstruction via the minimum description length principle”, arXiv: 2405.01015