Work of the group lies on the interface between statistical mechanics and quantitative biology. At the centre is the relationship between fluctuations and noise in biological systems, the corresponding statistical ensembles, and biological function. This connection emerges at very different levels and timescales, from stochastic modeling of gene expression to the population dynamics of regulatory DNA.

Stochastic dynamics of gene expression

Gene transcription takes place out of equilibrium, with the rate of expression of a gene constantly changing due to the presence or absence of specific proteins called transcription factors. Random fluctuations of molecular concentrations can have large effects on cells, such as transitions between metastable states. In a collaboration with experimentalists of the Veening lab (Groningen), we have pinpointed the fluctuation causing the transition to the induced state of the paradigmatic lactose-uptake pathway in E. coli and developed a statistical theory of this transition [1]. Past work was on out-of-equilibrium properties of gene expression [2], the determination of regulatory interactions between genes from gene expression time series [3], and fluctuations in the dynamics of non-coding RNA [4].

[1]   What makes the lac-pathway switch: identifying the fluctuations that trigger phenotype switching in gene regulatory systems
  P. Bhogale, R. Sorg, J.-W. Veening, J. Berg, Nucleic Acids Research; doi: 10.1093/nar/gku839 (2014)

[2]   Out-of-equilibrium dynamics of gene expression and the Jarzynski equality,
J. Berg, Phys.Rev.Lett 100, 188101 (2008)

[3]   Dynamics of gene expression and the regulatory inference problem,
J. Berg, Europhys. Lett. 82, 28010 (2008)

[4]   Quantitative analysis of competition in post-transcriptional regulation reveals a novel signature in target expression variation,
F. Klironomos and J. Berg, Biophysical Journal, 104 (4), 951-958 (2013)

The rate at which the lac-pathway switches to the induced state (log-scale) against the inverse of the inducer concentration. Our model predicts a simple relationship that is borne out very well experimentally (orange triangles), see [1].

Statistical mechanics, inverse problems, and inference

We study pure and applied problems at the interface between statistical mechanics and machine learning. An example is the inverse Ising problem: how to reconstruct couplings and fields of an Ising model given observables such as spin-spin correlations. We contributed a novel method based on the Bethe approximation, which is exact on tree-like lattices [1], and extended mean-field approaches to both ferromagnetic and glassy low-temperature phases of Ising models [2]. Applications lie in the reconstruction of expression levels of the constituents of complex tissues from mixtures (in collaboration with Roman Müller and Thomas Benzing at the Nephrology Lab, Cologne) and, more recently, cancer genomics (in collaboration with Roman Thomas at the Department of Translational Medicine, Cologne).

[1]   Mean-field theory for the inverse Ising problem at low temperatures,
  H.C. Nguyen and J. Berg, Phys. Rev. Lett. 109, 050602 (2012).

[2]   Bethe-Peierls approximation and the inverse Ising problem,
  H.C. Nguyen and J. Berg, J. Stat. Mech., P03004 (2012).

[3]   A statistical mechanics approach to the sample deconvolution problem,
  N. Riedel and J. Berg, Phys. Rev. E 87, 042715 (2013)

[4]   Significance analysis and statistical mechanics: an application to clustering,
  M. Łuksza, M. Lässig and J. Berg, Phys. Rev. Lett. 105, 220601 (2010).

Reconstruction error at different temperatures for model with a glassy low-T phase (SK-model) against inverse temperature and for different numbers of thermodynamic states. See [1] for details.

Statistical mechanics of bio-molecular networks

We have developed an equilibrium statistical mechanics of networks with local connectivity correlations [1], which links a network Hamiltonian with the corresponding statistical observables. This statistical approach forms the theoretical basis for identifying functional units in biological networks. Examples are repeated patterns in networks, called network motifs, which can be identified using Bayesian inference on models of correlations within networks [2]. The evolutionary dynamics of networks of related species can be traced by graph alignment, which is based on a stochastic evolution model for biological interactions and for DNA sequences [3,4]. We have used this approach to analyze coexpression networks of human and mouse, and the protein interaction networks of different herpesviruses. A software package called GraphAlignment is available for download.

[1]   Correlated random networks,
J. Berg and M. Lässig, Phys. Rev. Lett. 89 (22), 228701 (2002).

[2]   Local graph alignment and motif search in biological networks,
J. Berg and M. Lässig, PNAS 101(41), 14689-14694 (2004).

[3]   Cross-species analysis of biological networks by Bayesian alignment,
J. Berg and M. Lässig, PNAS 103 (29), 10967-10972 (2006).

[4]   From protein interactions to functional annotation: Graph alignment in Herpes,
M. Kolář, M. Lässig, and J. Berg, BMC Systems Biology, 2:90 (2008).

Alignment of the protein interaction networks of two different Herpes viruses [4]. Animation by Jörn Meier.

Evolution of regulatory interactions

Transcription factors function by binding to specific binding sites on DNA, thereby affecting the expression rate of nearby genes. These binding sites encode regulatory interactions between genes at the level of the DNA sequence. Over evolutionary time scales binding sites can grow weaker or stronger, or disappear completely from the sequence. Higher organisms, which largely share the same set of genes, owe much of their diversity to the evolutionary changes of binding sites.

A stochastic model for binding site evolution, including point mutations, genetic drift and natural selection, shows how quickly a new binding site can be generated as a response to selective pressure on a population [1]. The resulting equilibrium distribution of binding strengths agrees well with the empirical binding-site statistics found in bacterial genomes.

At a genome-wide level, many different regulatory networks can produce a given set of expression patterns. A simple model of gene regulation allows to investigate the ability of regulatory networks to reproduce given expression levels [2]. We find an exponentially large space of regulatory networks compatible with a given set of expression levels, giving rise to an extensive entropy of networks.

[1]   Adaptive evolution of transcription factor binding sites,
J. Berg, S. Willmann, and M. Lässig, BMC Evolutionary Biology 4(1):42 (2004).

[2]   Adaptive gene regulatory networks,
F. Stauffer and J. Berg, EPL 88, 48004 (2009).

A population of binding sites, driven by mutations and selection for binding a transcription factor.