Formation of regulatory modules by local sequence duplication
A. Nourmohammad and M. Lässig, PLoS Comp. Biol., PLoS Comput Biol 7, e1002167 (12 pages), (2011)
Turnover of regulatory sequence and function is an important part of molecular evolution. But what are the modes of sequence evolution leading to rapid formation and loss of regulatory sites? Here, we show that a large fraction of neighboring transcription factor binding sites in the fly genome have formed from a common sequence origin by local duplications. This mode of evolution is found to produce regulatory information: duplications can seed new sites in the neighborhood of existing sites. Duplicate seeds evolve subsequently by point mutations, often towards binding a different factor than their ancestral neighbor sites. These results are based on a statistical analysis of 346 cis-regulatory modules in the Drosophila melanogaster genome, and a comparison set of intergenic regulatory sequences in Saccharomyces cerevisiae. In fly regulatory modules, pairs of binding sites show significantly enhanced sequence similarity up to distances of about 50 bp. We analyze these data in terms of an evolutionary model with two distinct modes of site formation: (i) evolution from independent sequence origin and (ii) divergent evolution following duplication of a common ancestor sequence. Our results suggest that pervasive formation of binding sites by local sequence duplications distinguishes the complex regulatory architecture of higher eukaryotes from the simpler architecture of unicellular organisms.
Emergent Neutrality in Adaptive Asexual Evolution
Stephan Schiffels, Gergely Szöllösi, Ville Mustonen, and Michael Lässig, Genetics, in press (2011)
In non-recombining genomes, genetic linkage can be an important evolutionary force. Linkage generates interference interactions, by which simultaneously occurring mutations affect each other’s chance of fixation. Here, we develop a comprehensive model of adaptive evolution in linked genomes. By an approximate analytical solution, we predict fixation rates of beneficial and deleterious mutations, as well as the statistics of beneficial and deleterious alleles at fixed genomic sites. We find that interference interactions generate a regime of effective neutrality: all genomic sites with selection coefficients smaller in magnitude than a characteristic threshold have nearly random fixed alleles, and both beneficial and deleterious mutations at these sites have nearly neutral fixation rates. We show that this dynamics limits not only the speed of adaptation, but also a population’s degree of adaptation in its current environment. Our model integrates interference between beneficial mutations, genetic hitchhiking of weakly selected mutations, and background selection by strongly deleterious mutations into a unified framework of interference interactions. We apply the model to different adaptive scenarios: stationary adaptation in a time-dependent environment, and approach to equilibrium in a fixed environment (as in long-term evolution experiments). In both cases, the analytical predictions are in good agreement with numerical simulations. Our results suggest that interference can severely compromise biological functions in an adapting population, which sets viability limits on adaptive evolution under linkage.
Nonlinear fitness landscape of a molecular pathway
Lilia Perfeito, Stéphane Ghozzi, Johannes Berg, Karin Schnetz, Michael Lässig, PloS Genetics 7, e1002160 (10 pages), (2011)
Genes are regulated because their expression involves a fitness cost to the organism. The production of proteins by transcription and translation is a well-known cost factor, but the enzymatic activity of the proteins produced can also reduce fitness, depending on the internal state and the environment of the cell. Here, we map the fitness costs of a key metabolic network, the lactose utilization pathway in Escherichia coli. We measure the growth of several regulatory lac operon mutants in different environments inducing expression of the genes. We find a strikingly nonlinear fitness landscape, which depends on the production rate and on the activity rate of the lac proteins. A simple fitness model of the lac pathway, based on elementary biophysical processes, predicts the growth rate of all observed strains. The nonlinearity of fitness is explained by a feedback loop: production and activity of the lac proteins reduce growth, but growth also affects the density of these molecules. This nonlinearity has important consequences for molecular function and evolution. It generates a cliff in the fitness landscape, beyond which populations cannot maintain growth. In viable populations, there is an expression barrier of the lac genes, which cannot be exceeded in any stationary growth process. Furthermore, the nonlinearity determines how the fitness of operon mutants depends on the inducer environment. We argue that fitness nonlinearities, expression barriers, and gene-environment interactions are generic features of fitness landscapes for metabolic pathways, and we discuss their implications for the evolution of regulation.
Fitness flux and ubiquity of adaptive evolution
V. Mustonen and M. Lässig, Proc. Natl. Acad. Sci. 107, 4248-53, (2010)
Natural selection favors fitter variants in a population, but actual evolutionary processes may decrease fitness by mutations and genetic drift. How is the stochastic evolution of molecular biological systems shaped by natural selection? Here, we derive a theorem on the fitness flux in a population, defined as the selective effect of its genotype frequency changes. The fitness-flux theorem generalizes Fisher's fundamental theorem of natural selection to evolutionary processes including mutations, genetic drift, and time-dependent selection. It shows that a generic state of populations is adaptive evolution: there is a positive fitness flux resulting from a surplus of beneficial over deleterious changes. In particular, stationary nonequilibrium evolution processes are predicted to be adaptive. Under specific nonstationary conditions, notably during a decrease in population size, the average fitness flux can become negative. We show that these predictions are in accordance with experiments in bacteria and bacteriophages and with genomic data in Drosophila. Our analysis establishes fitness flux as a universal measure of adaptation in molecular evolution.
Significance analysis and statistical mechanics: an application to clustering
M. Luksza, M. Lässig, and J. Berg, Phys. Rev. Lett. 105, 220601 (4 pages), (2010)
This paper addresses the statistical significance of structures in random data: Given a set of vectors and a measure of mutual similarity, how likely does a subset of these vectors form a cluster with enhanced similarity among its elements? The computation of this cluster p-value for randomly distributed vectors is mapped onto a well-defined problem of statistical mechanics. We solve this problem analytically, establishing a connection between the physics of quenched disorder and multiple testing statistics in clustering and related problems. In an application to gene expression data, we find a remarkable link between the statistical significance of a cluster and the functional relationships between its genes.
From fitness landscapes to seascapes: non-equilibrium dynamics of selection and adaptation
V. Mustonen and M. Lässig, Trends Genet 25, 111-9, (2009)
Evolution is a quest for innovation. Organisms adapt to changing natural selection by evolving new phenotypes. Can we read this dynamics in their genomes? Not every mutation under positive selection responds to a change in selection: beneficial changes also occur at evolutionary equilibrium, repairing previous deleterious changes and restoring existing functions. Adaptation, by contrast, is viewed here as a non-equilibrium phenomenon: the genomic response to time-dependent selection. Our approach extends the static concept of fitness landscapes to dynamic fitness seascapes. It shows that adaptation requires a surplus of beneficial substitutions over deleterious ones. Here, we focus on the evolution of yeast and Drosophila genomes, providing examples where adaptive evolution can and cannot be inferred, despite the presence of positive selection
Energy-dependent fitness: a quantitative model for the evolution of yeast transcription factor binding sites
V. Mustonen, J. Kinney, CG. Callan Jr, and M. Lässig, Proc. Natl. Acad. Sci. 105, 12376-81, (2008)
We present a genomewide cross-species analysis of regulation for broad-acting transcription factors in yeast. Our model for binding site evolution is founded on biophysics: the binding energy between transcription factor and site is a quantitative phenotype of regulatory function, and selection is given by a fitness landscape that depends on this phenotype. The model quantifies conservation, as well as loss and gain, of functional binding sites in a coherent way. Its predictions are supported by direct cross-species comparison between four yeast species. We find ubiquitous compensatory mutations within functional sites, such that the energy phenotype and the function of a site evolve in a significantly more constrained way than does its sequence. We also find evidence for substantial evolution of regulatory function involving point mutations as well as sequence insertions and deletions within binding sites. Genes lose their regulatory link to a given transcription factor at a rate similar to the neutral point mutation rate, from which we infer a moderate average fitness advantage of functional over nonfunctional sites. In a wider context, this study provides an example of inference of selection acting on a quantitative molecular trait.
From protein interactions to functional annotation: graph alignment in Herpes
M. Kolar, Lässig, and J. Berg, BMC Syst Biol. 2, 90, (2008)
Background: Sequence alignment is a prolific basis of functional annotation, but remains a challenging problem in the 'twilight zone' of high sequence divergence or short gene length. Here we demonstrate how information on gene interactions can help to resolve ambiguous sequence alignments. We compare two distant Herpes viruses by constructing a graph alignment, which is based jointly on the similarity of their protein interaction networks and on sequence similarity. This hybrid method provides functional associations between proteins of the two organisms that cannot be obtained from sequence or interaction data alone.
Results: We find proteins where interaction similarity and sequence similarity are individually weak, but together provide significant evidence of orthology. There are also proteins with high interaction similarity but without any detectable sequence similarity, providing evidence of functional association beyond sequence homology. The functional predictions derived from our alignment are consistent with genomic position and gene expression data.
Conclusion: Our approach shows that evolutionary conservation is a powerful filter to make protein interaction data informative about functional similarities between the interacting proteins, and it establishes graph alignment as a powerful tool for the comparative analysis of data from highly diverged species.
Molecular evolution under fitness fluctuations
V. Mustonen and M. Lässig, Phys Rev Lett. 100, 108101, (2008)
Molecular evolution is a stochastic process governed by fitness, mutations, and reproductive fluctuations in a population. Here, we study evolution where fitness itself is stochastic, with random switches in the direction of selection at individual genomic loci. As the correlation time of these fluctuations becomes larger than the diffusion time of mutations within the population, fitness changes from an annealed to a quenched random variable. We show that the rate of evolution has its maximum in the crossover regime, where both time scales are comparable. Adaptive evolution emerges in the quenched fitness regime (evidence for such fitness fluctuations has recently been found in genomic data). The joint statistical theory of reproductive and fitness fluctuations establishes a conceptual connection between evolutionary genetics and statistical physics of disordered system
Adaptations to fluctuating selection in Drosophila
V. Mustonen and M. Lässig, Proc. Natl. Acad. Sci. 104, 2277-82, (2007)
Time-dependent selection causes the adaptive evolution of new phenotypes, and this dynamics can be traced in genomic data. We have analyzed polymorphisms and substitutions in Drosophila, using a more sensitive inference method for adaptations than the standard population-genetic tests. We find evidence that selection itself is strongly time-dependent, with changes occurring at nearly the rate of neutral evolution. At the same time, higher than previously estimated levels of selection make adaptive responses by a factor 10-100 faster than the pace of selection changes, ensuring that adaptations are an efficient mode of evolution under time-dependent selection. The rate of selection changes is faster in noncoding DNA, i.e., the inference of functional elements can less be based on sequence conservation than for proteins. Our results suggest that selection acts not only as a constraint but as a major driving force of genomic change.
From biophysics to evolutionary genetics: statistical aspects of gene regulation
M. Lässig, BMC Bioinformatics 8 Suppl 6, S7, (2007)
This is an introductory review on how genes interact to produce biological functions. Transcriptional interactions involve the binding of proteins to regulatory DNA. Specific binding sites can be identified by genomic analysis, and these undergo a stochastic evolution process governed by selection, mutations, and genetic drift. We focus on the links between the biophysical function and the evolution of regulatory elements. In particular, we infer fitness landscapes of binding sites from genomic data, leading to a quantitative evolutionary picture of regulation.
Cross-species analysis of biological networks by Bayesian alignment
J. Berg and M. Lässig, Proc. Natl. Acad. Sci. 103, 10967, (2006)
Complex interactions between genes or proteins contribute a substantial part to phenotypic evolution. Here we develop an evolutionarily grounded method for the cross-species analysis of interaction networks by alignment, which maps bona fide functional relationships between genes in different organisms. Network alignment is based on a scoring function measuring mutual similarities between networks, taking into account their interaction patterns as well as sequence similarities between their nodes. High-scoring alignments and optimal alignment parameters are inferred by a systematic Bayesian analysis. We apply this method to analyze the evolution of coexpression networks between humans and mice. We find evidence for significant conservation of gene expression clusters and give network-based predictions of gene function. We discuss examples where cross-species functional relationships between genes do not concur with sequence similarity.
Freezing of random RNA
M. Lässig and K. Wiese, Phys. Rev. Lett. 96, 228101, (2006)
We study secondary structures of random RNA molecules by means of a renormalized field theory based on an expansion in the sequence disorder. We show that there is a continuous phase transition from a molten phase at higher temperatures to a low-temperature glass phase. The primary freezing occurs above the critical temperature, with local islands of stable folds forming within the molten phase. The size of these islands defines the correlation length of the transition. Our results include critical exponents at the transition and in the glass phase.
A minimal stochastic model for influenza evolution
F. Tria, M. Lässig, L. Peliti, S. Franz, J. Stat. Mech. P07008 (2005)
We introduce and discuss a minimal individual based model for influenza dynamics. The model takes into account the effects of specific immunization against viral strains, but also infectivity randomness and the presence of a short lived strain-transcending immunity recently suggested in the literature. We show by simulations that the resulting model exhibits substitution of viral strains along the years, but that their divergence remains bounded. We also show that dropping any of these features results in a drastically different behaviour, leading either to the extinction of the disease, to the proliferation of the viral strains or to their divergence.
Biodiversity and productivity in model ecosystems I: Coexistence conditions for competing species
U. Bastolla, M. Lässig, S. Manrubia, and A. Valleriani, J. Theor. Biol. 235, 521, (2005)
This is the first of two papers where we discuss the limits imposed by competition to the biodiversity of species communities. In this first paper, we study the coexistence of competing species at the fixed point of population dynamic equations. For many simple models, this imposes a limit on the width of the productivity distribution, which is more severe the more diverse the ecosystem is (1994, Theor. Popul. Biol. 45, 227-276). Here we review and generalize this analysis, beyond the "mean-field"-like approximation of the competition matrix used in previous works, and extend it to structured food webs. In all cases analysed, we obtain qualitatively similar relations between biodiversity and competition: the narrower the productivity distribution is, the more species can stably coexist. We discuss how this result, considered together with environmental fluctuations, limits the maximal biodiversity that a trophic level can host.
Biodiversity and productivity in model ecosystems II: Species assembly and food web structure
U. Bastolla, M. Lässig, S. Manrubia, and A. Valleriani, J. Theor. Biol. 235, 531, (2005)
This is the second of two papers dedicated to the relationshipbetween population models of competition and biodiversity. Here we consider species assembly models where the population dynamics is kept far from fixed points through the continuous introduction of new species, and generalize to such models the coexistence condition derived for systems at the fixed point. The ecological overlap between species with shared preys,that we define here, provides a quantitative measure of the effective interspecies competition and of the trophic network topology. We obtain distributions of the overlap from simulations of a new model based both on immigration and speciation, and show that they are in good agreement with those measured for three large natural food webs. As discussed in the first paper, rapid environmental fluctuations, interacting with the condition for coexistence of competing species, limit the maximal biodiversity that a trophic level can host. This horizontal limitation to biodiversity is here combined with either dissipation of energy or growth of fluctuations, which in our model limit the length of foodwebs in the vertical direction. These ingredients yield an effective model of food webs that produce a biodiversity profile with a maximum at an intermediate trophic level, inagreement with field studies.
Evolutionary population genetics of promoters: predicting binding sites and functional phylogenies
V. Mustonen and M. Lässig, Proc. Natl. Acad. Sci. 103, 10967, (2005)
We study the evolution of transcription factor-binding sites in prokaryotes, using an empirically grounded model with point mutations and genetic drift. Selection acts on the site sequence via its binding affinity to the corresponding transcription factor. Calibrating the model with populations of functional binding sites, we verify this form of selection and show that typical sites are under substantial selection pressure for functionality: for cAMP response protein sites in Escherichia coli, the product of fitness difference and effective population size takes values 2NΔF of order 10. We apply this model to cross-species comparisons of binding sites in bacteria and obtain a prediction method for binding sites that uses evolutionary information in a quantitative way. At the same time, this method predicts the functional histories of orthologous sites in a phylogeny, evaluating the likelihood for conservation or loss or gain of function during evolution. We have performed, as an example, a cross-species analysis of
E. coli,
Salmonella typhimurium, and
Yersinia pseudotuberculosis. Detailed lists of predicted sites and their functional phylogenies are available.
Solvable sequence evolution models and genomic correlations
P.W. Messer, P.F. Arndt, and M. Lässig, Phys. Rev. Lett. 94, 138103, (2005)
We study a minimal model for genome evolution whose elementary processes are single site mutation, duplication and deletion of sequence regions, and insertion of random segments. These processes are found to generate long-range correlations in the composition of letters as long as the sequence length is growing; i.e., the combined rates of duplications and insertions are higher than the deletion rate. For constant sequence length, on the other hand, all initial correlations decay exponentially. These results are obtained analytically and by simulations. They are compared with the long-range correlations observed in genomic DNA, and the implications for genome evolution are discussed.
Toward an accurate statistics of gapped alignments
M. Kschischo, M. Lässig, and Y.-K. Yu, Bull. Math. Biol. 67, 169, (2005)
Sequence alignment has been an invaluable tool for finding homologous sequences. The significance of the homology found is often quantified statistically by p-values. Theory for computing p-values exists for gapless alignments [Karlin and Altschul 1990, Karlin and Dembo A 1992], but a full generalization to alignments with gaps is not yet complete. We present a unified statistical analysis of two common sequence comparison algorithms: maximum-score (Smith-Waterman) alignments and their generalized probabilistic counterparts, including maximum-likelihood alignments and hidden Markov models. The most important statistical characteristic of these algorithms is the distribution function of the maximum score S
max, resp. the maximum free energy F
max, for mutually uncorrelated random sequences. This distribution is known empirically to be of the Gumbel form with an exponential tail P(S
max > x) approximately exp(-λx) for maximum-score alignment and P(F
max > x) approximately exp(-λx) for some classes of probabilistic alignment. We derive an exact expression for lambda for particular probabilistic alignments. This result is then used to obtain accurate lambda values for generic probabilistic and maximum-score alignments. Although the result demonstrated uses a simple match-mismatch scoring system, it is expected to be a good starting point for more general scoring functions.
Universality of log-range correlations in expansion-randomization systems
P.W. Messer, M. Lässig, and P.F. Arndt, J. Stat. Mech., P10004, (2005)
We study the stochastic dynamics of sequences evolving by single site mutations, segmental duplications, deletions, and random insertions. These
processes are relevant for the evolution of genomic DNA. They define a universality class of non-equilibrium 1D expansion–randomization systems with generic stationary long-range correlations in a regime of growing sequence length. We obtain explicitly the two-point correlation function of the sequence composition and the distribution function of the composition bias in sequences of finite length. The characteristic exponent χ of these quantities is determined by the ratio of two effective rates, which are explicitly calculated for several specific sequence evolution dynamics of the universality class. Depending on the value of χ, we find two different scaling regimes, which are distinguished by the detectability of the initial composition bias. All analytic results are accurately verified by numerical simulations. We also discuss the non-stationary build-up and decay of correlations, as well as more complex evolutionary scenarios, where the rates of the processes vary in time. Our findings provide a possible example for the emergence of universality in molecular biology.
Adaptive evolution of transcription factor binding sites
J. Berg, S. Willmann, and M. Lässig, BMC Evol. Biol. 4, 42, (2004)
Background
The regulation of a gene depends on the binding of transcription factors to specific sites located in the regulatory region of the gene. The generation of these binding sites and of cooperativity between them are essential building blocks in the evolution of complex regulatory networks. We study a theoretical model for the sequence evolution of binding sites by point mutations. The approach is based on biophysical models for the binding of transcription factors to DNA. Hence we derive empirically grounded fitness landscapes, which enter a population genetics model including mutations, genetic drift, and selection.
Results
We show that the selection for factor binding generically leads to specific correlations between nucleotide frequencies at different positions of a binding site. We demonstrate the possibility of rapid adaptive evolution generating a new binding site for a given transcription factor by point mutations. The evolutionary time required is estimated in terms of the neutral (background) mutation rate, the selection coefficient, and the effective population size.
Conclusions
The efficiency of binding site formation is seen to depend on two joint conditions: the binding site motif must be short enough and the promoter region must be long enough. These constraints on promoter architecture are indeed seen in eukaryotic systems. Furthermore, we analyse the adaptive evolution of genetic switches and of signal integration through binding cooperativity between different sites. Experimental tests of this picture involving the statistics of polymorphisms and phylogenies of sites are discussed.
Local graph alignment and motif search in biological networks
J. Berg and M. Lässig, Proc. Natl. Acad. Sci. 101, 14689, (2004)
Interaction networks are of central importance in postgenomic molecular biology, with increasing amounts of data becoming available by high-throughput methods. Examples are gene regulatory networks or protein interaction maps. The main challenge in the analysis of these data is to read off biological functions from the topology of the network. Topological motifs, i.e., patterns occurring repeatedly at different positions in the network, have recently been identified as basic modules of molecular information processing. In this article, we discuss motifs derived from families of mutually similar but not necessarily identical patterns. We establish a statistical model for the occurrence of such motifs, from which we derive a scoring function for their statistical significance. Based on this scoring function, we develop a search algorithm for topological motifs called graph alignment, a procedure with some analogies to sequence alignment. The algorithm is applied to the gene regulation network of Escherichia coli.
Of statistics and genomes
D. Tautz and M. Lässig, Trends in Genetics 20, 344, (2004)
Higher organisms have more genes and larger genomes than simple organisms. This statement sounds almost too trivial to ask the question: why? But there are at least two different answers. Either there is an inherent necessity to increase genome size when more complexity is required or genome size increases because of other reasons that then enable complexity to "latch on". Recently, an article by Lynch and Conery, which used arguments of evolutionary population dynamics, proposed that low population size leads to larger genomes. This then provides the opportunity to generate more complex organisms.
Structure and evolution of protein networks: a statistical model for link dynamics and gene duplications
J. Berg, M. Lässig, and A. Wagner, BMC Evol. Biol. 4, 51, (2004)
Background
The structure of molecular networks derives from dynamical processes on evolutionary time scales. For protein interaction networks, global statistical features of their structure can now be inferred consistently from several large-throughput datasets. Understanding the underlying evolutionary dynamics is crucial for discerning random parts of the network from biologically important properties shaped by natural selection.
Results
We present a detailed statistical analysis of the protein interactions in Saccharomyces cerevisiae based on several large-throughput datasets. Protein pairs resulting from gene duplications are used as tracers into the evolutionary past of the network. From this analysis, we infer rate estimates for two key evolutionary processes shaping the network: (i) gene duplications and (ii) gain and loss of interactions through mutations in existing proteins, which are referred to as link dynamics. Importantly, the link dynamics is asymmetric, i.e., the evolutionary steps are mutations in just one of the binding parters. The link turnover is shown to be much faster than gene duplications. Both processes are assembled into an empirically grounded, quantitative model for the evolution of protein interaction networks.
Conclusions
According to this model, the link dynamics is the dominant evolutionary force shaping the statistical structure of the network, while the slower gene duplication dynamics mainly affects its size. Specifically, the model predicts (i) a broad distribution of the connectivities (i.e., the number of binding partners of a protein) and (ii) correlations between the connectivities of interacting proteins, a specific consequence of the asymmetry of the link dynamics. Both features have been observed in the protein interaction network of S. cerevisiae.
Evolutionary games and quasispecies
F. Tria, M. Lässig, L. Peliti, Europhys. Lett. 62, 446 (2003)
We discuss a population of sequences subject to mutations and frequency- dependent selection, where the fitness of a sequence depends on the composition of the entire population. This type of dynamics is crucial to understand, for example, the coupled evolution of different strands in a viral population. Mathematically, it takes the form of a reaction- diffusion problem that is nonlinear in the population state. In our model system, the fitness is determined by a simple mathematical game, the hawk-dove game. The stationary population distribution is found to be a quasispecies with properties different from those which hold in fixed fitness landscapes.
Stochastic evolution of transcription factor binding sites
J. Berg and M. Lässig, Biophysics (Moscow) 48, Suppl. 1 (2003)
A key step in the process of genetic transcription is the binding of one or several transcription factors to specific sites in the regulatory region of a gene. These binding sites may differ strongly across even closely related species, and the generation of new binding sites is an essential part of the evolution of regulatory networks. In this paper we consider the sequence evolution of binding sites, using empirically grounded fitness landscapes. We demonstrate how a new binding site for a given transcription factor may be generated de novo, and estimate the time required for this process in terms of the neutral mutation rate, the selection coefficient, and the effective population size. We also consider how several sites binding to the same type of factor can coexist in the regulatory region of a gene.
Correlated Random Networks
J. Berg and M. Lässig, Phys. Rev. Lett. 89, 228701, (2002)
We develop a statistical theory of networks. A network is a set of vertices and links given by its adjacency matrix c, and the relevant statistical ensembles are defined in terms of a partition function Z=Σexp([-βH(c)]. The simplest cases are uncorrelated random networks such as the well-known Erdös-Rényi graphs. Here we study more general interactions H(c) which lead to correlations, for example, between the connectivities of adjacent vertices. In particular, such correlations occur in optimized networks described by partition functions in the limit β --> infinity. They are argued to be a crucial signature of evolutionary design in biological networks.
Delocalization transitions of semiflexible manifolds
R. Bundschuh and M. Lässig, Phys. Rev. E65, 61502, (2002)
Semiflexible manifolds such as fluid membranes or semiflexible polymers undergo delocalization transitions if they are subject to attractive interactions. We study manifolds with short-ranged interactions by field-theoretic methods based on the operator product expansion of local interaction fields. We apply this approach to manifolds in a random potential. Randomness is always relevant for fluid membranes, while for semiflexible polymers there is a first-order transition to the strong coupling regime at a finite temperature.
Dynamics and topology of species networks
U. Bastolla, M. Lässig, S. Manrubia, and A. Valleriani, in Biological Evolution and Statistical Physics, ed. M. Lässig and A. Valleriani, Springer Verlag (2002) (refereed)
We study communities formed by a large number of species, which are an example of dynamical networks in biology. Interactions between species, such as prey-predator relationships and mutual competition, define the links of these networks.They also govern the dynamics of their population sizes. This dynamics acts as a selection mechanism, which can lead to the extinction of species. Adaptive changes of the interactions or the generation of new species involve random mutations as well as selection. We show how this dynamics determines key topological characteristics of species networks. The results are in agreement with observations.
Spatio-Temporal Modes of Speciation
M. Rost and M. Lässig, in Biological Evolution and Statistical Physics, ed. M. Lässig and A.Valleriani, Springer Verlag (2002), (refereed)
The split of a population into two reproductively isolated subpopulations is studied within a model including spatial heterogeneity. We find three dynamical pathways of speciation resulting from a coupling of space, competition and mating behaviour: (i) sympatric at small habitat heterogeneity, (ii) sympatric with subsequent spatial differentiation at intermediate heterogeneity, and (iii) allopatric under strong heterogeneity.
Diversity patterns from ecological models at dynamical equilibrium
U. Bastolla, M. Lässig, S. Manrubia, and A. Valleriani, J. Theor. Biol. 212, 11, (2001)
We study a dynamic model of ecosystems where an immigration flux assembles the species community and maintains its biodiversity. This framework is particularly relevant for insular ecosystems. Population dynamics is represented either as an individual-based model or as a set of deterministic equations for population abundances. Local extinctions and immigrations balance at a statistically stationary state where biodiversity fluctuates around a constant mean value. We find a number of scaling laws characterizing this stationary state. In particular, the number of species increases as a power law of the immigration rate. With additional assumptions on the immigration flux, we obtain species-area relationships in agreement with observations for archipelagos. We also find power-law distributions for species abundances and lifetimes.
Shape of ecological networks
M. Lässig, U. Bastolla, S. Manrubia, and A. Valleriani, Phys. Rev. Lett. 86, 4418, (2001)
We study the statistics of ecosystems with a variable number of coevolving species. The species interact in two ways: by prey-predator relationships and by direct competition with similar kinds. The interaction coefficients change slowly through successful adaptations and speciations. They are treated as quenched random variables. These interactions determine long-term topological features of the species network, which are found to agree with those of biological systems.
Dynamical anomalies and intermittency in Burgers turbulence
M. Lässig, Phys. Rev. Lett. 84, 2618, (2000)
We analyze the field theory of fully developed Burgers turbulence. Its key elements are shock fields, which characterize the singularity statistics of the velocity field. The shock fields enter an operator product expansion describing intermittency. The latter is found to be constrained by dynamical anomalies expressing finite dissipation in the inviscid limit. The link between dynamical anomalies and intermittency is argued to be important in a wider context of turbulence.
Finite-temperature sequence alignment
M. Kschischo and M. Lässig, Pacific Symposium on Biocomputing 5, (2000) (refereed)
We develop a statistical theory of probabilistic sequence alignments derived from a 'thermodynamic' partition function at finite temperature. Such alignments are a generalization of those obtained from information-theoretic approaches. Finite-temperature statistics can be used to characterize the significance of an alignment and the reliability of its single element pairs.
Scaling laws and similarity detection in sequence alignment with gaps
D. Drasdo, T. Hwa, and M. Lässig, J. Comput. Biol. 7, 115, (2000)
We study the problem of similarity detection by sequence alignment with gaps, using a recently established theoretical framework based on the morphology of alignment paths. Alignments of sequences without mutual correlations are found to have scale-invariant statistics. This is the basis for a scaling theory of alignments of correlated sequences. Using a simple Markov model of evolution, we generate sequences with well-defined mutual correlations and quantify the fidelity of an alignment in an unambiguous way. The scaling theory predicts the dependence of the fidelity on the alignment parameters and on the statistical evolution parameters characterizing the sequence correlations. Specific criteria for the optimal choice of alignment parameters emerge from this theory. The results are verified by extensive numerical simulations.
Semiflexible polymers with attractive interactions
R. Bundschuh, M. Lässig, and R. Lipowsky, Eur. Phys. J. B 3, 295, (2000)
The delocalization and unbinding transitions of two semi-flexible polymers which experience attractive interactions are studied by a variety of theoretical methods. In two-dimensional systems, one has to distinguish four different universality classes for the interaction potentials. In particular, the delocalization transitions from a potential well and the unbinding transitions from such a well in the presence of a hard wall exhibit distinct critical behavior governed by different critical exponents. In three-dimensional systems, we predict first-order transitions with a jump in the energy density but with critical or self-similar fluctuations leading to distribution functions with power law tails. The predicted critical behavior is con- firmed numerically by transfer matrix calculations in two dimensions and by Monte Carlo simulations in three dimensions. This behavior should be accessible to experiments on biopolymers such as actin filaments or microtubuli.
Optimizing Smith-Waterman alignments
R. Olsen, T. Hwa, and M. Lässig, Pacific Symposium on Biocomputing 4, 302 (1999) (refereed)
Mutual correlation between segments of DNA or protein sequences can be detected by Smith-Waterman local alignments. We present a statistical analysis of alignment of such sequences, based on a recent scaling theory. A new fidelity measure is introduced and shown to capture the significance of the local alignment, i.e., the extent to which the correlated subsequences are correctly identified. It is demonstrated how the fidelity may be optimized in the space of penalty parameters using only the alignment score data of a single sequence pair.
A statistical theory of sequence alignment with gaps
D. Drasdo, T. Hwa, and M. Lässig, Proceedings of the sixth international conference on intelligent systems for molecular biology (ISMB 98), AAAI Press, Menlo Park (1998) (refereed)
A statistical theory of local alignment algorithms with gaps is presented. Both the linear and logarithmic phases, as well as the phase transition separating the two phases, are described in a quantitative way. Markov sequences without mutual correlations are shown to have scale-invariant alignment statistics. Deviations from scale invariance indicate the presence of mutual correlations detectable by alignment algorithms. Conditions are obtained for the optimal detection of a class of mutual sequence correlations.
On growth, disorder, and field theory (review article)
M. Lässig, J. Phys. C 10, 9905 (1998)
This article reviews recent developments in statistical field theory far from equilibrium. It focuses on the Kardar-Parisi-Zhang equation of stochastic surface growth and its mathematical relatives, namely the stochastic Burgers equation in fluid mechanics and directed polymers in a medium with quenched disorder. At strong stochastic driving - or at strong disorder, respectively - these systems develop non-perturbative scale invariance. Presumably exact values of the scaling exponents follow from a self-consistent asymptotic theory. This theory is based on the concept of an operator product expansion formed by the local scaling fields. The key difference from standard Lagrangian field theory is the appearance of a dangerous irrelevant coupling constant generating dynamical anomalies in the continuum limit.
Optimal detection of sequence similarity by local alignment
T. Hwa and M. Lässig, Proceedings of the second annual conference on computational molecular biology (RECOMB 98), ACM Press, New York (1998) (refereed)
The statistical properties of local alignment algorithms with gaps are analyzed theoretically for uncorrelated and correlated random sequences. In the vicinity of the log-linear phase transition, the statistics of alignment with gaps is shown to be characteristically different from that of gapless alignment. The optimal scores obtained for uncorrelated sequences obey certain robust scaling laws. Deviation from these scaling laws signals sequence homology, and can be used to guide the empirical selection of scoring parameters for the optimal detection of sequence similarities. This can be accomplished in a computationally efficient way by using a novel approach focusing on the score profiles. Furthermore, by assuming a few gross features characterizing the statistics of underlying sequence-sequence correlations, quantitative criteria are obtained for the choice of optimal scoring parameters: Optimal similarity detection is most likely to occur in a region close to the log side of the loglinear phase transition.
Quantized scaling of growing surfaces
M. Lässig, Phys Rev. Lett. 80, 2366 (1998)
The Kardar-Parisi-Zhang universality class of stochastic surface growth is studied by exact field-theoretic methods. From previous numerical results, a few qualitative assumptions are inferred. In particular, height correlations should satisfy an operator product expansion and, unlike the correlations in a turbulent fluid, exhibit no multiscaling. These properties impose a quantization condition on the roughness exponent χ and the dynamic exponent z. Hence the exact values χ = 2/5, z = 8/5 for two-dimensional and χ = 2/7, z = 12/7 for three-dimensional surfaces are derived.
Reply on a Comment to Upper critical dimension of the Kardar-Parisi-Zhang equation
H. Kinzelbach and M. Lässig, Phys. Rev. Lett. 80, 889 (1998)
DNA sequence alignment and critical phenomena
D. Drasdo, T. Hwa, and M. Lässig, in Statistical Mechanics in Physics and Biology, ed. D. Wirtz et al., Boston (1997) (refereed)
Alignment algorithms are commonly used to detect and quantify similarities between DNA sequences. We study these algorithms in the framework of a recent theory viewing similarity detection as a geometrical critical phenomenon of directed random walks. We show that the roughness of these random walks governs the fidelity of an alignment, i.e., its ability to capture the correlations between the sequences compared. Criteria for the optimization of alignment algorithms emerge from this theory.
Upper critical dimension of the Kardar-Parisi-Zhang equation
M. Lässig and H. Kinzelbach, Phys. Rev. Lett. 78, 903, (1997)
The strong-coupling regime of Kardar-Parisi-Zhang surface growth driven by short-ranged noise is shown to have an upper critical dimension d> less than or equal to four [where the dynamic exponent z takes the value z(d
>) = 2]. To derive this, we use the mapping onto directed polymers with quenched disorder. Two such polymers coupled by a small contact attraction of strength u are shown to form a bound state at all temperatures 1/β ≤ 1/βc, the roughening temperature of a single polymer. Comparing singularities of the (de-)localization transition at u = 0 below 1/βc and at 1/βc then yields d
>≤4.
Comment on "Simplest possible self-organized critical system"
R. Bundschuh and M. Lässig, Phys. Rev. Lett. 77, 4273, (1996)
Directed polymers in high dimensions
R. Bundschuh and M. Lässig, Phys. Rev. E 54, 304, (1996)
We study directed polymers subject to a quenched random potential in d transversal dimensions. This system is closely related to the Kardar-Parisi-Zhang equation of nonlinear stochastic growth. By a careful analysis of the perturbation theory we show that physical quantities develop singular behavior for d→4. For example, the universal finite-size amplitude of the free energy at the roughening transition is proportional to √4-d. This shows that the dimension d=4 plays a special role for the Kardar-Parisi-Zhang problem.
Similarity detection and localization
T. Hwa and M. Lässig, Phys. Rev. Lett. 76, 2591, (1996)
The detection of similarities between long DNA and protein sequences is studied using concepts of statistical physics. It is shown that mutual similarities can be detected by sequence alignment methods only if their amount exceeds a threshold value. The onset of detection is a critical phase transition viewed as a localization-delocalization transition. The fidelity of the alignment is the order parameter of that transition; it leads to criteria to select optimal alignment parameters.
Vicinal surfaces and the Calogero-Sutherland model
M. Lässig, Phys. Rev. Lett. 77, 526, (1996)
A miscut (vicinal) crystal surface can be regarded as an array of meandering but noncrossing steps. Interactions between the steps are shown to induce a faceting transition of the rough surface between a homogeneous Tomonaga-Luttinger liquid state and a low-temperature regime of local step clusters in coexistence with ideal facets. This morphological transition is governed by a hitherto neglected critical line of the well-known Calogero-Sutherland model. Its exact solution yields expressions for measurable quantities that compare favorably with recent experiments on Si surfaces.
Depinning in a random medium
H. Kinzelbach and M. Lässig, J. Phys. A 28, 6535, (1995)
We develop a renormalized continuum field theory for a directed polymer interacting with a random medium and a single extended defect. The renormalization group is based on the operator algebra of the pinning potential; it has novel features due to the break down of hyperscaling in a random system. There is a second-order transition between a localized and a delocalized phase of the polymer; we obtain analytic results on ist critical pinning strength and scaling exponents. Our results are directly related to spatially inhomogeneous Kardar-Parisi-Zhang surface growth.
Interacting flux lines in a random medium
H. Kinzelbach and M. Lässig, Phys. Rev. Lett. 75, 2208, (1995)
We study the continuum field theory for an ensemble of directed lines r
i(t) in 1+d′ dimensions that live in a medium with quenched point disorder and interact via short-range pair forces gΨ(r
i-r
j). In the strong-disorder (or low-temperature) regime, attractive forces generate a bound state with localization length ξ⊥∼|g|
-ν⊥; repulsive forces lead to mutual avoidance with a pair distribution function P(r
i-r
j)∼ |r
i-r
j|
θ reminiscent of fermions. In the experimentally important dimension d′=2, we obtain ν⊥≈0.8 and θ≈0.4.
On the renormalization of the Kardar-Parisi-Zhang equation
M. Lässig, Nucl. Phys. B448, 559, (1995)
The Kardar-Parisi-Zhang (KPZ) equation of nonlinear stochastic growth in d dimensions is studied using the mapping on to a system of directed polymers in a quenched random medium. The polymer problem is renormalized exactly in a minimally subtracted perturbation expansion about d=2. For the KPZ roughening transition in dimensions d>2, this renormalization group yields the dynamic exponent z⋆=2 and the roughness exponent χ⋆=0, which are exact to all orders in ε≡(2−d)/2. The expansion becomes singular in d=4, which is hence identified with the upper critical dimension of the KPZ equation. The implications of this perturbation theory for the strong-coupling phase are discussed. In particular, it is shown that the correlation functions and the coupling constant defined in minimal subtraction develop an essential singularity at the strong-coupling fixed point.
Strongly inhomogeneous surface growth and directed polymers
H. Kallabis and M. Lässig, Phys. Rev. Lett. 75, 1578, (1995)
We study nonlinear surface growth driven by spatially localized noise, a model that can be mapped onto directed polymers with random contact interactions. These systems are asymptotically free and show nonperturbative strong-coupling behavior on large scales in one dimension; hence they are possibly the simplest examples with these properties. The strong-coupling regime represents new universality classes of directed growth and of polymer delocalization transitions, which we analyze in detail.
Bundles of interacting strings in two dimensions
C. Hiergeist, M. Lässig, and R. Lipowsky, Europhys. Lett. 28, 103, (1994)
Bundles of strings which interact via short-ranged pair potentials are studied in two dimensions. The corresponding transfer matrix problem is solved analytically for arbitrary string number N by Bethe ansatz methods. Bundles consisting of N identical strings exhibit a unique unbinding transition. If the string bundle interacts with a hard wall, the bundle may unbind from the wall via a unique transition or a sequence of N successive transitions. In all cases, the critical exponents are independent of N and the density profile of the strings exhibits a scaling form that approaches a mean-field profiie in the limit of large N.
New criticality of 1D fermions
M. Lässig, Phys. Rev. Lett. 73, 561, (1994)
One-dimensional massive quantum particles [or (1 + 1)-dimensional random walks] with short-ranged multiparticle interactions are studied by exact renormalization group methods. With repulsive pair forces, such particles are known to scale as free fermions. With finite m-body forces (m=3,4,…), a critical instability is found, indicating the transition to a fermionic bound state. These unbinding transitions represent new universality classes of interacting fermions relevant to polymer and membrane systems. Implications for massless fermions, e.g., in the Hubbard model, are also noted.
Critical roughening of interfaces: A new class of renormalizable field theories
M. Lässig and R. Lipowsky, Phys. Rev. Lett. 70, 1131, (1993)
A renormalizable field theory is developed for (multi)critical roughening of interacting interfaces in systems of dimension d<3. There is an infinite hierarchy of universality classes that mirrors the series of multicritical points in Ising systems. The relevant operator algebra of these theories is built up by local scaling fields that are singular distributions of the basic field variable. Critical indices, e.g., the exponent α, of the specific heat, are obtained analytically in an ɛ expansion. The extension of our results to d=3 is discussed.
Multiple crossover phenomena and scale hopping in two dimensions
M. Lässig, Nucl. Phys. B 380, 601, (1992)
We study the renormalization group for nearly marginal perturbations of a minimal conformal field theory M
p with p >> 1. To leading order in perturbation theory, we find a unique one-parameter family of “hopping trajectories” that is characterized by a staircase-like renormalization group flow of the C-function and the anomalous dimensions and that is related to a factorizable scattering theory recently solved by Al. B. Zamolodchikov. We argue that this system is described by interactions of the form . As a function of the relevant parameter t, it undergoes a phase transition with new critical exponents simultaneously governed by all fixed points M
p, M
p−1,…, M
3. Integrable lattice models represent different phases of the same integrable system that are distinguished by the sign of the irrelevant parameter .
Exact universal amplitude ratios in two-dimensional systems near criticality
M. Lässig, Phys. Rev. Lett. 67, 3737, (1991)
Universal amplitude relations associated with hyperscaling are obtained exactly for several integrable perturbations of two-dimensional (multi)critical points described by minimal models. The results are confirmed numerically and it is discussed how they can be verified by experiment.
Finite-size effects in theories with factorizable S-matrices
M. Lässig and M.J. Martins, Nucl. Phys. B354, 666, (1991)
We study the energy spectrum of (1 + 1)-dimensional perturbed conformal field theories defined on the cylinder. The finite-size dependence of the two-particle levels allows a direct numerical measurement of the elastic S-matrix which we compare with the conjectured minimal S-matrix for several perturbations of minimal models. We discuss the simplifications that integrability imposes on the spectrum above threshold. In particular, the ultraviolet limit of the elastic phase shift of two lightest particles is related in a simple way to scaling dimensions of the conformal field theory.
Hilbert space and structure constants of descendant fields in two-dimensional conformal theories
M. Lässig and G. Mussardo, Comput. Phys. Comm. 66, 71, (1991)
We have developed an algorithm to compute the Hilbert-space basis and the operator algebra of descendant fields for (1+1)-dimensional conformal field theories. Implemented as a Mathematica computer program, this algorithm is used to obtain nonperturbatively the spectrum of the transfer matrix theories, seen as deformations of a massless conformal theory.
New hierarchies of multicriticality in two-dimensional field theory
M. Lässig, Phys. Lett. B 278, 439, (1991)
The minimal conformal model M
p,q, perturbed by the relevant scaling field φ
1,3, is argued to undergo a crossover to the model M
p-(q-p),q-(q-p), at least for large values of p/(q − p). Hence its critical manifold is nested into all manifolds of lower criticality.
The scaling region of the tricritical Ising model in two dimensions
M. Lässig, G. Mussardo and J.L. Cardy, Nucl. Phys. B348, 591 (1991)
We study the scaling region spanned by all four relevant perturbations of the tricritical Ising model in two dimensions. We analyze the spectrum of the (1 + 1)-dimensional off-critical hamiltonian on a truncated Hilbert space, a method recently proposed by Yurov and AL Zamolodchikov. In the phase coexistence regions the massive excitations are kink states. On the temperature-driven two-phase coexistence line, they form bound states, which we analyze for periodic as well as for twisted boundary conditions. We find a new asymmetric two-phase region driven by the subleading magnetic field. There are some indications of massless states along the crossover line to the Ising model. The effects of off-critical integrability on the spectra are also observed and discussed.
Geometry of the renormalization group, with an application in two dimensions
M. Lässig, Nucl. Phys. B334, 652, (1990)
The renormalization group is viewed as a theory of the geometry of action space. A general covariant relation between coupling constant and field renormalization is derived. As an application, the crossover between the two-dimensional minimal modes M
m and M
m−1 is calculated to two-loop order in a minimal subtraction scheme.