## Abstract

Biological gene networks appear to be dynamically robust to mutation, stochasticity, and changes in the environment and also appear to be sparsely connected. Studies with computational models, however, have suggested that denser gene networks evolve to be more dynamically robust than sparser networks. We resolve this discrepancy by showing that misassumptions about how to measure robustness in artificial networks have inadvertently discounted the costs of network complexity. We show that when the costs of complexity are taken into account, that robustness implies a parsimonious network structure that is sparsely connected and not unnecessarily complex; and that selection will favor sparse networks when network topology is free to evolve. Because a robust system of heredity is necessary for the adaptive evolution of complex phenotypes, the maintenance of frugal network complexity is likely a crucial design constraint that underlies biological organization.

## Introduction

The synthesis of genetics, development, and evolution remains a significant challenge for theoretical and empirical research. In response to this challenge, researchers (Wagner, 1996; Salazar‐Ciudad *et al*, 2000; Siegal and Bergman, 2002; Bergman and Siegal, 2003; Solé *et al*, 2003; Masel, 2004; Azevedo *et al*, 2006; Huerta‐Sanchez and Durret, 2006; Oikonomou and Cluzel, 2006; Siegal *et al*, 2006; Ciliberti *et al*, 2007a, 2007b; MacCarthy and Bergman, 2007a, 2007b; Martin and Wagner, 2008) have begun to fuse systems biology, network theory, and evolutionary theory in a new research program of evolutionary systems biology. Applying a genetic algorithm (Holland, 1992) to evolve populations of individuals that are dynamically modeled as transcriptional regulatory networks, researchers have investigated issues such as the evolution of robustness and evolvability (Wagner, 1996; Salazar‐Ciudad *et al*, 2000; Siegal and Bergman, 2002; Bergman and Siegal, 2003; Solé *et al*, 2003; Azevedo *et al*, 2006; Huerta‐Sanchez and Durret, 2006; Ciliberti *et al*, 2007a, 2007b; MacCarthy and Bergman, 2007a; Martin and Wagner, 2008), the mechanisms of genetic assimilation (Masel, 2004), the maintenance of sex (Azevedo *et al*, 2006; MacCarthy and Bergman, 2007a), the role of network topology (Oikonomou and Cluzel, 2006; Siegal *et al*, 2006), and gene duplication and subfunctionalization (MacCarthy and Bergman, 2007b).

Early computational studies evolving artificial gene regulatory networks have previously determined (Wagner, 1996; Siegal and Bergman, 2002) that more densely connected networks evolve to be more robust to perturbation than sparser networks under stabilizing selection. Under stabilizing selection, robustness is selected for implicitly by the fitness function because stabilizing selection will select against deviations from the phenotypic optimal. Consequently, this will favor genes and genotypes that tend to express the optimal phenotype given factors such as environmental noise and mutation. In most natural populations, stabilizing selection operates most of the time. Because natural selection should favor more robust networks, studies are now customarily reported using more densely connected networks (Wagner, 1996; Siegal and Bergman, 2002; Bergman and Siegal, 2003; Masel, 2004; Azevedo *et al*, 2006; Ciliberti *et al*, 2007a, 2007b; MacCarthy and Bergman, 2007a, 2007b; Martin and Wagner, 2008). However, real gene networks including *Escherichia coli*, yeast, *Arabidopsis*, *Drosophila*, and sea urchin appear to be robust, but are all sparsely connected (see Table I); furthermore, despite the differences in phylogeny, phenotypic complexity, life history, and number of genes all surprisingly show a mean of *K*=1.5–2 transcriptional regulators per gene.

This discrepancy between our theoretical understanding and empirical data needs to be resolved because we may be overlooking something important. Moreover, because network connectivity appears to play an important role in the network dynamics of these models, and because theoretical results are typically reported with dense networks (*K*>2), these results may be inapplicable, uninformative, or misleading. Indeed, results from these studies often show that the positive results trend less favorably with decreased connectivity (Wagner, 1996; Siegal and Bergman, 2002; Azevedo *et al*, 2006; MacCarthy and Bergman, 2007a; Martin and Wagner, 2008), and where data were reported for sparse networks (*K*<2.5) the theoretical conclusions were shown not to hold (Siegal and Bergman, 2002; Azevedo *et al*, 2006). This suggests that dense network connectivity may be a critical assumption for many of these studies.

Previous studies investigating the evolution of robustness (Wagner, 1996; Siegal and Bergman, 2002; Bergman and Siegal, 2003; Azevedo *et al*, 2006) measured genetic robustness by the expected effects from a single perturbation applied to a random network interaction. Because perturbations can cause the phenotype to deviate, this method effectively estimates the expected cost of a perturbation per interaction (CPPI). However, by measuring genetic robustness in this way, the addition of a frivolous interaction would seemingly increase robustness whenever its average cost of perturbation (CP) was less than the CPPI. Using this procedure, spurious complexity is awarded credit for increasing genetic robustness simply because it reduces CPPI. Rather, if anything, spurious complexity should tend to decrease robustness because it introduces an additional target for perturbations and potentially a new channel for perturbations to propagate when acting on other interactions.

To measure network robustness more appropriately, it is necessary to critically evaluate how complexity affects the network (see Supplementary information for details and derivation). Briefly, in order for an interaction to increase robustness when added to a network, the interaction must actually decrease the CP incurred by other interactions. Simply diluting the CPPI by spreading a larger total cost over a greater number of interactions does not make a network more robust. Furthermore, when an interaction is added to a network, the costs of the additional interaction must be accounted for as the network is now exposed to a new degree of freedom for perturbations can act. Thus, the benefits of an interaction must exceed its costs to be favored by selection. Failing to account for the costs of complexity, sparser networks will be penalized for efficiency.

This suggests that genetic robustness must be measured in a way that takes into account the costs of complexity. One way to account for the costs of complexity is to sum the mean CP for each interaction, to arrive at the gross CP (GCP) for a network. In this way, although the addition of spurious interactions may decrease the CPPI, robustness would still decrease as measured by an increase in GCP.

## Results and discussion

To see how the CPPI and the GCP (see Materials and methods) are affected by connectivity density, we first specify rates by which network interactions are created (α) or destroyed (ϕ), and/or modified (δ) by mutation, and then apply this to a transcriptional gene regulatory model (see Materials and methods; Supplementary information) based on previous studies (Wagner, 1996; Siegal and Bergman, 2002; Bergman and Siegal, 2003; Masel, 2004; Azevedo *et al*, 2006; Huerta‐Sanchez and Durret, 2006; Siegal *et al*, 2006; Ciliberti *et al*, 2007a, 2007b; MacCarthy and Bergman, 2007a, 2007b; Martin and Wagner, 2008). For a network of *N* nodes and *n* directed interactions (see Supplementary Figure 1), the destruction and creation of network interactions influence a network's connectivity density (*c*=*n*/*N*^{2}; where *K*=*cN*). In the absence of selection, connectivity density (*c*) changes by: *c*′=μ(α(1−*c*)−ϕ*c*)/*N*^{2} (see Supplementary information); where μ is the network's mutation rate. Solving *c*(*t*+1)=*c*(*t*), gives the expected equilibrium density:

By setting the rates at which network interactions are added (α) or destroyed (ϕ) by mutation, we can observe how the CPPI and GCP change with connectivity density when networks are initialized far from their equilibrium density.

Both sparse (*ĉ*=0.1; α=0.008, ϕ=0.07) and dense (*ĉ*=0.9; α=0.07, ϕ=0.008) treatment networks of 100 replicate populations were initialized far from equilibrium (*c*_{0}=0.5) and were evolved under stabilizing selection for 30 000 generations (see Materials and methods). Expressing the same distribution of functions at *c*(0)=0.5, dense treatment networks will be driven to *c*(*t*) → 0.9, whereas the sparse treatment will be driven to *c*(*t*) → 0.1 (see Supplementary Figure 2). According to our analysis, we expect the CPPI to steadily increase as *c*(*t*) → 0.1, and decrease as *c*(*t*) → 0.9, but that sparser networks will be more robust at any time (*t*) as measured by GCP. Fitness was scored according to an individual's phenotypic similarity to the population's founder individual; the population's ancestor whose genotype was randomly generated for each treatment and which seeded the rest of the founder population. All results were reported by sampling a single individual expressing the optimal phenotype from each replicate population. Using parameters commonly used in previous studies, for all experiments *N*=10, μ=0.1, and σ=1 is the strength of selection, unless stated otherwise.

As expected, CPPI increases (Figure 1), whereas the GCP systematically decreases (Figure 2) with decreasing connectivity density (*c*(*t*) → 0.1). Conversely, as networks evolve denser connectivity (*c*(*t*) → 0.9) CPPI systematically decreases (Figure 1), but nevertheless, the GCP is always higher (Figure 2) compared to the sparse treatment at the same time (*t*). With a smaller GCP, the sparse treatment networks simply make more efficient use of network interactions to fulfill the same function. This indicates that sparse networks are actually more robust if the costs of complexity are accounted for. If true, then evolution should seek to optimize the costs and benefits of complexity with a parsimonious network structure, a network topology that is sparsely connected and not unnecessarily complex, by seeking an optimal topological ensemble of interactions that best meets the network's functional requirements under its normal range of operating conditions.

Because selection was not permitted to evolve the network ‘topology’ (connectivity) in previous studies—only the interaction strengths—it was never demonstrated that selection would actually favor denser networks. To test whether sparse networks will be favored by selection, we initialized replicate populations at equilibrium connectivity density by specifying appropriate values for α and ϕ, and then evolved populations under stabilizing selection. If sparser networks evolve greater robustness under stabilizing selection, then we expect networks to evolve connectivity density below their equilibrium value, as determined by: *c*(*t*)<*c*(0).

Three treatments of 100 replicate populations were classified as high (*ĉ*=0.6; α=0.105, ϕ=0.07), intermediate (*ĉ*=0.5; α=0.07, ϕ=0.07), or low (*ĉ*=0.4; α=0.07, ϕ=0.105) equilibrium density, and networks were initialized to their equilibrium values (*c*_{0}=*ĉ*). As predicted, the evolutionary response (Figure 3) shows that selection favors networks below their equilibrium density (*c*(*t*)<*ĉ*). Similar results were also observed for asexual reproduction, and larger networks when sparse and dense networks evolved under competition (see Supplementary information), and we observed the same qualitative result for multicellular implementations and with the inclusion of developmental noise (not shown). Our analysis (see Supplementary information) indicates that this is a general result. In contrast to early studies (Wagner, 1996; Siegal and Bergman, 2002; Bergman and Siegal, 2003; Azevedo *et al*, 2006), our results and analyses show that selection favors sparse networks, and that misassumptions about how to measure robustness in previous studies had inadvertently discounted the costs of complexity.

We have determined that selection favors the formation of sparse and minimally complex networks and that robustness implies a parsimonious network. These results bring theory in line with empirical data, which show that biological gene networks are sparsely connected (Table I), having about *K*∼2 transcriptional regulators per gene. These considerations challenge the theoretical conclusions of previous studies, which were reported using dense networks (*K*>2), particularly when positive results often trend less favorably with decreasing *K*, and have been shown to break down for *K*⩽2.5 (see Supplementary information). Using the fixed topology model, Azevedo *et al* (2006) reported that sex ensures its own maintenance by selecting for a negative epistatic effect (deleterious mutations will act multiplicatively). When compounded deleterious mutations are concentrated in one genotype, they will be purged according to the deterministic mutation hypothesis (Kondrashov, 1988). Despite these observations, selection for negative epistasis was not observed for sparse networks (*K*⩽2.5), whereas biological gene networks have an average of *K*=1.5–2 transcriptional regulators per gene (Table I). Other studies have concluded that robustness could evolve under viability selection, rather than stabilizing selection (Siegal and Bergman, 2002). These results, however, were based on the analysis of dense networks (*K*⩾4.0), whereas less connected networks (*K*=0.1444) evolved minimal robustness (see Supplementary information, Supplementary Figure 6). A subsequent report (Bergman and Siegal, 2003) also suggested that arbitrary genes may buffer genetic variation and act as evolutionary capacitors, thereby providing a general mechanism to cryptically store hidden genetic variation. However, these results were reported with *c*⩾0.75 and *N*=10; values that are indicative of a densely connected network; nevertheless, our analyses (see Supplementary information) suggest that genes which buffer genetic variation should be highly connected, although the overall network should still be sparsely connected, and that a hierarchical or scale‐free distribution may help optimize the costs of interactions.

Our analyses also result in several theoretical and practical implications, which may provide an alternative perspective for biological systems. The competing constraints on robustness and complexity suggest that an optimal network, for a given function, might be preordained to only a few (sparse) canonical topologies (for example, see Supplementary information); and would suggest a new interpretation for homology and developmental constraints. That is, if biological networks are optimal, then robust functional networks may reside on high adaptive peaks, divided from each other by large topological distances that are impassible by neutral evolution—except perhaps by gene duplication.

Furthermore, the observation that few optimal topologies are able to satisfy a given function would support the theory (Noman and Hitoshi, 2005) that the topologies of real biological networks can be reverse‐engineered through evolutionary simulations. Previous studies support this possibility. Attempts by von Dassow *et al* (2000) to model the *Drosophila* segment polarity network from empirical data failed until two additional, previously unreported, interactions were included in their analysis. Following the addition of these factors, the network was determined to be highly robust, with the many parameters demonstrating high degrees of variation. In some instances, up to a ‘1000‐fold’ variation was observed. Indeed, when a random set of values were assigned to the 48 parameters that described the network interactions, the authors reported that ∼1 in 200 of the randomly chosen parameter sets produced the desired pattern; this lead to the conclusion that the network's topology, rather than the kinetic and biochemical details, was primarily responsible for the properties of robustness. Because the *Drosophila* segmentation network is both robust and sparse it may be possible, given sufficient computational resources, to reverse engineer this network structure with evolutionary simulations.

As selection should favor parsimonious networks, this may impose a fundamental design constraint that also drives the evolution of epigenetic processes, systems, and strategies to effectively maintain frugal network complexity despite increases in genomic complexity. Consistent with this hypothesis, it was recently proposed that RNA evolved as a regulatory layer in higher organisms (Mattick, 2004) to overcome apparent scaling limitations on complexity in (protein‐based) gene networks. DNA methylation also appears to be a mechanism that offers a way to limit potential connectivity of a gene to a subset of actual interaction partners. In another study (Zilberman *et al*, 2007), experiments showed that highly methylated genes become upregulated when methylation is lost, whereas unmethylated regions often contain highly expressed genes that are regulated by specific transcription factors. Similarly, chromatin‐remodeling proteins (Polo and Almouzni, 2006), such as the Polycomb group (PcG), are an important silencing mechanism that limits transcriptional regulators from binding to promoter sequence as mutations to members of PcG lead to homeotic transformations (Lewis, 1978). These results suggest that we need to be careful about assuming additional interaction partners without empirical evidence that shows a causal relationship. More generally, however, these examples suggest that the maintenance of frugal network complexity is a critical design constraint and may be an important principle for understanding biological organization and evolutionary design decisions.

## Materials and methods

### Network model

An individual is modeled as an interaction network of *N* transcriptional regulators encoded by an *N* × *N* matrix **W** (see Supplementary Figure 1), where *w*_{ij}∈**W** describes the regulatory influence of gene *j* on gene *i*. Interactions (*w*_{ij}≠0)∈**W** are assigned random variables [*X*∼*N*(0,1/3)] from a discrete (rounded to:1E−3 significant digits) truncated (*X*∈[−1, +1]) discontinuous (*X*∉0) normal distribution. An individual's state vector **s**(*t*)=(*s*_{1}(*t*), …, *s*_{N}(*t*)) describes gene states at time *t* as expressed, unregulated, or repressed: *s*_{i}(*t*)={+1, 0, −1}. Given the initial state vector {** s_{0}**:

*s*

_{01}=+1,

**s**

_{0}_{(i≠1)}=0} and setting

**s**(0)=

**s**, gene states change by:

_{0}

Iterating (2), an individual's phenotype is its steady‐state vector {**ŝ**: **ŝ**=**s**(*t*+1)=**s**(*t*), *t*<100}. If by **s**(100)≠**s**(99), then the genotype is unviable.

### Founder population

A population is seeded with a founder genotype whose phenotype is designated as optimal: **ŝ**^{OPT}. To produce a founder, random genotypes are generated until one expresses a phenotype: {**ŝ**^{OPT}: **ŝ**^{OPT}=**ŝ**, 0∉**ŝ**}. A random genotype is generated by randomly filling a zeroed‐matrix (**W**) with *c*_{0}*N*^{2} non‐zero values. The founder population is then seeded with *M*−1 mutant clones of the founder (each mutated by randomly selecting, then modifying, three non‐zero values).

### Fitness and selection

A Gaussian fitness‐function scores fitness with:

Here *D*(**ŝ**, **ŝ**^{OPT}) is the hamming distance between the individual and optimal phenotype. An individual's fitness gives a survival probability.

### Reproduction and mutation

By randomly mating sampled pairs, survivors populate the next generation with *M* offspring. Offspring inherit each row of its **W** equiprobably from each parent. Offspring experience mutation at the rate μ: iterating each *w*_{ij}∈**W**. A network interaction is added, deleted, or modified with conditional probability: P(α∣*w*_{ij}=0)=μα/*N*^{2}; P(ϕ∣*w*_{ij}≠0)=μϕ/*N*^{2}; P(δ∣*w*_{ij}≠0)=μδ/*N*^{2}, where ϕ=1−δ.

### Measuring the average CPPI

For a matrix **W**, the CPPI is measured as the expected deviation (scaled to unity) of the phenotype when a single non‐zero element in **W** is randomly sampled and assigned a new non‐zero value. The expected deviation is estimated by randomly sampling the effects for 5000 independent perturbations. Similarly, the CPPI can be calculated as the summation of the expected cost of each interaction (CP) averaged over the number of interactions.

### Measuring GCP

For a matrix **W**, the GCP is measured as the sum of the mean effects of perturbation on each interaction (*w*_{ij}≠0)∈**W**, or similarly GCP can be measured as the CPPI multiplied by the number of interactions in the network.

## Acknowledgements

I thank Vincent Lynch and Richard Harrington for their helpful discussions; Jeremy Draghi, Alex Urban, and David Vasseur for reviewing the paper; Rimas Vaisnys for his early input into the model description; the second reviewer for his comments on the structure; and my mentor Gunter Wagner, for his sage advice and paternal rigor. This research benefited from the Yale University Life Sciences Computing Center, and NIH grant: RR19895, which funded the instrumentation.

## Supplementary Information

Supplementary Information [msb200852-sup-0001.pdf]

## References

This is an open‐access article distributed under the terms of the Creative Commons Attribution License, which permits distribution, and reproduction in any medium, provided the original author and source are credited. This license does not permit commercial exploitation without specific permission.

- Copyright © 2008 EMBO and Nature Publishing Group