## Abstract

We present a novel method for deriving network models from molecular profiles of perturbed cellular systems. The network models aim to predict quantitative outcomes of combinatorial perturbations, such as drug pair treatments or multiple genetic alterations. Mathematically, we represent the system by a set of nodes, representing molecular concentrations or cellular processes, a perturbation vector and an interaction matrix. After perturbation, the system evolves in time according to differential equations with built‐in nonlinearity, similar to Hopfield networks, capable of representing epistasis and saturation effects. For a particular set of experiments, we derive the interaction matrix by minimizing a composite error function, aiming at accuracy of prediction and simplicity of network structure. To evaluate the predictive potential of the method, we performed 21 drug pair treatment experiments in a human breast cancer cell line (MCF7) with observation of phospho‐proteins and cell cycle markers. The best derived network model rediscovered known interactions and contained interesting predictions. Possible applications include the discovery of regulatory interactions, the design of targeted combination therapies and the engineering of molecular biological networks.

## Introduction

Our ability to measure increasingly complete and accurate molecular profiles of living cells motivates new quantitative approaches to cell biology. For example, a key aim of systems biology is to relate changes in molecular behavior to phenotypic consequences. To achieve this aim, computational models of cellular processes are extremely useful, if not essential. Computational models can be used for the analysis of experimental data, for the prediction of outcomes of unseen experiments and for planning interventions designed to modify system behavior. We have developed a particular approach to constructing, optimizing and applying computational models of cellular processes, which we call Combinatorial Perturbation‐based Interaction Analysis (CoPIA). The key ingredients of the approach are combinatorial intervention, molecular observation at multiple points, model construction in terms of nonlinear differential equations, optimization of model parameters with simplicity constraints and experimental validation.

### The power of combinatorial perturbation

In molecular biology, a targeted perturbation typically inhibits or activates function of biomolecules, e.g. as a result of drug action, small RNA interference, genetic or epigenetic change (Figure 1). In a single experiment, targeted perturbations can be applied either singly or in combination. Combined perturbation by several agents can be much more informative than that by a single agent, as its effects typically reveal downstream epistasis within the system, such as non‐additive synergistic or antagonistic interactions. In addition, a large number of independently informative experiments can be performed if in each experiment a different small set of, e.g. two or three, perturbants is chosen from a larger repertoire. Thus, combinatorial perturbations are potentially powerful investigational tools for extracting information about pathways of molecular interactions in cells (such as *A inactivates B*, or *X and Y are in the same pathway*) (Avery and Wasserman, 1992; Kaufman *et al*, 2005; Kelley and Ideker, 2005; Segre *et al*, 2005; Yeh *et al*, 2006; Lehár *et al*, 2007). Combinatorial perturbations can also be powerful application tools when rationally designed to achieve desired effects. For example, combination of targeted drugs is considered a promising strategy to improve treatment efficacy, reduce off‐target effects and/or prevent evolution drug resistance (Borisy *et al*, 2003; Keith *et al*, 2005; Komarova and Wodarz, 2005; Chou, 2006).

With recent advances in molecular technologies—e.g., targeted perturbation by small molecules, full‐genome libraries of small RNAs, highly specific antibody assays, massive parallelization and imaging techniques—there is intense interest in the investigational power of multiple perturbation experiments in a variety of biological systems. The inherent complexity of such experiments raises significant challenges in data analysis and an acute need for improving modeling approaches capable of capturing effects such as time‐dependent responses, feedback effects and nonlinear couplings.

### Deriving system models from combinatorial perturbation experiments

Computer simulation of pre‐defined pathways can be used to predict epistasis effects and explore how pathway organization shapes the perturbation response (Omholt *et al*, 2000; Segre *et al*, 2005; Lehár *et al*, 2007). In many situations however, observational data are provided but the pathway is unknown or only partially known. To solve this problem, our computational modeling approach enables users to construct a complete differential equation model for a system from combinatorial perturbation experiments. In the context of this paper, the system of interest is defined by a particular type of cell, its environment, a time interval of observation and a phenotypic change, such as cell death or growth. The system is further characterized by its points of intervention, such as drug targets, and the points of observation, such as the phosphorylation state of proteins involved in signaling processes (Figure 1). To represent such a system mathematically, we choose network models in which nodes represent molecular concentrations or levels of activity and edges reflect the influence of one node on the time derivative of another. The time evolution of the system is modeled by linear differential equations, modified by a nonlinear transfer function to reflect properties of the system that are not explicitly modeled (Figure 1). We present efficient optimization algorithms to find models that achieve maximum agreement between observation and prediction. Our algorithm is based on a combination of a gradient descent method (to set dynamical parameters) and a Monte Carlo process (to explore alternative network connectivities). We make a software implementation of CoPIA available as platform‐independent software (http://cbio.mskcc.org/copia).

### Testing the predictive power of derived system models

We perform combinatorial perturbation experiments in an MCF7 breast cancer cell line to test the modeling framework in the steady‐state limit. In this test, we demonstrate how observation of the effects of drug pair perturbations can be exploited to deduce a network model of signaling and phenotype control (reverse engineering of pathways). We use observed molecular state and growth phenotype responses to build predictive models and use these to explain the perturbation–phenotype relationship in terms of coupling between proteins in the EGFR/MAPK and PI3K/AKT pathways. Without using known pathway biology, the resulting model reproduces known regulatory couplings and negative feedback regulation downstream of EGFR and PI3K/AKT/mTOR, and makes predictions about possible roles of PKC‐δ and eIF4E in the control of MAPK signaling and G1 arrest in MCF cells.

We conclude that CoPIA may be of interest as a broadly applicable tool to construct models, discover regulatory interactions and predict cellular responses. For instance, researchers can measure a set of protein phosphorylation responses to drug combinations and use the method to automatically construct network models that predict the response to novel drug combinations. Application of this methodology to time‐dependent experimental observations would extend this predictive capability to the regimen of time‐dependent, rationally designed combinatorial therapy.

## Results

### Modeling the effects of combinatorial perturbations

#### Multiple input–multiple output models.

State space representation is commonly used in mathematical modeling of input–output behavior in natural systems. In this representation, the time behavior of the system state is described by a first‐order differential equation

where the vector *y*(*t*) represents state variables (the activities of the system's components), the vector *u*(*t*) represents perturbations (external influences on the components) and *f* is a linear or nonlinear transfer function (de Jong, 2002). For example, *y*(*t*) can be the abundances of specific mRNAs or proteins, whereas *u*(*t*) can be the concentrations of different chemical compounds to which the cells are exposed (Figure 1). In essence, state space models relate a system's input to its output. State space models with multiple inputs–outputs (that is, *y* and *u* have more than one coordinate) are called multiple input–multiple output (MIMO) models.

#### Linear MIMO models.

When *f* is a linear function of *y* and *u*, the above model is called a linear MIMO model. The mathematical properties of linear MIMO models are well known (Ljung, 1986) and such models have been applied to many biological problems, for example, the construction of transcriptional network models (Tegner *et al*, 2003; Xiong *et al*, 2004; di Bernardo *et al*, 2005). Nevertheless, linear models have a limitation in that they can only model uncoupled perturbation effects (linear dose–response relationships), whereas nonlinear effects (coupled perturbation effects) are ignored (Figure 1; ‘Model representation’). As a result, linear MIMO models are unable to capture important phenomena that are known to occur in cellular systems, such as saturation effects, switch‐like effects and nonlinear interaction phenomena such as genetic epistasis and pharmacological synergism.

#### Nonlinear MIMO models.

To overcome this limitation, we construct nonlinear MIMO models capable of representing coupled perturbation effects. Previously, other authors have observed that complex gene knockout effects, including epistasis effects, can be predicted in metabolic flux networks where bounds on the reaction rates are introduced (Fell and Small, 1986; Edwards and Palsson, 2000; Segre *et al*, 2005; Deutscher *et al*, 2006). Similarly, metabolic systems with Michaelis–Menten kinetics or transcriptional networks with bounds on transcription rates will exhibit epistasis behavior (Omholt *et al*, 2000; Lehár *et al* 2007). In the particular case of the MIMO model, we expect more biologically realistic behavior if one replaces the linear transfer function *f* with a nonlinear transfer function ϕ that imposes bounds on the rates of change of the system. Accordingly, we propose the class of models

In this class of models, the matrix *w*_{ij} represents the interactions between the molecules and processes represented by the state variables of the system. (Intuitively, the matrix elements *w*_{ij} can be thought of as a map of the system, in which *w*_{ij}>0 means ‘node *j* activates node *i*’, whereas *w*_{ij}<0 corresponds to inhibition.) Furthermore, α_{i}>0 represents the tendency of the system to return to the initial state (*y*_{i}=0); β_{i}>0 are constants and ϕ_{i} is a transfer function capable of capturing both switch‐like behavior and bounded reaction rates. Examples of such functions include sigmoid functions, piece‐wise linear approximations of sigmoids or biochemically motivated approximations such as the Hill or Michaelis–Menten equations (Materials and methods).

#### Application of nonlinear MIMO models to combinatorial perturbation experiments.

We developed computer algorithms to infer nonlinear models of the above type from experimental data, as specified by the best‐performing values of the coupling parameters *w*_{ij} and other parameters. As detailed in Materials and methods, the current implementation of our approach consists of the following steps. First, the system of interest is subjected to a set of independent single or multiple target perturbation experiments; and, for each perturbation vector (time‐independent instance of *u*), a readout vector (steady‐state instance of *y*) is recorded. Second, we infer a nonlinear model that best reproduces the experimental data (Materials and methods). Specifically, we rely on parameter estimation techniques for feedback systems to find a model that minimizes a quadratic error term between observed and predicted readouts, subject to simplicity constraints on the number of interactions in the system. Third, the fitted model can be used to predict the system's response to unseen perturbations (for example, combinations of drugs), and to gain new insight into the system's architecture.

### Testing modeling power for combinatorial perturbations in breast cancer cells

#### Dual drug perturbation experiments in MCF7 breast cancer cells.

To directly test the power of the approach, we performed an independent experimental study in MCF7 human breast carcinoma cells. As perturbants of the system, we chose compounds targeting EGFR (ZD1839), mTOR (rapamycin), MEK (PD0325901), PKC‐δ (rottlerin), PI3 kinase (LY294002) and IGF1R (A12 anti‐IGF1R inhibitory antibody). As relevant readouts of molecular and phenotypic responses, we chose phospho‐protein levels of seven regulators of survival, proliferation and protein synthesis (p‐AKT‐S473, p‐ERK‐T202/Y204, p‐MEK‐S217/S221, p‐eIF4E‐S209, p‐c‐RAF‐S289/S296/S301, p‐P70S6K‐S371 and pS6‐S235/S236) as well as flow cytometric observation of two phenotypic processes (cell cycle arrest and apoptosis) (Figure 2). Inhibitors were administered singly and in pairs, followed by EGF stimulation. When recording responses of protein phosphorylation, we used the average response at 5 and 30 min as the surrogate for steady‐state values. To build models, we represented the state of each of the above perturbation targets (signaling proteins), as well as each of the readouts, by one state variable *y*_{i}. We then used the proposed optimization procedure (Materials and methods) to estimate the coupling parameters *w*_{ij} and other parameters, resulting in predictive models of response in terms of these system variables.

#### Quantitative prediction of system response.

We first assessed the predictive power of the derived models using leave‐one‐out cross‐validation, in which one pair perturbation is left out of the analysis and then its effect predicted from information gained from all other perturbations. The resulting predictions were reasonably accurate for the nine different readouts. The best prediction was obtained for p‐S6 phospho‐protein levels (cross‐validation error CV=0.02, Pearson correlation *r*=0.96) and the weakest for the G1 arrest phenotype (CV=0.07, *r*=0.45) (Figure 2 and Supplementary Table 1). We directly compared the performance of our modeling approach to one using a corresponding set of linear differential equations with the same optimization procedure. By comparison, predictions using the nonlinear approach agreed better with experimental observations for eight of the nine readouts. Using the nonlinear modeling approach, the prediction error was lower by up to 50% with correspondingly better correlation between predictions and experimental observations (Supplementary Table 1). Thus, we conclude that our method is capable of deriving reasonably accurate network models for the input–output behavior of MCF7 cells with respect to the readouts used.

### Detection of key regulatory mechanisms without prior knowledge

From a set of perturbation experiments, how can one deduce the logical network structure of activating and inhibiting interactions between the key molecular components, similar to the familiar pathway diagrams in publications summarizing a set of molecular biological experiments? Here, we use the derived network models with the smallest global error (*E*_{total}=*E*_{SSQ}+λ*E*_{STRUCT}, Materials and methods) to infer causal connectivity diagrams. The inference is based on the assumption that interactions in sufficiently simple models that fit experimental observations, called ‘good’ models, represent an underlying causal relationship between system components modeled by the system variables *y*_{i}. Such a relationship can be either an indirect regulatory effect or a direct physical interaction that would be observable *in vitro* with purified components. Using our Monte Carlo algorithm, we generated a population of 450 good models from the MCF7 dual drug perturbation experiments. From these, we assessed the statistical significance of the individual interactions both in terms of a posterior probability (which is obtained directly from the Monte Carlo process, see Materials and methods) and a 90% confidence interval constructed by boot‐strapping simulations (Table I). We now discuss the connectivity of the best model, i.e. the one with the smallest error (schema in Figure 3, explicit equations in Materials and methods) relative to the known biology of regulatory pathways in the MCF7 breast cancer cell line.

#### Interpretation of derived network structure.

In comparing the inferred connectivity with mechanisms known to occur in MCF7 cells (Table I), two caveats are important. (1) The logical nodes in our models are defined precisely as the perturbed and observed molecular species, i.e. the targets of drug perturbation and the targets of specific observed antibody reactions, and may not be exactly identical to a single molecular species. For example, ‘EGFR’ refers to the direct target(s) of activation by EGF and of inhibition by the drug ZD1839, and these two are assumed to be identical. (2) The models make no reference to unperturbed or unobserved nodes, e.g. whereas p‐AKT is in the network model, the unphosphorylated AKT is not. With these caveats in mind, one can use the models both for confirmation and prediction of interactions. Of the 23 interactions in the best model, 14 had a posterior probability in the range of 20–99% (Table I). Of these, several statistically robust interactions clearly confirm canonical pathway structures. (i) The MAPK cascade downstream of the EGF receptor is detected as a chain of interactions between EGFR, MEK and ERK (Figure 3 and Table I). (ii) The negative feedback regulation of MAPK signaling is captured as negative interaction from ERK to EGFR, and as a moderately significant self‐inhibition of MEK (see Discussion). (iii) PI3K‐dependent signaling and the tendency for MCF7 cells to be dependent on AKT activation for survival are detected as interactions between PI3K, AKT and the apoptosis phenotype. (iv) The model inference that apoptosis is controlled by p‐AKT, but not p‐ERK, is in agreement with previous results in MCF7 cells (Simstein *et al*, 2003; DeFeo‐Jones *et al*, 2005). (v) mTOR downstream signaling is detected as interactions between mTOR, p70S6K and ribosomal S6 protein (Mingo‐Sion *et al*, 2005). The derivation of these expected interactions from a small set of perturbation experiments, without prior pathway knowledge, underscores the non‐trivial value of the model building approach and provides some confidence in the concrete predictions of logical regulatory interactions for MCF7 cells (Table I), which are discussed below.

## Discussion

In summary, our evaluation in breast cancer cells supports two main conclusions. First, our approach to model construction can be used to build reasonably accurate quantitative predictors of pathway responses to combinatorial drug perturbation in MCF7 cells. Second, the quality of the deduced interaction network suggests that well‐parameterized nonlinear MIMO models are interpretable in terms of a network of (direct and/or indirect) regulatory interactions. The inference of network structure is surprisingly effective: the logical network diagram in Figure 3 was derived *de novo* based on only 21 experiments, using non‐temporal data and only nine experimental readouts and accurately reflects important known regulatory interactions. This bodes well for future applications in which the amount of readout data can easily be an order of magnitude greater. In addition to yielding details of intermolecular coupling, the method is sufficiently general to allow predictive modeling of causal relationships between biomolecular events and cellular phenotypic consequences, such as growth or cell cycle arrest. The method lends itself to multi‐level modeling in the sense that molecular, mesoscopic and macroscopic events can be modeled in a single framework once appropriate state variables *y*_{i} are defined.

### Software and technical aspects of implementation

We aim to put these tools into the hands of both computational and experimental biologists for widespread use and are providing a software distribution of CoPIA in the supplement. When applying the method in practice, three crucial technical details are important. A user has to choose (i) which system properties to represent by dynamical variables; (ii) a specific form for the transfer function ϕ; and (iii) protocol and parameter values for the Monte Carlo simulation, or for a similar exploration of solution space. The key parameters include λ, which enforces network sparsity to avoid overfitting, and *T*, the temperature parameter, which fine‐tunes the extent of non‐optimal exploration of network space. In Materials and methods, we provide guidelines for these choices.

### Complementarity to response surface models and epistasis clustering

In a recent interesting work, Lehár *et al* (2007) used drug pairs to perturb signaling pathways in cancer cells, and provided an interpretation framework based on traditional pharmacological models for two‐drug response surfaces. Drug targets in the PI3K and MAPK pathways were characterized by correlating ‘synergy profiles,’ demonstrating a link between network connectivity and drug pair response. Such synergy profiles, in turn, can be thought of as a generalization of the epistasis matrix used by Segre *et al* (2005) as a basis for functional clustering of genes. The approach proposed here is different in the sense that it performs a global optimization that aims to find a fully parameterized model for the entire system. Such models, in turn, can be used for additional purposes such as making predictions of system responses, or making connectivity information explicit as pathway diagrams. Preliminary data suggest that CoPIA models can be used to interpret or predict response surface data, as a function of drug concentrations, as an alternative to the approach of Lehár *et al*, e.g. to reduce experimental cost (S Nelander, unpublished data). Finally, the differential equation CoPIA models can be easily represented in standard systems biology formats, such as BioModels (Le Novère *et al*, 2006) and be used with a number of tools for model visualization, numerical simulation or analytical characterization.

### Relationship to neural models and Hopfield networks

The nonlinear representation proposed here, or related neural models, has been used in biological contexts such as transcriptional network modeling (Marnellos and Mjolsness, 1998; D'haeseleer *et al*, 2000; Omholt *et al*, 2000; Vohradsky, 2001; Li *et al*, 2004; Bonneau *et al*, 2006; Hart *et al*, 2006), in synthetic biology (Kim *et al*, 2005, 2006) and for problems such as approximation of inorganic chemical reactions (Shenvi *et al*, 2004), but not for general cellular processes and/or drug perturbations. In addition, CoPIA models are similar, but not identical, to Hopfield networks, a formalism introduced to study computation in physical systems (Hopfield, 1982). To further motivate this class of models in representing biological systems, we propose an extended effort to theoretically and empirically analyze how well biochemical reactions can be approximated by neural functions, e.g. reactions involved in DNA switches (Kim *et al*, 2005).

### Confirmed and predicted regulatory interactions in MCF7 cells

In our analysis, we detected self‐inhibitory feedback loops downstream of the EGF receptor. This is compatible with the observation that receptor activation of MAPK signaling frequently leads to rapid feedback inhibition, for instance by induced expression of inhibitory proteins (such as Sprouty (Kim and Bar‐Sagi, 2004) or MAPK phosphatases), or inhibition of RAF by direct phosphorylation (Dougherty *et al*, 2005). In our experiments, we are not able to identify the full complexity of the feedback loops, as we did not perturb nodes such as ERK or RAF‐1 or other proteins and used a short EGF stimulation time. Additional predictions, such as (i) eIF4E acting as a downstream effector of ERK, as well as (ii) PKC‐δ counteracting the G1 arrest phenotype, are supported by results in other cell types (Waskiewicz *et al*, 1997). Furthermore, the model predicts a mutually inhibitory interplay between eIF4E activation by phosphorylation and G1 arrest, consistent with the established role of eIF4E as a potent oncogene and a master activator of a ‘regulon’ of cell cycle activator genes (Culjkovic *et al*, 2006). However, the predicted increase in p‐RAF by PKC‐δ is paradoxical: the observed phosphorylation sites on c‐Raf (S289/S296/S301) are regarded as inhibitory, which seems inconsistent with the facts that PKC‐δ can activate MAPK signaling in a RAF‐dependent way (Jackson and Foster, 2004). Our prediction might suggest an unknown direct effect mechanism, or an indirect effect that is not captured in the present analysis. Finally, three less interpretable and therefore interesting or potentially problematic features of the network in Figure 3 are (i) the self‐activation of ERK; (ii) the activating arrow between apoptosis and G1 arrest and, (iii) the fact that RAF is not placed between EGFR and MEK, as in the usual representation of this pathway. Overall, a number of predictions can be used to design experiments to validate or refute the model predictions.

### Future challenges

There are a number of future challenges and opportunities to apply the method to important problems and to increase its power. A key challenge is to use the method to extend known pathways, by combining exploratory perturbation experiments with the richness of biological knowledge in pathway databases. This can be achieved by adding *a priori* known nodes *y*_{j} into the formalism and introducing a bias in the network search that favors solutions compatible with prior knowledge. To deal with off‐target effects of perturbations and incompletely known drug–target specificity, we propose a variant algorithm in which drug–target couplings are parameters that are determined by optimization. Such a variant can be used in target identification for interesting drugs, e.g. compounds that have a desirable effect but for which the target is not yet known. To maximize the information value of experiments, we propose to develop algorithms for the design of experiments, e.g. based on the change of outcomes with respect to particular parameters (King *et al*, 2004; Vatcheva *et al*, 2006). We see tremendous opportunities in new types of experiments. To generate more comprehensive and more informative perturbations of a larger set of cellular components, one can use combinatorial RNA interference (Friedman and Perrimon, 2006; Sahin *et al*, 2007). To generate readout richer by one or two orders of magnitude, one can use mass spectrometry of protein and phospho‐protein levels (Mann *et al*, 2002). The CoPIA method can be generalized to go beyond the steady‐state approximation and explicitly model the time behavior of system components by minimizing the error function for a set of time series experiments.

### From models to therapies

The proposed combinatorial perturbation approach to cell biology, CoPIA, presents a well‐specified experimental–computational procedure to construct predictive models for perturbation responses in malignant cells. We suggest use of such models to optimize therapeutic protocols, especially by designing interventions using a combination of targeted compounds administered in an optimal time sequence. Our method constitutes a concrete step toward the active development of network‐oriented pharmacology.

## Materials and methods

### Computational methods

#### Phenotype prediction.

The nonlinear MIMO model for combinatorial perturbation in cellular systems is introduced in the Results section (equation (2)). When this system is propagated through time, it will generally converge to a stable, fixed point (Pineda, 1987). We interpret this fixed point as the phenotypic response to the perturbation *u*. To calculate the fixed point given in a model, we used standard numerical integration methods (*ode15s* (Mathworks Inc.) and *DLSODE* (Hindmarsh, 1993)). As the class of models studied here can in principle have more than one solution to the steady‐state equation (Smits *et al*, 2006), we used the convention—for practical purposes—to start each predictive simulation from the unperturbed, wild‐type steady state *y*=0.

#### Overview of model fitting algorithm.

The procedure used to find parameter values (for the α_{i}'s, β_{i}'s and the *w*_{ij}'s) from experimental data is outlined below. As an overall approach, we minimize a global error function that combines the requirements of data fit and simplicity. The error function is defined as

where *E*_{SSQ} is the residual sum of squares error, which measures the difference between the model's predicted values and the corresponding observational values for the subset of variables that are observed. The term *E*_{STRUCT} is a penalty term that measures the complexity of the network and λ is a tuning parameter that needs to be chosen; for λ=0 no emphasis is put on the model structure and increasingly sparse (uncomplicated) models are obtained for increasing values of λ. We used the *l*^{0}‐norm of the regulatory matrix *w* to define *E*_{STRUCT} as

where 0^{0}=0. The *l*^{0}‐norm is a common approach to enforce sparse solutions in many machine‐learning applications (Weston *et al*, 2002). In principle, other norms can be used, such as the *l*^{1} norm (Yeung *et al*, 2002).

To minimize *E*_{total}, we made combined use of a Monte Carlo stochastic search algorithm (to search for the network structure) and an efficient gradient descent algorithm described by Pineda (1987) (to set the parameters). In an outer loop of the algorithm, the Monte Carlo process gradually updates the model structure (the set of non‐zeros in *w*). In an inner loop, we apply Pineda's algorithm to fit parameters (α_{i}'s, β_{i}'s and non‐zero *w*_{ij}'s). The output of the algorithm is a set of complete ODE models, for example

In the following two sections, we describe the gradient descent algorithm and the Monte Carlo stochastic search algorithm more thoroughly.

#### Inner loop: minimization of *E*_{SSQ} using a gradient descent algorithm.

Assume a MIMO system with *N* dynamical variables *y*_{1},*y*_{2}, …, *y*_{N}, of which a subset Ω of the variables can be observed experimentally. A perturbation experiment is described by the pair (*u*, *Y*), where *u*=(*u*_{1}, …, *u*_{N}) is the perturbation treatment and *Y*={*Y*_{i}∣*i*∈Ω} is the experimental observation. As a mathematical model for the relationship between the perturbation *u* and the experimentally observed response *Y*, we use the dynamical system described in the Results section (equation (2)). Let denote the steady state of this dynamical system under the perturbation *u*. We then define the sum of squares error for a single experiment as *E*_{SSQ} = ∑_{i∈Ω}(*Y*_{i}−_{i})^{2}.

We consider a fixed network structure, where some *w*_{ij}'s are fixed to zero. To describe the structure, we define a matrix *U* such that *w*_{ij} can adopt a non‐zero value if *U*_{ij}=1 and *w*_{ij} is zero if *U*_{ij}=0.

Given *N*, (*u*, *Y*) and *U*, we want to find parameters *α*_{i}'s, β_{i}'s and the non‐zero *w*_{ij}'s that minimize the error *E*_{SSQ}. For the special case where λ=0, α=1, β=1, Pineda (1987) described a gradient descent procedure, based on solving a set of differential equations in which the weights *w*_{ij} are updated following the gradient descent rule

Here, η is a (small) number that sets the convergence speed, and τ is a ‘pseudo‐time’ that increases as the fitting procedure progresses. We use the update equations derived in D'haeseleer *et al* (2000) to extend to an arbitrary α and β. The computation formula to minimize *E*_{SSQ} thus becomes:

In these equations, *z* is an error propagation variable introduced for computational purposes (Pineda, 1987). To fit the model for a single (*u*, *Y*) pair, we integrated these equations (*DLSODE* or *ode15s*) with initial value 0 for *w* and 1 for α and β. The parameters were not subjected to constraints such as lower and upper bounds. Solutions for different stimulus–response pairs were combined using online learning with momentum described in Duda *et al* (2000).

#### Outer loop: minimization of *E*_{TOTAL} with an l‐zero penalty using stochastic search.

We used a Markov Chain Monte Carlo approach (Ewens and Grant, 2005) to minimize *E*_{TOTAL}, and hence find the optimal model defined by the network structure *U* and parameter values for α, β and non‐zero *w*'s.

In the algorithm, a set of models are maintained and a particular model survives to the next iteration with probability proportional to _{e}^{−E}total^{/T} (the Boltzmann factor, where *T* denotes the temperature of the search). Hence, low‐error models are more likely to be propagated to next iteration. The temperature is typically high in the beginning of the search and low in the end.

The algorithm is outlined as follows:

Initialize with

*U*_{current}=*U*_{start}. Here, subindexes of*U*(*U*_{current},*U*_{start}, U_{1}, U_{2}, …) refer to different realizations of the U matrix (as opposed to*U*matrix elements. As*U*_{start}, we use a N × N matrix of zeros.Generate a set

*S*={*Ũ*_{1}, …,*Ũ*_{k}} of structures that are variations of*U*_{current}. For simplicity, we consider every structure that differs from Ucurrent by one edge.Estimate the parameters for each structure

*Ũ*_{1}, …,*Ũ*_{k}using the variant of Pineda's algorithm presented above. Record the corresponding sum‐of‐square errors*E*_{1}, …,*E*_{k.}Calculate the total error for each topology as

*E*_{j}′=*E*_{j}+λ∑*U*_{j}.Use a decision rule

*R*to select one of the alternate topologies,*U*_{selected.}Update the current topology,

*Ũ*_{current}←*Ũ*_{selected}, potentially update*T*, and repeat from step 2.

As decision rule *R*, we randomly select topology *U*_{j} with probability

Under certain assumptions (the number of neighbors *k* is the same for every topology *U*, neighbor is a mutual relationship, and all possible topologies can be reached in a finite number of steps), the above Markov chain will have a stationary probability distribution in which the probability for a certain topology is proportional to its Boltzmann factor Ewens and Grant (2005). For a sufficiently low temperature *T*, the algorithm will converge to a probability optimum/error minimum.

#### Bootstrapping confidence intervals.

For a given model structure *U*, we used re‐sampling of residuals to generate boot‐strapped confidence intervals for the model parameters. First, the model was fitted using structure *U* and the original data, and residuals were calculated as the best model fit minus the original data. A total of 200 ‘new’ data sets was then constructed by adding randomly drawn residuals to each measurement (using residuals for the corresponding experimental readout, i.e. p‐MEK residuals were added to p‐MEK values and so on). For each such re‐sampled data set, a model was fitted using the structure *U*. Subsequently, confidence intervals for each coupling parameter *w*_{ij} were calculated as percentiles 5–95% across the 200 data sets.

#### Data preprocessing and parameter choices.

The relationship between the model variable *y*_{i}, a corresponding experimental observation *Y*_{i} and an experimental reference point *Y*_{ref} or *Y*_{max} is defined by a mapping function. In our evaluation in breast cancer cells, we used the log relative change defined as

The transfer functions ϕ_{i} should be chosen such that the interval spanned by the experimental data corresponds to the target domain of the function. We found it useful to standardize data to the interval [−1, +1] and then to choose the sigmoid function accordingly. As the reference (‘wild‐type’) value *Y*_{ref}, we used the untreated controls. As only one concentration level was used for every drug (chosen to be around the ED_{90}), we represented perturbation as *u*_{i}=1 if the drug was added, and *u*_{i}=0 otherwise. We used ϕ_{i}=tanh(*y*_{i}) as the sigmoid (suitable as it maps to the interval [−1, +1], another function with this target domain, ϕ_{i}=2/π tan^{−1}(*cy*_{i}/2), gave very similar results).

### Experimental methods

#### Cell culture and reagents.

MCF7 cells were obtained from American Type Culture Collection; maintained in 1:1 mixture of DME:F12 media supplemented with 100 U/ml penicillin, 100 g/ml streptomycin, 4 mM glutamine and 10% heat‐inactivated fetal bovine serum and incubated at 37°C in 5% CO_{2}. The final concentrations for inhibitors used for perturbation experiments were 1 μM ZD1839 (AstraZeneca), 10 μM LY294002 (Calbiochem), 50 nM PD0325901 (Pfizer), 2 μM rottlerin (EMD), 10 nM rapamycin and 1.5 μg/ml antibody A12 (ImClone Systems).

#### Immunoblotting.

MCF7 cells were grown in 100 mm dishes, and starved for 20 h in PBS. They were then treated with indicated concentrations of inhibitors (details see Cell culture and reagents) or vehicle (DMSO) for 1 h, followed by adding EGF into the media (final EGF concentration was 100 ng/ml). After EGF stimulation for 5 or 30 min in the presence of drugs or DMSO, western blots were performed by harvesting MCF7 cellular lysates in 1% Triton lysis buffer (50 mM HEPES, pH 7.4, 1% Triton X‐100, 150 mM NaCl, 1.5 mM MgCl_{2}, 1 mM EGTA, 1 mM EDTA, 100 mM NaF, 10 mM sodium pyrophosphate, 1 mM vanadate, 1 × protease cocktail II (Calbiochem) and 10% glycerol), separating 40 μg of each lysate by SDS–PAGE, transferring to PVDF membrane and immunoblotting using specific primary and secondary antibodies and chemoluminescence visualization on Kodak or HyBlotCL films. Antibodies for phospho‐Akt‐S473, phospho‐ERK‐T202/Y204, phospho‐MEK‐S217/S221, phospho‐eIF4E‐S209, phospho‐c‐RAF‐S289/S296/S301, phospho‐p70S6K‐S371 and phospho‐pS6‐S235/S236 were from Cell Signaling. Films were scanned by an microTEK scanner at 600 d.p.i. in gray scale. Bands were selected and quantified by FUJIFILM Multi Gauge V3.0 software. Each membrane was normalized to internal controls (with or without 100 ng/ml EGF). The membranes were stripped and reprobed with anti‐beta actin (Sigma no. A5441) to confirm equal protein loading.

#### Flow cytometry analysis of cell cycle and apoptosis.

MCF7 cells were seeded in six‐well plates (200 000 cells per well) and grown for 20 h in 10% FBS/DME:F12. Cells were then starved for 20 h in PBS, and then treated with indicated concentrations of inhibitors (details see Cell culture and reagents) or DMSO for 1 h, followed by adding EGF into the media (final EGF concentration was 100 ng/ml). After EGF stimulation for 24, 48 or 72 h in the presence of drugs or DMSO, cells were harvested by trypsinization, including both suspended and adherent fractions, and washed in cold PBS. Cell nuclei were prepared by the method described by Nusse *et al* and cell cycle distribution was determined by flow cytometric analysis of DNA content (FACS) using red fluorescence of 488 nm excited ethidium bromide‐stained nuclei. The percentage of cells in the G1 phase (cell cycle arrest) and sub‐G1 fraction (apoptosis) was recorded.

## Acknowledgements

We thank Doron Betel, Nikolaus Schultz, Debora Marks and Erik Kristiansson for comments on the paper and Solmaz Shahalizadeh‐Korkran for contributions to algorithm evaluation. This research project was made possible by an EMBO long‐term postdoctoral fellowship and a stipend from the PE Lindahl foundation (SN); support from the Göteborg University quantitative biology platform and the Swedish Strategic Research Foundation through Göteborg Mathematical Modeling Center (PG) and, by a donation from Matt's Promise Foundation (CS). *Author contributions*: SN, PG and CS developed the computational methodology with additional contributions from BN. SN and PG wrote the CoPIA software. SN, WW, QS and CP planned and interpreted experiments. WW performed experiments. SN and CS wrote the paper with key contributions from BN, PG and WW.

## Supplementary Information

Supplementary Table I [msb200853-sup-0001.xls]

Supplementary Table II [msb200853-sup-0002.xls]

Supplementary Information [msb200853-sup-0003.zip]

## References

- Copyright © 2008 EMBO and Nature Publishing Group