The epidermal growth factor receptor (EGFR) signaling pathway is one of the most important pathways that regulate growth, survival, proliferation, and differentiation in mammalian cells. Reflecting this importance, it is one of the best‐investigated signaling systems, both experimentally and computationally, and several computational models have been developed for dynamic analysis. A map of molecular interactions of the EGFR signaling system is a valuable resource for research in this area. In this paper, we present a comprehensive pathway map of EGFR signaling and other related pathways. The map reveals that the overall architecture of the pathway is a bow‐tie (or hourglass) structure with several feedback loops. The map is created using CellDesigner software that enables us to graphically represent interactions using a well‐defined and consistent graphical notation, and to store it in Systems Biology Markup Language (SBML).
The epidermal growth factor receptor (EGFR) signaling pathway is one of the most important pathways that regulate growth, survival, proliferation, and differentiation in mammalian cells. It has been investigated in quite some depth, both experimentally and computationally (Wiley et al, 2003), and several computational models have been created to analyze its dynamics (Kholodenko et al, 1999; Schoeberl et al, 2002; Shvartsman et al, 2002). Further research is now needed to improve the model by incorporating various intracellular dynamics and expanding the scope where only a limited part of the signaling system has been modeled (Kholodenko, 2003). Recently, a consortium has been formed to specifically focus on the receptor tyrosine kinase signaling system, and the need for a shared model has been discussed. Despite its static nature, a comprehensive map of molecular interactions would serve as a useful reference, and greatly help research on EGFR signaling.
General characteristics of the EGFR signaling map
We manually constructed a comprehensive pathway map for EGFR‐mediated signaling (Figure 1) based on published scientific papers. The map includes EGFR endocytosis followed by its degradation or recycling, small guanosine triphosphatase (GTPase)‐mediated signal transduction such as mitogen‐activated protein kinase (MAPK) cascade, phosphatidylinositol polyphosphate (PIP) signaling, cell cycle, and G protein‐coupled receptor (GPCR)‐mediated EGFR transactivation via intracellular Ca2+ signaling. The map was created using CellDesigner (http://celldesigner.org/), a software package that enables users to describe molecular interactions using a well‐defined and consistent graphical notation (Funahashi et al, 2003; Kitano, 2003). The data of molecular interactions are stored in Systems Biology Markup Language (SBML; http://sbml.org/) (Hucka et al, 2003). Since SBML is a standard machine‐readable model representation format, all the information can be used for a range of computational analysis, including computer simulation.
The map is based on the molecular interactions documented in 242 papers accessible from PubMed (see the list of references for EGFR Pathway Map). It comprises 211 reactions and 322 species. A ‘species’ is a term defined by SBML as ‘an entity that takes part in reactions’ and it is used to distinguish the different states that are caused by enzymatic modification, association, dissociation, and translocation.
The species shown on the EGFR map can be categorized as follows: 202 proteins, three ions, 21 simple molecules, 73 oligomers, seven genes, and seven RNAs. In the number of species, eight degraded products and one unknown molecule are also included. Among 202 protein species, we identified 122 molecules, among which are 10 ligands, 10 receptors, 61 enzymes (including 32 kinases), three ion channels, 10 transcription factors, six G protein subunits, and 22 adaptor proteins.
The reactions can be categorized as follows: 131 state transitions, 34 transportations, 32 associations, 11 dissociations, two truncations, and one unknown transition. Among these reactions, there are 247 interactions; these represent 206 catalyses, nine unknown catalyses, 16 inhibitions, 12 transcriptional activations, and four transcriptional inhibitions. There are clusters of reactions that are involved in specific functions, such as endocytosis, degradation, recycling of EGFR, small GTPase signaling, MAPK cascade, PIP signaling, cell cycle, Ca2+ signaling, and GPCR‐mediated EGFR transactivation. Reactions within each cluster are visually collocated to improve readability of the map.
The architecture of ErbB and GPCR signaling networks
While the EGFR map cannot yet be the basis for a dynamical simulation until a series of kinetic parameters have been identified, it can help us understand the architectural feature of the signaling network. Looking at the map displayed in Figure 1, a notable feature becomes apparent; a variety of ligands bind to corresponding subtypes of erythroblastic leukemia viral (v‐erb‐b) oncogene homolog (ErbB) receptors that activate molecules in an extensive network of receptor complexes, and then converge into a handful of molecules, such as nonreceptor tyrosine kinase (non‐RTK), small GTPase, and PIPs, which activate a variety of cascades leading to diverse responses including transcriptional regulation. This architecture, also called a bow‐tie (or hourglass) structure, is a characteristic feature for robust evolvable systems (Kitano, 2004). Typically, it has diverse molecules for input and output that are connected to the conserved core with highly redundant and extensively crosstalking pathways and feedback control loops in various places in the pathway.
Figure 2 illustrates the overall bow‐tie structure of molecular interactions included in the EGFR map ver. 2.0. The arrow in this figure represents the flow of a signal transduction. The ErbB receptor‐mediated signaling network resembles a bow‐tie structure with feedback control loops and inhibitory feed‐forward paths. Positive and negative feedback controls are represented by red filled arrows and blue bar‐headed lines, respectively. Inhibitory feed‐forward paths are shown by purple bar‐headed lines.
As input signals, 15 members of the endogenous EGF ligand family have been identified, that is, amphiregulin, betacellulin, biregulin, EGF, epiregulin, HB‐EGF, heregulin α/β, neuregulin (NRG) 1α/1β/2α/2β/3/4, and transforming growth factor alpha (TGFα) (Jones et al, 1999; Olayioye et al, 2000; Yarden and Sliwkowski, 2001). While the ligands overlap with respect to binding to ErbB receptors, they have their own specificities and affinities for the respective receptors. The redundant and overlapping nature of ligand receptor binding enhances robustness in sensing the molecules in the environment, as dysfunction in one of the receptors may be compensated for by other receptors that have an affinity for the overlapping ligand molecule.
The binding of ligands induces homo‐ and heterodimerization of four ErbB family receptors: EGFR (ErbB1), ErbB2, ErbB3, and ErbB4 (Yarden and Schlessinger, 1987; Yarden and Sliwkowski, 2001). Although 10 combinations of ErbB receptor dimers are mathematically possible, only a subset of these is biologically meaningful. Specifically, ErbB2 has no high‐affinity ligand and is only activated by heterodimerization with another ErbB receptor (Holbro et al, 2003), and the ErbB3 homodimer is inactive (Chen et al, 1996; Olayioye et al, 2000; Yarden and Sliwkowski, 2001). ErbB heterodimers form a highly redundant group of receptor complexes and thereby add to the complexity of EGFR signaling. Dimerization stimulates ErbB cytoplasmic kinase activity leading to auto‐ and trans‐phosphorylation on tyrosine residues (Qian et al, 1994; Heldin, 1995), which serve as docking sites for five adaptor proteins and five enzymes, as shown in Figure 2. Signals from ErbBs converge to molecules forming a bow‐tie core and are supposed to represent a versatile and conserved group of molecules and interactions. Molecules such as non‐RTK (proline‐rich tyrosine kinase (Pyk) 2, v‐src sarcoma viral oncogene homolog (c‐Src)), small GTPase (rat sarcoma viral oncogene homolog (Ras), Rac/cell division cycle 42 (Cdc42)), and PIPs (phosphatidylinositol‐4‐phosphate (PI4‐P), phosphatidylinositol‐4,5‐bisphosphate (PI4,5‐P2), phosphatidylinositol‐3,4,5‐triphosphate (PI3,4,5‐P3)) are candidate of components that constitute the conserved core. Each molecule in the bow‐tie core plays a central role in downstream signaling cascades to produce various physiological events such as cell cycle progression and migration via actin reorganization.
Furthermore, there is crosstalk between the ErbB and G protein coupled‐receptor (GPCR) signaling cascade. Phospholipase C (PLC) γ stimulated by ErbB dimer produces inositol 1,4,5‐triphosphate (IP3) from PI4,5‐P2, which binds to IP3 receptor and causes Ca2+ efflux, while GPCR signaling regulates cytosol Ca2+ concentration via two enzymes, PLCβ and adenylyl cyclase. Release of Ca2+ affects Pyk2 activity that is placed in the possible bow‐tie core segment.
Several system controls define the overall behavior of the signaling network. There are two positive feedback loops in the ErbB bow‐tie structure. Firstly, Pyk2/c‐Src activates ADAMs, which shed pro‐HB‐EGF (Dikic et al, 1996; Li et al, 1996; Poghosyan et al, 2002), so that the amount of HB‐EGF will be increased and enhance the signaling. This Pyk2/c‐Src‐mediated feedback loop is further enhanced by the Ca2+‐mediated crosstalk from the GPCR signaling cascade (shown by a green line in Figure 2) (Prenzel et al, 1999; Carpenter, 2000; Shi et al, 2000; Schafer et al, 2004). Secondly, active PLCβ/γ produces diacylglycerol (DAG) from PI4,5‐P2, which results in the cascading activation of protein kinase C (PKC) (Mellor and Parker, 1998), phospholipase D (PLD) (Exton, 2002), and phosphatidylinositol‐5‐kinase (PI5K) (Moritz et al, 1992). PI5K phosphorylates PI4‐P resulting in an increase of PI4,5‐P2.
There are six negative feedback loops. In two of these, protein tyrosine phosphatases (SHP‐1 and SHP‐2) inhibit EGFR at the input wing of the bow tie. In three others, a son of sevenless (SOS) homolog (Rozakis‐Adcock et al, 1995; Douville and Downward, 1997) is inhibited (by extracellular signal‐regulated kinase (ERK) 1, ERK2, or ribosomal protein S6 kinase (RSK 2)), starting from the output wing to SOS, which localizes near the core of the bow tie. In the sixth, ErbB is degraded (via the activity of Casitas B‐lineage lymphoma proto‐oncogene (c‐Cbl), which is recruited by growth factor receptor‐bound protein (Grb) 2) (Levkowitz et al, 1999; Yokouchi et al, 1999; Ravid et al, 2004); here, feedback starts from the very end of the output wing, moving toward the initial input wing of the bow tie. In addition, a number of local inhibitory control exist that use phosphatases to control kinase activities.
There are cases where both activation and inhibition are directed to the same protein. For example, EGFR provides both positive signaling to Ras activation, and negative regulation through recruitment of Ras GTPase‐activating protein (RasGAP) (Agazie and Hayman, 2003). RAS‐associated protein RAB5a (Rab5a) is influenced by both activation and inhibition signals from Ras interaction 1 (Rin1) (Tall et al, 2001) and related to the N‐terminus of tre (RN‐tre) (Lanzetti et al, 2000), respectively. EGFR essentially regulates both paths as it binds EGF receptor pathway substrate (Eps) 8 that activates RN‐tre, and binds Grb2, which in turn stimulates Ras via SOS leading to Rin1 activation (Han et al, 1997). It is interesting to note that in both cases, the length of the path for inhibition is shorter than that of activation. It will be important to understand how such positive and negative controls are regulated.
In total, there are two positive feedback loops, six negative feedback controls, and inhibitory feed‐forward paths in the ErbB bow‐tie structure. In addition, there are a few positive and negative feedback loops in the GPCR cascade that affect ErbB pathway dynamics. As a whole, the ErbB signaling network forms an overall bow‐tie structure with highly redundant and overlapping input pathways and feedback controls. We consider that such a bow‐tie structure with feedback control is a typical architecture for signal transduction pathways that can be observed even in TLR and GPCR pathways. Understanding the dynamics of such an architecture is critically important for an in‐depth knowledge of signaling systems in general. This includes understanding how such pathways have evolved, and how diverse input stimuli are encoded, converge, and differentially activate various reactions, including the transcription of downstream genes.
Graphical notations of the EGFR Pathway Map
The main symbols used to represent molecules and interactions in this map are displayed in Figure 3. Kitano proposed a graphical notation system for biological networks designed to express sufficient information in a clearly visible and unambiguous way (Kitano, 2003). Several graphical notations for molecular interactions have been proposed previously (Kohn, 1999; Pirson et al, 2000; Cook et al, 2001; Kohn, 2001; Maimon and Browning, 2001), although none has been widely used. The Kohn Map is perhaps the most widely known of these. However, lack of software to support the notation has hampered its use. We have developed CellDesigner, a freely downloadable software tool. It has already been adopted by various research groups and databases such as the PANTHER pathway database (Mi et al, 2005). The current EGFR map is essentially a state transition diagram, in which one state of the system is represented in one node, and an arc from one node to another node represents a transition of the state of the system. This class of diagrams is often used in engineering and software development, and the schema avoids using symbols that directly point to molecules to indicate activation or inhibition. The arrow of state transition (a straight line with a filled arrowhead) represents the state changes that occur as a result of molecular interactions, instead of ‘activation’ in a traditional notation familiar to molecular biologists. The diagram directly indicates a transition from an inactive to an active state for activation, and a transition from an active state to an inactive state for inhibition. When these transitions are promoted or inhibited by other mediating molecules, such as active kinases, these reactions are represented by a catalysis arrow (circle‐headed line) and inhibition arrow (bar‐headed line), respectively. It is essential that such syntax and semantics are made clear and defined consistently, particularly for a large‐scale map, so that the information presented is conveyed unambiguously.
Notation on modification and localization of protein
Figure 4 illustrates how the modification status of a protein is presented. Essentially, each state of a protein (i.e. phosphorylation, acetylation, etc.) can be represented such that it reflects its modification and oligomerization.
In this map, we employed a naming convention in which the localization of protein is indicated by a prefix to the protein name, such as ‘cyt.XX’ and ‘pl.m.XX’ for protein XX in the cytosol and protein XX at the plasma membrane, respectively. In addition, in order to provide a better overview and to understand pathways at a glance, we assigned unique names with an ‘address’ to a protein to express differences of combination states of protein species. For instance, Figure 5 provides the reader with a small part of the pathways illustrated in the map, namely interactions between EGFR and the three adaptor proteins, Src homology 2 domain containing transforming protein (Shc), Grb2, and GRB2‐associated binding protein 1 (Gab1). Figure 5A shows the detailed scheme of combination states between EGFR and adaptors, while Figure 5B expresses combination states by assigning an ‘address’ to the name of a protein. The method of referring to proteins with an ‘address’ becomes clear using Grb2 as an example. Grb2 is recruited to the activated EGFR via the phosphotyrosine residues Tyr1068 or Tyr1086, and this event is denoted as ‘Grb2@EGFR.Y1068/1086P’. The reaction of association between active EGFR and Grb2 is represented using an open‐headed ‘transport’ arrow and a circle‐headed ‘catalysis’ arrow as a local rule adopted in this map ver. 2.0. This convention allows for a more efficient presentation of signaling events and requires much less space, as illustrated in Figure 5B. It should be stressed that this convention is in accordance with the information provided by a full representation.
Omissions in notation
For ease of readability and in order to save space, we adopted to omit notation from this version of the EGFR Pathway Map (ver. 2.0). While simulation studies require precise representation of pathways, such representation has to deal with the complicated issue of multiple states of complexes. Figure 6 shows a simple example. The 85 kDa regulatory subunit of phosphatidylinositol 3‐kinase (PI3K (p85)) binds to active ErbB3 receptor via its phosphorylated tyrosine residues: Tyr1035, Tyr1178, Tyr1203/05, Tyr1241, Tyr1257, and Tyr1270 (Olayioye et al, 2000). To distinguish the complexes according to differences of phosphotyrosine residues, Figure 6A should be redrawn as Figure 6B.
Another type of omission concerns the case in which many pathways are represented by fewer pathways. For example, it has been reported that Grb2 and Shc bind to activated EGFR via their phosphotyrosine residues and function as adaptors of downstream signaling. They are recruited to endosomes during stimulation by EGF where they form complexes with endocytosed EGFR and activate Ras signaling (see the list of references for EGFR Pathway Map). Although some other proteins such as PI3K (p85/p110) are reported to be translocated to endosomes with growth factor receptors (Christoforidis et al, 1999), it is not clear whether all other EGF‐induced interactions occur similarly in endosomes as well as at the cell surface. To conserve space in the current version of the EGFR Pathway Map, we made Grb2 represent interactions with endosomal EGFR.
In addition to sphingosine‐1‐phosphate (S1P), lysophosphatidic acid (LPA), and prostaglandin E2 (PGE2), other ligands such as endothelin‐1 (Vacca et al, 2000) and angiotensin II (Hama et al, 2004) have been reported to be involved in GPCR‐mediated EGFR signal transactivation.
A number of ambiguous cases of protein‐protein interactions came up during the construction of the map. For example, EGF simulation induces activation of protein kinase B (PKB/Akt) via PIPs, which have multiple functions including antiapoptotic properties. However, the mechanistic details as to its activation are controversial. It has been reported that PKB/Akt is phosphorylated at two sites for its full activation: Thr308 in the activation T‐loop of kinase domain and Ser473 in the C‐terminal hydrophobic motif. While phosphoinositide‐dependent kinase 1 (PDK1) has been unambiguously identified as Thr308 kinase, Ser473 kinase named PDK2 remains elusive. Although it has recently been reported that the conventional isoforms of PKC could phosphorylate at Ser473 by distinct stimulation (Kawakami et al, 2004), PKC inhibitors including PKCbeta inhibitor LY 379196 caused PKB/Akt phosphorylation at Ser473 (Wen et al, 2003). Moreover, Toker and Newton (2000) reported that the PDK2 site, namely Ser473, was regulated by autophosphorylation. Because it is not clear whether Ser473 undergoes autophosphorylation, phosphorylation by PDK2, or both, the pathway is represented by unknown catalysis arrows (circle‐headed dashed line).
The EGFR Pathway Map was created using CellDesigner ver. 2.1.1. Compliance of CellDesigner with SBML enables researchers to store models and to use them for analyses by other SBML‐compliant applications.
CellDesigner is also a Systems Biology Workbench (SBW)‐enabled application. With SBW installed, CellDesigner can integrate with all SBW‐enabled modules, including simulation and other analysis packages.
The most recent version of CellDesigner (ver. 2.2) enables users to store data of each molecule and reaction in the species and reaction <notes>, respectively, to link directly to the database such as PubMed simply by clicking. CellDesigner can thus be a portal software platform as well as information organizer for systems biology research.
Updating of the EGFR Pathway Map
This version of the map (ver. 2.0) is intended to be comprehensive but is not necessarily exhaustive. We will periodically update and expand the map on our website using experimental data derived from further studies and through interactions with researchers specialized in certain modules of the EGFR signaling network. To facilitate such interaction and updating of the map, we are currently designing community‐support web‐based tools that will allow a community‐based collaborative development process. Addition and correction of the original map can be made through comments and feedback from experts in specific molecules and interactions, while kinetic constants and other experimentally obtained data can be incorporated into the map.
In systems biology research, both molecular details and a system‐wide network structure must be taken into account. Thus, data resources and tools that enable flexible and updated access to various levels of information are essential. The EGFR signaling map presented in this article is one attempt to seed such effort.
This research is, in part, supported by the Exploratory Research for Advanced Technology (ERATO) and the Solution‐Oriented Research for Science and Technology (SORST) programs (Japan Science and Technology Organization), the NEDO Grant (New Energy and Industrial Technology Development Organization) of the Japanese Ministry of Economy, Trade and Industry (METI), the Special Coordination Funds for Promoting Science and Technology and the Center of Excellence Program for Keio University (Ministry of Education, Culture, Sports, Science, and Technology), The Genome Network Project (the Japanese Ministry of Education, Culture, Sports, Science and Technology), and the Air Force Office of Scientific Research (AFOSR).
Supplementary PDF 1
Supplementary PDF 2
Supplementary SBML 1
- Copyright © 2005 EMBO and Nature Publishing Group