Although the genomes of many microbial pathogens have been studied to help identify effective drug targets and novel drugs, such efforts have not yet reached full fruition. In this study, we report a systems biological approach that efficiently utilizes genomic information for drug targeting and discovery, and apply this approach to the opportunistic pathogen Vibrio vulnificus CMCP6. First, we partially re‐sequenced and fully re‐annotated the V. vulnificus CMCP6 genome, and accordingly reconstructed its genome‐scale metabolic network, VvuMBEL943. The validated network model was employed to systematically predict drug targets using the concept of metabolite essentiality, along with additional filtering criteria. Target genes encoding enzymes that interact with the five essential metabolites finally selected were experimentally validated. These five essential metabolites are critical to the survival of the cell, and hence were used to guide the cost‐effective selection of chemical analogs, which were then screened for antimicrobial activity in a whole‐cell assay. This approach is expected to help fill the existing gap between genomics and drug discovery.
Discovering new antimicrobial targets and consequently new antimicrobials is important as drug resistance of pathogenic microorganisms is becoming an increasingly serious problem in human healthcare management (Fischbach and Walsh, 2009). There clearly exists a gap between genomic studies and drug discovery as the accumulation of knowledge on pathogens at genome level has not successfully transformed into the development of effective drugs (Mills, 2006; Payne et al, 2007). In this study, we dissected the genome of a microbial pathogen in detail, and subsequently developed a systems biological strategy of employing genome‐scale metabolic modeling and simulation together with metabolite essentiality analysis for effective drug targeting and discovery. This strategy was used for identifying new drug targets in an opportunistic pathogen Vibrio vulnificus CMCP6 as a model.
V. vulnificus is a Gram‐negative halophilic bacterium that is found in estuarine waters, brackish ponds, or coastal areas, and its Biotype 1 is an opportunistic human pathogen that can attack immune‐compromised patients, and causes primary septicemia, necrotized wound infections, and gastroenteritis. We previously found that many metabolic genes were specifically induced in vivo, suggesting that specific metabolic pathways are essential for in vivo survival and virulence of this pathogen (Kim et al, 2003; Lee et al, 2007). These results motivated us to carry out systems biological analysis of the genome and the metabolic network for new drug target discovery.
V. vulnificus CMCP6 has two chromosomes. We first re‐sequenced genomic regions assembled in low quality and low depth, and subsequently re‐annotated the whole genome of V. vulnificus. Horizontal gene transfer was suspected to be responsible for the diversification of each chromosome of V. vulnificus, and the presence of metabolic genes was more biased to chromosome 1 than chromosome 2. Further studies on V. vulnificus genome revealed that chromosome 2 is more prone to diversification for better adaptation to the environment than its chromosome 1, while chromosome 1 tends to expand their genetic repertoire while maintaining the core genes at a constant level.
Next, a genome‐scale metabolic network VvuMBEL943 was reconstructed based on literature, databases and experiments for systematic studies on the metabolism of this pathogen and prediction of drug targets. The VvuMBEL943 model is composed of 943 reactions and 765 metabolites, and covers 673 genes. The model was validated by comparing its simulated cell growth phenotype obtained by constraints‐based flux analysis with the V. vulnificus‐specific experimental data previously reported in the literature. In this study, constraints‐based flux analysis is an optimization‐based simulation method that calculates intracellular fluxes under the specific genetic and environmental condition (Kim et al, 2008). As a result, 17 growth phenotypes were correctly predicted out of 18 cases, which demonstrate the validity of VvuMBEL943.
The main objective of constructing VvuMBEL943 in this study is to predict potential drug targets by system‐wide analysis of the metabolic network for the effective treatment of V. vulnificus. To achieve this goal, a set of drug target candidates was predicted by taking a metabolite‐centric approach. Metabolite essentiality analysis is a concept recently introduced for the study of cellular robustness to complement conventional reaction or gene‐centric approach (Kim et al, 2007b). Metabolite essentiality analysis observes changes in flux distribution by removing each metabolite from the in silico metabolic network. Hence, metabolite essentiality predicts essential metabolites whose absence causes cell death. By selecting essential metabolites, it is possible to directly screen only their structural analogs, which substantially reduces the number of chemical compounds to screen from the chemical compound library. As a result of implementing this approach, 193 metabolites were initially identified to be essential to the cell. These essential metabolites were then further filtered based on the predetermined criteria, mainly organism specificity and multiple connectivity associated with each metabolite, in order to reduce the number of initial target candidates towards identifying the most effective ones.
Five essential metabolites finally selected are 2‐amino‐4‐hydroxy‐6‐hydroxymethyl‐7,8‐dihydropteridine (AHHMP), d‐glutamate (DGLU), 2,3‐dihydrodipicolinate (DHDP), 1‐deoxy‐d‐xylulose 5‐phosphate (DX5P), and 4‐aminobenzoate (PABA). Enzymes that consume these essential metabolites were experimentally verified to be essential, which indeed demonstrates the essentiality of these five metabolites. On the basis of the structural information of these five essential metabolites, whole‐cell screening assay was performed using their analogs for possible antibacterial discovery. We screened 352 chemical analogs of the essential metabolites selected from the chemical compound library, and found a hit compound 24837, which shows the minimal inhibitory concentration (MIC) of 2 μg/ml and minimal bactericidal concentration (MBC) of 4 μg/ml, showing good antibacterial activity without further structural modification. Although this study demonstrates a proof‐of‐concept, the approaches and their rationale taken here should serve as a general strategy for discovering novel antibiotics and drugs based on systems‐level analysis of metabolic networks.
Chromosome 1 of Vibrio vulnificus tends to contain larger portion of essential or housekeeping genes on the basis of the genomic analysis and gene knockout experiments performed in this study, while its chromosome 2 seems to have originated and evolved from a plasmid.
The genome‐scale metabolic network model of V. vulnificus was reconstructed based on databases and literature, and was used to identify 193 essential metabolites.
Five essential metabolites finally selected after the filtering process are 2‐amino‐4‐hydroxy‐6‐hydroxymethyl‐7,8‐dihydropteridine (AHHMP), d‐glutamate (DGLU), 2,3‐dihydrodipicolinate (DHDP), 1‐deoxy‐d‐xylulose 5‐phosphate (DX5P), and 4‐aminobenzoate (PABA), which were predicted to be essential in V. vulnificus, absent in human, and are consumed by multiple reactions.
Chemical analogs of the five essential metabolites were screened and a hit compound showing the minimal inhibitory concentration (MIC) of 2 μg/ml and the minimal bactericidal concentration (MBC) of 4 μg/ml against V. vulnificus was identified.
The genomes of a diverse spectrum of organisms have been elucidated, not only for better understanding of their biological characteristics at the systems level, but also for various biotechnological applications, such as identification of new drug targets. Discovering new antimicrobial targets and consequently new antimicrobials is important, as drug resistance of pathogenic microorganisms is an increasingly serious problem in human healthcare management (Fischbach and Walsh, 2009). There clearly exists a gap between genomic studies and drug discovery as the accumulation of knowledge on pathogens at the genome level has not successfully transformed into the development of effective drugs (Mills, 2006; Payne et al, 2007). In previous efforts, genes predicted to be conserved across microbial pathogens from comparative genomic analyses were considered as drug targets, and subjected to laborious gene disruption experiments for drug target validation (Payne et al, 2007). The emergence of genome‐scale metabolic networks and simulation methods has much accelerated this process, but they have mostly not been linked to successful drug discovery (Thiele et al, 2005; Jamshidi and Palsson, 2007; Oberhardt et al, 2008; Lee et al, 2009; Mazumdar et al, 2009; Kim et al, 2010a). Recent relevant work attempted to overcome this problem by employing virtual screening for chemicals against drug target enzymes predicted using a genome‐scale metabolic network model (Shen et al, 2010). In this study, we dissected the genome of a target microbial pathogen in detail, and subsequently developed a systems biological strategy that employs genome‐scale metabolic modeling and simulation for effective drug targeting and discovery. This strategy was used for identifying new drug targets in the opportunistic pathogen Vibrio vulnificus CMCP6 as an example.
V. vulnificus is a Gram‐negative halophilic bacterium that is found in estuarine waters, brackish ponds, or coastal areas. V. vulnificus belongs to γ‐proteobacterium in the family Vibrionaceae, and it has been classified into three biotypes on the basis of their biochemical and biological characteristics (Gulig et al, 2005). Biotype 1, the dominant human pathogen, and biotype 3, recently identified in people handling fish in Israel, are considered to be the human pathogenic groups (Jones and Oliver, 2009). Biotype 1 V. vulnificus is an opportunistic human pathogen that can attack immune‐compromised patients, and causes primary septicemia, necrotized wound infections, and gastroenteritis. Consequently, this bacterium is the major causative agent of death from ingestion of raw or undercooked seafood, especially in people with liver damage through, for instance, underlying hepatic disease and heavy alcohol drinking habits. Primary septicemia progresses rapidly through a fulminant course, which results in a high mortality rate of over 50%, despite aggressive antimicrobial and supportive shock therapies (Gulig et al, 2005).
Although V. vulnificus isolates are generally reported to be susceptible to commercial antimicrobial agents, including doxycycline, third generation cephalosporins, and fluoroquinolones (Park et al, 1991; Hlady and Klontz, 1996; Bross et al, 2007), a substantial proportion of V. vulnificus isolates from the environment shows resistance to currently used antimicrobial agents (Radu et al, 1998; Baker‐Austin et al, 2009). Thus, new effective antibiotics to treat antibiotic‐resistant V. vulnificus need to be developed. Consequently, we previously searched V. vulnificus genes preferentially expressed in patients for potential drug targets, and found that many metabolic genes were specifically induced in vivo, suggesting that specific metabolic pathways are essential for in vivo survival and virulence of this pathogen (Kim et al, 2003; Lee et al, 2007). These results motivated us to proceed to a systems biological analysis of the genome and drug targeting of this pathogen.
In this study, the V. vulnificus CMCP6 genome was partially re‐sequenced and fully re‐annotated for genomic studies and reconstruction of its genome‐scale metabolic network VvuMBEL943. Drug targets were systematically predicted from the simulation and analysis of the metabolic network VvuMBEL943, and were experimentally validated. Information from predicted drug targets was then used to identify novel antibiotic compounds from chemical library (Figure 1). It is expected that this strategy would allow identification of novel antimicrobial targets, and consequently effective antimicrobials for other pathogens of importance, in a cost‐effective manner.
General features of V. vulnificus CMCP6 genome
After the initial submission of the V. vulnificus CMCP6 genome data to GenBank in 2002 (AE016795 and AE016796) , we re‐sequenced the genomic regions assembled in low quality and low depth, and subsequently re‐annotated the whole genome. The updated genome information was deposited to GenBank under the accession numbers AE016795.3 (chromosome 1) and AE016796.3 (chromosome 2). V. vulnificus CMCP6 has two circular chromosomes of 3 281 866 bp (chromosome 1) and 1 844 830 bp (chromosome 2) with no plasmids. The sizes of the chromosomes were reduced by 78 bp (chromosome 1) and 23 bp (chromosome 2) after the sequence update. The average G+C contents of chromosome 1 and 2 are 46.45 and 47.12%, respectively. Comparison of generic genomic features of the six selected Vibrio species is summarized in Figure 2 and Supplementary Table I. Among the six Vibrio species, the overall genome organization of V. vulnificus CMCP6, such as chromosomal size or gene content, appears to be most similar to that of Vibrio parahaemolyticus, a gastroenteritis‐causing pathogen (Figure 2 and Supplementary Figure 1). Also, close genomic comparison between the two V. vulnificus chromosomes suggests that horizontal gene transfer is responsible for the diversification of each genome, leading to an increase in fitness of this organism under varying environmental conditions (Quirke et al, 2006). Between the two chromosomes, the location of metabolic genes was more biased to chromosome 1; 75.4% of metabolic genes are located on chromosome 1, which takes up 65.3% of the total nucleotide sequence (Supplementary Table I). However, genes associated with propanoate, butanoate, and nitrogen metabolism are relatively more likely to be present on chromosome 2.
Comparative genomic analysis
Proteins encoded in the two chromosomes of V. vulnificus CMCP6 were compared with those of five other completely sequenced Vibrio genomes in order to estimate their overall similarity with respect to gene content. It was found that 75.6% of the V. vulnificus CMCP6 proteins matched with 40.1 and 31.4% of the V. parahaemolyticus and V. harveyi proteins, respectively, as identified by the reciprocal BLAST best hits (E‐value <10−5). Phylogenetic distribution of the unidirectional best hits using V. vulnificus CMCP6 proteins as queries revealed that 40.9% of the proteins from V. vulnificus CMCP6 chromosome 1 are most similar to those found on V. parahaemolyticus chromosome 1, but the proportion decreased to 31.1% for chromosome 2. This observation suggests that the V. vulnificus CMCP6 chromosome 2 is more prone to diversification for better adaptation to the environment than its chromosome 1. Other lines of evidence suggest that the chromosome 2 of Vibrio species is more diverse in size, gene content, and gene order (Heidelberg et al, 2000; Makino et al, 2003). In addition, the genome sequences of Vibrio species revealed that the smaller chromosome 2 does not have an authentic chromosomal replication origin, but rather has a plasmid‐like replication origin (Egan and Waldor, 2003). It was thus hypothesized that chromosome 2 was originally acquired as a megaplasmid, which subsequently acquired some essential genes (Heidelberg et al, 2000), supporting the diversification of chromosome 2.
Next, to identify the conserved gene sets among the Vibrio genomes, proteins from CMCP6 and five other Vibrio species were clustered using BLASTCLUST. The thresholds were set at 60% identity and 70% length coverage, which are stricter than the conventionally used thresholds (35% identity and 80% length coverage) for identifying functions of proteins being compared. As a result, 1587 clusters (1625.7±5.9 proteins) ranging from 27.3 (V. harveyi) to 42.7% (V. fischeri) of total proteins were obtained. Although the distribution of conserved genes was rather uniform among the compared genomes, genes unique to each genome that were not clustered varied significantly (1348.3±224.3 proteins). This implies that larger Vibrio genomes tend to expand their genetic repertoire while maintaining the core genes at a constant level without notable increase in paralogs. The percentage of unclustered genes unique to each genome is also proportional to the total number of genes (R2=0.9033). V. harveyi, possessing the largest genome among the sequenced Vibrios (6 058 377 bp including a plasmid), is remarkable in that it has as many as 38 unique clusters (>3 members) totaling 653 genes, which can be operationally classified into paralogs. Interestingly, most of them are of unknown function, and some encode reverse transcriptases with group II intron origin, and probable transposases. In contrast, other Vibrio genomes have only four unique clusters at most. With these findings, genome evolution in Vibrios can largely be explained by species‐specific gene acquisition probably due to the horizontal gene transfer, and/or paralog expansion. This pattern of genome evolution in Vibrios may provide fitness advantages in specific habitats, and potentially could contribute to drug resistance.
We compiled a list of the 17 largest gene clusters, consisting of 10 or more proteins commonly found in the 6 Vibrio genomes (Supplementary Table II). These clusters, deemed essential for metabolism and motility, occur in each genome in rather uniform numbers, in contrast to genome‐specific clusters. Highly conserved genes found in bacterial genomes, which do not have homologs in humans, are often considered to be potential drug targets (Kim et al, 2010a). However, in the present study, the metabolic genes of V. vulnificus in these clusters were not identified as essential ones, suggesting that plurality of such genes in a given genus does not, at least in Vibrionaceae, necessarily mean that those genes can serve as drug targets. Metabolic pathways governed by these clustered genes might have been fortified by gene acquisition mechanisms throughout evolution.
Genome‐scale metabolic network of V. vulnificus, VvuMBEL943
With the newly annotated genome of V. vulnificus CMCP6, a genome‐scale metabolic network was reconstructed for systematic studies on its metabolism and prediction of drug targets (Figure 1). The in silico genome‐scale metabolic network, VvuMBEL943, is composed of 943 reactions and 765 metabolites, and covers 673 genes (Table I and Supplementary Tables III and IV). For this process, relevant metabolic information from the Kyoto Encyclopedia of Genes and Genomes (KEGG) (Kanehisa et al, 2006), TransportDB (Ren and Paulsen, 2005), and MetaCyc (Caspi et al, 2008) was manually curated along with the genome annotation data and literature to upgrade the initial metabolic network towards a more realistic one (Figure 1). The VvuMBEL943 possesses 792 reactions (84.0% out of whole reactions) assigned with specific open reading frames (ORFs). Some reactions were added even if they do not have assigned ORFs in order to enable the in silico cell VvuMBEL943 to biosynthesize all the constituents of biomass. The ability of V. vulnificus to grow in the defined minimal (M9) medium (Table II) suggests that these reactions need to be included for generating precursors for biomass formation. Biomass composition was determined based on the literature, genome annotation data, and experiments performed in this study (Supplementary Table V) as it is crucial for the metabolic network to correctly represent the biosynthesis of major cell constituents, such as amino acids and fatty acids, and ultimately the cell growth rate.
VvuMBEL943 was then validated by comparing the simulated cell growth phenotype obtained by constraints‐based flux analysis with the V. vulnificus‐specific experimental data previously reported in the literature (Table II). Constraints‐based flux analysis calculates intracellular fluxes under predefined stoichiometric relationship of metabolites present in the organism along with condition‐specific constraints. The objective function used was the maximization of cell growth rate (Kim et al, 2008) (see Materials and methods) because this objective well describes the metabolic status of V. vulnificus, which proliferates inside iron‐overloaded mice with an estimated in vivo doubling time of 72 min (Kim et al, 2003; Lee et al, 2007). In this study, cell survival under the given genetic and environmental (growth media) conditions is of particular interest. Initially, 16 out of 18 cases were predicted to be consistent with the experimental data, which suggested that further improvement of the model was needed. For instance, our earlier version of the metabolic network did not show growth in M9 medium with maltose despite possessing a maltose utilization system, which differed from experimental data showing cell growth on maltose (Lim et al, 2005) (Case 15). It was discovered that VvuMBEL943 did not possess an enzyme that metabolizes glucose, generated from maltose in the cell (Boos and Shuman, 1998). Thus, a reaction catalyzed by glucokinase (EC 220.127.116.11) was added to VvuMBEL943, despite its absence in the genome annotation, in order to properly reflect cell growth on maltose. Finally, for Case 2, the growth of the gmhD mutant in AB medium (Kim et al, 2007a) was inconsistent with the simulation result. GmhD (EC 18.104.22.168) is a critical enzyme involved in the biosynthesis of lipopolysaccharide, which should operate in the growing cell. Deletion of the gmhD gene greatly impaired the growth of V. vulnificus by resulting in the biosynthesis of a truncated inner‐core oligosaccharide (Kim et al, 2007a). As GmhD was predicted to be essential using VvuMBEL943, the significantly retarded growth phenotype observed in the reported experiments (Kim et al, 2007a) suggests that the simulation result is not too inaccurate. As a result of the validation process, the simulation accuracy of VvuMBEL943 was improved to support the results of 17 out of 18 experimental cases (Case 2 was counted as an inconsistent result here). Thus, the validated VvuMBEL943 represents the metabolic characteristics of V. vulnificus well, and can be employed for further simulations. In addition to this, topological features of VvuMBEL943, namely average path length and network diameter, were characterized. They appeared to be more similar to those of average bacteria than eukaryotes and archaea, and more similar to those of V. cholerae than Escherichia coli (Ma and Zeng, 2003), which manifests that VvuMBEL943 properly conveys the distinct features of V. vulnificus (Supplementary Table VI).
Systems drug targeting
The main objective of constructing VvuMBEL943 in this study is to predict potential drug targets by system‐wide analysis of the metabolic network (thus, systems drug targeting) for the effective treatment of V. vulnificus. As biological systems are optimized through evolution giving them robustness, targeting for antibacterial drugs can, in essence, be achieved by disrupting the robustness of the pathogen, and thereby effectively suppressing its growth (Kitano, 2007). In this study, robustness is a crucial concept in predicting drug targets, which can be generally defined as the cellular ability to maintain its biological processes against internal and external perturbations, including disruption of genes and inactivation of enzymes by chemical agents (Kitano, 2007; Kim et al, 2007b). In terms of metabolic simulations implemented in this study, zero cell growth rate after perturbation indicates the complete disruption of robustness in V. vulnificus.
To achieve this goal, a set of drug target candidates was predicted by taking a metabolite‐centric approach. Metabolite essentiality analysis is a concept recently introduced for the study of cellular robustness to complement the conventional reaction (gene)‐centric approach (Kim et al, 2007b, 2010b). The conventional reaction (gene)‐centric approach has limitations in understanding cellular robustness because functionally important reactions are usually backed up by alternative pathways; thus, many important reactions are not necessarily lethal to the survival of the cell after knocking out significant portion of genes. In contrast, metabolite essentiality analysis observes changes in flux distribution by removing each metabolite from the in silico metabolic network. As a result, major and rerouting reactions relevant to the metabolite to be removed are all deleted from the cellular network. If this removal causes zero cellular growth rate, then the robustness of the cell is deemed completely disrupted.
Besides its role of elucidating cellular robustness, metabolite essentiality analysis can greatly facilitate drug screening and design process. Metabolite essentiality analysis predicts essential metabolites whose absence causes cell death. By identifying the essential metabolites, it is possible to focus first on their structural analogs, instead of large random chemical libraries, which substantially reduce the number of chemical compounds to screen. Rationale for this approach is well supported by Dobson et al (2009), who reported that metabolite‐likeness is a good criterion in designing a drug. Also, it would facilitate rational design of drugs on the basis of the structures of essential metabolites.
In this study, metabolite essentiality analysis was performed for the in silico cell VvuMBEL943 growing in an arbitrary complex medium, in which cells are allowed to simultaneously uptake all the nutrients when respective transporters exist. This is to account for the effects of various carbon and nitrogen sources available inside the host (e.g., human) for V. vulnificus, and to predict solid drug targets over a wide range of conditions because the essentiality of metabolic reactions can be conditionally different. As a result, 193 metabolites were identified to be essential to the cell (Figure 3 and Supplementary Table VII). It is obvious that larger number of essential metabolites will be predicted if fewer input transport reactions are made available during the simulation, and the results will become more condition‐dependent.
For the characterization of this initial set of 193 essential metabolites, they were grouped into their relevant submetabolism, and charted on the graph as functions of the portion of essential metabolites, average number of reactions that surround the essential metabolites, and average essentiality of reactions constituting the submetabolisms (Figure 4 and Supplementary Table VII and VIII; see Materials and methods). In this analysis, cofactor and vitamin metabolism contained the largest number of essential metabolites, followed by amino acid and lipid metabolisms, which are indicated with numbers in the parentheses for each submetabolism (Figure 4). Meanwhile, lipid metabolism topped average essentiality of metabolic reactions, while nucleotide metabolism had the most abundant reactions surrounding essential metabolites. As large sets of essential metabolites were found in the metabolisms of cofactors, vitamins, lipids, and amino acids even when these compounds (or precursors) were provided in the arbitrary complex medium (simulating human body condition), there is a high probability of identifying drug targets in these metabolisms. Furthermore, this analysis shows that the portions of essential metabolites (third numbers for each submetabolism in Figure 4) and the average essentiality of reactions in each submetabolism were overall well correlated. The submetabolism containing a larger portion of essential metabolites, represented by a bigger bubble size in Figure 4, tends to have a higher average essentiality of reactions. Finally, nucleotide metabolism has the largest average number of reactions surrounding essential metabolites, probably because it includes the nucleotide salvage pathway, which is highly interconnected, and possesses several important energy carriers, such as ATP. Such high value in this category leads to the possibility that one of those surrounding reactions might also be present in the host organism, including human. In fact, as discussed later, no final essential metabolites were selected from this nucleotide metabolism partly because of this reason. Taken together, analysis of essential metabolites provides several important features of each submetabolism in this prokaryotic organism, in particular with regard to the drug target selection. In addition, synthetic lethality caused by essential metabolites was predicted, which showed six sets of synthetic lethal reactions (Supplementary Table IX).
Essential metabolites of V. vulnificus predicted with VvuMBEL943 were compared with two Gram‐negative bacteria, E. coli and Helicobacter pylori whose genome‐scale metabolic networks were previously reported (Reed et al, 2003; Thiele et al, 2005). The percentage of essential metabolites in V. vulnificus was 25.2%, which is comparable to that of E. coli (24.8%), but is much lower than that of H. pylori (40.9%). The smaller genome size (1.7 Mbp) of H. pylori seems to be the reason for the higher percentage of essential metabolites. In addition, 98 essential metabolites predicted in V. vulnificus were also predicted as essential in both E. coli and H. pylori. These 98 commonly predicted essential metabolites are those highly associated with biomass composition (Supplementary Table X). The numbers of essential metabolites exclusively identified in E. coli and H. pylori were 21 and 17, respectively.
As not all enzymes associated with essential metabolites satisfy the requirements to be good drug targets, this initial set of essential metabolites were filtered on the basis of predetermined criteria, mainly organism specificity and maximal impact on the biological robustness, in order to reduce the number of initial target candidates while identifying the most effective ones (Figure 3) (Kim et al, 2010a). The first step is to remove so‐called currency metabolites, such as ATP and NADH, which universally exist across organisms; 30 essential metabolites were removed from the candidate list. Next, metabolites consumed by multiple reactions (⩾2) were only considered, removing 81 metabolites. Selection of essential metabolites consumed by multiple reactions is based on the rationale that a drug mimicking the structure of the essential metabolite will have a higher chance of binding to multiple enzymes involved in multiple reactions. This means that it might be possible to reduce the chance of resistance development by single enzyme mutation in the pathogenic microorganism (Martinez and Baquero, 2000; Silver, 2007).
The last two steps are the comparison against human metabolism to avoid possible side effects of drugs attacking metabolites or homologous enzymes in the human body. Essential metabolites were removed if they are present in the comprehensive human metabolic network (Duarte et al, 2007), causing an additional 66 metabolites to be removed. Also, BLASTP was performed on the enzymes consuming the remaining essential metabolites against human proteins to remove homologous ones (E‐value smaller than 0.01). As a result, five essential metabolites were predicted, including 2‐amino‐4‐hydroxy‐6‐hydroxymethyl‐7,8‐dihydropteridine (AHHMP), d‐glutamate (DGLU), 2,3‐dihydrodipicolinate (DHDP), 1‐deoxy‐d‐xylulose 5‐phosphate (DX5P), and 4‐aminobenzoate (PABA) (Table III and Figure 5). All these essential metabolites possess two outgoing reactions, which have mostly been demonstrated to be good drug targets through experimental evidence. Interestingly, DX5P is a critical intermediate for isoprenoid biosynthesis and peptidoglycan assembly in bacteria, and thus its surrounding enzymatic reactions have been suggested to be promising drug targets, although underexplored yet (Walsh, 2003).
Experimental validation of drug targets
The five finally selected essential metabolites, which were predicted through the metabolite essentiality analysis and subsequent filtering process using the criteria of organism specificity and maximal impact on the biological robustness, have a total of nine consuming reactions catalyzed by seven enzymes, each encoded by a single gene (Figure 5 and Table III). These seven genes were selected as drug targets, and subjected to further experimental validation. It is interesting to note that all of these genes are located on chromosome 1, which is known to contain mostly housekeeping genes as discussed above.
To validate the essentiality and lethality of the final drug targets, each gene was knocked out by the insertion of a suicide vector in the chromosome or by in‐frame deletion by using the allelic exchange method (Materials and methods). The essentiality of the gene was judged by whether or not a specific mutant for each gene is created (Figure 5). If any target gene is essential for the survival of V. vulnificus, introduction of an insertion or deletion mutation to the gene should be lethal. For the five target genes, VV10567, VV11175, VV11644, VV11691, and VV11866, it was not possible to create insertion or deletion mutants by any means after massive experimental trials. For glutamate racemase (VV11175), it was originally predicted to be nonessential in silico because VvuMBEL943 obtains DGLU from the arbitrary complex medium, which, at first glance, appears to contradict our experimental results. Yet, gene knockout experiments were performed in a rich (heart infusion) medium, which is composed of eukaryotic ingredients lacking DGLU. According to this, essentiality of VV11175 was predicted again with the transporter for DGLU uptake blocked in VvuMBEL943, and VV11175 was then predicted to be essential, which becomes consistent with the experimental data. Re‐simulation of VvuMBEL943 with blocked consumption of DGLU also predicted the same 193 essential metabolites. For two other target genes, VV10580 and VV11568, deletion mutants could not be obtained, but insertion mutants could be generated; these insertion mutants showed serious growth defect and formed tiny colonies on heart infusion agar plates. Taken together, seven essential drug targets identified by the metabolite‐centric approach together with filtering criteria could be experimentally validated to be lethal in V. vulnificus upon their removal.
Next, gene (reaction) essentiality simulations were also performed for the seven drug target genes identified by metabolite essentiality and subsequent analyses. All genes except for VV11644 were predicted to be essential by in silico gene essentiality simulations. On the other hand, all seven genes were found to be essential by experiments. There seems to be a nutritional component in the arbitrary complex medium that is absent in the heart infusion medium, which made VV11644 nonessential in silico. Finally, although the confidence level of the results of the above gene knockout experiments is high as massively repeated experiments have been performed, they are so‐called negative experiments (e.g., no mutant generation upon gene knockout). Thus, positive killing experiments using drug‐like molecules targeting the candidates were also performed.
Screening of drugs based on essential metabolites
On the basis of the structural information on the five essential metabolites predicted above, whole‐cell screening was performed using their analogs for possible antibacterial discovery. In this study, screening only structural analogs of the essential metabolites, rather than a whole chemical compound library, reduces the typical number of chemical compounds to be screened through high‐throughput assays. Furthermore, recent analyses indicating that effective drugs are structurally highly similar to endogenous metabolites helped to motivate this approach, in which compounds to screen were selected based on their structural similarity to our predicted essential metabolites (Dobson et al, 2009).
For the identification of potential antibiotics that can efficiently kill V. vulnificus, we screened 352 chemical compounds, which are structurally similar to one of the five essential metabolites, obtained from Korea Chemical Bank at Korea Research Institute of Chemical Technology (Daejeon, Korea). Among several hit compounds, compound 24837, an analog of PABA, was the most potent in terms of cell inhibition capability (Figure 6A). An in vitro enzyme assay confirmed that compound 24837 targets dihydropteroate synthase (Supplementary Figure 2).
In order to more precisely characterize compound 24837, a time‐kill assay was performed. As a result, the minimal inhibitory concentration (MIC) and the minimal bactericidal concentration (MBC) of compound 24837 were determined to be 2 and 4 μg/ml, respectively (Figure 6B). No viable cells were observed in 3 h after the treatment with 4 μg/ml of compound 24837, while cells (1.8 × 107 CFU/ml) were completely killed within 1 h when treated with 8 μg/ml of compound 24837. The steepness of killing curves obviously shows that compound 24837, selected from our systematic approach, has good bactericidal activity against the pathogen V. vulnificus. Meanwhile, sulfamethoxazole, an existing chemotherapeutic agent for V. vulnificus, which also interferes with folate biosynthesis, failed to inhibit the growth even at 8 μg/ml (Figure 6B). In the cases of the existing antimicrobials, the MBC ranges of gentamicin and enoxacin for susceptible E. coli strains are 1.2–1.9 and 0.6–0.8 μg/ml, respectively (Liu et al, 2004), while the MBC range of vancomycin against methicillin‐resistant Staphylococcus aureus is 0.5–2.0 μg/ml (Vidaillac et al, 2009). Thus compound 24837, without further improvement, is found to possess sufficient antibacterial activity against V. vulnificus. Although the small‐molecule drug identified here needs to be further optimized by considering other aspects of good antibiotics, the proof‐of‐concept approaches taken in this study can serve as a general strategy for discovering novel antibiotics and drugs based on systems‐level analysis of metabolic networks.
In this study, we reported a systems biological approach for characterizing an organism and transforming genomic information into efficient drug discovery. The V. vulnificus genome was partially re‐sequenced and fully re‐annotated as accurate initial genomic information is crucial for a better understanding of this organism, as well as subsequent metabolic modeling and prediction of drug targets. Then, the genome‐scale metabolic network VvuMBEL943 was reconstructed and used to predict drug targets through a metabolite‐centric approach. Finally, structural information on five essential metabolites was used to select structural analogs for whole‐cell antibiotic screening, resulting in several hit compounds, one of them being compound 24837, which showed a MIC of 2 μg/ml.
The use of the systems biological approaches, especially reconstruction and simulation of genome‐scale metabolic networks, appears to be very effective during the early phase of drug discovery because it systematically predicts condition‐independent drug targets at agreeable accuracy and in a cost‐effective manner, as demonstrated in this study. Also, metabolite essentiality analysis was proven to be useful for the initial selection of structural chemical analogs and subsequent whole‐cell screening. Furthermore, it should be possible to rationally design a drug based on the structure of essential metabolites, which could simultaneously disrupt their associated outgoing reactions (Dobson et al, 2009; Kim et al, 2010b).
It should be noted that genes excluded from our final drug target list do not necessarily mean that they are completely disqualified as drug targets. In fact, the criteria for selecting final drug targets in this study might have been rather strict, so that well‐known target genes were unexpectedly filtered out. For instance, the pyrH gene encoding UMP kinase, which was previously validated as an essential in vivo survival factor (Lee et al, 2007), was not considered for the final target gene because the reaction catalyzed by this gene product interacts with currency metabolites, including ATP, ADP, UMP, and UDP despite their essentiality in the organism. The corresponding reaction (R342, PyrH) was predicted to be essential in silico (Supplementary Table VIII) and humans do not have this enzyme, suggesting it to be an ideal target as well. Hence, enzymes associated with discarded essential metabolites are also worth considering, yet different drug discovery strategies need to be undertaken, for instance rational design of the drug that binds to the target enzyme instead of analog‐based screening so as not to disturb essential metabolites present in humans. Moreover, new criteria that also consider genes preferentially expressed in vivo upon infection (Kim et al, 2003) need to be taken into account.
Although metabolic genes of V. vulnificus are distributed both in chromosome 1 and 2, it is intriguing to find that 23.8% of metabolic genes in chromosome 1 (123 out of 517 genes) were predicted to be essential during the cell growth in an arbitrary complex medium, while only 11.5% of those in chromosome 2 (18 out of 156 genes) were essential (Supplementary Table VIII). Also, the final drug target genes listed in Table III are exclusively encoded by chromosome 1. This supports the hypothesis that smaller chromosomes, namely chromosome 2, of Vibrio species might have evolved from a plasmid origin because they possess fewer essential metabolic genes compared with chromosome 1, and lack authentic chromosomal replication origins (Heidelberg et al, 2000). The primordial plasmid seems to have evolved to the second chromosome by expanding its genomic repertoire through gene acquisition by horizontal gene transfer. For the same reason, chromosome 1 seems to be the primordial genome because it contains a higher portion of essential metabolic genes or housekeeping genes necessary for the cellular survival. This type of analysis will be useful for the studies on essential metabolic genes in microorganisms having multiple chromosomes.
Two existing antibacterials, sulfamethoxazole and trimethoprim, which are known to be bacteriostatic, deserve further discussion in association with compound 24837. Sulfamethoxazole structurally competes with PABA, while trimethoprim specifically inhibits dihydrofolate reductase. Their synergistic combination occasionally shows bactericidal activity by inhibiting sequential enzymes of folate biosynthetic pathway (Allan and Moellering, 1985; Zander et al, 2010). However, compound 24837, a structural analog of PABA, screened in this study showed bactericidal activity against V. vulnificus CMCP6 without combination with other inhibitors. This unexpected finding goes against the long‐lasting idea that PABA analog interfering with folate biosynthetic pathway is bacteriostatic, and suggests that final targets predicted from our systems‐oriented drug‐targeting studies could serve as bactericidal targets.
Finally, further studies need to be done on the proof‐of‐concept hit compound 24837 and the strategy presented in this study. Although an in vitro enzyme assay showed that compound 24837 inhibits dihydropteroate synthase, it might also bind to other enzyme targets, in which case several system‐wide approaches, for instance omics techniques (Freiberg et al, 2004), can be undertaken. Also, it is necessary to examine whether the strategy employed in this study can efficiently identify bactericidal compounds for other targets as well, and whether the identified hit compounds are feasible for future optimization. After all, all these efforts would contribute to filling the gap between genomic studies and drug discovery.
Materials and methods
Bacterial strains and growth conditions
V. vulnificus CMCP6 is a clinical isolate from Chonnam National University Hospital, and was used throughout this study. Bacterial cells were grown in 2.5% NaCl M9 medium with glucose or glycerol as a sole carbon source for experimental validation of the constructed VvuMBEL943. They were also grown in heart infusion or Mueller Hinton medium for specific studies as detailed below.
Genome sequence update, re‐annotation, and analysis
After our initial submission of V. vulnificus CMCP6 genome data to GenBank in 2002 (AE016795 and AE016796), mainly based on COG annotation, the genome sequence and annotation were substantially improved. First, re‐sequencing of low‐quality and/or low‐depth regions was performed by Sanger sequencing of PCR products directly amplified from the genomic DNA. Editing of annotation accompanying the sequence correction, as well as modification of translation initiation sites inferred from the BLAST analysis results against known protein sequence database were conducted manually. Hypothetical genes overlapping with other known genes or ones with extrinsic evidence were chosen and reassigned with putative gene names. The final set of protein sequences was searched against TIGRFAMs (Haft et al, 2003) and COG (Tatusov et al, 2003). The entire genome sequence was also subjected to automatic annotation by the RAST server (Aziz et al, 2008). When assigning functional information to each gene, the highest priorities were given to the TIGRFAMs search results. Finally, the remaining genes were given names after the automatic RAST annotation. As the RAST server with its own gene prediction method was used, information was assigned only when the gene calls fully overlap with original ones. The species of the family Vibrionaceae chosen for the comparative analysis were V. parahaemolyticus RIMD 2210633 (Makino et al, 2003), V. cholerae N16961 (Heidelberg et al, 2000), V. fischeri ES114 (Ruby et al, 2005), V. harveyi ATCC BAA‐1116, and V. splendidus LGP32 (Le Roux et al, 2009). Protein sequences were clustered using the BLASTCLUST program of NCBI.
Reconstruction of V. vulnificus genome‐scale metabolic network
The genome‐scale metabolic network was reconstructed according to the procedure previously described (Kim et al, 2007c; Thiele and Palsson, 2010). Reactions and metabolites incorporated in VvuMBEL943 are shown in Supplementary Tables III and IV. Briefly, the initial version of the network was first reconstructed based on the metabolic pathways of V. vulnificus present in the KEGG (Kanehisa et al, 2006). Even though the metabolic information of V. vulnificus present in the KEGG might show some discrepancies with our genome annotation data, those maps were first adopted to generate comprehensive overall picture of the metabolic network. Then, the metabolic network was refined based on the information present in MetaCyc database (Caspi et al, 2008) and the genome annotation data to confirm reaction reversibility, gene‐to‐protein relationships, and predicted functions for each gene incorporated in the network. In the same way, transport reactions were incorporated based on the best available information from literature and TransportDB (Ren and Paulsen, 2005). Also, biomass composition was determined based on literature information except for the compositions of amino acids and fatty acids, which were experimentally determined in this study (Supplementary Table V). Finally, the network was validated by comparing simulation results with experimental results from literature, and accordingly further refined so as to use it for in silico drug‐targeting analysis. The metabolic network was reconstructed using MetaFluxNet (Lee et al, 2003), and was converted into GAMS (GAMS Development Corp., Washington DC, USA) for subsequent drug‐targeting simulations. The SBML version of VvuMBEL943 was deposited to the BioModels Database (accession number: MODEL1011300000).
Constraints‐based flux analysis for model validation and simulation
Constraints‐based flux analysis is an optimization‐based simulation technique based on the mass balance of intracellular metabolites (Kim et al, 2008; Park et al, 2009). A set of mass balances around metabolites yields a stoichiometric model Sij·vj=0 with an assumption of pseudosteady state, in which Sij is a stoichiometric coefficient of a metabolite i in the jth reaction, and vj is flux of the jth reaction given in mmol/gDCW/h. The metabolic network constructed in this study is an underdetermined system, having infinite sets of solutions. Thus, linear programming (LP) was performed by maximizing the cell growth rate, represented by the overall macromolecule composition of the cell (Supplementary Table V), with constraints of mass conservation, reaction thermodynamics and metabolic capacity for each reaction in order to calculate intracellular fluxes as follows:
where bi is the net transport flux of metabolite i, and is zero if the metabolite is an intermediate. αj and βj are the lower and upper bounds of the flux of the jth reaction, respectively. Hence, αj is set to zero for irreversible reactions, while unconstrained for reversible reactions. Uptake rates of nutrients were assumed to be <2 mmol/gDCW/h, based on the growth characteristics of V. fischeri (Droniuk et al, 1987). For simulation of VvuMBEL943 under the undefined complex media, including heart infusion and Luria‐Bertani (LB) broth, uptake rates for the nutrients, mostly amino acids and nucleotides, examined to be present in those complex media were constrained to be <2 mmol/gDCW/h if their associated transporters are part of the network model. Except for the validation simulations, all the simulations were performed under the arbitrary complex medium.
For metabolite essentiality, each metabolite was removed one by one. This was simulated by deleting all the outgoing (consuming) or incoming (producing) reactions around the metabolite to be removed. Details can be found from our previous paper (Kim et al, 2007b). If removal of a certain metabolite leads to zero cell growth, this metabolite is deemed essential (Supplementary Table VII).
Analysis of essential metabolites
For the analysis of essential metabolites presented in Figure 4, 193 essential metabolites were grouped into their relevant submetabolism, which refers to the metabolism that contains a biosynthetic pathway for a corresponding metabolite. For instance, NAD was included in metabolism of cofactors and vitamins as this submetabolism contains biosynthetic pathways for NAD. In this analysis, all nucleotides, including ATP, were grouped into nucleotide metabolism, and oxidized and reduced thioredoxin (OTHIO and RTHIO), CO2, NH3, and Pi were not considered, as they do not belong to a particular metabolism. The portion of essential metabolites in each submetabolism was calculated by dividing the number of essential metabolites by the total number of metabolites participating in the submetabolism. Average essentiality is the average value of essentiality (=1−gΔ/g) of reactions in the submetabolism where gΔ and g each indicates the predicted cell growth rate of the single reaction mutant and the wild‐type using constraints‐based flux analysis under the arbitrary complex medium (Edwards and Palsson, 2000; Guimera et al, 2007). To calculate the growth rate of the single reaction mutant, each reaction was deleted one at a time by setting the reaction flux to zero while maximizing the biomass formation equation (Supplementary Table VIII).
Construction of predicted drug target gene mutants of V. vulnificus by homologous recombination
To verify essentiality of predicted drug target genes, specific mutants of the genome project strain V. vulnificus CMCP6 were constructed both by the insertional inactivation with a suicide vector and the in‐frame deletion by using allelic exchange (Miller and Mekalanos, 1988). For the insertional inactivation of the drug target genes, truncated drug target gene fragments were amplified by PCR and ligated into the suicide vector pDM4 (Milton et al, 1996). The primer pairs used for the amplification of the truncated DNA fragments of each gene are listed in Supplementary Table XI. The PCR primers had overhangs recognized by one of restriction enzymes cutting the multicloning site of pDM4. The ligated DNA was transformed into E. coli SY327 λ pir. Plasmids amplified in E. coli SY327 λ pir were isolated and transformed into E. coli SM10 λ pir (Miller and Mekalanos, 1988), and subsequently transferred to V. vulnificus CMCP6 by conjugation. Chloramphenicol‐resistant transconjugants having the mobilized plasmid integrated into their chromosome by homologous recombination were selected on thiosulfate‐citrate‐bile‐sucrose (TCBS) agar plates containing 2 μg of chloramphenicol/ml as described previously (Milton et al, 1996). Nonpolar in‐frame deletion mutants of the putative drug target genes were constructed by a modified overlap extension PCR method (Horton et al, 1989). The method consists of two steps. First, we amplified two PCR fragments corresponding to upstream and downstream regions of each target gene by normal PCR reaction. Reverse primer for upstream fragment and forward primer for downstream fragment included DNA sequences (bolded) at the 5′ end that are complementary to each other that would allow joining of the upstream and downstream fragments (Supplementary Table XI). The resulting two fragments were purified by a PCR purification kit (QIAGEN), and then used as templates for next overlap extension PCR reaction. The second PCR was performed using the forward primer of upstream fragment and the reverse primer of downstream fragment having appropriate restriction overhang for cloning, which will generate a fused DNA fragment consisting of up and downstream DNA fragments generated in the first PCR step. The resulting in‐frame deleted amplicons of each of the genes were cloned into pCR®2.1‐TOPO® vector (Invitrogen). The DNA insert having in‐frame deletion of each gene was re‐cloned to pDM4, and the resulting ligate was transformed into E. coli SM10 λ pir. The plasmid in E. coli SM10 λ pir was transferred to V. vulnificus CMCP6 by conjugation, and the transconjugants were selected on TCBS agar plate containing chloramphenicol as described above. The transconjugants were plated onto 2.5% NaCl heart infusion agar plate containing 10% sucrose to select clones that experienced the second homologous recombination event, forcing excision of the vector sequence and leaving only a mutated or wild‐type allele of the genes. Each insertional or in‐frame deletion mutation was confirmed by PCR of the chromosomal DNA from respective mutants.
Whole‐cell screening of structural analogs of essential metabolites
Structural analogs of the five essential metabolites were searched against the in‐house chemical compound library of Korea Chemical Bank at Korea Research Institute of Chemical Technology (Daejeon, Korea) using the Pipeline Pilot 6.0 with MDL Public Keys and Tanimoto coefficient of 0.5 as a cutoff (Accelrys, San Diego, CA, USA). As a result, 352 chemical analogs were identified, and subsequently obtained from Korea Chemical Bank. Whole‐cell screening for the MIC was then performed in accordance with the CLSI guideline (CLSI, 2006). Bacterial culture grown to OD600 of 0.10–0.15 in Mueller Hinton broth were 10‐fold diluted, and 5 μl was inoculated into each well whose working volume was 100 μl in the 100‐well plates. Bacterial cells were then cultured for 20 h and monitored using BioScreen C (Oy Growth Curves Ab Ltd, Finland). Chemical compound or analog was considered active if the maximal concentration of cells treated with the corresponding chemical compound showed more than 80% reduction compared with that of the untreated control sample.
V. vulnificus CMCP6 was cultured in 2.5% NaCl heart infusion broth at 37°C overnight with shaking at 220 r.p.m. Next day, the culture was diluted by 100‐fold with 2.5% NaCl Mueller Hinton broth containing compound 24837 or sulfamethoxazole. Viable cell counting was carried out by the pour plating method at 0, 1, 3, and 6 h after incubation at 37°C without agitation.
We are grateful to Korea Chemical Bank for the preparation of chemical compounds and Dr Yu‐Sin Jang for his valuable comments on the enzyme assay. This work was supported by the Korean Systems Biology Research Project (20100002164) and World Class University program (R322009000101420) of the Ministry of Education, Science and Technology through the National Research Foundation of Korea. JHR was supported by the Regional Technology Innovation Program (No. RTI05‐01‐01) of the Ministry of Knowledge Economy.
Author Contributions: SYL and JHR designed the research, coordinated the project, and analyzed the data. HUK performed metabolic modeling, whole‐cell screening and enzyme assay. HUK and TYK predicted drug targets. HEC and JJK conducted partial genome re‐sequencing, and HJ re‐annotated the updated genome. SYK constructed mutants, and performed various cell growth experiments. KYY developed the chemical analog. HUK, SYK, HJ, JHR, and SYL wrote the paper.
Conflict of Interest
The authors declare that they have no conflict of interest.
Supplementary methods, Supplementary Figures S1–2, Supplementary Tables SI, SII, SV–XI
Supplementary Table III
List of reactions
Supplementary Table IV
List of metabolites
SBML file for Vibrio genome‐scale metabolic model
This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial‐NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.
- Copyright © 2011 EMBO and Macmillan Publishers Limited