Global systems biology, personalized medicine and molecular epidemiology

Jeremy K Nicholson

Author Affiliations

  • Jeremy K Nicholson, 1 Department of Biomolecular Medicine, Faculty of Medicine, Imperial College London, South Kensington, London, UK

Systems biology in individuals and populations

One of the great challenges for 21st century medicine is to deliver effective therapies that are tailored to the exact biology or biological state of an individual to enable so‐called ‘personalized healthcare solutions’. Ideally, this would involve a system of patient evaluation that would tell clinicians the correct drug, dose or intervention for any individual before the start of therapy. A practical approach to this evaluation is the concept of patient stratification in which individuals are biologically subclassified (classically according to some genetic features) and biofeatures modelled in relation to outcome. In principle, such stratification for personalized therapy can be applied to drug safety and efficacy modelling and to more general healthcare paradigms involving optimized nutrition and lifestyle management.

Of course, truly personalized treatments, even if they can be developed and applied widely, will lamentably always be a luxury of the worlds' richest citizens and nations. So in some respects, personalized healthcare might appear to be at the opposite end of the medical spectrum to the subject of epidemiology in which disease risk factors and disease incidence are studied in populations rich and poor alike. Systems biology provides us with a common language for both describing and modelling the integrated action of regulatory networks at many levels of biological organization from the subcellular through cell, tissue and organ right up to the whole organism. The relatively new science of molecular epidemiology concerns the measurement of the fundamental biochemical factors that underlie population disease demography and understanding ‘the health of nations’ and this subject naturally lends it to systems biology approaches. Hence, systems biology is certain to have in future a major role in both the development of personalized medicine and in molecular epidemiological studies.

Populations are, of course, made up of individuals and, in principle, there are important unifying features that can be considered from a systems perspective in which biological parameter variability in individuals and their statistical description in large populations can be used to interrogate the outcomes of therapeutic interventions and global patterns of disease distribution. Personalized healthcare and molecular epidemiology are thus effectively two sides of the same ‘systems biology coin’; the essential differences are with respect to the type of medical end points or outcomes that are to be modelled (Figure 1). Metabonomics (see Box 1 for definitions of terms) offers a practical approach to measuring the metabolic end points that link directly to whole system activity and metabolic profiles are determined by both host genetic and environmental factors (Nicholson et al, 2002).

Figure 1.

Relationships between systems biology, personalized healthcare and molecular epidemiology (dotted lines indicate indirect connections or influences).

Embedded Image

The majority of personalized approaches have so far been mainly based on measuring genotype variations relating to polymorphisms in drug‐metabolizing enzymes such as cytochrome P450 isoenzymes and N‐acetyl transferases (Meyer and Zanger, 1997; Eichelbaum and Burk, 2001; Srivastava, 2003). As there are many of examples of adverse drug reactions being linked to specific enzymatic deficiencies or mutations (Meyer and Zanger, 1997; Eichelbaum and Burk, 2001; Srivastava, 2003), it seems perfectly reasonable to pursue genetically based personalized medicine strategies. However, pharmacogenomic results have thus far proved to be surprisingly disappointing, partly because of the unexpected complexity of the human genome and the difficulties in accurately and unequivocally describing human genotypes and phenotypes (Nebert and Menon, 2001; Nebert et al, 2003; Nebert and Vessell, 2004). Moreover, when considering the wider aspects of human health, it is clear that most major diseases are subject to strong environmental influences, and the majority of people in the world die from what are, in the broadest sense, environmental causes. At the personal level external influences also affect drug metabolism and toxicity, and individual outcomes of a drug intervention are the result of conditional probabilistic interactions between complexes of drug‐metabolizing enzyme genes, a range of metabolic regulatory genes and environmental factors such as diet (Nicholson et al, 2004).

Even the basic concept of a ‘specified’ human population is actually confusing and has often involved ill‐defined notions of ethnicity, which are associated with historical culturally biased thinking rather than the genuine and usually small genetic differences between human population groups. The overall lack of genetic variation between populations is remarkable in itself and this is a consequence of humans having moved out of Africa only ca. 100 000 years ago. Thus, according to microsatellite studies, only about 5–10% of the total human genetic variance actually occurs between populations or ethnic groups (Cavalli‐Sforza and Feldman, 2003). Of course phenotypically, population subgroups around the world vary widely, as do human disease distributions that are related to diet and environmental factors. There are also well‐known differences in drug metabolism (and hence toxicity potential) associated with variations in human genotype and phenotype at both individual and population levels (Meyer and Zanger, 1997; Eichelbaum and Burk, 2001; Nebert and Menon, 2001; Nebert et al, 2003; Srivastava, 2003; Nebert and Vessell, 2004). Obviously, there are many connections between the health of general populations and that of the individuals that make them up, and so it is useful to consider this from a molecular systems biology viewpoint (Figure 1). However, measurement of parameters that relate system level activities to drug interventional outcomes is practically highly limited in applications involving large‐scale human populations (Box 2). Population stratification (in the epidemiological rather than personal sense) according to age, gender, diet, ‘ethnicity’ and socioeconomic factors is complicated by the fuzziness of some of the classes, and this complicated modelling of these features in relation to systems biology (omics) metrics. Thus, bridging the subjects of personal healthcare and population epidemiology via system biology will require a pragmatic and practical approach, which leads us to the concept of ‘top‐down’ systems biology and the derivation of metabolic parameters of ‘global’ system function.

Embedded Image

‘Top‐down’ systems biology and metabonomics

We have been advocating the use of metabolic measurement at the system level utilizing metrics obtainable from biological fluids such as urine and plasma for many years (Nicholson and Wilson, 1989). A particular advantage of biological fluid monitoring or screening is that it is minimally invasive or non‐invasive and can be applied on a large scale for human population phenotyping (Nicholson et al, 1999, 2002). The science of metabonomics deals with understanding metabolic changes of a complete system caused by interventions (Nebert et al, 2003; Dumas et al, 2006a, 2006b) and in particular we have noted that metabolic end points are the result of gene–environment interactions in their broadest sense, including extended genome and parasitic interactions (Wang et al, 2004; Dumas et al, 2006a, 2006b; Martin et al, 2006). We have previously outlined our ideas about conditional probabilistic (Bayesian) interactions between genes and environment with respect to adverse drug reactions in individuals and have suggested a hypothetical (Pachinko) model to help study and visualize these interactions (Nicholson and Wilson, 2003). In the Pachinko model, a popular Japanese pinball machine game is used as a metaphor to underscore the idea that metabolic fate results from a sequence of conditional probabilistic interactions between metabolites and components of the cellular biochemical network. In particular, drug molecules can be thought of as a tumbling shaped charge represented as a ball in the machine. Each ball (drug molecule) hits pins (representing drug‐metabolizing enzymes—the exact position of which would analogously vary with SNP variations), which transforms the molecule sequentially and so alters its course through the machine (cell/body). Eventually, the drug is metabolized to a state that readily leaves the body and so the exits from the machine at hypothetical ‘excretion points’. The behaviour of each individual ball is difficult to predict, but the probabilistic path of the whole population can be modelled. Thus, the environmental interaction components, for example, gut microbial metabolites, chemicals or dietary compounds, can also be visualized as other balls or shaped objects tumbling through the Pachinko machine. These agents may then block or interfere with or even enhance the drug metabolism pathways. This equates to altering the probabilities of metabolic flow through the system, and the resulting changes in the pathway utilization may be modelled using Bayesian methods. These gene–environment interactions can result in many outcomes—some of which may generate metabolites that cause cellular damage or idiosyncratic (unpredictable) toxicity. Related to this is our concept of the ‘conditional metabolic phenotype’ or CMP (Nicholson et al, 2005) in which both genetic factors and exogenous factors, such as diet, exposure to foreign chemicals and so on, interact to determine the possible outcomes of a drug or dietary intervention (Nicholson et al, 2004, 2005; Dumas et al, 2006a, 2006b). The most important feature of the CMP concept is that it represents a starting point of an individual in a multivariate metabolic space that is the result of the combination of many physical, chemical, genetic and environmental influences. We have hypothesized that it is the starting position irrespective of the relative contributions of the individual ‘vector’ components that determines the outcome of an intervention (Nicholson et al, 2004) and this is exactly the basis of the personalized healthcare paradigm.

So how do we start to apply these ideas to real systems? ‘Bottom‐up’ modelling approaches if viewed in the cold light of day can never really work in the world of gigantic human phenotypic variability. Indeed even in vitro to in vivo extrapolations of drug metabolism and toxicity within one species are notoriously unreliable, and ‘bottom‐up’ systems biology modelling poses a vastly more complex challenge because most of the quantitative features needed to make reasonable cellular models are simply not measurable in ‘intact’ humans. So approaches appear to work very well in the systems biology of yeast or Escherichia coli cultures are not readily translatable into the modelling of either individual human or population biology.

Pharmaco‐metabonomics and prediction of drug intervention outcomes

In the alternative ‘top‐down’ approach where metrics of the systemic homeostatic activity are obtained, we have now shown a ‘proof‐of‐concept’ of a new ‘pharmaco‐metabonomic’ approach to understanding and predicting interventional outcome of drugs (such as toxicity and xenobiotic metabolism in animal model systems) based on mathematical models of a pre‐dose metabolic profiles (Clayton et al, 2006). In these studies, we investigated the effects of three structurally diverse hepatotoxins in rats (galactosamine, allyl alcohol and paracetamol), which act via different mechanisms, and found that pre‐dose urinary profiles carried information about the degree of post‐dosing toxicity, and in the case of paracetamol information about variation of drug metabolism itself. Pharmaco‐metabonomics is thus the prediction of the outcome of an intervention in individuals based on pre‐dose metabolic state of that individual (Nicholson and Wilson, 2003; Clayton et al, 2006). In a preliminary study on galactosamine toxicity, we found that the responder/non‐responder pattern of liver damage at 24 h post‐dosing was reflected in the pre‐dose metabolic profile of the urine. This was achieved using a simple principal components analysis (which is an unsupervised method that is blind to class in its construction). In a more complex study, a supervised approach, projection‐on‐latent structure method, was used working with animals given a threshold toxic dose of paracetamol that produced a wide range of liver toxicity between individuals (Figure 2). Here, we found once again that there was a significant association between pre‐dose metabolic profile and post‐dose outcome with respect to liver damage severity and indeed to drug metabolism (specifically the paracetamol to paracetamol glucuronide excretion ratio was strongly correlated with pre‐dose urinary metabolite profiles). These studies imply that there may be future possibilities of applying this approach non‐invasively to screening humans in populations. However, practically this is still far off, and we need to extend our knowledge on the relationships between endogenous metabolic status and drug metabolism outcomes for a much wider range of drugs. Of course, there are also significant ethical issues involved with such screening procedures in man. Furthermore, we should not forget that models obtained by the integration of various ‘omics’ approaches (pharmacogenomics, pharmacoproteomics and pharmaco‐metabonomics) may have improved predictive power, which might indeed be required to get personalized healthcare to work in the real world. Indeed, we have recently shown that proteomics and metabonomics can be statistically integrated to produce new trans‐omic combination biomarkers to classify experimental disease states such as xenograft models of prostatic cancer (Rantalainen et al, 2006). However, with current technology, the scale‐up of multi‐omics strategies to man would be impractical and prohibitively expensive.

Figure 2.

Pharmaco‐metabonomic modelling procedure: spectroscopic data on pre‐dose metabolic fingerprints (X matrix) from biofluids such as urine and plasma are statistically linked to outcome (quantitative toxicity (Y1) drug metabolism (Y2) matrixes) of a drug intervention via multivariate statistics such as partial least squares methods. Typically, 20–50% of all data is used in the training set construction. The predictive power of the models is then tested using a test set or a cross‐validation set to assess model robustness. It is also possible as an additional test to avoid overfitting of data, to deliberately permute the training set matrixes to induce a false model that should have a very low predictive capability.

The most likely near‐term implementation of pharmaco‐metabonomics would be in the pharmaceutical industry itself at the clinical trial or development stage when drugs are first going into man. Here pre‐dose metabolic models could be built and then related to quantitative metabolic fates of compounds and any observed adverse reactions. This would then lead to knowledge about the possible contraindications of a particular drug used in certain phenotypic classes of individuals, which is effectively a type of patient stratification. In any case, both early and clinical safety studies would benefit from the improved metabolic descriptions of test subjects (animal or man) and their responses to novel therapeutic agents, good or bad. It must also be said that the pharmaco‐metabonomic concept is not limited just to drug interventions. Effects of dietary modulation, pre‐biotic and probiotic treatments and other lifestyle changes could also ultimately be evaluated in this way. This is important because ‘personalized healthcare’ means different things for different people and, in general populations, it is lifestyle management not drug therapy that is most effective for disease prevention, which of course is better than having to find a cure.

Populations and molecular epidemiology: getting systems biology into man

Getting systems biology out of the laboratory into the more general human population both for screening purposes and in order to understand our own changing health patterns is a formidable challenge. Despite relentless advances in medical technology, many major indications of population morbidity and mortality such as heart disease, diabetes, obesity and cancer (all problems in which genetic and environmental factors are closely entwined) are rising all over the world. Interestingly, many of these diseases may be related to changes in the activities or composition of the gut microbiota (microbiome), which has probably been profoundly affected by our lifestyle changes (especially antibiotic use) over the last 50 or so years. In fact, the microbiome is the exact point where host genetics meets environment and can be considered to be our most integrated and influential ‘environmental’ factor (Nicholson et al, 2005). Given that humans have slowly evolved with this ‘extended genome’ of the microbiota, perturbation of this close association is potentially dangerous and, controversially, may be a root cause of many of our rapidly spreading ‘modern’ diseases (Nicholson et al, 2005). Indeed recent studies by us and others have shown that gut microbiotal variations affect the development of diet‐induced insulin resistance and type II diabetes mellitus (Dumas et al, 2006a, 2006b) and even the development of type I diabetes in experimental animals (Brugman et al, 2006), which until recently was thought of as being related to purely mammalian (human) genome problems. Thus, wherever we turn we see hypercomplexity in disease development and this must be taken into account in systems biology disease modelling if we are ever going to get effective treatments that actually work in man. In examining human populations for molecular epidemiological purposes, it will probably be important to measure metagenomic features of the gut microbiome, which strongly influences exact mammalian metabolic phenotypes of mice and men (Holmes and Nicholson, 2005; Gavaghan‐McKee et al, 2006) and so, using the pharmaco‐metabonomic argument, must also influence disease development and possibly optimized therapeutic interventions in individuals and populations. So as systems biology moves forward with the strong driver of personalized medicine, we will also be able to apply these strategies for looking at the changing demography of human disease around the world. Also the creation of personalized health science for the ‘rich nations’ should hopefully also benefit the people of developing nations, perhaps especially those countries that are trying to Westernize their economies and lifestyles, and in so doing are now acquiring Western disease patterns at an alarming rate.