While each genome is far from the deterministic blueprint from which an entire life can be mapped out in advance, they do vary significantly from person to person - and the study of that variation has yielded many important insights into both the causes of disease and why some of us are more susceptible to certain diseases than others. At its most extreme, the possession of certain variations in a genome will almost certainly result in a specific disease.

The discovery of disease-causing variation in our genomes is less straightforward that you might imagine. Each person’s genome contains many thousands of variants, most (possibly all) of which are completely harmless. Ironically, in order to understand the significance of the variation in any single genome it must be analysed in comparison to many thousands of others. Context is key. 

This is why the paper published in Nature this week by a global consortium of researchers – lead by the MacArthur group at the Broad Institute – is so important in the quest to provide diagnoses, prognoses and even treatments to those with genetic disease. It brings together the largest ever, openly and freely accessible collection of human exomes (the part of the genome that encodes our genes) and in doing so enables clinical geneticists around the world to compare their patients’ genomes to this collection and to understand whether the variations they contain are likely to be responsible for causing their disease. As the paper shows so elegantly, the key to doing this successfully is to understand how common or rare any individual variant may be. So if a patient with a very rare inherited cardiomyopathy has a genomic variant that ‘damages’ a gene important in heart function, their clinical geneticist can look it up in the ExAc database to find out if the variant is as rare as the disease. Because if the variant is found in lots of other unrelated people’s genomes, people who do not (as far as we can tell) have the rare inherited cardiomyopathy, it is unlikely to be causing this condition in the patient in question. A companion paper from the Watkins group in Oxford uses the ExAc data set to demonstrate this point and, importantly, why lack of a large enough comparison population of genomes can lead to potentially damaging misdiagnosis.

Global diversity – why the whole genome matters

What both the ExAc paper, and another paper published this week, also show is that it is not enough to know whether other people with or without your disease share the same potentially disease causing genomic variant as you to get the right diagnosis. You need to know whether you share a similar genomic ancestry. Why does this matter? Because individual variants in individual genes often do not ‘act’ in complete isolation from one another. They encode proteins that act as part of complex, interdependent biological pathways. 

Context matters. Understanding your disease requires finding not just other people with your disease, but other people with genomes like yours. This is the essence of personalised medicine. Indeed, the paper in NEJM demonstrates this clearly for Americans of African vs European ancestry, where the lack of ‘African’ context in the genome databases used for context in diagnosing inherited cardiomyopathy has led to African Americans being disproportionately likely to receive an incorrect genetic diagnosis for this disease. Crucially,that this does not mean clinicians can use observed or reported ‘race’ as a proxy for genomic ancestry. Subjective assessment of skin colour (or other physical, geographical or cultural attributes) often correlates very poorly with objectively measured genomic ancestry and relatedness. The only way to place us accurately in the right context, is to analyse our genome.

Lessons for global and national health policy

These excellent papers offer at least three important lessons for both global and national health policy makers:

  1. Share your data, openly and quickly! The ExAc data set aggregates data from more than 20 groups of researchers, and almost 100,000 people from around the world. The willingness of the participants in their research to share their data, and the willingness of the researchers to do the same (not something to be taken for granted), will save countless lives. Policy makers must understand that this is possible not only because the data is aggregated into a large centralised resource (making it easier to use) but also because it is OPEN. Any clinician or researcher, anywhere in the world can access this data, freely. The continued insistence of many national and international genome projects, including the UK’s own 100,000 Genomes Project, on restricting access to their data to assuage concerns about patient confidentiality and to enable commercial gain, is a great loss for patients worldwide who could and should benefit from this data - as they already do from ExAc - now. Thankfully, within some health systems recognition is growing of the necessity to share openly the wealth of clinically acquired genetic and genomic data they possess in order to deliver safe and effective genomic medicine. However, too many private and, to their shame, public health systems and laboratories are dragging their feet on this vital issue, and patients are continuing to lose out as a result. 
  2. Size does matter – It is currently quite fashionable (in some health policy circles) to sneer at talk of the power of ‘big data’ and to decry the value of combining large independently collected, variable datasets. The ExAc dataset shows that, for genomic medicine at least, size really does matter. Time and time again the paper describes important information that was previously unobtainable or validates hypotheses that were previously untestable, simply because ExAc is bigger than the previously available datasets of this kind.
  3. Diversity matters – This is true across healthcare, but is clearly illustrated by the struggles of genomic medicine to account for diversity. As noted by the ExAc paper and many others, the systematic underrepresentation of certain genomically distinct populations in databases of genomic variation means that individuals from those populations are poorly served by the genomic medicine services of today. Globally, there must be a focus on filling the gaps in our knowledge of genome diversity from around the world, and nationally health systems must focus on ensuring that they have access to the knowledge of genomic diversity and its impact on health necessary to deliver high quality care to their entire population, not just the ethnic majorities within them.