26 August 2014
With the advent of ever faster and cheaper DNA sequencing, the greatest challenge to the delivery of genomic medicine is no longer to provide accessible and affordable genome sequencing (although this is not yet a completely solved problem), but is instead to understand the meaning and consequence of the variation that DNA sequencing detects in the genome. It is straightforward now to catalogue every position in a person’s genome at which they differ from another comparator genome, such as the human reference genome curated by NCBI. What is far from straightforward is, however, the task of understanding what those variations ‘mean’ for our health now and in the future.
Some genetic variations are very well understood and their presence in the genome can be used to accurately predict the likelihood of a person developing a particular genetic disease. For example where a person has the deltaF508 deletion mutation in both copies of their CFTR gene, they are highly likely to develop cystic fibrosis because this small deletion effectively prevents the CFTR protein from functioning properly. There remain, however, many millions of genetic variants for which their effects, either on the proteins they encode or in the case of non-coding DNA the genes that they regulate, cannot be predicted. Consequently we cannot meet the goal of genomic medicine by predicting their effects on the function of our cells, organs and ultimately our health. Notably, such mutations are responsible for many of the rare genetic diseases that remain undiagnosed in the population, and if their effects could be accurately interpreted it is patients with these diseases that stand to benefit most in the early days of genomic medicine, through initiatives such as the 100,000 Genomes Project.
While small steps towards improving the understanding of the effect of mutations on gene function are made through the painstaking experimental evaluation of individual mutations in individual genes, and by correlating their presence with known cases of rare disease, there remains a pressing need to find higher throughput ways to measure and catalogue the effects of the complete range of genetic variation possible within the human genome. It is particularly important to determine the effects of the many rare mutations, which while they may only be detected in one or a few individuals, are more likely to be the causes of the rare genetic diseases that in aggregate affect 3 million people in the UK.
A paper published in Nature last week describes and exciting new approach to overcoming this problem. The authors have used the CRISPR/Cas9 genome editing technique to perform what they term ‘saturation mutagenesis’ of regions of two genes; the BRCA1 gene, mutations in which can increase risk of breast and ovarian cancer, and the DBR1 gene. Their strategy was to simultaneously edit the genomes of hundreds of thousands of cells in culture (all in a single experiment), so that each cell contained a completely normal genome, except that their BRCA1 or DBR1 gene was uniquely mutated at a single position. For example, they generated cells that contained each of the 4096 different possible combinations of six DNA bases at a particular location in the DBR1 gene. Deep sequencing was then used as a measure of whether the mutations had ‘damaged’ the function of the gene by measuring reductions in the level at which it was present in the genomes extracted from the cells.
The brilliance of this technique is that it enables, using existing molecular biology tools, the parallel investigation of the functional consequences of an unprecedentedly large number of genomic variations. This is an exciting advance for genomic medicine in that it opens up the possibility of undertaking saturation mutagenesis in a number of genes known to be associated with disease, but in which much of the variation remains ‘of unknown significance’. The functional insights this would provide will, it is hoped, allow definitive interpretation of patient’s mutations and the delivery of accurate molecular diagnoses.
The authors do, however, sound several notes of caution, including warning that the efficiency of genome editing remains low, meaning most cells in these experiments have un-edited genomes and so detecting the effects of mutations in the edited cell population is challenging. It is also important to note that while genome editing targeted to a single location is possible, this advance will need to be matched by the availability of functional assays to determine whether the mutation introduced truly affects the function of the gene. In some cases this will be a straightforward case of sequencing and quantification, for example where mutations affect the transcription of a gene, in others, where the mutation has more subtle effects on the structure or function of the protein itself, these assays may need to be bespoke for each gene, introducing significant cost and complexity back into the process.
Finally it is worth recalling that humans are the product of the combination of all of the effects of the variations in our genome, combined with the effects of our environment, and that whilst the step-wise isolation of the effects of any one of these variants may be immensely powerful, it will remain important to place that variant back in the wider genomic and ‘life history’ context of each individual in order to predict its effect as accurately as possible.