Exomic sequencing to identify rare disease variants

18 August 2009

With the rapid development and falling costs of next-generation sequencing technologies, suggestions have been made that whole-genome sequencing can be used to identify rare variants that contribute to disease. However, this is not an easy task due to the enormous amount of sequence variation present in individual genomes, which creates a challenge to validating truly disease causing mutations (see previous news). However, concentrating on exomic sequencing (i.e. sequencing only the protein coding regions, or exons) may be an approach that yields better results, by allowing assessment of whether a particular sequence change impacts on protein structure and function. In addition, sequencing the ~1% of the genome used for protein coding will be cheaper and faster in comparison to whole-genome sequencing. A recent paper published in Nature has demonstrated the feasibility of this approach to identify rare variants associated with monogenic disease.

In their study, Ng et al. have carried out targeted sequencing of all of the protein-coding regions of eight HapMap individuals, as well as four unrelated individuals with a rare autosomal dominant disorder – Freeman-Sheldon syndrome (FSS) – to demonstrate an approach for the discovery of rare highly penetrant variants [Ng et al. (2009) Nature doi:10.1038/nature08250]. They enriched the coding sequences from the genomes by targeted capture using microarrays; the captured exomes were then sequenced using high-throughput sequencing. The quality of the exomic data was assessed in a number of ways in order to validate the sensitivity and specificity of the technique in identifying variants.

The candidate gene related to FSS was identified through a number of steps taken to eliminate background non-causal variants. Firstly, the number of genes that had one or more non-synonymous coding SNPs (i.e. those with potentially the highest impact on phenotype), splice site disruptions or coding indels in one or several FSS exomes were investigated. Filters were then applied to remove common variants present in the dbSNP catalogue (a public database of SNPs) or the eight HapMap exomes. This narrowed the possible disease-causing candidates to a single gene, MYH3, which had previously been identified using a candidate gene approach. A disruption of this gene was observed in all four individuals with FSS but not in the dbSNP or the HapMap exomes.

The authors suggest that “direct sequencing of exomes of small numbers of unrelated individuals with a shared monogenic disorder can serve as a genome-wide scan for the causative gene”. They further suggest that this strategy may be easier when applied to recessive diseases, as there are far fewer genes which are homozygous or compound heterozygotes. This strategy may also be applied to complex common diseases, but will require larger sample sizes and a better approach to assessing the impact of the mutation in order to combat increasing extent of genetic heterogeneity. The authors point out that although this approach is useful in discovering causal-variants, one limitation is that is does not identify structural or non-coding variants, which may be found by whole genome sequencing.

More from us

Genomics and policy news