Google Genomics collaborates to expand research resource

17 March 2015

Google Genomics is set to benefit from a new collaboration with Tute Genomics that will provide a database of annotated genetic variants.

Annotation is the essential next stage in genome analysis following genome sequencing. It is an automated process driven by special computational algorithms that begin to make sense of the raw sequence data, identifying all the coding regions within the genome sequence and correlating them with the known products and functions of each gene, including any known involvement of specific genetic variants in disease.

Google’s cloud-based genomic data facility, established in 2014, provides an application programming interface (API) to construct software tools for data analysis; the goal of Google Genomics is said to be to ‘empower fast and actionable analyses of massive genomic datasets’ by providing storage, analysis and sharing facilities. Tute Genomics is a company that provides a proprietary clinical genome interpretation platform to support diagnostic analysis.

Now 8.5 billion genomic annotations provided by Tute will be made publicly available via Google Genomics, allowing researchers to use this data to help process their own (for a modest fee). Google Genomics already provides access to genomic data from the public 1000 Genomes Project, an autism research initiative and commercial sequencing supremo Illumina.

Tute Genomics’ CEO Dr Reid Robison said: "The time is coming when genome sequencing will be part of routine clinical care, and open access to genetic variant databases is a necessary step in order to accelerate progress towards precision medicine". This is true in that following on from genome sequencing and annotation, a process of clinical interpretation is necessary to decide which variants may be pathogenic (disease-causing). To some degree this process can be automated using databases of already characterised pathogenic variants – and the more extensive and reliable the database, the more robust this process will be. However, there are many variants of unknown clinical significance that require careful expert evaluation to weigh their probable or possible roles in a given patient; this process cannot be automated.

To learn more about the elements of genomic data analysis and interpretation, see our free briefing Defining the role of a bioinformatician.