University of Cambridge logo

Using whole genome sequencing to help combat COVID-19


The UK government has launched a new alliance to sequence the genomes of SARS-CoV-2, the virus responsible for the current COVID-19 pandemic. Backed by a £20 million investment, the COVID-19 Genomics UK Consortium (COG-UK) is comprised of the NHS, Public Health Agencies, the Wellcome Sanger Institute and several academic institutions.

Whole genome sequencing (WGS) provides the highest possible resolution information about an organism’s genome, and has the potential to transform infectious disease management. By analysing differences in the genetic code of viruses from different patients, the consortium aims to map the spread of the virus in real time, tracking new mutations to identify if different strains are emerging. Better understanding of the genetic makeup of the virus could  ultimately save lives by informing strategies for public health and clinical care, as well as facilitating the design of therapies and vaccines to combat the virus.

How will the whole genome sequencing be carried out?

Samples from patients with confirmed COVID-19 in the UK will be sent to sequencing centres across the country, including in Belfast, Birmingham, Cambridge, Cardiff, Edinburgh, Exeter, Glasgow, Liverpool, London, Norwich, Nottingham, Oxford and Sheffield; the Wellcome Sanger Institute will coordinate data analysis.

Several methods for next generation sequencing (NGS) are available to deliver WGS and, for a project of this size, using standardised methods and protocols would be ideal. However, given the urgency of the work the decision has been made that each sequencing centre will use the methods and protocols it is most familiar with. While this could present challenges when comparing different datasets the decision is understandable given the scale of the public health crisis – and each consortium member must share methods, protocols and validation studies with the group.   

One method that will be used is long read sequencing, which can sequence much longer sections of DNA than conventional methods. Long read sequencing is useful for rapid sequencing of DNA in outbreak situations and has already been used in Ebola and Zika outbreaks. Based on this work an approach has already been deployed to sequence the virus causing COVID-19 in a similar manner using Oxford Nanopore Technology.

How is WGS useful?

By sequencing the entire viral genome, researchers can pinpoint the genetic changes that occur in the virus as it spreads through the population. This approach is useful to

1) Understand the transmission of the virus

Understanding changes in the genetic sequence of the viral genomes collected from different patients allows researchers to build a viral ‘family tree’ and contribute to efforts to monitor disease spread within and between populations over time. This can help with identification of infection ‘hot spots’, or of super spreaders – individuals who transmit the infection to a larger than expected number of people. This information is valuable for planning targeted public health interventions to reduce disease spread.

2) Design treatments and vaccines

Understanding the viral DNA sequence will assist researchers designing therapies and vaccines that target specific features of the virus. It will also allow better understanding of how therapy and vaccine effectiveness might change as the virus evolves.

3) Monitor viral evolution

Continually tracking the virus will alert researchers to genetic changes that might give rise to less virulent or more virulent strains. Early warning of a more virulent virus or emergence of treatment resistance will be vital to support measures to minimise disease spread and for designing new treatments and vaccines.

4) Prepare for the future

The infrastructure and protocols that are created through this project, as well as the data produced, will be highly informative when responding to future pathogen outbreaks in the most efficient and effective manner possible. The work of this consortium has been made possible by building on previous research undertaken in the UK and making use of existing projects on microbial genome sequencing. These projects include the ARTIC project which aims to put genome sequencing at the heart of outbreak response, and the CLIMB project which provides bioinformatics capabilities for the UK microbiology community.

Challenges and outstanding questions

There is a lot of potential for the consortium to make huge advances in our understanding of the virus, but it won’t be easy. Whilst WGS is a powerful tool for understanding virus transmission, DNA sequences alone are often not enough to track viral spread without other information such as contact tracing, which follows up a person’s movements and in particular people they may have been in contact with while infectious.

For example, in February the genome of the virus causing COVID-19 found in a German patient who was infected in Italy appeared similar to that detected in a patient in Munich a month earlier. This implied that the virus at the source of the Italian outbreak may have originated in Munich, though it was also considered equally likely both viruses had been imported from China, arriving separately in each country. In this case information suggesting that the Italian outbreak originated in Germany spread before being refuted, causing confusion. This situation highlights that more information is necessary to confirm how the virus is spreading.

The consortium will face several technical challenges in terms of analysing viral DNA – for example, some sample types such as nasal swabs don’t always provide good quality viral RNA (viral DNA is often obtained by using viral RNA as a template) and quantities of genetic material can vary greatly between samples. It is also currently uncertain how many viral genomes it will be possible to sequence with the funding available.

Genomics: one weapon in the fight

The COG-UK consortium is a prime example of how innovative genome sequencing technologies can be harnessed through collaborative efforts to help tackle a major public health crisis. While it is exciting to see  the genomic infrastructure in the UK facilitate such work, genome sequencing of the virus causing COVID-19 is only one tool of many that can help with management of the pandemic. For example, other technologies for rapid testing to confirm suspected cases, as well as to determine if previously infected individuals have immunity, are arguably more crucial for shorter term successful management of the COVID-19 outbreak and in preventing high mortality rates.

Nevertheless, the COG-UK consortium has a potentially vital role to play in mitigating the longer-term impacts of the current outbreak and further, in improving our ability to understand and manage future disease outbreaks.


Genomics and policy newsletter

Sign up