11 August 2016
Robust computational software for data analysis and adequate computing infrastructure are essential to delivery of pathogen genomic services, as is the availability of people with the appropriate skills to develop, operate and maintain these resources.
A key consideration in supporting the implementation of genomics informed infectious diseases services is how to organize the procurement, development and delivery of these vital service components in the most efficient and effective way.
Last month saw the launch of CLIMB - Cloud Infrastructure for Microbial Bioinformatics - a multicentre collaboration to develop and deploy a world leading cyber-infrastructure for microbial bioinformatics; providing cloud-based compute, storage, and analysis tools for microbiologists in the UK.
As explained at the CLIMB launch presentation, the project has been impelled by at least two key developments over the years:
Together these factors have resulted in high volumes of genomic data being generated across a broader array of settings; i.e. sequencing is no longer the preserve of specialist research facilities. So the challenge now is in ensuring that the growing capacity to sequence genomes can be equalled by the capacity to analyse the resultant sequence data. Yet it is grossly inefficient for each laboratory, centre, or organisation with their own sequencer to also invest in their own dedicated computing infrastructure and data management team. This is in part due to the paucity of bioinformaticians, but also because each seque ncing location would carry an overhead in setting-up and maintaining computational hardware locally; an overhead which could alternatively be shared through use of a common centralised computational facility. A centralised resource also has other major advantages which are not easily amenable to localised solutions.
In our recent guest blog, Prof Mark Pallen and colleagues - the team behind CLIMB - described the advantages of cloud computing and the concept of ‘virtualisation’ - which allows users to work in a simulated computer environment. As well as avoiding heavy capital investment to setup computational resources locally, the framework allows users to share data, analysis pipelines and software programs, reduces the burden of installing complex programs, helps facilitate training in microbial bioinformatics, and eases the reproducibility of bioinformatics analysis. In short, a centralised computational resource like CLIMB, supports collaboration, encourages efficiency, and facilitates reproducibility.
One of the central themes of our landmark report on bringing Pathogen Genomics into Practice was the need for organisations to work together to share data, analysis tools, bioinformatics pipelines, knowledge and experience, as doing so would help accelerate implementation and maximise the efficiency and effectiveness of future pathogen genomic services. Towards this goal we identified the need for the centralised or networked provision of computation infrastructure, (which could be based on cloud or virtual computing) for clinical and public health laboratories wishing to undertake pathogen genome analysis. However, there is still no clear indication from health service leaders that such national infrastructure is underway or indeed will even be committed for genomic based infectious disease management services. In the meantime CLIMB, an academically funded (MRC) endeavour, is up and running. CLIMB has been developed as a national facility, and is free to academic microbiologists within the UK. But the project is also keen to engage with clinical and public health professionals working in infectious disease management since the resource can facilitate the exchange of knowledge, skills, and tools between academic and clinical / public health communities. So for those microbiology services who are eager to progress the development of their pathogen genomics services now might be the time to engage with CLIMB.