Developments in gene prediction software

30 June 2005

The international ENCyclopedia Of DNA Elements (ENCODE) project, established in 2003 with the aim of identifying all the functional elements in the human genome sequence, held its Gene Prediction Workshop at the Wellcome Trust Sanger Institute in Cambridge last week. This workshop was the culmination of a competition called EGASP to develop improved gene prediction software for genome sequence analysis. Automated methods for gene prediction are important in genomic research, since the standard experimental procedures to identify and annotate gene sequences are expensive and laborious, but their accuracy is generally limited. A total of eighteen teams were challenged to use their programs to predict gene positions in selected areas of the genome; the ENCODE team used normal analytical methods to determine the actual genes in the same regions. Although no one program was superior to the rest, the combined predictions of the various software packages for the genome regions in question identified 70% of the genes identified by the ENCODE researchers [Abbot A (2005) Nature 435, 134]. They also predicted the existence of hundreds of genes not identified by the normal analysis, some of which will be investigated further and may represent some novel genes.

