The overall goal of the project
(see previous news
) is to identify and understand the functional regions of the human genome – a task that grows in magnitude as more is learned about the complexity of regulation and control of the genome. Gene-coding regions represent a very small proportion of the whole human genome sequence; it is not yet known what proportion regulatory regions comprise, although more than two million potential elements have already been identified by the project.
Released in PLoS Biology, the new guide to the resources and data arising from the ENCODE project details completion of the mapping of two major groups of genomic elements: all gene sequences (both coding genes that specify proteins, and non-coding genes that specify functional RNA molecules), and all regions known to control gene expression.
ENCODE researchers are working to combine their findings with those from other major research initiatives such as the 1000 Genomes project (see previous news) and the NIH Roadmap Epigenomics program (see previous news). Their hope is that combining data on functional genomic elements with information about genetic and epigenetic variation and different human phenotypes will improve our understanding of the relationships between genetics, health and disease. The overview provides examples of how these data can be used to interpret human genome data such as associations between genetic variants and diseases, and guide further research.