HIV-1 genome structure: a new level of genetic code?

7 August 2009

HIV-1 (human immunodeficiency virus type 1) has a small single-stranded RNA genome. The genome comprises all the elements necessary for the virus to evade human immune responses, infect target cells and subvert the normal cellular machinery to replicate and release new viruses. Single-stranded RNA can fold up to form two and three dimensional structures with critical functions in the regulation of viral replication and gene expression. Similarly, the importance of RNA folding elements in the control of human gene expression via mRNA (the intermediate between the DNA genes and protein products) is now widely acknowledged.

Whilst the simple RNA sequence of the HIV-1 genome has long been known, a new paper in Nature reports the characterization of a complete HIV-1 genome [Watts JM et al. (2009) Nature 460, 711-716]. The authors used a novel method that maintains RNA structure called SHAPE, selective 2'-hydroxyl acylation analysed by primer extension; see [Al-Hashimi HM et al. (2009) Nature 460, 696-698] for further explanation. Although this technique gives a lower resolution than other types of structural analysis, and does not resolve some forms of RNA structure, it nevertheless provides a structural overview of the complete genome that reveals both previously characterized and novel features.

Stable, conserved RNA structures were found to sequester or ‘insulate’ unstructured regions, particularly those that show high sequence variability between different viral strains; these hypervariable regions (such as within the gene for the surface protein Env) are essential for viral evasion of human immune responses. This organisation, with stable RNA helices flanking hypervariable regions, probably prevents variations within the regions from interacting with and potentially disrupting other RNA structures that play essential regulatory roles.

The authors also observed a pattern of RNA structures in positions corresponding to junctions between different domains of HIV-1 proteins that are initially produced as large polypeptides subsequently cleaved into separate, smaller proteins. They propose this to be consistent with a model whereby the RNA encodes protein structure at two levels: simple RNA sequences that encode proteins by dictating amino acid sequence, and highly structured RNA elements that determine the final three-dimensional structure of the proteins. They suggest that the highly structured RNA elements cause ribosomes (which produce proteins by assembling a chain of amino acids as dictated by the RNA coding sequence) to slow or pause, allowing time for the growing amino acid chain to fold. This could allow different protein domains to fold into the correct three-dimensional arrangements.

Comment: This study provides further evidence that the structure and function of RNA is both complex and important; the concept of an additional level of genetic code operating via RNA structure representing an area of compelling interest. In the case of HIV-1, improved understanding of the viral genome structure and function may indicate new areas for potential therapeutic intervention against the virus. More broadly, there is undoubtedly a need for improved understanding of how RNA functions in human gene expression. However, caveats remain. One important observation is that a static map such as that presented by Watts et al., besides omitting some structural features, also cannot accurately represent the probable in vivo situation, whereby regions may show variable structural conformations.

More from us