Glossary of genetics terminology


Have you found this resource useful?  You will find many more useful tools and publications in our Resources section, which only requires a quick, simple and free one-time registration to access.


Alleles Variant forms of the same gene


Alternative splicing  Different ways of removing introns from the RNA transcribed from genes, prior to translation into protein. This provides a mechanism for producing multiple different proteins from the same gene.


Amino acid  A molecule containing both amino (-NH2) and carboxyl (-COOH) groups. There are 20 different basic amino acids coded for by DNA and are the building blocks of proteins. The order of amino acids in a protein is determined by the genetic sequence.


Aneuploidy Having an abnormal number of chromosomes


Apoptosis  Normal programmed cell death


Array  (see Microarray)


Association  A tendency for two characteristics to occur together at non-random frequencies. Note that this is a statistical correlation and does not by itself imply causation.


Association analysis A method of genetic analysis that compares the frequency of alleles between affected and unaffected individuals; a given allele is considered to be associated with the disease if that allele occurs at a significantly higher frequency among affected individuals.


Autosomes The chromosomes that are not concerned with sex determination. Humans have 22 pairs of autosomes, plus two sex chromosomes.


Base-pair  Pair of complementary nucleotides that make up the DNA sequence.


Carrier Usually refers to an individual who is heterozygous for a recessive, disease-causing allele. A carrier of such an allele usually shows no symptoms of the disease but can pass the mutant allele on to his or her children. If both parents are carriers, there is a one in four chance that their child will be homozygous for the mutant allele and will be affected by the disease.


Clinical utility  An assessment of the risks and benefits resulting from using a particular test and the likelihood that the test will lead to an improved overall outcome.


Clinical validity  The accuracy with which a test identifies or predicts a patient’s clinical status. For genetic testing, the relevance of a particular gene to a disease can be assessed by genome-disease association studies and the accuracy of a test is evaluated in terms of its specificity, sensitivity, PPV and NPV.


Codon  Three sequential nucleotides in DNA, often referred to as the triplet code, which specifies (or codes for) a single amino acid. There are 64 different codons possible from the 4 nucleotides in DNA, of which 61 specify the incorporation of an amino acid into a protein while the remaining three are stop codons that signal the end of the protein.


Complex disease  Any disease phenotype that does not exhibit classical inheritance patterns attributable to a single gene. Generally multiple genetic and environmental factors are involved.


Consanguineous  Related by birth; descended from the same parent or ancestor.


cDNA DNA prepared by "reverse transcription" of RNA; that is, the RNA is used as a template for the synthesis of a complementary DNA molecule.


Chromosomes The structures within cells that carry the genetic information in the form of DNA. Each chromosome is composed of a single, long molecule of DNA, complexed with specific proteins. Humans have 22 pairs of autosomes and 2 sex chromosomes. One member of each pair of chromosomes is inherited from the father, and the other from the mother.


Clone A set of genetically identical organisms. (The word 'cloning' is also used to mean the production of a set of identical molecules of DNA; this is more accurately referred to as molecular cloning).


Copy Number Variation (CNV) Differences in the number of copies of a particular gene or segment of DNA, due to gains or losses of around one thousand to several million base-pairs, that have been found by comparison between two or more genomes. Copy number variations which include coding regions, and thus alter the number of copies of a gene present inside a cell, are also sometimes referred to as changes in gene dosage.


Differentiate  A developmental process by which individual cells in the body become specialized and take on different functions.


Diploid Refers to the chromosome complement of normal body cells, which have two copies of each chromosome (one inherited from the mother and one from the father).


DNA repair There are many ways in which DNA can be damaged. For example, chemicals can cause changes in the DNA sequence, or mistakes can be made during the copying of DNA that happens just before a cell divides to produce two new cells. Every cell has a set of proteins whose role is to repair damaged DNA. If the genes that produce these repair proteins are themselves damaged by mutation, the cell may be unable to repair damaged DNA effectively.


Dominant inheritance Inheritance of a mutation from one parent only (or arising anew during egg or sperm formation) can be sufficient for the person to be affected.


Epigenetic A factor or mechanism that changes the expression of a gene or genes without changing their DNA sequence. In more general terms, an epigenetic factor is something that changes the phenotype without changing the genotype.


Epigenome  Description of all the epigenetic modifications across the whole genome. Unlike the genome (DNA sequence), different cells within an organism have different epigenomes that may change with time in response to environmental cues.


Exons  Regions of a gene that code for protein.


FISH  Fluorescent in situ hybridisation, a technique which can be used to detect and localise the presence or absence of specific DNA sequences on chromosomes.


Functional genomics  Large-scale analysis of gene function.


Gene  A part of the DNA molecule of a chromosome which encodes (directs the synthesis of) a protein.


Gene expression  The process by which the information in the DNA sequence of a gene is transcribed into messenger RNA and translated into protein.


Genetic heterogeneity  Refers to diseases, conditions or other characteristics that appear similar but whose genetic basis is different in different populations or individuals.


Genetic map  See linkage map


Genetic screening  Carrying out a genetic test on a whole unselected population, or on all the members of a subset of the population (for example, people from a particular ethnic group, or pregnant women, or newborn infants.


Genetic test  A test to detect the presence or absence of, or change in, a particular gene or chromosome


Genome  Complete genetic sequence for an organism.


Genome-wide association study (GWA or GWAS)  A study in which specific genetic markers (usually SNPs) across the entire genome of multiple people are scanned in order to find genetic variations associated with disease. By comparing DNA samples from a group of patients who share a particular disease to those who do not, genome-wide association studies aim to pinpoint the genetic differences that correlate with and perhaps play a causative role in that disease.


Genome-wide significance: Statistical term indicating that a genetic association is probably real, rather than being a false positive. Due to the large number of SNPs tested simultaneously in GWAS, the usual statistical significance level of 0.05 is far too lenient and results in thousands of false positive results just due to chance. For this reason, a more stringent "genome-wide" significance threshold level of 5 x 10^-8 is generally used to claim statistical significance. This is based on the testing of 1 million SNPs and uses the simple Bonferroni correction for multiple testing (i.e. 0.05 divided by the number of tests). Although this may be overly conservative, it provides a similar significance threshold to that suggested by several more complicated methods.


Genotype The specific genetic constitution of an individual.


Germ-line  The sex cells (sperm and egg), which transmit genetic information from one generation to the next.


Germ-line mutation  A mutation in a germ-line cell or a cell that is destined to become a germ-line cell. A mutation in a germ-line cell can be passed on to the next generation.


Haploid Refers to the chromosome complement of sex cells (i.e. sperm or egg cells), which have only one copy of each chromosome, and therefore have half the number of chromosomes found in other cells of the body.


Haplotype A specific set of alleles that are physically located close to each other on the same chromosome and are commonly inherited together.


Heritability The degree to which a characteristic is determined by genetics.


Heterozygote An individual who carries two different alleles of a particular gene. An individual who carries two different mutant alleles in the same gene is said to be a compound heterozygote.


Histone  Set of proteins that form a scaffold around which chromosomal DNA is wound, in order to compact it inside the cell. Chemical modification of histones is involved in epigenetic regulation of gene activity, by altering the accessibility of the DNA.


Homologous chromosomes Every person has two copies of each chromosome (with the exception of the sex chromosomes): one member of each pair is inherited from the father and the other from the mother. The members of a pair of chromosomes are known as homologous chromosomes or homologues.


Homozygote An individual who has two identical copies of a particular gene.


Hybridisation The process by which two nucleic acid strands (DNA or RNA) with complementary sequences associate and bind together along their length.


Indel  Genetic mutation that results in INsertion or DELetion of nucleotides.


Introns  Segment of a gene, which does not code for protein and interrupts the protein coding regions of a gene. Introns are transcribed into RNA but are spliced out prior to translation of the mRNA into protein.


IVF  In vitro fertilisation, an assisted reproduction technique in which fertilisation is achieved outside of the body.


Karyotype A description of the number and structure of the chromosomes of an individual.


Linkage Two genes or markers that are so close together on a chromosome that they are rarely separated by recombination are said to be linked. Linkage analysis is a statistical method for detecting linkage between a disease and markers of known location by following their inheritance in families.


Linkage disequilibrium (LD)  Non-random association of alleles at genetic loci such that combinations of alleles occur more or less frequently in a population than would be expected from the random formation of haplotypes. It can vary from zero (LD = 0, no linkage disequilibrium, the alleles are not associated and are inherited independently) to one (LD = 1, complete linkage disequilibrium, alleles are associated and always inherited together).


Linkage map A map of the relative positions of genes (or other DNA sequences) on a chromosome, determined on the basis of how frequently the genes are inherited together. The closer two genes are, the less likely it is that recombination will separate them during the formation of sex cells (sperm or egg).


Locus The location of a gene or a marker on a chromosome.


Mapping See linkage map and physical map


Marker A gene or other segment of DNA whose position on a chromosome is known and whose inheritance can be monitored.


Meiosis The specialised cell division that takes place when sex cells (sperm or egg cells) are produced. The members of each homologous pair of chromosomes separate from each other so that each sex cell receives only one member of each pair.


Mendelian  A Mendelian trait is one that is controlled by a single locus (so that a mutation in a single gene can cause a disease) and is thus transmitted in a simple inheritance pattern according to Mendel's laws. Examples include sickle-cell anaemia, Tay-Sachs disease and cystic fibrosis. These disorders contrast with multifactorial diseases, which are affected by several loci and the environment.


Messenger RNA A molecule of RNA whose sequence is complementary to the coding sequence of a gene. Messenger RNA is translated into protein.


mRNA see Messenger RNA


Meta-analysis  Statistical technique in which the results from two or more studies are systematically combined into a single quantitative result, in order to give more weight to the conclusions.


Metabolome  The collection of all metabolites resulting from specific cellular processes in a biological organism, which are the end products of its gene expression.


Methylation (DNA)  Chemical modification of DNA by the addition of a methyl (-CH3) group to the cytosine (C) nucleotide, which can be inherited without changing the DNA sequence. Used as a epigenetic mechanism for gene silencing.


Microarray (DNA)  Miniaturized array of a large number (usually thousands) of unique DNA sequences, commonly representing specific genes, attached to a solid substrate. Often used to simultaneously monitor the expression level of each gene by hybridisation of RNA to the bound DNA. Also known as a DNA chip or a gene chip.


Microsatellite  Region of non-coding DNA consisting of specific short sequences (usually 1-5 base-pairs) repeated sequentially many times. The number of repeats in each microsatellite varies widely among different individuals. Also known as short tandem repeats (STRs).


Mitosis  Normal process of replication in which a cell divides to form two daughter cells with identical sets of chromosomes.


Monogenic  Characteristic that is controlled by a single gene.


Multipotent  Class of stem cells that can differentiate into more than one tissue type of closely related cells, but not all. Descendents of pluripotent stem cells.


Mutation  Any heritable change in the DNA sequence; usually refers to a rare and harmful change in the DNA sequence that is present in less than 1% of the population.


Negative predictive value (NPV)  Probability of an individual not having a particular disorder following a negative test, i.e. the proportion of true negative results relative to the total number of negative tests. As the prevalence of a disease decreases, the NPV increases as each individual is less likely to have the disorder.


Non-coding  Region of the DNA sequence which does not contain any genes, i.e. does not code for the production of a protein. Around 99% of the human genetic code is thought to be non-coding.


Nucleotide The molecular unit from which DNA and RNA are made. In DNA, a nucleotide consists of a 'base' [adenine (A), guanine (G), cytosine (C) or thymine (T)] linked to the sugar deoxyribose and a phosphate group. Many nucleotide units joined together via their sugar-phosphate groups make up a DNA molecule. In RNA, the sugar is ribose and the base thymine is replaced by uracil (U). The sequence of a DNA or RNA molecule is usually described as the sequence of its bases, e.g. AAAAGTTCGTCTAGGTC. A trinucleotide is a set of three nucleotides, e.g. CGG.


Nutrigenomics  The study of molecular relationships between nutrition and the response of genes, with the aim of understanding how genetics influences the affect of diet on health. Nutrigenomics focuses on the effect of nutrients on the genome, proteome, and metabolome of an organism.


Odds ratio  A measure of relative risk or effect size that is usually estimated from case-control studies.


Oncogene Gene that normally plays a role in the cell growth and division. When mutated, oncogenes can cause abnormal cell proliferation.


PCR see Polymerase chain reaction


Penetrance The likelihood that a person carrying a mutation will develop the characteristics caused by that mutation.


Pharmacogenetics  The study of the influence of individual genes or alleles on the metabolism or function of drugs. Generally focused on understanding how an individual’s response to medication may be affected by their genetic sequence (genotype).


Pharmacogenomics  The genome-wide study of the effects of drugs on gene expression patterns. Generally focused on funding new targets for drug discovery.


Phenocopy A environmentally-induced phenotype that mimics one caused by genetic factors.


Phenotype The observable traits of an organism. The phenotype results from the combination of genetic and environmental factors.


Physical map A map showing the relative positions of genes or other DNA sequences on the chromosomes, determined for example by how frequently the genes remain on the same segment of DNA when the chromosome is fragmented.


Pluripotent  Class of stem cells that can differentiate into any specific cell type of the body. Descendents of totipotent stem cells.


Polygenic  Characteristic that is controlled by multiple genes.


Polymerase chain reaction (PCR) A method for exponentially increasing the number of copies of a specific DNA sequence. The use of PCR enables the genetic analysis of biological samples containing only tiny amounts of DNA.


Polymorphism Variation in a region of DNA sequence among different individuals; the variation should be present in at least 1-2% of the population to be considered a polymorphism.


Point mutation  Genetic mutation that causes the substitution of a single nucleotide with another (see SNP). When it occurs inside a gene, the change may be classed as:

  • Silent - codes for the same amino acid

  • Missense - codes for a different amino acid

  • Nonsense - codes for STOP which may truncate the protein

Positive predictive value (PPV)  The probability of an individual having a particular disorder following a positive test, i.e. the proportion of true positive results relative to the total number of positive tests. As the prevalence of a disease increases, the PPV also increases as each individual is more likely to have the disorder.


Preimplantation genetic diagnosis Use of genetic testing on one or two cells taken from a live early-stage embryo created by in vitro fertilisation. The procedure is usually carried out in order to determine whether the embryo is affected by a serious genetic disease. An unaffected embryo is implanted in the uterus and allowed to develop to term.


Proband The first person in a family or pedigree to be brought to medical attention.


Promoter A DNA sequence that regulates the expression of a gene.


Pronuclei The bodies containing genetic material derived from the sperm and egg cells that are present in the zygote following fertilisation, but before the two fuse to form a single nucleus.


Protein  Functional gene product composed of a long chain of amino acids, the sequence of which is determined by the genetic code.


Proteome  The total set of different proteins expressed from the genome within a particular cell, tissue or organism at a given time. Proteomics is the large-scale study of the structure and function of proteins.


Quantitative-fluorescent polymerase chain reaction (QF-PCR) A technique based on PCR to exponentially increase the number of copies of a specific DNA sequence in a quantitative manner. Since the number of molecules theoretically doubles in each PCR cycle, the number of amplification cycles combined with the final amount of PCR product should allow calculation of the initial quantity of genetic material. Two common methods of quantification are the use of fluorescent dyes that intercalate with double-strand DNA, and modified DNA probes that fluoresce when hybridized with a complementary DNA. The level of fluorescence can be followed in real time (i.e. with each amplification cycle).


Recessive inheritance A mutation has to be inherited from both parents in order for a person to be affected. Such parents are usually unaffected carriers because they only have a single copy of the mutant gene.


Recombination Every person has two copies of each chromosome (with the exception of the sex chromosomes): one member of each pair is inherited from the father and the other from the mother. The members of a pair of chromosomes (known as homologous chromosomes or homologues) contain corresponding sets of genes, but each pair of genes (alleles) may not be identical. During the formation of sex cells (egg and sperm), homologous chromosomes exchange segments, shuffling the combinations of alleles on each one. This is called recombination. Subsequently, the reshuffled homologues separate from each other and only one member of each pair ends up in a given sperm or egg cell. This process ensures that sexual reproduction passes on a mixture of alleles from all the parental chromosomes.


Regenerative medicine  Medical interventions in which living, functional tissues are created in order to repair or replace lost tissue or organ function. This term is commonly used to refer to stem cell therapies, but can include a variety of approaches including gene therapy and tissue engineering and transplantation.


Resequencing Resequencing refers to sequencing a small region of an individual's genome, such as a candidate gene, in order to detect differences from the standard reference human genome that might be associated with a particular disease. Resequencing techniques can be broadly divided into those which test for known mutations (genotyping) and those which scan for any mutation in a specific target region (variation analysis); genetic differences may include single nucleotide polymorphisms (SNPs), insertions or deletions.


Reverse transcription PCR (RT-PCR)  A technique based on PCR to exponentially increase the number of copies of a specific RNA sequence. The RNA is first reverse transcribed into complementary DNA, followed by amplification of this DNA using PCR.


RNA  Nucleic acid produced by DNA transcription. It is usually single stranded and has a specific sequence of nucleotides which is complementary to the DNA sequence.


RNA interference (RNAi)  Silencing of a gene by complementary RNA. Also known as RNA silencing.


Segregation At meiosis, the two corresponding alleles of a gene, located on a pair of homologous chromosomes, separate (or segregate) with these chromosomes so that each sex cell (sperm or egg) receives only one of the alleles.


Sensitivity  Measure of the accuracy of a test. Sensitivity is the proportion of those with a condition who have a positive test result, i.e. true positives. For example, a sensitivity of 95% means that, out of 100 people with a condition, 95 will be correctly diagnosed, and 5 cases will be missed.


Sex chromosomes The chromosomes that determine the sex of an individual. Females have two X chromosomes while males have an X and a Y chromosome.


Short tandem repeat  See microsatellite


Silencing  Switching off of a gene by any mechanism other than a change in the genetic sequence.


Single nucleotide polymorphism (SNP) A DNA sequence variation that involves a change in a single nucleotide


Somatic cells All cells of the body apart from the germ-line (sex) cells and their precursors.


Somatic mutation A mutation that occurs in any cell of the body other than a germ-line cell, and thus is not heritable.


Specificity  Measure of the accuracy of a test. Specificity is the proportion of those without a condition who have a negative test result, i.e. true negatives. For example, a specificity of 95% means that, out of 100 healthy people, 95 will be correctly diagnosed, and 5 will test positive for a condition they do not have.


Stem cells  Primal cells found in multicellular organisms that retain the ability to renew themselves through mitosis and can differentiate into a diverse range of specialised cells.


Synthetic biology  The design and construction of new biological functions and systems not found in nature.


Totipotent  Class of stem cells that have the capacity to form an entire organism and can differentiate into every cell and tissue in the body, including the placenta. Earliest embryonic stem cell produced from the fusion of an egg and sperm cell.


Transcription; The process in which a molecule of RNA is synthesised, using as a template the DNA sequence of a gene. The RNA is processed into messenger RNA, which is then translated to produce a protein.


Transcription factor  Protein that binds to a specific DNA sequence and is involved in the transcription of DNA into RNA.


Transcriptome  The total set of different RNA molecules transcribed from the genome in a particular cell, tissue or organism at a given time.


Translation The process in which the sequence of a messenger RNA molecule is used to direct the order of assembly of amino acids to make a protein.


Translocation  Type of chromosomal abnormality in which a section of DNA is rearranged and transferred from one chromosome to a different one.


Trisomy Presence of an additional copy of a chromosome, so that there are three copies instead of the usual two. The most common trisomy is trisomy of chromosome 21, which causes Down syndrome.


Truncating mutation A mutation that introduces a premature translational "stop" signal into a gene, causing a shortened (truncated) protein to be made. Such proteins are often unstable and are degraded by the cell.


Tumour suppressor gene Gene that plays a role in controlling cell survival and division. If both copies of a tumour suppressor gene are deleted or inactivated, these controlling functions are lost and unrestrained proliferation may result.


X chromosome One of the two sex chromosomes of humans. Females have two X chromosomes, while males have one X and one Y chromosome.


X-linked inheritance Males have only one allele of (almost) every gene on the X chromosome, so a recessive mutation in one of those genes may cause disease. Inheritance of the disease is said to be X-linked. Examples include haemophilia and X-linked colour blindness.


Y chromosome One of the two sex chromosomes of humans. Females have two X chromosomes while males have one X and one Y chromosome. The Y chromosome carries a male-sex-determining gene that initiates development as a male.







Last Updated: 10 September 2010