Identification and genomic data

1 January 2018

This report explores privacy and anonymity in the context of genetic and genomic data: the protection of privacy and anonymity is regarded by some as of absolute importance, but increasingly, this objective is subject to multiple competing interests. Robust protection of these principles of privacy and anonymity rely on techniques to anonymise data, but making data truly anonymous can be difficult and genomic data presents particular challenges.

EU data protection law categorises data as different types based on the sensitivity of the data and the possibility of being able to identify a person through that data. Genomic data are exceptionally useful as a medical and scientific resource, but have many characteristics that make it difficult to anonymise without comprising its utility. Although the likelihood of someone being identified through their anonymised genomic data is low, the possibility has the potential to undermine public trust and confidence in the mechanisms of data security.

Adding to the complexity of the discussion is the idea that the genome is always intrinsically identifying, which arguably it is not. Examining why it is not reveals a common, but false, understanding of what it means to be identifiable. Furthermore, when considering why genomic data in particular is strongly identifying, it is arguable that significant flaws in the EU data protection regime become apparent.

Identification and genomic data explains the complex debate surrounding what is called anonymisation and the legal and regulatory issues that govern the use of genomic data.


Genomic data does not sit comfortably within the current legal and regulatory framework as a consequence of its nature and an overall lack of regulatory coherence. In this report we propose a number of responses to this challenge:

  • Take greater account of societal and technical change
  • Do not rely solely on anonymisation
  • Ensure transparency
  • Understand the difference between 'identification' and 'individualisation
  • Move away from language that appears to be absolutist
  • Pursue further research
  • Consider regulatory change
  • Establish and empower a Council of Data Ethics

Identification and genomic data  will be of particular use to healthcare professionals, data-processers and policy-makers who are interested in the use of genomic data. 

As with all PHG Foundation reports Identification and genomic data  is free to download.