Siirry sisältöön. | Siirry navigointiin

Jyväskylän yliopiston Koppa

HUOM! Kopan käyttö päättyy 31.7.2024! Lue lisää.


Navigation

Anonymisation and pseudonymisation

tekijä: timapupu Viimeisin muutos torstai 14. maaliskuuta 2024, 12.48

This material is expired! Please go to the new and updated Research Data Management guide for students.

 

Even though collecting personal data is in principle forbidden, there are several exceptions to this rule, such as collecting them when it is indispensable for research. Even then, personal data must be pseudonymised or anonymised whenever possible. Pseudonymisation and anonymisation protect the participants’ identity.

This video explains pseudonymisation and anonymisation through examples.

Pseudonymisation 

  • The participants’ names, cities and other personal details are replaced with codes. The code key is stored in a secure place (e.g. a locked desk drawer), separately from the data.  It can still be used to identify the participants.  
  • In practice, the code key is a list containing the participants’ names and corresponding pseudonyms or number sequences. 
  • Pseudonymisation does not provide similar protection as anonymisation, as the participants can still be indirectly identified.  
  • There are different ways of pseudonymisation. 

Anonymisation 

  • The data are modified so that the participants are no longer identifiable.  
  • Anonymisation is one of the options mentioned in the GDPR to open the data for later users. However, note that genuine anonymisation is challenging because it means that the individual becomes irrevocably unidentifiable. In addition, technological advances may produce new ways to integrate the data. 

If data have been genuinely anonymised, they are no longer personal data. Pseudonymised data, instead, continue to be personal data. 

See the guidelines on identifiable and anonymous data by the Finnish Social Science Data Archive.  

How do you think potential data minimisation, pseudonymisation or anonymisation affect the quality and coherence of data? 

Example:  
I am creating a survey that includes questions about age and residential area, among other details. However, the details cannot be linked to the respondent. Is the privacy notice needed? Are these anonymous or personal data? 
Remember to make sure that the survey software is data secure. Participants will be informed in any case, so do it with the privacy notice.  What if your data leaks and the participant becomes identifiable by utilising data from other sources?