Documentation and metadata

tekijä: Tiia Marjaana Puputti Viimeisin muutos tiistai 29. kesäkuuta 2021, 15.31

Questions in the data management plan:

1. Write down the central metadata of your research, as applicable.

Related to all FAIR principles: Findable, Accessible, Interoperable and Re-usable. 

Documentation helps make better science.

Scientific knowledge production requires that the research process be documented so precisely that it would be possible to repeat the study in the same way afterwards. Right from the beginning of your research process, ensure that you can describe it precisely enough in your thesis. 

The documentation requirement also applies to data – you must be able to describe the generation, structure and processing of the data to others so that they understand it. For documenting the data, you can use, for example, a formal research diary or an Excel sheet, where you specify 

  • what parts constitute the data 
  • essential information on the various parts of the data (e.g. on interviews: the interviewee, the interview date, the theme…)

Documentation provides an inventory of the data. In other words, documentation is bookkeeping on the data. In practice, you keep a record of what you did, when, how, and with whom. This will help you find, for example, a specific interview on a certain topic from among all your interviews.

This way your work is more far-sighted, systematic and structured, which means that you conduct better science. You facilitate your own progress when you know what you are doing and how you will find the necessary information in your data. This way you can also easily return to your data or let other researchers use the data (based on consent), and the data are easy for them to understand. 

Different disciplines may have their own practices related to documentation. Here is an example of a laboratory notebook for students of chemistry.

Metadata are part of documentation

Documentation includes the recording and updating of metadata. In other words, you need to attach descriptive and technical data (= metadata) to your research data. Metadata are a set of data that describes other data, in this case: information on your research data. Metadata include, for example, the name given to the data as well as when and how the data were collected. You can think of metadata as a kind of guide on how to read the data (“readme” file, e.g.  Guide to writing “readme” style metadata). 

By using metadata, you ensure that you or later users can find all they need and interpret the data in the same way, irrespective of the moment and context of use. If you took a month-long break from thesis writing and had only the collected data but no explanatory data, would you still remember what you were doing? Or if you let a research group use your data without explaining the variables and abbreviations you have used, would the group even understand the data?

What if you use archived data? For archived data, the archivist has taken care of descriptions, so you will find them in the data catalogues of the archive. However, tabulations and other corresponding operations you perform on archived data are part of your own description. In your own data, you can save both the research data and their metadata file in the same location.  

Note that even if you probably do not manage to define precisely all the relevant metadata at the beginning of your research process, you should plan them as precisely as possible. 

Things you have mentioned elsewhere need not be repeated in the metadata. If something has already been explained in, for example, the research plan, it need not be explained again in the metadata section of the data management plan.

Remember to specify your plan further over the entire research process, also after this course!

The following page provides a list of metadata.