Getting started with research data

In this article

Tip!

Before you even begin your collection/production of data it´s advisable to set up a plan for how this data shall be managed. A Data Management Plan (DMP) is often a necessity if you want to receive a grant. And a DMP that is updated and followed up on a regular basis will provide support come project implement time.

What data should I manage?

Data can be qualitative (e.g., text interviews, images and videos, audio recordings) or quantitative (e.g., tabular data, structured databases). Are there any ethical aspects (including personal data) or intellectual property rights issues with your data that you need to address?

As part of your Research Data Management (RDM) you should manage any data and code, as well as documentation about them, that are created or used as part of a research project. This might include:

  • Quantitative and qualitative data
  • Raw or processed data
  • Notes
  • Laboratory or research notebooks
  • Codebooks
  • Code or software used to run data analyses
  • Data workflows or pipelines
  • Metadata (documentation describing the data)
As a rule of thumb

You should know the location of all data produced by or used in your research project and it should be annotated sufficiently so that others can understand and reproduce your work, and possibly re-use your data in future studies.

What should I document about my research data?

What is the sufficient information needed for others to be able to use your data, understand or replicate your work? The answer to that and your type of data determines the need to document some or all of the items below.

Research Project Documentation

  • Rationale and context for data collection
  • Data collection methods
  • Structure and organisation of data files
  • Data sources used
  • Strategies for data validation and quality assurance
  • Analytical steps and pipelines (if any) used to process data
  • Information on data confidentiality, access and use conditions (sensitive data, such as personal data)
  • Archiving

Dataset documentation

  • Variable names and descriptions (for quantitative data)
  • Explanation of codes and classification schemes used
  • Algorithms used to transform data (including code)
  • File format (including version) for any software used

Read more about the importance of Research data management for both institutions and researchers.

Perrier L, Blondal E, Ayala AP, Dearborn D, Kenny T, et al. (2017) Research data management in academic institutions: A scoping review. PLOS ONE 12(5): e0178261. https://doi.org/10.1371/journal.pone.0178261

Classify information

Depending on the type of data, for instance personal data or other sensitive data, you might need to rethink how you plan to manage and store your data.

There are a couple of documents dedicated to the handling ( Riktlinjer för hantering av information) and classification (Rutinbeskrivning för informationsklassning) of information at Halmstad University. There is also a page, Guidelines for processing information, on the staff web where you can read more about handling of personal data as well as a guide to help you identify a correct storage solution for your needs.

Note!

At the moment we are working on setting up a dedicated storage solution for research data, that meet the needs required for storage of all types of research data. This solution is not part of the guide to finding a correct storage solution as of yet. When this solution is up and running it will be included in this guide and you will find all the necessary information required to start using this solution.

If you are not used to the procedure of classifying your data, or have other questions regarding data security, please contact the Data Protection Unit at Halmstad University ( dataskydd@hh.se).

Organising data

It will be easier to find and to keep track of data files, even after a long time, if the file names are sensible and the folder structures are well-organised. Delete data and files that are not needed and will not to be archived. Separate work in progress or drafts from completed work. Make sure you backup your original data.

File structuring

Think carefully how best to structure files in folders, in order to make it easy to locate and organise files and versions. When working in collaboration with others, the need for an orderly structure is even more important.

If your workplace already have established ways to structure folders, use the same method.

Consider the best hierarchy for files, deciding whether a deep or shallow hierarchy is preferable. But use a hierarchical structure!

Example folder structure

In the example to the right, data and documentation files are held in separate folders. Data files are further organised according to data type and then according to research activity. Documentation files are organised also according to type of documentation file and research activity.

File naming

A file name should be seen as a principal identifier for a file. Therefore good file naming conventions can give clues to the content, status and version of a file. It can uniquely identify a file and help in classifying and sorting files. File names that reflect the file content also facilitate searching and discovering files. In collaborative research, it is vital to keep track of changes and edits to files via the file name. File names should be independent of the location of the file on a computer.

There are software available that can help in naming of files. Bulk renaming of files can be done with the Bulk Rename Utility in Windows, or with software such as Ant Renamer, Rename-IT or Renamer (MacOS).

Best practice is to:

  • create sensible, meaningful but brief names
  • use file names to classify types of files
  • avoid using spaces, dots and special characters (& or ? or !)
  • use hyphens (-) or underscores (_) to separate elements in a file name
  • avoid very long file names
  • include versioning within file names where appropriate, e.g. _v1, _v2

Even though computers add basic information and properties to a file, such as file type, date and time of creation and modification, this is not reliable data management. This type of metadata should instead be added to the file name.

This article largely builds upon information from the UK Data Service.

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.

Still need help? Contact Us Contact Us