4. Collecting data

4. Collecting data

You can use existing data for your study (data re-use) or you can generate new data. To collect these data, it is common practise in the molecular life sciences to use flat files, whereas for clinical research, it is common practice to store new data in data capturing systems. For both it is vital to protect privacy and security, while ensuring quality and integrity of the data.

If you will be handling a large dataset, it is important to think ahead about:

  • storage capacity;
  • access policies (e.g., whether web-based or multi-user access is required);
  • protection against unauthorised access (see section security);
  • backups;
  • data-lineage (process of collecting (what sources), extracting, transforming the data)
  • when the raw data will become available;
  • describing the data (meta- and reference data, ontologies);
  • the location(s) for data processing;
  • costs (for instance for storage and computing capacity).

The best time to consider and describe these issues is at the start of your project.

This fourth section of HANDS provides guidelines about re-using existing data, selecting file formats, implementing a suitable data capturing system and implementing data protection measures.