Data Usage Areas
Each organization or research community might have different business goals for re-using data. These different re-use areas require different attention for improving findability, accessibility and interoperability of data sets. Following table provides an overview of these data re-use areas, and for each area what action can be taken to improve data management processes and potential benefits.
Data Usage Area | Aim | Expected Benefit |
---|---|---|
Data Interpretability | Improve reuse of data by another person. - Provide metadata about how data is organized and structured. - Document the content of data files, and their relations. - Identify and validate data types and formats |
- Reduces time spent for each person examining and understanding existing research data. |
Data Integration | Improve data consolidation & harmonization. - Annotate data with common vocabularies, ontologies. - Use of common master data and reference data (if available). - Use common data profiles, models, schemas for semantic modelling(if available). - Improve interoperability by mapping terminologies (e.g. via identifier linksets). |
- Reduces time for data cleaning and integration. - Increases the likelihood of linking datasets with other sets automatically. |
Data Repurposing | Improve reuse of data in another context, such as with different research hypotheses. - Document research hypothesis and data inclusion and exclusion criteria - Document reference materials, such as cell lines and microorganisms. - Document different steps of research lifecycle and their data outputs. - Provide raw data or primary data, not only derived and analyzed data sets. - Provide a variety of research outcomes, such as negative results. |
- Reduces the resources spent for generating data for research hypotheses. - Improves repurposing of data as part of a new study. |
Data Reproducibility | Improve repeatability, replicability, reproducibility of research outcomes - Provide documentation and guidelines for describing research protocols. - Provide provenance of experiment such as measuring tools, locations, conditions, hypothesis, time periods, study design (power analysis, sample sizes) - Identify the key resources such as antibodies, model organisms and software. - Share materials, software, and other tools used for data analysis |
- Ensures transparency, gives confidence in understanding study. - Increases the likelihood of attaining results by a different or same research team, using the same or different experiment setups. |
Potential future Data Usage areas:
-
Regulatory Reporting: Easily fulfill the mandatory reporting requirements
-
Data Analytics (machine actionable data): Easy access by services and running algorithms and computational models
Maturity of each data usage area is measured via a set of indicators
Data Usage Area | Related Set of Indicators |
---|---|
Data Interpretability | F+S03, F+S04, F+A01, F+A02, F+A04, F+A05 |
Data Integration | F+A02, F+A03, F+A04, F+A05, F+A06 |
Data Repurposing | F+S01, F+S02, F+S04, F+S05, F+S06, F+S07, F+S08a, F+S08b, F+S08c, F+S08d |
Data Reproducibility | F+S02, F+S05, F+S08a, F+S08b, F+S08c |