Common data element
A common data element is a standardized key term or concept, established so that it may be reused in a range of projects/studies, to enhance data quality and reusability over time.
Community-defined data reporting guideline
A community-defined data reporting guideline is a well-defined document, of varying degree of formalism, that provides instructions and guidance on data archiving and sharing practices across a community, defined as a like-minded group of users. Most of the community defined guidelines can be found on FAIRSharing.org at https://fairsharing.org/search?fairsharingRegistry=Standard
This is a type of metadata that describes the context in which the dataset was produced within. Contextual metadata is necessary to preserve the integrity of the data and provides the means to understand and enable the interpretation of the data to anyone wishing to use this data. In the biomedical domain, this information includes concepts that make up the study design such as the assessments or assays that generated the data, as well as the entities that data is collected for such as subjects, samples, cell lines, or molecular entities and the relationships between them.
A controlled terminology is an organised list of terms for the purpose of curation and normalisation of data record annotation.
A data analyst is a professional who retrieves, organizes, and analyzes information from various sources to help an organization achieve business, scientific, or research goals.
Data custodians are responsible for the safe custody, transport, storage of the data and implementation of business rules. They usually involve database administrator (DBA), data modeler and ETL developer.
A data dictionary is a document systematically listing and defining each of the variables recorded or used in a dataset.
Data discovery involves the collection and evaluation of data from various sources and is often used to understand trends and patterns in the data. It requires a progression of steps that organizations can use as a framework to understand their data.
Data element level
The data element level is the smallest unit of data and is described by a definite meaning and semantic.
Data exchange model
A data exchange model is a type of formal data representation devised and optimized to enable the transfer of information between systems.
Data hosting environment
Data hosting is the act of storing the data on a stable and accessible web platform. While there is no standard arrangement for providing this service, data hosting does represent a significant commitment that requires dedicated, long-term capacity that maintains a persistent and highly reliable web-connected platform.
Data maturity is a measurement of the extent to which an organization is utilizing their data. To achieve a high level of data maturity, data must be deeply ingrained in the organization, and be fully incorporated into all decision making and practices. Data maturity is often measured in stages.
A data model is a formal representation of a domain which identifies key entities and relations between them.
Data owner are either individuals or teams who make decisions such as who has the right to access and edit data and how it’s used. Owners may not work with their data every day, but are responsible for the data within their perimeter in terms of its collection, protection and quality.
Data representation can be referred to as the form in which we stored the data, processed it and transmitted it.
Data searching is the process of finding relevant information from the data in a systematic manner.
A data variable is an information entity which has a type and which can assume a range of quantitative or qualitative values and may be associated to measurement unit. Data variables should be defined in a data dictionary and may be used to generate Common Data Elements (CDE) or may use represented with CDEs.
A dataset is a collection of data item, information elements forming a signal which may be analysed in a standalone fashion or in conjunction with other datasets. A dataset may be represented in standard formats. In the context of the FAIR-DSM model is a data structure that is purposed for FAIR sharing. A dataset in FAIR-DSM-L-1 is a flexible container for any collection of data that is expected to be hosted and served for FAIR data sharing. Starting with FAIR-DSM-L-2 to FAIR-DSM-L-5, a dataset is expected to refer to a container for structured data, composed of a set of fields and each field is composed of a set of values (value set). In the majority of cases a dataset will be a tabular data structure (delimited text file) or a hierarchical data structure (json or xml)
A dataset field is an attribute of the dataset, which may be defined in a dataset model
A data descriptor is a data structure that contains any information that describes a dataset, i.e. it is a record representing the dataset’s metadata
Dataset field values
Dataset field values correspond to a set of information entities which are allowed for a given dataset field.
A dataset model is a formal representation devised for the archiving and exchange of domain specific data
A domain entity is an information element (a type, a concept) core to the description of a given knowledge area
The domain model is a formal representation of a knowledge domain with concepts, roles, datatypes, individuals, and rules, typically grounded in a description logic.
FAIR sharing is a use case for a data object (dataset) to exhibit a state of FAIRness purposed for sharing and re-use
Field level metadata
The field level metadata described in detail about the data stored in the field either based on the type of data, the usage of data, the data source etc.
General purpose metadata schema
General metadata schemas have a reduced learning curve and reduced expertise requirements, but their description may be inadequate in a specific domain. They include what are considered the essential elements to describe any data. Such schemas therefore lack domain-specificity and granularity.
Generic data model
Generic data models are generalizations of conventional data models that define standardised general relation types, together with the data that may be related or connected by such a relation type.
An IT professional may be: a person working in the field of information technology and has proven extensive knowledge in the area of computing.
Identifiable dataset is any information (personal or indirect) that can link a dataset to its research study.
Legal Expert means an Advisor, Associate Advisor, Consultant or any other officer having qualifications and experience in Law and duly authorized to deal with the legal aspects of the findings, decisions, recommendations and other legal matters of the Office.
Linked data representation
Linked Data is a set of design principles for sharing machine-readable interlinked data on the Web. When combined with Open Data (data that can be freely used and distributed), it is called linked data representation
A format in which the computer/machine can interpret the underlying data similar to a human and dedude conclusions on the dataset based on human’s questions
A machine readable format is a information representation specifying how to serialise (write) information/data in ways software agents can handle
Master data entity
The conceptual entity about which the observed or measured data is reported in the dataset(s). E.g., chemical compound, biospecimen, human subject, animal subject, cell-line
A metadata record is an instance of a metadata schema denoting/describing an entity, digital or otherwise
A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.
a metadata schema is a formal representation / specification which defines a set of descriptors needed to denote an entity
Minimum information reporting guidelines
A minimum information reporting guideline consist of sets of guidelines and formats for reporting data derived by specific high-throughput methods. Their purpose is to ensure the data generated by these methods can be easily verified, analysed and interpreted by the wider scientific community.
An ontology term is a entity defined in a semantic data model or a pre-defined ontology. it can be a Class/type or a Property/relation
A reference field is a field that represents a relationship between a data entity and one or more other data entities, which may belong to the same or different entity type.
A reporting guideline is a document, usually available as unstructured text, outlining annotation requirements in a specific domain for the purpose of data archiving, exchange and reporting.
A resolvable identifier is one that enables a system to locate the identified resource, or some information about it, such as metadata or a service related to it, elsewhere in the network.
Semantic data model
A semantic model for the dataset describes the meaning of entities and relations in the dataset accurately, unambiguously, and in a computer-actionable way.
Sensitive data is confidential information that must be kept safe and out of reach from all outsiders unless they have permission to access it. It includes all data, whether original or copied, which contains personal data and confidential data among other things.
A structured dataset is a information object which is represented according to some principled ways, thus adding contraints and structure to information representation.
Study designs are the set of methods and procedures used to collect and analyze data in a study.
An ontology term metadata is any descriptor used to describe an entity in a ontology (or a concept in a SKOS vocabulary).
Tidy data principles
Tidy Data provide a standardized way to link the structure of a dataset (its physical layout) with its semantics (its meaning). In tidy data:
- Each variable forms a column.
- Each observation forms a row.
- Each type of observational unit forms a table
Value level metadata
Value-level metadata describes the data attributes like type, length, format, controlled terminology, origin, derivation method or comments associated with a subsetted value