Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Nov 2, 2020
Date Accepted: Oct 14, 2021
Understanding the Nature of Metadata – A Systematic Review
ABSTRACT
Background:
Metadata is created to describe the corresponding data in a detailed and unambiguous way and is used for various applications in different research areas, e.g., data identification and classification. However, a clear definition of metadata is crucial for further use. Unfortunately, extensive experience with the processing and management of metadata has shown that the term "metadata" and its use is not always unambiguous.
Objective:
This study aimed to understand the definition of metadata and the challenges resulting from metadata reuse.
Methods:
A systematic literature search was performed in this paper following the PRISMA Guidelines for Reporting on Systematic Reviews. Five research questions were identified to streamline the review process, addressing: metadata characteristics, metadata standards, use cases, and problems encountered. The review was preceded by a harmonization process in order to achieve a general understanding of the terms used.
Results:
The harmonization process resulted in a clear set of definitions for metadata processing focusing on data integration. The following literature review was conducted by ten reviewers with different backgrounds and using the harmonized definitions. This study included 81 peer-reviewed papers from the last decade after applying various filtering steps to identify the most relevant articles. The five research questions could be answered, resulting in a broad overview of standards, use cases, problems, and corresponding solutions for the application of metadata in different research areas.
Conclusions:
Metadata can be a powerful tool for identifying, describing, and processing information, but its meaningful creation is costly and challenging. This review process uncovered many standards, use cases, problems, and solutions for dealing with metadata. The presented harmonized definitions and the new schema have the potential to improve the classification and generation of metadata by creating a shared understanding of metadata and its context.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.