Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Nov 14, 2018
Open Peer Review Period: Nov 16, 2018 - Jan 11, 2019
Date Accepted: Mar 4, 2019
(closed for review but you can still tweet)
PACO - Physical Activity Concept Ontology
ABSTRACT
Background:
Physical activity data provides important information on disease onset, progression, and treatment outcomes. Although analyzing physical activity data in conjunction with other clinical and microbiological data will lead to new insights crucial for improving human health, it has been hampered partly due to the large variations in the way the data are collected and presented.
Objective:
The goal of this study was to develop a Physical Activity Concept Ontology (PACO) to support structuring and standardizing heterogeneous descriptions of physical activities.
Methods:
We prepared a corpus of 1140 unique questions collected from various physical activity questionnaires and scales, as well as existing standardized terminologies and ontologies. We extracted concepts relevant to physical activity from the corpus using MUTT (Multipurpose Text processing Tool). The target concepts were formalized into an ontology using Protégé (version 4). Evaluation of PACO was performed along two aspects: structural consistency and structural cohesiveness. Evaluations were conducted using the Ontology Debugger plugin of Protégé and OntOlogy Pitfall Scanner (OOPS!). A use case application of PACO was demonstrated by structuring and standardizing 36 exercise habit statements and then automatically classifying them to a defined class of either sufficiently active or insufficiently active using FaCT++, an ontology reasoner available in Protégé.
Results:
PACO was constructed using the 268 unique concepts extracted from the questionnaires and assessment scales. PACO contains 225 classes including 9 defined classes, 8 object properties, 1 data property, and 23 instances (excluding 36 exercise statements). The maximum depth of classes is 4 and the maximum number of siblings is 38. The evaluations with ontology auditing tool confirmed that PACO is structurally consistent and cohesive. We showed in a small sample of 36 exercise habit statements that we could map text segments to relevant PACO concepts (e.g., exercise type class, intensity, and total minutes exercised per week) and infer from these concepts output determinations of sufficiently active or insufficiently active, using the FaCT++ reasoner.
Conclusions:
As a first step toward standardizing and structuring heterogeneous descriptions of physical activities for integrative data analyses, PACO was built with the concepts collected from physical activity questionnaires and scale. PACO was evaluated to be structurally consistent and cohesive, and also demonstrated to be potentially useful in standardizing heterogeneous physical activity descriptions and classifying them into clinically meaningful categories that reflect adequacy of exercise. Clinical Trial: NA
Citation

Per the author's request the PDF is not available.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.