Accepted for/Published in: JMIR Formative Research
Date Submitted: Nov 24, 2022
Open Peer Review Period: Nov 24, 2022 - Jan 19, 2023
Date Accepted: Feb 8, 2023
(closed for review but you can still tweet)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Dynamic region of interest selection in remote photoplethysmography: proof of principle
ABSTRACT
Background:
Remote photoplethysmography (rPPG) can record vital signs (VS) by detecting subtle changes in the light reflected from the skin. Lifelight®(Xim Ltd) is a novel software being developed as a medical device for the contactless measurement of VS using rPPG via the integral cameras on smart devices. Research to date has focused on extracting the pulsatile VS signal from the raw signal, which can be influenced by factors such as ambient light, skin thickness, facial movements and skin tone.
Objective:
This preliminary proof-of-concept study outlines a dynamic approach to rPPG signal processing in which green channel signals from the most relevant areas of the face (the mid-face, comprising the cheeks, nose and top of the lip) are optimized for each subject using tiling and aggregation (T&A) algorithms.
Methods:
High-resolution 60 second videos were recorded during the VISION-MD study (Clinicaltrials.gov identifier NCT04763746). The mid-face was divided into 62 tiles of 20 × 20 pixels and the best 30 tiles, based on the signal to noise ratio in the frequency domain (SNR-F), aggregated using five different algorithms. Signals from the mid-face before and after T&A were categorized by a trained observer blinded to the data processing as 0 (high quality, suitable for algorithm training), 1 (suitable for algorithm testing) or 2 (inadequate quality). In a secondary analysis, observer categories were compared for signals predicted to improve category following T&A based on SNR-F score. Observer ratings and SNR-F scores were also compared before and after T&A for Fitzpatrick skin tones 5 and 6, in which rPPG is hampered by light absorption by melanin.
Results:
The analysis used 4310 videos recorded from 1315 participants. Signals in categories 2 and 1 had lower mean SNR-F scores than those in category 0. T&A improved the mean SNR-F score using all algorithms. Nine to 21% improved by at least one category, with up to 10% improving into category 0, and 15–39% remained in the same category,. Importantly, 9–21% improved from category 2 (not usable) into category 1. Improvements were seen with all the algorithms tested. No more than 2% of signals were assigned into a lower-quality category following T&A. In the secondary analysis, 62% of 52 signals were re-categorized by the observer as predicted from SNR-F score. T&A improved SNR-F scores in darker skin tones; 41% of 369 signals improved from category 2 to 1 and 12% from category 1 to 0.
Conclusions:
The T&A approach to dynamic ROI selection improved signal quality, including in dark skin tones. The method was verified by comparison with a trained observer rating. T&A can reasonably be expected to overcome factors that compromise whole-face rPPG. The performance of this method in estimating VS is currently being assessed.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.