Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Multi-scale Bowel Sound Event Spotting in Highly Imbalanced Wearable Monitoring Data: Algorithm Development and Validation
ABSTRACT
Background:
Abdominal auscultation, i.e. listening to Bowel Sounds (BS), can be used to analyse digestion. An automated retrieval of BS would be beneficial to assess gastro-intestinal disorders non-invasively.
Objective:
To develop a multi-scale spotting model to detect BS in continuous audio data from a wearable monitoring system.
Methods:
We designed a spotting model based on Efficient-U-Net (EffUNet) architecture to analyse 10-second audio segments at a time and spot BS with a temporal resolution of 25 ms. Evaluation data was collected across different digestive phases from 18 healthy participants and 9 patients with Inflammatory Bowel Disease (IBD). Audio data were recorded in a daytime setting with a T-Shirt that embeds digital microphones. The dataset was annotated by independent raters with substantial agreement (Cohen’s κ between 0.70 and 0.75), resulting in 136 h of labelled data. In total, 11482 BS were analysed, with BS duration ranging between 18 ms and 6.3 s. The share of BS in the dataset (BS ratio) was 0.89%. We analysed performance depending on noise level, BS duration, and BS event rate, as well as report spotting timing errors.
Results:
Leave-One-Participant-Out cross-validation of BS event spotting yielded a median F1 score of 0.73 for both, healthy volunteers and patients. EffUNet detected BS in different noise conditions with 0.73 recall and 0.72 precision. In particular, for SNR > 4 dB, more than 83% of BS were recognised, with precision ≥ 0.77. EffUNet recall dropped below 0.60 for BS duration ≥ 1.5 s. At BS ratio > 5%, our model precision was > 0.83. For both healthy participants and patients, insertion and deletion timing errors were the largest, with a total of 15.54 min insertion errors and 13.08 min of deletion errors over the total audio dataset. On our dataset, EffUNet outperform existing BS spotting models that provide similar temporal resolution.
Conclusions:
The EffUNet spotter is robust against background noise and can retrieve BS with varying duration. EffUNet outperforms previous BS detection approaches in unmodified audio data, containing highly sparse BS events.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.