JMIR Preprints #94063: A Machine Learning Approach to Voice-Based Parkinson Disease Screening Using Multiview Spectrogram and Speech Recognition Features: Diagnostic Study

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

A Machine Learning Approach to Voice-Based Parkinson Disease Screening Using Multiview Spectrogram and Speech Recognition Features: Diagnostic Study

Arifa Zahir;
Jaehong Yu;
Jin-Sun Jun;
Kiwon Park;
Ryul Kim;
Hyundoo Jeong

ABSTRACT

Background:

Parkinson’s disease (PD) frequently manifests early vocal impairment, motivating the development of non-invasive and scalable digital screening tools.

Objective:

This study proposes MSR-PDNet, a multiview spectrogram- based deep learning framework integrating recognition-aware context for PD detection from voice recordings.

Methods:

Voice recordings from 203 participants (121 PD, 82 healthy controls) were collected prospectively. Three spectrogram representations(Mel, STFT, and CQT) were extracted and processed via parallel convo- lutional neural network branches. A recognition ratio (RR) feature vector derived from automatic speech recognition transcript agreement was option- ally fused with spectrogram embeddings. Models were evaluated using strict subject-wise 5-fold cross-validation.

Results:

MSR-PDNet achieved 86.9% mean test accuracy using three- view spectrogram fusion, improving to 97.4% when incorporating RR. RR integration reduced the false negative rate by approximately 84.5%, substan- tially improving sensitivity in screening-oriented settings.

Conclusions:

Combining multiview spectrogram learning with recognition- aware context significantly enhances voice-based PD classification under leakage- free evaluation. The proposed framework supports deployment-oriented, non-invasive PD screening systems.

Citation

Please cite as:

Zahir A, Yu J, Jun JS, Park K, Kim R, Jeong H

A Machine Learning Approach to Voice-Based Parkinson Disease Screening Using Multiview Spectrogram and Speech Recognition Features: Diagnostic Study

JMIR Med Inform 2026;14:e94063

DOI: 10.2196/94063

PMID: 42274996

PMCID: 13255941

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Feb 24, 2026

Date Accepted: May 10, 2026

A Machine Learning Approach to Voice-Based Parkinson Disease Screening Using Multiview Spectrogram and Speech Recognition Features: Diagnostic Study

ABSTRACT

Citation

Copyright