Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Research Protocols

Date Submitted: Jun 3, 2025
Date Accepted: Oct 27, 2025

The final, peer-reviewed published version of this preprint can be found here:

Evaluating AI-Generated Podcasts Versus Traditional Reading for Learning From Medical Articles: Protocol for a Mixed-Design Study Among Resident Physicians

Stadler M, Richters C, Fischer MR, Hutmacher F

Evaluating AI-Generated Podcasts Versus Traditional Reading for Learning From Medical Articles: Protocol for a Mixed-Design Study Among Resident Physicians

JMIR Res Protoc 2025;14:e78505

DOI: 10.2196/78505

PMID: 41385789

PMCID: 12743238

Evaluating AI-Generated Podcasts Versus Traditional Reading for Learning From Medical Articles: Protocol For A Mixed-Design Study Among Resident Physicians

  • Matthias Stadler; 
  • Constanze Richters; 
  • Martin R. Fischer; 
  • Fabian Hutmacher

ABSTRACT

Background:

Podcasts have emerged as a popular medium in medical education over the past decade. Audio learning allows flexibility and may help residents engage with content in new ways. Meanwhile, reading scientific literature is a core skill for residents, yet many residents struggle to comprehend complex research articles. Advances in artificial intelligence (AI) now enable automatic generation of podcast-style summaries of documents. It remains unclear whether listening to such AI-generated podcast summaries can match the educational value of reading the full text of medical papers, and whether this might depend on the complexity of the article.

Objective:

This study aims to compare comprehension of medical research papers among medical doctors in their residency when learning via an AI-generated audio podcast versus traditional reading. We will examine whether article complexity (narrative vs. technical) moderates any difference. We hypothesize an interaction: for a highly complex article, residents who read the full text should achieve better understanding than those who listen to a summary, whereas for an easier article the difference between modalities should be smaller.

Methods:

We designed a 2×2 mixed factorial study with N = 60 resident physicians preparing for board certification in internal medicine or cardiology. All participants will engage with two peer-reviewed cardiology articles differing in complexity: a narrative case report on eosinophilic myocarditis and a technical research article on vena contracta area quantification using three-dimensional echocardiography. Each participant will read one article and listen to an AI-generated podcast summary of the other, with the order and assignment counterbalanced to control for order effects. The podcasts are created using Google NotebookLM’s experimental “Audio Overview” feature, generating dialogue-style summaries. Participants will complete a multiple-choice knowledge test for each article. Primary outcomes are comprehension scores for each modality. Secondary outcomes include intrinsic motivation, perceived learning gains, and cognitive load for each condition. Data will be analyzed using mixed ANOVA to test main effects of modality and article complexity, and their interaction.

Results:

Recruitment is expected to begin by end-2025, with data collection completed by early 2026. We will report the trial results according to CONSORT guidelines, and any deviations from this protocol will be documented and justified. No results are available at the time of publication of this protocol.

Conclusions:

This randomized trial will offer evidence on the effectiveness of AI-generated podcast summaries as a learning tool for medical literature. If listening to an AI-generated podcast yields comprehension comparable or superior to reading the full article, it could validate an innovative, time-saving approach for busy medical trainees. Conversely, if significant deficits are observed in the podcast group (especially for complex content), the findings will highlight the limits of AI summaries and the continued importance of traditional reading for thorough understanding. This study will inform educators and students whether AI-generated audio can serve as a reliable adjunct or alternative for engaging with research papers. Ultimately, the results may guide how AI and multimedia are integrated into medical education to enhance learning outcomes.


 Citation

Please cite as:

Stadler M, Richters C, Fischer MR, Hutmacher F

Evaluating AI-Generated Podcasts Versus Traditional Reading for Learning From Medical Articles: Protocol for a Mixed-Design Study Among Resident Physicians

JMIR Res Protoc 2025;14:e78505

DOI: 10.2196/78505

PMID: 41385789

PMCID: 12743238

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.