Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Jul 7, 2025
Open Peer Review Period: Jul 9, 2025 - Sep 3, 2025
Date Accepted: Nov 4, 2025
(closed for review but you can still tweet)
Large Language Model-enabled Editing of Patient Audio Interviews from This is My Story (TIMS) Conversations: A Comparative Study
ABSTRACT
Background:
This Is My Story (TIMS) was started by Chaplain Elizabeth Tracey to promote a humanistic approach to medicine. Patients in the TIMS program are the subject of a guided conversation in which a chaplain interviews either the patient or their loved one about the patient. The interviewer asks four questions to elicit clinically actionable information that has been shown to improve communication, between the narrator and the medical providers, and increase empathy on part of the medical team. The original recorded conversation is edited into a condensed audio file approximately 1.5 minutes in length and placed in the electronic health record where it is easily accessible by all clinicians caring for the patient.
Objective:
TIMS is active at the Johns Hopkins Hospital and has shown value in assisting with clinician empathy and communication. As the program expands, there exists a barrier to adoption due to limited time and resources needed to manually edit audio conversations into a more condensed format. To address this, we propose an automated solution using a large language model (LLM) to create meaningful and concise audio summaries.
Methods:
We analyzed 24 TIMS audio interviews and created three edited versions of each: (1) Expert-edited, (2) AI-edited using a fully automated LLM pipeline, and (3) Novice-edited by two medical students trained by the expert. All versions were evaluated using a within‐subjects design by a second expert who was blinded to both the editor and order each audio was presented. This expert rated all interviews and scored audio quality and content quality on 5-point Likert scales. We quantified transcript similarity to the expert-edited reference using lexical and semantic similarity metrics and qualitatively assessed important information omitted relative to the expert-edited interview.
Results:
Audio quality (flow, pacing, clarity) and content quality (coherence, relevance, nuance) were each rated on 5-point Likert scales. Expert-edited interviews received highest mean ratings for both audio quality (4.84) and content quality (4.83). Novice-edited scored moderately (3.84 audio, 3.63 content), while AI-edited scored slightly lower (3.49 audio, 3.20 content). Novice and AI edits were rated significantly lower than expert (p <.001), but not significantly different from each other. AI and novice-edited interview transcripts had comparable overlap with the expert reference transcript, while qualitative review found frequent omissions of patient identity, actionable insights, and overall context in both the AI and novice-edited interviews. AI editing was fully automated and significantly reduced the editing time compared to both human editors.
Conclusions:
AI-based editing pipeline can generate TIMS audio summaries with comparable content and audio quality to novice human editors with one hour of training. AI significantly reduces editing time and removes the need for manual training, while offering a solution to scale TIMS to larger organizations or where expert editors are not readily available. Clinical Trial: Not applicable.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.