JMIR Preprints #95644: Sentence-Level Provenance for AI Medical Record Summarization: Formative Usability Evaluation of a Click-to-Inspect Interface

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

Sentence-Level Provenance for AI Medical Record Summarization: Formative Usability Evaluation of a Click-to-Inspect Interface

Andrew Parambath;
Giordana Pulpo;
Vince Hartman

ABSTRACT

Background:

Large language models (LLMs) can generate fluent summaries of longitudinal medical records, but in high-stakes clinical settings, verification burden remains a barrier to trust. Existing provenance mechanisms, such as document-level citations and section references, often require manual search within long, fragmented notes, limiting their usefulness during time-constrained workflows for clinicians.

Objective:

To design and evaluate a sentence-level provenance interface (“click-to-inspect”) that enables rapid verification of AI-generated longitudinal medical record summaries at the level of individual statements.

Methods:

We developed and tested a web-based interface in which every sentence in an AI-generated longitudinal patient summary is clickable and linked to a semantically matched source sentence in the originating clinical note. Clicking a sentence opens the source note in a side-by-side view, scrolls to the matched passage, and highlights it in context. Formative usability testing was conducted with 46 clinician interactions using synthetic longitudinal patient charts. Participants included medical students, residents, and attending physicians across multiple specialties including internal medicine, dermatology, radiology, plastic surgery, anesthesiology, interventional radiology, obstetrics-gynecology, and family medicine. Usability was assessed using the System Usability Scale (SUS) and Net Promoter Score (NPS), alongside qualitative feedback.

Results:

Clinicians reported high usability (mean SUS score 86.25, SD 7.77; 95% CI 83.96–88.54) and a positive overall experience (NPS 35; 22/46 promoters, 18/46 passives, 6/46 detractors). Participants described rapid access to supporting evidence as critical for trust calibration during first-pass chart review. Qualitative feedback identified friction in traditional citation-based interfaces and supported sentence-level inspectability as a low-friction verification mechanism.

Conclusions:

Sentence-level provenance transforms AI-generated summaries from static narratives into interactive verification tools. An approach that enables rapid, selective inspection of individual claims during longitudinal chart review, may reduce verification burden and support calibrated reliance in high-risk clinical contexts. Clinical Trial: NA

Citation

Please cite as:

Parambath A, Pulpo G, Hartman V

Sentence-Level Provenance for AI Medical Record Summarization: Formative Usability Evaluation of a Click-to-Inspect Interface

JMIR Preprints. 18/03/2026:95644

DOI: 10.2196/preprints.95644

URL: https://preprints.jmir.org/preprint/95644

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Currently submitted to: JMIR Human Factors

Date Submitted: Mar 18, 2026

Open Peer Review Period: Apr 7, 2026 - Jun 2, 2026

(currently open for review)

Sentence-Level Provenance for AI Medical Record Summarization: Formative Usability Evaluation of a Click-to-Inspect Interface

ABSTRACT

Citation

Copyright