JMIR Preprints #55898: Assessing the application of Natural Language Processing Models (NLPMs) in generating dermatologic patient education materials according to reading level

Current Preprint Settings

(as selected by the authors)

1. When the manuscript is submitted, allow peer review from:

(a) Anybody (open community peer review)
(b) Editor-selected reviewers (closed peer review)

2. When the manuscript is submitted, display the preprint PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)
(d) No one

3. When the manuscript is accepted, display the accepted manuscript PDF to:

(a) Anybody, anytime
(b) Logged-in users only
(c) Anybody, anytime (title and abstract only)

Assessing the application of Natural Language Processing Models (NLPMs) in generating dermatologic patient education materials according to reading level

Raphaella Lambert;
Zi-Yi Choo;
Kelsey Gradwohl;
Liesl Schroedl;
Arlene Ruiz De Luzuriaga

ABSTRACT

Background:

Health literacy presents a barrier to receiving outpatient dermatologic care. Yet, dermatologic patient education materials (PEMs) are often written above the national average 7-8th-grade reading level. Chat Generative Pre-Trained Transformer (ChatGPT), DermGPT and DocsGPT are natural language processing models responsive to user prompts. Our project assesses their use in generating dermatologic PEMs at specified reading levels.

Objective:

To assess the ability of NLPMs ChatGPT, DocsGPT and DermGPT to generate PEMs for common and rare dermatologic conditions at unspecified and specified reading levels. Further, to assess preservation of meaning across such NLPM-generated PEMs, as assessed by dermatology resident trainees.

Methods:

We evaluated the Flesch-Kincaid reading level (FKRL) of current AAD PEMs for four common (atopic dermatitis, acne vulgaris, psoriasis, herpes zoster) and rare (epidermolysis bullosa, bullous pemphigoid, lamellar ichthyosis, lichen planus) dermatologic conditions. We prompted ChatGPT, DermGPT and DocsGPT “Create a patient education handout about [condition] at a [FKRL],” to iteratively generate 10 PEMs per condition at unspecified, 5th and 7th-grade FKRLs evaluated with Microsoft Word readability statistics. Preservation of meaning across NLPMs was assessed by two dermatology resident trainees.

Results:

Current AAD PEMs had an average FKRL of 9.35 and 9.50 for common and rare diseases, respectively. For common diseases, ChatGPT-produced PEMs had average FKRLs of 11.21 (unspecified prompt), 5.02 (5th-grade prompt) and 6.56 (7th-grade prompt); DocsGPT-produced PEMs had average FKRLs of 10.18 (unspecified prompt), 5.01 (5th-grade prompt) and 5.98 (7th-grade prompt); and DermGPT-produced PEMs had average FKRLs of 11.14 (unspecified prompt), 7.43 (5th-grade prompt) and 7.28 (7th-grade prompt). For rare diseases, ChatGPT-generated materials had average FKRLs of 11.45 (unspecified prompt), 5.13 (5th-grade prompt) and 6.75 (7th-grade prompt); DocsGPT-produced PEMs had average FKRLs of 10.41 (unspecified prompt), 5.30 (5th-grade prompt) and 6.43 (7th-grade unspecified); and DermGPT-generated PEMS had average FKRLs of 11.93 (unspecified prompt), 7.14 (5th-grade prompt) and 7.58 (7th-grade unspecified). Compared to DermGPT, both DocsGPT (P=1.75E-06, P=7.26E-05) and ChatGPT (P=2.60E-09, P=.000172) were better able to generate PEMs at a 5th-grade reading level for common and rare conditions, respectively. Preservation of meaning analysis revealed that for common conditions, DermGPT ranked highest for overall ease of reading, patient understandability and accuracy (14.75/15) followed by DocsGPT (14.25/15) and ChatGPT (13.5/15). For rare conditions, handouts generated by ChatGPT ranked highest (13.5/15), followed by DermGPT (13/15) and DocsGPT (13/15).

Conclusions:

Our analysis suggests that NLPMs may reliably meet 7th-grade FKRLs for select common and rare dermatologic conditions and are easy to read, understandable for patients and mostly accurate. More specifically, DocsGPT and ChatGPT appear to outperform DermGPT at the 5th-grade FKRL, though both DermGPT and DocsGPT perform better at the 7th-grade FKRL with few differences observed across common or rare conditions. As such, NLPMs may play a role in enhancing health literacy and disseminating accessible, understandable PEMs in dermatology.

Citation

Please cite as:

Lambert R, Choo ZY, Gradwohl K, Schroedl L, Ruiz De Luzuriaga A

Assessing the Application of Large Language Models in Generating Dermatologic Patient Education Materials According to Reading Level: Qualitative Study

JMIR Dermatol 2024;7:e55898

DOI: 10.2196/55898

PMID: 38754096

PMCID: 11140271

Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

Copyright

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

JMIR Publications

JMIR Preprints

Accepted for/Published in: JMIR Dermatology

Date Submitted: Dec 29, 2023

Date Accepted: Mar 6, 2024

Assessing the application of Natural Language Processing Models (NLPMs) in generating dermatologic patient education materials according to reading level

ABSTRACT

Citation

Copyright