Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: May 6, 2025
Open Peer Review Period: May 6, 2025 - May 21, 2025
Date Accepted: Aug 11, 2025
Date Submitted to PubMed: Aug 12, 2025
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Interdisciplinary Development and Fine-Tuning of CARDIO, a Large Language Model for Cardiovascular Health Education in HIV Care: Tutorial

Rullo R, Maatouk A, Huang T, Chen J, Qiu W, O'Connor G, Womack J, Sadak T, Rodriguez C, de Jesus Espinosa T, Carneiro P, Marshall A, Ying R, Ramos SR

Interdisciplinary Development and Fine-Tuning of CARDIO, a Large Language Model for Cardiovascular Health Education in HIV Care: Tutorial

J Med Internet Res 2025;27:e77053

DOI: 10.2196/77053

PMID: 40794856

PMCID: 12475882

Interdisciplinary Development and Fine-Tuning of CARDIO, a LLM for Cardiovascular Health Education in HIV Care: A Tutorial

  • Ryan Rullo; 
  • Ali Maatouk; 
  • Tinglin Huang; 
  • Jialin Chen; 
  • Weikang Qiu; 
  • Giselle O'Connor; 
  • Julie Womack; 
  • Tatiana Sadak; 
  • Christine Rodriguez; 
  • Tania de Jesus Espinosa; 
  • Pedro Carneiro; 
  • Ami Marshall; 
  • Rex Ying; 
  • S. Raquel Ramos

ABSTRACT

Background:

The integration of Artificial Intelligence in healthcare presents as a significant opportunity to revolutionize patient care. In the United States, an estimated 129 million people have at least one chronic illness, with 42% having two or more. Despite being largely preventable, the prevalence of chronic illness is expected to rise and impose significant economic burdens and financial toxicity on healthcare.

Objective:

We leverage an interdisciplinary team encompassing nursing, public health, and computer science to optimize health through prevention education for cardiovascular and metabolic comorbidities in persons living with HIV. In this paper, we describe the iterative development of an intersectionality-informed large language model designed to support cardiometabolic health in this population.

Methods:

First, we curated data by scraping publicly available, authoritative, evidence-based sources to capture a comprehensive dataset, supplemented by publicly available HIV forum content. Second, we benchmarked candidate large language models and generated a fine‐tuning dataset using GPT-4 through multi-turn question–answer conversations, employing standardized metrics to assess baseline model performance. Third, we iteratively refined the selected model via Low-Rank Adaptation and reinforcement learning, integrating quantitative metrics with qualitative expert evaluations.

Results:

Pre-existing LLM models demonstrated poor n-gram agreement, dissonance from model answers (Accuracy 4.16, Readability 4.63, Professionalism 4.58), and difficult readability (Kincaid 8.54, Jargon 4.44). After prompt adjustments and fine-tuning, preliminary results demonstrate the potential of a customized LLaMA-based LLM to provide personalized, culturally salient patient education.

Conclusions:

In this study, we described the steps in developing an LLM using an interdisciplinary team. Through data collection, data scraping, model benchmarking, and model fine-tuning, our LLM’s performance improved substantially (Accuracy 5.0, Readability 4.98, Professionalism 4.98, Kincaid 7.17, Jargon 2.92). This demonstrates strong promise of model success post-finalization. In building an LLM for cardiovascular health promotion and patient education, this research contributes to innovative strategies for managing comorbid conditions in persons with HIV. Clinical Trial: N/A


 Citation

Please cite as:

Rullo R, Maatouk A, Huang T, Chen J, Qiu W, O'Connor G, Womack J, Sadak T, Rodriguez C, de Jesus Espinosa T, Carneiro P, Marshall A, Ying R, Ramos SR

Interdisciplinary Development and Fine-Tuning of CARDIO, a Large Language Model for Cardiovascular Health Education in HIV Care: Tutorial

J Med Internet Res 2025;27:e77053

DOI: 10.2196/77053

PMID: 40794856

PMCID: 12475882

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.