Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Feb 1, 2025
Date Accepted: Apr 30, 2025

The final, peer-reviewed published version of this preprint can be found here:

Menstrual Health Education Using a Specialized Large Language Model in India: Development and Evaluation Study of MenstLLaMA

Adhikary PK, Motiyani I, Oke G, Joshi M, Pathak K, Singh SM, Chakraborty T

Menstrual Health Education Using a Specialized Large Language Model in India: Development and Evaluation Study of MenstLLaMA

J Med Internet Res 2025;27:e71977

DOI: 10.2196/71977

PMID: 40669074

PMCID: 12286563

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

MenstLLaMA: A Specialized Large Language Model for Menstrual Health Education in India

  • Prottay Kumar Adhikary; 
  • Isha Motiyani; 
  • Gayatri Oke; 
  • Maithili Joshi; 
  • Kanupriya Pathak; 
  • Salam Michael Singh; 
  • Tanmoy Chakraborty

ABSTRACT

Background:

The quality and accessibility of menstrual health education in developing nations, including India, remain inadequate due to challenges such as poverty, social stigma, and gender inequality. While community-driven initiatives aim to raise awareness, artificial intelligence (AI) offers a scalable solution for disseminating accurate information. However, existing general-purpose large language models (LLMs) are ill-suited for this task, suffering from low accuracy, cultural insensitivity, and overly complex responses. To address these limitations, we developed MenstLLaMA, a specialized LLM tailored to the Indian context, designed to deliver menstrual health education empathetically, supportively, and accessible.

Objective:

To develop and evaluate MesnstLLaMA, a specialized LLM tailored to deliver accurate, culturally sensitive menstrual health education, and to assess its effectiveness compared to existing general-purpose models.

Methods:

We curated a novel, domain-specific dataset and benchmarked state-of-the-art LLMs to develop MenstLLaMA, an empathic companion model. The evaluation employed an open-label benchmark design with a four-stage framework: (1) overlap with ground truth, (2) clinical relevance, (3) response diversity, and (4) user satisfaction. A panel of clinical experts (N=1,18) conducted expert evaluations, while participants (N=1,200) interacted with chatbots, including MenstLLaMA, in 15–20-minute randomized sessions for user satisfaction assessment.

Results:

MenstLLaMA was compared against state-of-the-art general-purpose LLMs such as GPT-4o, Claude-3, and Mistral using automated and human-based metrics. MenstLLaMA achieved the highest BLEU score (0.059) and BERTScore (0.911), outperforming competitors without requiring few-shot learning. Clinical experts consistently rated its responses superior to gold-standard answers. User case studies revealed high ratings in Understandability (4.7/5) and Relevance (4.3/5), with a moderate rating in Context Sensitivity (3.9/5).

Conclusions:

MenstLLaMA demonstrates exceptional accuracy, empathy, and user satisfaction in menstrual health education, bridging critical gaps left by general-purpose LLMs. Its potential for integration into broader health education platforms positions it as a transformative tool for menstrual well-being. Future research may explore its long-term impact on public perception and menstrual hygiene practices.


 Citation

Please cite as:

Adhikary PK, Motiyani I, Oke G, Joshi M, Pathak K, Singh SM, Chakraborty T

Menstrual Health Education Using a Specialized Large Language Model in India: Development and Evaluation Study of MenstLLaMA

J Med Internet Res 2025;27:e71977

DOI: 10.2196/71977

PMID: 40669074

PMCID: 12286563

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.