Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Advancing Public Health Medical Education in Low-Resource Settings: New Insights from a Comparative Performance Study of Authoritative Textbook-Augmented Large Language Model in Xizang Autonomous Region
ABSTRACT
Background:
Public health medical education has become increasingly important, with the rising demand for residence and tourism in the low-resourced, high-altitude Xizang Autonomous Region, as well as the associated economic and social benefits. Traditional authoritative textbooks fail to meet the growing demand for accessibility and interactivity in the digital era, while highly accessible general large language models (LLMs) suffer from hallucinations when applied to specialized medical domains. In addition, developing specialized LLMs for low-resource domains and regions is prohibitively expensive and difficult.
Objective:
To explore a novel approach to high-altitude public health medical education that integrates modern LLMs and an authoritative textbook, using a comprehensive benchmark evaluation across multiple dimensions and retrieval-augmented generation (RAG) technology.
Methods:
Assessments: A clinically weighted evaluation of multidimensional first-response scores (including comprehensiveness, accuracy, clarity, and relevance) and a composite consistency metric (including semantic similarity and algorithmic similarity) was administered to four publicly available LLMs (GPT-5.2 developed by OpenAI, Gemini 3.0 Pro developed by Google, DeepSeek R1 developed by DeepSeek, and Tencent HY 2.0 developed by Tencent) to select the optimal model. The performance of LMMs is evaluated through a set of eighty questions specially designed for high aititude public health by authoritative medical specialists. Retrieval-augmented generation (RAG): Four specific and prevalent authoritative textbooks on high-altitude public health medicine, High Altitude Medicine and Physiology, High Altitude Medicine: A Case-Based Approach, High Altitude Medicine, and High Altitude Medical Protection, was deployed as the external knowledge base for the evaluation-optimized model.
Results:
DeepSeek R1 was selected as the optimal base model for achieving the highest weighted score (5.61/10.00), followed by GPT-5.2 (5.51/10.00), Gemini 3.0 Pro (5.39/10.00), and Tencent HY 2.0 (4.71/10.00). The deployed retrieval-augmented model integrating the authoritative textbooks and the optimal LLM DeepSeek R1, HHME-Xplus-RAG, achieved remarkable improvement in multidimensional scores compared to baseline DeepSeek R1 (7.94 (7.75, 8.00) vs. 7.63 (7.38, 7.88), P < 0.001).
Conclusions:
The proposed authoritative textbook-augmented LMMs, HHSE-Xplus-RAG model, demonstrated superior multidimensional performance in medical education on high-altitude public health compared with general-purpose LLMs, providing a practical paradigm for integrating traditional authoritative textbooks with modern LLM to create reliable and specialized educational tools. Clinical Trial: Not applicable
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.