Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Mar 28, 2025
Date Accepted: May 24, 2025

The final, peer-reviewed published version of this preprint can be found here:

Evaluating and Improving Syndrome Differentiation Thinking Ability in Large Language Models: Method Development Study

Chen C, Wang X, Guan M, Yue W, Wu Y, Zhou Y, Wang X

Evaluating and Improving Syndrome Differentiation Thinking Ability in Large Language Models: Method Development Study

JMIR Med Inform 2025;13:e75103

DOI: 10.2196/75103

PMID: 40540614

PMCID: 12204376

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Evaluating and Improving Syndrome Differentiation Thinking Ability in Large Language Models: Methods Study

  • Chunliang Chen; 
  • Xinyu Wang; 
  • Ming Guan; 
  • Wenjing Yue; 
  • Yuanbin Wu; 
  • Ya Zhou; 
  • Xiaoling Wang

ABSTRACT

Background:

Large language models (LLMs) provide new opportunities to advance the intelligent development of Traditional Chinese medicine (TCM). Syndrome differentiation thinking is an essential part of TCM, and equipping LLMs with this capability represents a crucial step toward more effective clinical applications of TCM. However, given the complexity of TCM syndrome differentiation thinking, acquiring this ability is a considerable challenge for the model.

Objective:

This study aims to evaluate LLMs' syndrome differentiation thinking ability and design a method to enhance their performance in this area effectively.

Methods:

We decompose the process of TCM syndrome differentiation thinking into three core tasks: pathogenesis inference, syndrome inference, and diagnostic suggestion. To evaluate the performance of LLMs in these tasks, we constructed a high-quality evaluation dataset, providing a reliable foundation for the quantitative assessment of their capabilities. Furthermore, we developed a methodology for generating instruction data based on the idea of an "open-book exam", customized three data templates, and dynamically retrieved task-relevant professional knowledge, inserted into predefined positions within the templates. This approach effectively generates high-quality instruction data that aligns with the unique characteristics of TCM syndrome differentiation thinking. Leveraging this instruction data, we fine-tuned the base model, enhancing the syndrome differentiation thinking ability of the LLMs.

Results:

We collected 200 medical cases for the evaluation dataset and standardized them into three types of task questions. We tested general and TCM LLMs, comparing their performance with our proposed solution. The results demonstrate that our method significantly enhances LLMs' syndrome differentiation thinking ability. Our model achieved 85.7% and 81.2% accuracy in Tasks 1 and 2, respectively, surpassing the best-performing TCM and general LLMs by 26.3% and 15.8%. In Task 3, our model scored 84.3, indicating that the model is very similar to the advice given by experts.

Conclusions:

Existing general LLMs and TCM LLMs still have significant limitations in the core task of syndrome differentiation thinking. Our research shows that fine-tuning LLMs by designing professional instruction templates and generating high-quality instruction data can significantly improve their performance in core tasks. The optimized LLMs show a high degree of similarity in reasoning results with the opinions of domain experts, indicating that they can simulate syndrome differentiation thinking to a certain extent. This has important theoretical and practical significance for in-depth interpretation of the complexity of the clinical diagnosis and treatment process of TCM.


 Citation

Please cite as:

Chen C, Wang X, Guan M, Yue W, Wu Y, Zhou Y, Wang X

Evaluating and Improving Syndrome Differentiation Thinking Ability in Large Language Models: Method Development Study

JMIR Med Inform 2025;13:e75103

DOI: 10.2196/75103

PMID: 40540614

PMCID: 12204376

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.