Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: JMIR Medical Informatics

Date Submitted: Feb 4, 2026
Open Peer Review Period: Feb 11, 2026 - Apr 8, 2026
(currently open for review)

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Development and Validation of an Interpretable Machine Learning Model for Predicting Lateral Neck Lymph Node Metastasis in Papillary Thyroid Carcinoma Based on Ultrasound Data: A Retrospective Study

  • Ruijie Sun; 
  • Yuhui Ma; 
  • Yushan Jiang; 
  • Xiaoguang Li

ABSTRACT

Background:

Background:

Lateral neck lymph node metastasis (LLNM) is a major determinant of recurrence risk and surgical strategy in papillary thyroid carcinoma (PTC). However, accurate preoperative identification of LLNM remains challenging, as conventional imaging assessment is limited by operator dependency and variable diagnostic performance. Although several predictive models have been proposed, many suffer from limited generalizability or poor interpretability, hindering their integration into clinical decision-making.

Objective:

Objective:

Preoperative accurate prediction of LLNM in PTC remains challenging, and existing models have limitations such as poor interpretability or restricted applicability. This study aimed to develop and validate an interpretable machine learning (ML) model based on routine clinical and ultrasound data to predict LLNM risk in PTC patients.

Methods:

Methods:

A retrospective cohort study enrolled 816 PTC patients (June 2022-May 2024), randomly split into training (n=571) and internal validation (n=245) sets at a 7:3 ratio, with an independent external validation cohort of 178 patients (June 2024-May 2025). Clinical, laboratory, and routine ultrasound data were collected. Feature selection employed a three-step approach: (1) univariate and multivariate logistic regression (LR) analysis, (2) Boruta-SHAP algorithm for importance ranking, and (3) clinical expert validation to ensure clinical relevance. Nine ML models were developed, with hyperparameter tuning via grid search and 10-fold cross-validation. Model performance was evaluated using metrics such as area under the receiver operating characteristic curve (ROC), sensitivity, specificity, and F1-score. The SHapley Additive exPlanations (SHAP) method was used for model interpretation.

Results:

Results:

Eight independent risk factors were identified: gender, multifocality, age, tumor diameter, tumor location, capsular invasion, central lymph node metastasis, and uneven lateral cervical lymph node hilum echo. The Gradient Boosting Machine (GBM) model demonstrated optimal performance with an AUC of 0.905 (95% CI: 0.868-0.942), sensitivity of 0.831, specificity of 0.840, and F1-score of 0.764 in internal validation. External validation confirmed robust generalizability (AUC: 0.887, 95% CI: 0.840-0.934).SHAP analysis revealed that tumor size, gender, lateral cervical lymph node echo, central lymph node metastasis, and capsular invasion were the top five contributors to high LLNM risk, and provided individualized risk interpretation.

Conclusions:

Conclusion: This interpretable GBM model, based on routinely accessible clinical and ultrasound data, enables accurate preoperative LLNM risk stratification, supporting personalized decisions on the extent of lymph node dissection and potentially reducing unnecessary prophylactic surgery while ensuring adequate treatment for high-risk patients.


 Citation

Please cite as:

Sun R, Ma Y, Jiang Y, Li X

Development and Validation of an Interpretable Machine Learning Model for Predicting Lateral Neck Lymph Node Metastasis in Papillary Thyroid Carcinoma Based on Ultrasound Data: A Retrospective Study

JMIR Preprints. 04/02/2026:92844

DOI: 10.2196/preprints.92844

URL: https://preprints.jmir.org/preprint/92844

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.