Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.
Who will be affected?
Readers: No access to all 28 journals. We recommend accessing our articles via PubMed Central
Authors: No access to the submission form or your user account.
Reviewers: No access to your user account. Please download manuscripts you are reviewing for offline reading before Wednesday, July 01, 2020 at 7:00 PM.
Editors: No access to your user account to assign reviewers or make decisions.
Copyeditors: No access to user account. Please download manuscripts you are copyediting before Wednesday, July 01, 2020 at 7:00 PM.
Authors’ Reply: Critical Limitations in Comparing ChatGPT and DeepSeek for Orthopedic Assessment
Chirathit Anusitviwat;
Sitthiphong Suwannaphisit;
Jongdee Bvonpanttarananon;
Boonsin Tangtrakulwanich
ABSTRACT
We respond to comments on our study comparing ChatGPT and DeepSeek for answering orthopedic multiple-choice questions. We clarify that the reported Cohen κ values reflect inter-rater reliability within each model rather than agreement between the two models. All questions were administered in English, and the findings therefore reflect performance in an English-language context. We acknowledge limitations related to reproducibility due to the use of web-based interfaces and address concerns about data contamination. We also correct a typographical error in the reported accuracy for the pelvic and spine injury category.
Citation
Please cite as:
Anusitviwat C, Suwannaphisit S, Bvonpanttarananon J, Tangtrakulwanich B
Authors’ Reply: Critical Limitations in Comparing ChatGPT and DeepSeek for Orthopedic Assessment