Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: Aug 2, 2024
Date Accepted: Oct 19, 2024

The final, peer-reviewed published version of this preprint can be found here:

Exploring the Potential of Claude 3 Opus in Renal Pathological Diagnosis: Performance Evaluation

Li X, Liu K, Lang Y, Chai Z, Liu F

Exploring the Potential of Claude 3 Opus in Renal Pathological Diagnosis: Performance Evaluation

JMIR Med Inform 2024;12:e65033

DOI: 10.2196/65033

PMID: 39547661

PMCID: 11607560

Exploring the Potential of Claude 3 Opus in Renal Pathological Diagnosis: A Performance Evaluation

  • Xingyuan Li; 
  • Ke Liu; 
  • Yanling Lang; 
  • Zhonglin Chai; 
  • Fang Liu

ABSTRACT

Background:

Artificial intelligence (AI) has shown great promise in assisting medical diagnosis, but its application in renal pathology remains limited.

Objective:

Evaluated the performance of an advanced AI language model, Claude 3 Opus, in generating diagnostic descriptions for renal pathological images.

Methods:

A dataset of 100 renal pathological images across 27 disease types was curated. Claude 3 Opus generated diagnostic descriptions for each image, which were scored by two pathologists on clinical relevance, accuracy, fluency, completeness, and overall value.

Results:

Claude 3 Opus achieved high scores in language fluency (mean=3.86) but lower scores in clinical relevance (1.75), accuracy (1.55), completeness (2.01), and overall value (1.75). Performance varied across disease types. Inter-rater agreement was substantial for relevance (κ=0.627) and overall value (κ=0.589), and moderate for accuracy (κ=0.485) and completeness (κ=0.458).

Conclusions:

Claude 3 Opus shows potential in generating fluent renal pathology descriptions but needs improvement in accuracy and clinical value. AI's performance varies across disease types. Further optimization and validation are needed for clinical application.


 Citation

Please cite as:

Li X, Liu K, Lang Y, Chai Z, Liu F

Exploring the Potential of Claude 3 Opus in Renal Pathological Diagnosis: Performance Evaluation

JMIR Med Inform 2024;12:e65033

DOI: 10.2196/65033

PMID: 39547661

PMCID: 11607560

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.