Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Aug 9, 2024
Date Accepted: May 6, 2025

The final, peer-reviewed published version of this preprint can be found here:

Use of Large Language Models to Classify Epidemiological Characteristics in Synthetic and Real-World Social Media Posts About Conjunctivitis Outbreaks: Infodemiology Study

Deiner MS, Deiner RY, Fathy C, Deiner NA, Hristidis V, McLeod SD, Bukowski TJ, Doan T, Seitzman GD, Lietman TM, Porco TC

Use of Large Language Models to Classify Epidemiological Characteristics in Synthetic and Real-World Social Media Posts About Conjunctivitis Outbreaks: Infodemiology Study

J Med Internet Res 2025;27:e65226

DOI: 10.2196/65226

PMID: 40601927

PMCID: 12268217

Use of large language models to classify epidemiological characteristics in synthetic and real-world social media posts about conjunctivitis outbreaks: Infodemiology Study

  • Michael S. Deiner; 
  • Russell Y Deiner; 
  • Cherie Fathy; 
  • Natalie A Deiner; 
  • Vagelis Hristidis; 
  • Stephen D. McLeod; 
  • Thomas J. Bukowski; 
  • Thuy Doan; 
  • Gerami D. Seitzman; 
  • Thomas M. Lietman; 
  • Travis C. Porco

ABSTRACT

Background:

Use of online search and social media can help identify epidemics, potentially earlier than clinical methods or even potentially identifying otherwise unreported outbreaks. Monitoring for eye-related epidemics can facilitate early public health intervention to reduce transmission and ocular comorbidities. However, monitoring social media post content for conjunctivitis outbreaks is costly and laborious. Large language models (LLMs) could overcome these barriers, assessing the likelihood real-world outbreaks are being described. Public health actions for likely outbreaks could benefit more though by knowing additional epidemiological characteristics, such as the outbreak type, size or which ones are the most severe

Objective:

We assessed if and how well LLMs’ can classify epidemiological features from social media posts beyond conjunctivitis outbreak probability, including outbreak type, size, severity, etiology and community setting. We employed a validation framework comparing LLM classifications to other LLMs and human experts.

Methods:

We wrote code to generate synthetic conjunctivitis outbreak social media posts, embedded with specific pre-classified epidemiological features to simulate various infectious eye outbreak and control scenarios. We used these posts to develop effective LLM prompts and to test capabilities of multiple LLMs to assess them. For top-performing LLM’s, we next gauged their practical utility in real-world epidemiological surveillance by comparing their assessments of Twitter/X, forum and YouTube conjunctivitis posts. Finally, human graders also classified posts and we compared their classifications to a leading LLM for validation. Comparisons entailed correlation, or sensitivity and specificity statistics.

Results:

We assessed seven LLMs for effectively classifying epidemiological data from 1,152 synthetic posts, 370 Twitter/X posts, 290 forum posts and 956 YouTube comment posts. Despite some discrepancies, LLMs demonstrated a reliable capacity for nuanced epidemiological analysis across various data sources and compared to humans or between LLMs. Notably, GPT-4 and Mixtral 8x22b exhibited high performance predicting conjunctivitis outbreak characteristics like probability (0.73 correlation, GPT-4) size (0.82 correlation, Mixtral8x22b) and outbreak type (infectious, allergic, or environmental), however there were notable exceptions. Assessing synthetic and real-world post content for etiological causes, infectious eye disease specialist validations revealed GPT-4 had high specificity (0.83-1.00) but varied sensitivity (0.32-0.71). Inter-rater reliability analyses showed LLM-expert agreement exceeded expert-expert agreement for severity assessment (ICC = 0.69 vs 0.38), while agreement varied by condition type (κ = 0.37-0.94).

Conclusions:

This investigation into the potential of LLMs for public health infoveillance suggests effectiveness in classifying key epidemiological characteristics from social media content about conjunctivitis outbreaks. Future studies should further explore LLMs potential to support public health monitoring through automated assessment and classification of potential infectious eye or other outbreaks. Their optimal role may be to act as a first line of documentation, alerting public health organizations for follow-up of LLM-detected and classified small early outbreaks with a focus on the most severe ones.


 Citation

Please cite as:

Deiner MS, Deiner RY, Fathy C, Deiner NA, Hristidis V, McLeod SD, Bukowski TJ, Doan T, Seitzman GD, Lietman TM, Porco TC

Use of Large Language Models to Classify Epidemiological Characteristics in Synthetic and Real-World Social Media Posts About Conjunctivitis Outbreaks: Infodemiology Study

J Med Internet Res 2025;27:e65226

DOI: 10.2196/65226

PMID: 40601927

PMCID: 12268217

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.