Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR AI

Date Submitted: Apr 18, 2025
Date Accepted: Aug 27, 2025

The final, peer-reviewed published version of this preprint can be found here:

Assessing the Capability of Large Language Models for Navigation of the Australian Health Care System: Comparative Study

Simmich J, Ross MH, Russell TG

Assessing the Capability of Large Language Models for Navigation of the Australian Health Care System: Comparative Study

JMIR AI 2025;4:e76203

DOI: 10.2196/76203

PMID: 41060005

PMCID: 12508777

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Assessing the Capability of Large Language Models for Navigation of the Australian Healthcare System: A Comparative Study

  • Joshua Simmich; 
  • Megan Heather Ross; 
  • Trevor Glen Russell

ABSTRACT

Background:

Australians in rural and regional areas face significant challenges in navigating the healthcare system, including limited services, fewer treatment options, and difficulty understanding their entitlements. Generative search tools, powered by large language models (LLMs), show promise in improving health information retrieval by generating direct answers. However, concerns remain regarding their accuracy and reliability when compared to traditional search engines, in a healthcare context.

Objective:

This study aimed to compare the effectiveness of a generative AI search (Microsoft Copilot) versus a conventional search engine (Google Web Search) for navigating healthcare information.

Methods:

A total of 97 adults in Queensland participated in an online survey, answering scenario-based healthcare navigation questions using either Microsoft Copilot or Google Web Search. Accuracy was assessed using binary correct/incorrect ratings, graded correctness (incorrect, partially correct, correct), and numerical scores (0–2 for service identification, 0–6 for criteria). Participants also completed a Technology Rating Questionnaire (TRQ) to evaluate their experience with their assigned tool.

Results:

Participants assigned to Microsoft Copilot outperformed the Google Web Search group on two healthcare navigation tasks (identifying aged care application services and listing mobility allowance eligibility criteria), with no clear evidence of a difference the remaining six tasks. On the TRQ, participants rated Google Web Search higher in willingness to adopt and perceived impact on quality of life, and lower in effort needed to learn. Both tools received similar ratings in perceived value confidence, help required to use, and concerns about privacy.

Conclusions:

Generative AI tools can achieve comparable accuracy to traditional search engines for healthcare navigation tasks, though this did not translate into an improved user experience. Further evaluation is needed as AI technology improves and users become more familiar with its use.


 Citation

Please cite as:

Simmich J, Ross MH, Russell TG

Assessing the Capability of Large Language Models for Navigation of the Australian Health Care System: Comparative Study

JMIR AI 2025;4:e76203

DOI: 10.2196/76203

PMID: 41060005

PMCID: 12508777

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.