Accepted for/Published in: JMIR AI
Date Submitted: Dec 7, 2023
Open Peer Review Period: Dec 7, 2023 - Feb 2, 2024
Date Accepted: Dec 31, 2024
(closed for review but you can still tweet)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Artificial Intelligence in the Pharmaceutical Industry: Gauging Use Potential for Creation of Scientific Response Documentation in Medical Information
ABSTRACT
Background:
Pharmaceutical manufacturers address healthcare professionals' (HCPs) information needs through Scientific Response Documents (SRDs), offering evidence-based answers to medication and disease state questions. Medical Information departments, staffed by medical experts, develop SRDs that provide concise summaries, relevant background information, search strategies, clinical data, and balanced references. With an escalating demand for SRDs and the increasing complexity of therapies, Medical Information departments are exploring advanced technologies and artificial intelligence (AI) tools like Large Language Models (LLMs) to streamline content development. While AI in general and LLMs in particular shows promise in generating draft responses, a synergistic approach combining an LLM with traditional machine learning classifiers in a series of human supervised and curated steps could help to address limitations like hallucination and ensure accuracy, context, traceability, and accountability in SRD creation.
Objective:
This study aims to quantify the pain points of SRD development and develop the framework to explore the feasibility and value addition of integrating AI capabilities, including LLM and machine learning, in the process of SRD creation.
Methods:
A survey conducted by phactMI, a non-profit consortium of Medical Information leaders in the pharmaceutical industry, assessed aspects of SRD creation across 33 member companies. The survey collected data on time and tediousness, with respondents ranking various tasks related to SRD development. The results identified steps in SRD creation where AI could offer maximum benefit. Another working group, consisting of Medical Information professionals and data scientists, utilized AI to aid SRD authoring, focusing on data extraction and abstraction. They employed logistic regression on semantic embedding features to train classification models and transformer-based summarization pipelines to generate concise summaries, maintaining a collaborative approach throughout the study.
Results:
The survey had a 52% response rate and revealed that Medical Information departments create an average of 614 new documents and update 1352 documents annually. The respondents ranked paraphrasing content from scientific articles as the most tedious and time-consuming. In the second phase of the project, all trained sentence classification models demonstrated clear (though in some cases modest) ability to distinguish their target categories when applied to test SRD references that were outside their training sets. Comparison between Bilingual Evaluation Understudy (BLEU) score and semantic similarity on the paraphrased texts varied among reviewers; with each individual preferring different tradeoffs between these metrics. The survey had a 52% response rate and revealed that Medical Information departments create an average of 614 new documents and update 1352 documents annually. The respondents ranked paraphrasing content from scientific articles as the most tedious and time-consuming. In the second phase of the project, all trained sentence classification models demonstrated clear (though in some cases modest) ability to distinguish their target categories when applied to test SRD references that were outside their training sets. Comparison between Bilingual Evaluation Understudy (BLEU) score and semantic similarity on the paraphrased texts varied among reviewers; with each individual preferring different tradeoffs between these metrics.
Conclusions:
This study establishes a framework for integrating AI, including LLM and machine learning, into SRD creation, supported by a pharmaceutical company survey emphasizing the benefits of AI in paraphrasing content. While machine learning models show potential for section identification and content utility assessment, further optimization and research are essential before full-scale industry implementation. The working group's insights guide AI-driven content analysis, addressing limitations and advancing efficient, precise, and responsive pharmaceutical SRD creation.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.