Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Aug 26, 2021
Date Accepted: May 30, 2022

The final, peer-reviewed published version of this preprint can be found here:

Examining Analytic Practices in Latent Dirichlet Allocation Within Psychological Science: Scoping Review

Hagg L, Merkouris SS, O’Dea GA, Francis LM, Greenwood CJ, Fuller-Tyszkiewicz M, Westrupp EM, Macdonald JA, Youssef GJ

Examining Analytic Practices in Latent Dirichlet Allocation Within Psychological Science: Scoping Review

J Med Internet Res 2022;24(11):e33166

DOI: 10.2196/33166

PMID: 36346659

PMCID: 9682457

Examining analytical practices in Latent Dirichlet Allocation within Psychological Science: A Scoping Review

  • Lauryn Hagg; 
  • Stephanie S Merkouris; 
  • Gypsy A O’Dea; 
  • Lauren M Francis; 
  • Christopher J Greenwood; 
  • Matthew Fuller-Tyszkiewicz; 
  • Elizabeth M Westrupp; 
  • Jacqui A Macdonald; 
  • George J Youssef

ABSTRACT

Background:

Background:

Latent Dirichlet Allocation (LDA) is a tool for rapidly synthesising meaning from ‘big data’, but outputs can be sensitive to decisions made during the analytic pipeline. This review will focus on the complex analytical practices specific to LDA, which existing practical guides for conducting LDA have not addressed.

Objective:

Objectives: This scoping review will use key analytical steps (data selection, data pre-processing, and data analysis) as a framework to understand the methodological approaches being used in psychology research utilising LDA.

Methods:

Methods:

Four psychology and health databases were searched. Studies were included if they used LDA to analyse written words and focussed on a psychological construct/issue. The data charting processes was constructed and employed based on common data selection, pre-processing, and data analysis steps.

Results:

Results:

Forty-seven studies were included. These explored a range of research areas and most sourced their data from social media platforms. While some studies reported on pre-processing and data analytic steps taken, most studies did not provide sufficient detail for reproducibility. Furthermore, debate surrounding the necessity of certain pre-processing and data analysis steps is revealed.

Conclusions:

Conclusions:

Findings highlight the growing use of LDA in psychological science. However, there is a need to improve analytical reporting standards, and identify comprehensive and evidence based best practice recommendations. To work towards this, we have developed an LDA Preferred Reporting Checklist which will allow for consistent documentation of LDA analytic decisions, and reproducible research outcomes.


 Citation

Please cite as:

Hagg L, Merkouris SS, O’Dea GA, Francis LM, Greenwood CJ, Fuller-Tyszkiewicz M, Westrupp EM, Macdonald JA, Youssef GJ

Examining Analytic Practices in Latent Dirichlet Allocation Within Psychological Science: Scoping Review

J Med Internet Res 2022;24(11):e33166

DOI: 10.2196/33166

PMID: 36346659

PMCID: 9682457

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.