Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Formative Research

Date Submitted: Jun 6, 2023
Date Accepted: Aug 21, 2023

The final, peer-reviewed published version of this preprint can be found here:

Automatic Extraction of Research Themes in Epidemiological Criminology From PubMed Abstracts From 1946 to 2020: Text Mining Study

Karystianis G, Simpson P, Lukmanjaya W, Ginnivan N, Nenadic G, Buchan I, Butler T

Automatic Extraction of Research Themes in Epidemiological Criminology From PubMed Abstracts From 1946 to 2020: Text Mining Study

JMIR Form Res 2023;7:e49721

DOI: 10.2196/49721

PMID: 37738080

PMCID: 10559193

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Automatic extraction of research themes in epidemiological criminology from PubMed abstracts from 1946 to 2020: text mining study

  • George Karystianis; 
  • Paul Simpson; 
  • Wilson Lukmanjaya; 
  • Natasha Ginnivan; 
  • Goran Nenadic; 
  • Iain Buchan; 
  • Tony Butler

ABSTRACT

Background:

The area of epidemiological criminology studies the intersection between the public health and justice systems focusing on prevalent health issues that affect offending and incarcerated populations. Given the growth in this field in recent years, it is important to understand and assess gaps between research outputs and priorities identified from prisoner health stakeholders.

Objective:

Examine published research outputs in epidemiological criminology to assess gaps between published outputs and current research priorities identified by prison stakeholders.

Methods:

Text mining study. A rule-based method was applied to 23,904 PubMed epidemiological criminology abstracts to extract the study determinants and outcomes (i.e., “themes”). These were mapped against research priorities identified by Australian prison stakeholders to assess differences from research outputs. The income level for the affiliation country of the first authors was also identified to compare the ranking of research priorities in income country groups.

Results:

On an evaluation set of 100 abstracts, the identification of themes returned an F1-Score of 90.0% indicating reliable performance. More than 50% of articles had at least one extracted theme; the most common was substance use (12.9%) followed by the Human Immunodeficiency Virus (12.6%). Infectious diseases (24.9%) was the most common research priority category, followed by mental health (24.0%) and alcohol and other drug use (20.5%). A comparison between the extracted themes and the stakeholder priorities showed an alignment for mental health, infectious diseases and alcohol and other drug use. While behaviour and juvenile related themes were common, they did not feature as prison priorities. Most research derived was from high income countries (85.3%) while countries with the lowest income status focused half of their research on infectious diseases (51.6%).

Conclusions:

The frequency of investigated themes may reflect historical developments concerning disease prevalence, treatment advances, and social understandings of illness and incarcerated populations. Differences between income status groups are likely to be explained by local health priorities and immediate health risks. Notable gaps between stakeholder research priorities and research outputs concerned themes more focused on social factors and systems and may reflect publication bias or self-publication-selection highlighting the need for further research on prison health services and social determinants of health.


 Citation

Please cite as:

Karystianis G, Simpson P, Lukmanjaya W, Ginnivan N, Nenadic G, Buchan I, Butler T

Automatic Extraction of Research Themes in Epidemiological Criminology From PubMed Abstracts From 1946 to 2020: Text Mining Study

JMIR Form Res 2023;7:e49721

DOI: 10.2196/49721

PMID: 37738080

PMCID: 10559193

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.