Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Formative Research

Date Submitted: Mar 12, 2025
Date Accepted: Jun 30, 2025

The final, peer-reviewed published version of this preprint can be found here:

Evaluation of Generative Artificial Intelligence Implementation Impacts in Social and Health Care Language Translation: Mixed Methods Case Study

Martikainen M, Smolander K, Sanmark J, Sanmark E

Evaluation of Generative Artificial Intelligence Implementation Impacts in Social and Health Care Language Translation: Mixed Methods Case Study

JMIR Form Res 2025;9:e73658

DOI: 10.2196/73658

PMID: 40961386

PMCID: 12443352

Evaluation of Generative Artificial Intelligence Implementation Impacts in Social and Health Care Language Translation: Mixed Methods Case Study

  • Miia Martikainen; 
  • Kari Smolander; 
  • Johan Sanmark; 
  • Enni Sanmark

ABSTRACT

Background:

Generative Artificial Intelligence (GAI) is expected to enhance the productivity of public social and healthcare sector while maintaining, at minimum, current standards of quality and user experience. However, empirical evidence on GAI impacts in practical, real-life settings remains limited.

Objective:

This study investigates productivity, machine translation quality, and user experience impacts of GPT-4 language model in an in-house language translation services team of a large wellbeing services county in Finland.

Methods:

The study employes a mixed-methods approach. Quantitative data of 908 translation segments was collected in real-life conditions using the computer-assisted language translation software Trados to assess productivity differences between machine and human translation. Similarly, a separate data set of 1387 segment pairs were collected and analyzed to estimate machine translation quality. Additionally, user experience was investigated through qualitative data from translator interviews. The data was collected between March and June 2024 with the team of 4 translators.

Results:

The findings indicate that, on average, post-editing machine translations is 14% faster than translating texts from scratch (P=.028), and 37% faster when number of segments is equalized across translators. Nonetheless, insights from translator interviews underscore that further productivity gains could be achieved by optimizing language model performance and redesigning operational processes and workflows. Additionally, the translators perceive that the quality of machine translation affects both productivity and user experience.

Conclusions:

The results highlight that the full productivity potential of GAI technology is constrained by the organization’s capabilities in managing and utilizing artificial intelligence effectively.


 Citation

Please cite as:

Martikainen M, Smolander K, Sanmark J, Sanmark E

Evaluation of Generative Artificial Intelligence Implementation Impacts in Social and Health Care Language Translation: Mixed Methods Case Study

JMIR Form Res 2025;9:e73658

DOI: 10.2196/73658

PMID: 40961386

PMCID: 12443352

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.