Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Medical Informatics

Date Submitted: May 19, 2019
Date Accepted: Aug 30, 2019

The final, peer-reviewed published version of this preprint can be found here:

Mining Hidden Knowledge About Illegal Compensation for Occupational Injury: Topic Model Approach

Min JY, Song SH, Kim HJ, Min KB

Mining Hidden Knowledge About Illegal Compensation for Occupational Injury: Topic Model Approach

JMIR Med Inform 2019;7(3):e14763

DOI: 10.2196/14763

PMID: 31573948

PMCID: 6787526

Mining hidden knowledge about illegal compensation for occupational injury: A topic model approach

  • Jin-Young Min; 
  • Sung-Hee Song; 
  • Hye-Jin Kim; 
  • Kyoung-Bok Min

ABSTRACT

Background:

Although injured employees are legally covered by worker's compensation insurance in South Korea, some employers make agreements to prevent the injured employees from covering their compensation. Thus, this leads to under-reporting of occupational injury statistics. Illegal compensation (in Korean called “gong-sang”) is a critical way to underreport or cover-up. However, “gong-sang” is not counted as official occupational injury statistics and so we cannot identify the “gong-sang”-related issues.

Objective:

This study analyzed the social media data using topic modeling to explore hidden knowledge about illegal compensation - “gong-sang” - for occupational injury in South Korea.

Methods:

We collected a total of 2,210 documents from social media data by filtering the keyword “gong-sang”. The study period was between January 1, 2006 and December 31, 2017. After natural language processing of the Korean language using KoNLPy, a morphological analyzer, we performed topic modeling by the Latent Dirichlet allocation (LDA) in the Python library, Gensim. A 10-topic model was selected and ran with 3000 Gibbs sampling iterations to fit the model.

Results:

LDA model was classified gong-sang related documents into four categories from a total of 10 topics. Topic 1 was the greatest concern (60.5%). Workers who suffered from industrial accidents seemed to be worried about illegal compensation and legal insurance claims, wherein keywords on the choice between illegal compensation and legal insurance claims were included. In Topic 2, keywords were associated with claims for industrial accident insurance benefits. Topics 3 and 4, as the second-highest concern (19.2%) contained keywords implying the monetary compensation of “gong-sang”. The rest topics (Topics 5-10) included keywords on vulnerable job (i.e., workers at construction and defense industry, delivery riders, and foreign workers) and body parts (i.e., injuries to hands, face, teeth, lower limbs, and back) to “gong-sang”.

Conclusions:

We explored hidden knowledge to identify the salient issues surrounding “gong-sang” using LDA model. These topics may provide valuable information to ensure the more efficient operation of South Korea’s occupational health and safety administration, and protect vulnerable workers from illegal “gong-sang” compensation practices.


 Citation

Please cite as:

Min JY, Song SH, Kim HJ, Min KB

Mining Hidden Knowledge About Illegal Compensation for Occupational Injury: Topic Model Approach

JMIR Med Inform 2019;7(3):e14763

DOI: 10.2196/14763

PMID: 31573948

PMCID: 6787526

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.