Accepted for/Published in: Journal of Medical Internet Research
Date Submitted: Apr 2, 2020
Date Accepted: May 13, 2021
Using text mining techniques to identify healthcare providers with patient safety problems: an exploratory study
ABSTRACT
Background:
Regulatory bodies such as healthcare inspectorates can identify risks of healthcare providers by analyzing patient complaints. Text mining techniques (automatic text analysis based on machine learning), might help by identifying specific patterns and signals for risks on quality and safety issues.
Objective:
The aim of this study was to explore whether text mining techniques might be used to identify healthcare providers at risk.
Methods:
We performed an exploratory study on a complaints database of the Dutch Health and Youth Care Inspectorate with more than 22000 written complaints. We studied a range of supervised machine learning techniques to automatically determine the severity of incoming complaints. We investigated several features based on the complaints’ content, including sentiment analysis, to decide which were helpful for severity prediction. Finally, we took the list of health care providers and their organization-specific complaints to determine the average severity of complaints per organization. We performed a keyword analysis in order to give the Inspectorate insight in the patterns and severity per organization.
Results:
The data preparation and preprocessing were time-consuming one-off costs, mainly because we had to create a safe and efficient digital research environment. A straightforward text classification approach using a bag-of-words feature representation worked best for severity prediction. The usage of sentiment analysis for severity prediction was not helpful. Finally, we produced a list of n-grams of healthcare providers with the most complaints to inform the Inspectorate about the specific combination of words for these organizations.
Conclusions:
Text mining techniques can support inspectorates with fully automatic analysis of complaints. They can give insights in patterns, detect possible blind spots, or support prioritizing follow-up supervision activities by sorting complaints on severity per organization or per sector. An appropriate data science and ICT infrastructure is crucial and indispensable for applied text mining.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.