Accepted for/Published in: JMIR Medical Informatics
Date Submitted: Nov 11, 2025
Date Accepted: Jun 10, 2026
IdeaDistiller - AI-support for Idea Synthesis in Concept Mapping: Algorithm Development and Validation
ABSTRACT
Background:
Concept mapping (CM) is a widely used mixed method research approach for structuring and visualising complex ideas across various fields, for example health sciences. A critical bottleneck in the CM process is the idea synthesis phase, which remains labor-intensive, subjective, and consequently challenging to scale for large datasets.
Objective:
In this study, we propose IdeaDistiller, a semi-automated solution based on topic modeling to optimise the idea synthesis step while maintaining methodological rigor through a human-in-the-loop approach.
Methods:
Using six healthcare-related datasets in English and Swedish, we systematically evaluated different embedding models, dimensionality reduction techniques, and clustering algorithms to identify robust and reproducible parameter settings for the proposed approach. IdeaDistiller clusters participant-generated ideas based on semantic similarity to identify similar ideas with different wording, suggests representative, unique ideas per cluster, and provides coherence scores and sorted outputs to aid manual validation.
Results:
Our findings suggest that IdeaDistiller may substantially reduce the manual effort involved in idea synthesis while preserving quality and transparency. However, human expertise remains indispensable for validating and refining cluster outputs.
Conclusions:
Integrating semi-automated methods into the CM workflow offers significant potential for improving the efficiency, scalability, and rigour of the CM process. Building on our work will enable the exploration of larger multilingual datasets and integration in future concept mapping studies. The code for IdeaDistiller is publicly available.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.