Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR Cancer

Date Submitted: Jun 10, 2025
Open Peer Review Period: Jun 9, 2025 - Aug 4, 2025
Date Accepted: Oct 31, 2025
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Using Latent Dirichlet Allocation Topic Modeling to Uncover Latent Research Topics and Trends in Renal Cell Carcinoma: Bibliometric Review

De La Hoz-M J, Montes-Escobar K, Salas-Macias CA, Fors M, Ballaz SJ

Using Latent Dirichlet Allocation Topic Modeling to Uncover Latent Research Topics and Trends in Renal Cell Carcinoma: Bibliometric Review

JMIR Cancer 2026;12:e78797

DOI: 10.2196/78797

PMID: 41544251

PMCID: 12810951

A bibliometric review: using latent dirichlet allocation (lda) topic modelling to uncover latent research topics and trends in renal cell carcinoma

  • Javier De La Hoz-M; 
  • Karime Montes-Escobar; 
  • Carlos Alfredo Salas-Macias; 
  • Martha Fors; 
  • Santiago J. Ballaz

ABSTRACT

Background:

Renal cell carcinomas (RCCs) is a common, often letal kidney cancer that originates in the renal cortex. Its incidence is rising, and major factors include smoking, obesity and hypertension, though its ethiology is uncertain. While surgery is effective for localized RCC, treatments for metastic RCC have advanced significantly in recent years due to better diagnostic, prognostic and predictive tools. Despite this progress, challenges remain, including long-term drug resistance and the complexity of RCC as a diverse group of diseases rather than a single entity.

Objective:

The objective of this bibliometric review was a comprehensive analysis of the topics and trends in RCC research landscape offering a foundation for future investigations.

Methods:

We used R 'Bibliometrix' to conduct a bibliographic search in Scopus and PubMed covering publications from 1975 to 2023 to statistically assess the distribution of publications associated with RCC by year, journal, and country. Topic modelling of RCC research was conducted using Latent Dirichlet Allocation (LDA), a Bayesian network-based probabilistic algorithm that identifies unobserved thematic clusters in a collection of text documents. Trends in the retrieved themes were then characterized by using regression slopes over time, across countries, and in different journals. These trends were visualized as a heatmap, which was then used for hierarchical clustering to group similar topics based on their correlation strengths.

Results:

There were found thirty topics with the best coherence score (semantic similarity of terms) in the LDA model, which were in eight crucial domains of RCC research: treatment and therapies; biomolecular and genetic characteristics, disease characteristics and progression, diagnosis and evaluation, metastasis and dissemination, epidemiology, and risk factors, related conditions; and pathological features. The pertinent clustergrams that resulted from the heatmaps mirrored the LDA´s algorithm identification of major RCC research subjects.

Conclusions:

Over fifty years, RCC research’s focus has shifted from diagnosis and assessment to a more thorough understanding of disease characteristics and progression. Because many patients are diagnosed with abdominal imaging studies, an emerging topic in RCC is diagnostic imaging and radiological evolution. The advances in omics technologies, the function of microRNA signature in the progression, diagnosis, therapy targeting, and prognosis of RCC has garnered a lot of attention. The discovery of the genetic background has enhanced our understanding of the growth of RCC. Drug resistance, local RCC ablation and postoperative surveillance of RCC recurrence following nephrectomy are key future research avenues. The next generation of drug-targeted and immunotherapy will make it possible to successfully treat metastatic RCC following nephrectomy. Neglected topics include the association between ferroptosis and RCC, the long-term assessment of novel treatments, and the application of artificial intelligence on RCC. Our bibliographic review delivered pertinent data for clinical decision-making and the planning of future biomedical RCC research.


 Citation

Please cite as:

De La Hoz-M J, Montes-Escobar K, Salas-Macias CA, Fors M, Ballaz SJ

Using Latent Dirichlet Allocation Topic Modeling to Uncover Latent Research Topics and Trends in Renal Cell Carcinoma: Bibliometric Review

JMIR Cancer 2026;12:e78797

DOI: 10.2196/78797

PMID: 41544251

PMCID: 12810951

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.