Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: Journal of Medical Internet Research

Date Submitted: Dec 31, 2018
Date Accepted: Feb 25, 2019
(closed for review but you can still tweet)

The final, peer-reviewed published version of this preprint can be found here:

Genomic Common Data Model for Seamless Interoperation of Biomedical Data in Clinical Practice: Retrospective Study

Shin SJ, You SC, Roh J, Kim JH, Haam S, Reich CG, Blacketer C, Son DS, Oh S, Park YR, Park RW

Genomic Common Data Model for Seamless Interoperation of Biomedical Data in Clinical Practice: Retrospective Study

J Med Internet Res 2019;21(3):e13249

DOI: 10.2196/13249

PMID: 30912749

PMCID: 6454347

Genomic Common Data Model for Seamless Interoperation of Biomedical Data in Clinical Practice: Retrospective Study

  • Seo Jeong Shin; 
  • Seng Chan You; 
  • Jin Roh; 
  • Jang-Hee Kim; 
  • Seokjin Haam; 
  • Christian G. Reich; 
  • Clair Blacketer; 
  • Dae-Soon Son; 
  • Seungbin Oh; 
  • Yu Rang Park; 
  • Rae Woong Park

ABSTRACT

Background:

Clinical sequencing data should be shared so as to achieve the sufficient scale and diversity required for providing strong evidence toward improving patient care. A distributed research network (DRN) allows researchers to share this evidence rather than the patient-level data across centers, thereby avoiding privacy issues. The Observational Medical Outcomes Partnership (OMOP) common data model (CDM), currently used in DRNs, has low coverage of sequencing data and does not reflect the latest trend of precision medicine.

Objective:

The aim of this study was to develop and evaluate the feasibility of a genomic CDM (G-CDM), as an extension of the OMOP-CDM, for application of genomic data in clinical practice.

Methods:

Existing genomic data models and sequencing reports were reviewed to extend the OMOP-CDM to cover genomic data. Human Genome Organisation (HUGO) Gene Nomenclature Committee (HGNC) and Human Genome Variation Society (HGVS) nomenclature was adopted to standardize the terminology in the model. Sequencing data of 114 and 1060 patients with lung cancer were obtained from the Ajou University School of Medicine (AUSOM) database of Ajou University Hospital and The Cancer Genome Atlas (TCGA), respectively, which were transformed to a format appropriate for the G-CDM. The data were compared with respect to gene name, variant type, and actionable mutations.

Results:

The G-CDM was extended into four tables linked to tables of the OMOP-CDM. Upon comparison with TCGA data, a clinically actionable mutation, ‘p.Leu858Arg’, in the EGFR gene was 6.64-times more frequent in the AUSOM database, while the ‘p.Gly12Xaa’ mutation in the KRAS gene was 2.02-times more frequent in the TCGA dataset. The data-exploring tool GeneProfiler was further developed to conduct descriptive analyses automatically using the G-CDM, which provides the proportions of genes, variant types, and actionable mutations. GeneProfiler also allows for querying the specific gene name and HGVS nomenclature to calculate the proportion of patients with a given mutation.

Conclusions:

We developed the G-CDM for effective integration of genomic data with standardized clinical data allowing for data sharing across institutes. The feasibility of the G-CDM was validated by assessing the differences in data characteristics between two different genomic databases through the proposed data-exploring tool, GeneProfiler. The G-CDM may facilitate analyses of interoperating clinical and genomic datasets across multiple institutions, minimizing privacy issues and thereby enabling researchers to better understand the characteristics of patients and promote personalized medicine in clinical practice.


 Citation

Please cite as:

Shin SJ, You SC, Roh J, Kim JH, Haam S, Reich CG, Blacketer C, Son DS, Oh S, Park YR, Park RW

Genomic Common Data Model for Seamless Interoperation of Biomedical Data in Clinical Practice: Retrospective Study

J Med Internet Res 2019;21(3):e13249

DOI: 10.2196/13249

PMID: 30912749

PMCID: 6454347

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.