Accepted for/Published in: JMIR Bioinformatics and Biotechnology
Date Submitted: Mar 4, 2022
Date Accepted: Aug 22, 2022
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
NGS Diagnosis for Patients with Inherited Diseases: AI-Assisted Variant Prioritization
ABSTRACT
Background:
In recent years, thanks to the rapid development of next-generation sequencing (NGS) technology, an entire human genome can be sequenced in a short period of time. Therefore, NGS technology is being widely introduced into clinical diagnosis practice, especially with those diagnosis of hereditary disorders. Processing the DNA sequence data of a patient requires multiple tools and complex bioinformatics pipelines, and the exome data of single nucleotide variant (SNVs) will be generated.
Objective:
To assist physicians to interpret the genetic variation information generated by NGS in a short period of time
Methods:
We constructed a machine learning model for disease causing variants prediction in exome data. In our research, we collected sequencing data from whole exome sequencing and gene panel as training set. Then we integrated variant annotations from multiple genetic databases for model training. The model we built will rank SNVs and output the most possible disease-causing candidates. For model testing, we collected whole exome sequencing data from 108 patients with rare genetic disorders in National Taiwan University Hospital. We applied sequencing data and phenotypic information automatically extracted by keyword extraction tool from patient's electronic medical records into our machine learning model.
Results:
we succeed in 92.6% of the cases to locate the causative variant in the top 10 ranking list of average 741 candidate variants per person after filtering.
Conclusions:
The model ranks the same as manual performance, and it has been to use to help clinical diagnosis with genetic diseases.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.