Motivation: Next-generation sequencing is rapidly improving diagnostic rates in rare Mendelian diseases, but even with whole genome or whole exome sequencing, the majority of cases remain unsolved. Increasingly, RNA sequencing is being used to solve many cases that evade diagnosis through sequencing alone. Specifically, the detection of aberrant splicing in many rare disease patients suggests that identifying RNA splicing outliers is particularly useful for determining causal Mendelian disease genes. However, there is as yet a paucity of statistical methodologies to detect splicing outliers. Results: We developed LeafCutterMD, a new statistical framework that significantly improves the previously published LeafCutter in the context of detecting outlier splicing events. Through simulations and analysis of real patient data, we demonstrate that LeafCutterMD has better power than the state-of-the-art methodology while controlling false-positive rates. When applied to a cohort of disease-affected probands from the Mayo Clinic Center for Individualized Medicine, LeafCutterMD recovered all aberrantly spliced genes that had previously been identified by manual curation efforts. Availability and implementation: The source code for this method is available under the opensource Apache 2.0 license in the latest release of the LeafCutter software package available online at http://davidaknowles.github.io/ leafcutter.
ASJC Scopus subject areas
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics