Deep Phenotyping on Electronic Health Records Facilitates Genetic Diagnosis by Clinical Exomes

Jung Hoon Son, Gangcai Xie, Chi Yuan, Lyudmila Ena, Ziran Li, Andrew Goldstein, Lulin Huang, Liwei Wang, Feichen Shen, Hongfang D Liu, Karla Mehl, Emily E. Groopman, Maddalena Marasa, Krzysztof Kiryluk, Ali G. Gharavi, Wendy K. Chung, George Hripcsak, Carol Friedman, Chunhua Weng, Kai Wang

Research output: Contribution to journalArticle

18 Citations (Scopus)

Abstract

Integration of detailed phenotype information with genetic data is well established to facilitate accurate diagnosis of hereditary disorders. As a rich source of phenotype information, electronic health records (EHRs) promise to empower diagnostic variant interpretation. However, how to accurately and efficiently extract phenotypes from heterogeneous EHR narratives remains a challenge. Here, we present EHR-Phenolyzer, a high-throughput EHR framework for extracting and analyzing phenotypes. EHR-Phenolyzer extracts and normalizes Human Phenotype Ontology (HPO) concepts from EHR narratives and then prioritizes genes with causal variants on the basis of the HPO-coded phenotype manifestations. We assessed EHR-Phenolyzer on 28 pediatric individuals with confirmed diagnoses of monogenic diseases and found that the genes with causal variants were ranked among the top 100 genes selected by EHR-Phenolyzer for 16/28 individuals (p < 2.2 × 10−16), supporting the value of phenotype-driven gene prioritization in diagnostic sequence interpretation. To assess the generalizability, we replicated this finding on an independent EHR dataset of ten individuals with a positive diagnosis from a different institution. We then assessed the broader utility by examining two additional EHR datasets, including 31 individuals who were suspected of having a Mendelian disease and underwent different types of genetic testing and 20 individuals with positive diagnoses of specific Mendelian etiologies of chronic kidney disease from exome sequencing. Finally, through several retrospective case studies, we demonstrated how combined analyses of genotype data and deep phenotype data from EHRs can expedite genetic diagnoses. In summary, EHR-Phenolyzer leverages EHR narratives to automate phenotype-driven analysis of clinical exomes or genomes, facilitating the broader implementation of genomic medicine.

Original languageEnglish (US)
Pages (from-to)58-73
Number of pages16
JournalAmerican Journal of Human Genetics
Volume103
Issue number1
DOIs
StatePublished - Jul 5 2018

Fingerprint

Exome
Electronic Health Records
Phenotype
Genes
Genetic Testing
Chronic Renal Insufficiency

Keywords

  • biomedical informatics
  • diagnosis
  • electronic health records
  • exome
  • genome
  • knowledge engineering
  • natural language processing
  • next-generation sequencing
  • phenotyping
  • precision medicine

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Cite this

Deep Phenotyping on Electronic Health Records Facilitates Genetic Diagnosis by Clinical Exomes. / Son, Jung Hoon; Xie, Gangcai; Yuan, Chi; Ena, Lyudmila; Li, Ziran; Goldstein, Andrew; Huang, Lulin; Wang, Liwei; Shen, Feichen; Liu, Hongfang D; Mehl, Karla; Groopman, Emily E.; Marasa, Maddalena; Kiryluk, Krzysztof; Gharavi, Ali G.; Chung, Wendy K.; Hripcsak, George; Friedman, Carol; Weng, Chunhua; Wang, Kai.

In: American Journal of Human Genetics, Vol. 103, No. 1, 05.07.2018, p. 58-73.

Research output: Contribution to journalArticle

Son, JH, Xie, G, Yuan, C, Ena, L, Li, Z, Goldstein, A, Huang, L, Wang, L, Shen, F, Liu, HD, Mehl, K, Groopman, EE, Marasa, M, Kiryluk, K, Gharavi, AG, Chung, WK, Hripcsak, G, Friedman, C, Weng, C & Wang, K 2018, 'Deep Phenotyping on Electronic Health Records Facilitates Genetic Diagnosis by Clinical Exomes', American Journal of Human Genetics, vol. 103, no. 1, pp. 58-73. https://doi.org/10.1016/j.ajhg.2018.05.010
Son, Jung Hoon ; Xie, Gangcai ; Yuan, Chi ; Ena, Lyudmila ; Li, Ziran ; Goldstein, Andrew ; Huang, Lulin ; Wang, Liwei ; Shen, Feichen ; Liu, Hongfang D ; Mehl, Karla ; Groopman, Emily E. ; Marasa, Maddalena ; Kiryluk, Krzysztof ; Gharavi, Ali G. ; Chung, Wendy K. ; Hripcsak, George ; Friedman, Carol ; Weng, Chunhua ; Wang, Kai. / Deep Phenotyping on Electronic Health Records Facilitates Genetic Diagnosis by Clinical Exomes. In: American Journal of Human Genetics. 2018 ; Vol. 103, No. 1. pp. 58-73.
@article{6fbef79689814e1d8e60150caa8ecb45,
title = "Deep Phenotyping on Electronic Health Records Facilitates Genetic Diagnosis by Clinical Exomes",
abstract = "Integration of detailed phenotype information with genetic data is well established to facilitate accurate diagnosis of hereditary disorders. As a rich source of phenotype information, electronic health records (EHRs) promise to empower diagnostic variant interpretation. However, how to accurately and efficiently extract phenotypes from heterogeneous EHR narratives remains a challenge. Here, we present EHR-Phenolyzer, a high-throughput EHR framework for extracting and analyzing phenotypes. EHR-Phenolyzer extracts and normalizes Human Phenotype Ontology (HPO) concepts from EHR narratives and then prioritizes genes with causal variants on the basis of the HPO-coded phenotype manifestations. We assessed EHR-Phenolyzer on 28 pediatric individuals with confirmed diagnoses of monogenic diseases and found that the genes with causal variants were ranked among the top 100 genes selected by EHR-Phenolyzer for 16/28 individuals (p < 2.2 × 10−16), supporting the value of phenotype-driven gene prioritization in diagnostic sequence interpretation. To assess the generalizability, we replicated this finding on an independent EHR dataset of ten individuals with a positive diagnosis from a different institution. We then assessed the broader utility by examining two additional EHR datasets, including 31 individuals who were suspected of having a Mendelian disease and underwent different types of genetic testing and 20 individuals with positive diagnoses of specific Mendelian etiologies of chronic kidney disease from exome sequencing. Finally, through several retrospective case studies, we demonstrated how combined analyses of genotype data and deep phenotype data from EHRs can expedite genetic diagnoses. In summary, EHR-Phenolyzer leverages EHR narratives to automate phenotype-driven analysis of clinical exomes or genomes, facilitating the broader implementation of genomic medicine.",
keywords = "biomedical informatics, diagnosis, electronic health records, exome, genome, knowledge engineering, natural language processing, next-generation sequencing, phenotyping, precision medicine",
author = "Son, {Jung Hoon} and Gangcai Xie and Chi Yuan and Lyudmila Ena and Ziran Li and Andrew Goldstein and Lulin Huang and Liwei Wang and Feichen Shen and Liu, {Hongfang D} and Karla Mehl and Groopman, {Emily E.} and Maddalena Marasa and Krzysztof Kiryluk and Gharavi, {Ali G.} and Chung, {Wendy K.} and George Hripcsak and Carol Friedman and Chunhua Weng and Kai Wang",
year = "2018",
month = "7",
day = "5",
doi = "10.1016/j.ajhg.2018.05.010",
language = "English (US)",
volume = "103",
pages = "58--73",
journal = "American Journal of Human Genetics",
issn = "0002-9297",
publisher = "Cell Press",
number = "1",

}

TY - JOUR

T1 - Deep Phenotyping on Electronic Health Records Facilitates Genetic Diagnosis by Clinical Exomes

AU - Son, Jung Hoon

AU - Xie, Gangcai

AU - Yuan, Chi

AU - Ena, Lyudmila

AU - Li, Ziran

AU - Goldstein, Andrew

AU - Huang, Lulin

AU - Wang, Liwei

AU - Shen, Feichen

AU - Liu, Hongfang D

AU - Mehl, Karla

AU - Groopman, Emily E.

AU - Marasa, Maddalena

AU - Kiryluk, Krzysztof

AU - Gharavi, Ali G.

AU - Chung, Wendy K.

AU - Hripcsak, George

AU - Friedman, Carol

AU - Weng, Chunhua

AU - Wang, Kai

PY - 2018/7/5

Y1 - 2018/7/5

N2 - Integration of detailed phenotype information with genetic data is well established to facilitate accurate diagnosis of hereditary disorders. As a rich source of phenotype information, electronic health records (EHRs) promise to empower diagnostic variant interpretation. However, how to accurately and efficiently extract phenotypes from heterogeneous EHR narratives remains a challenge. Here, we present EHR-Phenolyzer, a high-throughput EHR framework for extracting and analyzing phenotypes. EHR-Phenolyzer extracts and normalizes Human Phenotype Ontology (HPO) concepts from EHR narratives and then prioritizes genes with causal variants on the basis of the HPO-coded phenotype manifestations. We assessed EHR-Phenolyzer on 28 pediatric individuals with confirmed diagnoses of monogenic diseases and found that the genes with causal variants were ranked among the top 100 genes selected by EHR-Phenolyzer for 16/28 individuals (p < 2.2 × 10−16), supporting the value of phenotype-driven gene prioritization in diagnostic sequence interpretation. To assess the generalizability, we replicated this finding on an independent EHR dataset of ten individuals with a positive diagnosis from a different institution. We then assessed the broader utility by examining two additional EHR datasets, including 31 individuals who were suspected of having a Mendelian disease and underwent different types of genetic testing and 20 individuals with positive diagnoses of specific Mendelian etiologies of chronic kidney disease from exome sequencing. Finally, through several retrospective case studies, we demonstrated how combined analyses of genotype data and deep phenotype data from EHRs can expedite genetic diagnoses. In summary, EHR-Phenolyzer leverages EHR narratives to automate phenotype-driven analysis of clinical exomes or genomes, facilitating the broader implementation of genomic medicine.

AB - Integration of detailed phenotype information with genetic data is well established to facilitate accurate diagnosis of hereditary disorders. As a rich source of phenotype information, electronic health records (EHRs) promise to empower diagnostic variant interpretation. However, how to accurately and efficiently extract phenotypes from heterogeneous EHR narratives remains a challenge. Here, we present EHR-Phenolyzer, a high-throughput EHR framework for extracting and analyzing phenotypes. EHR-Phenolyzer extracts and normalizes Human Phenotype Ontology (HPO) concepts from EHR narratives and then prioritizes genes with causal variants on the basis of the HPO-coded phenotype manifestations. We assessed EHR-Phenolyzer on 28 pediatric individuals with confirmed diagnoses of monogenic diseases and found that the genes with causal variants were ranked among the top 100 genes selected by EHR-Phenolyzer for 16/28 individuals (p < 2.2 × 10−16), supporting the value of phenotype-driven gene prioritization in diagnostic sequence interpretation. To assess the generalizability, we replicated this finding on an independent EHR dataset of ten individuals with a positive diagnosis from a different institution. We then assessed the broader utility by examining two additional EHR datasets, including 31 individuals who were suspected of having a Mendelian disease and underwent different types of genetic testing and 20 individuals with positive diagnoses of specific Mendelian etiologies of chronic kidney disease from exome sequencing. Finally, through several retrospective case studies, we demonstrated how combined analyses of genotype data and deep phenotype data from EHRs can expedite genetic diagnoses. In summary, EHR-Phenolyzer leverages EHR narratives to automate phenotype-driven analysis of clinical exomes or genomes, facilitating the broader implementation of genomic medicine.

KW - biomedical informatics

KW - diagnosis

KW - electronic health records

KW - exome

KW - genome

KW - knowledge engineering

KW - natural language processing

KW - next-generation sequencing

KW - phenotyping

KW - precision medicine

UR - http://www.scopus.com/inward/record.url?scp=85048729565&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048729565&partnerID=8YFLogxK

U2 - 10.1016/j.ajhg.2018.05.010

DO - 10.1016/j.ajhg.2018.05.010

M3 - Article

C2 - 29961570

AN - SCOPUS:85048729565

VL - 103

SP - 58

EP - 73

JO - American Journal of Human Genetics

JF - American Journal of Human Genetics

SN - 0002-9297

IS - 1

ER -