Applying semantic web technologies for phenome-wide scan using an electronic health record linked Biobank

Jyotishman Pathak, Richard C. Kiefer, Suzette J Bielinski, Christopher G. Chute

Research output: Contribution to journalArticle

19 Citations (Scopus)

Abstract

Background: The ability to conduct genome-wide association studies (GWAS) has enabled new exploration of how genetic variations contribute to health and disease etiology. However, historically GWAS have been limited by inadequate sample size due to associated costs for genotyping and phenotyping of study subjects. This has prompted several academic medical centers to form "biobanks" where biospecimens linked to personal health information, typically in electronic health records (EHRs), are collected and stored on a large number of subjects. This provides tremendous opportunities to discover novel genotype-phenotype associations and foster hypotheses generation. Results: In this work, we study how emerging Semantic Web technologies can be applied in conjunction with clinical and genotype data stored at the Mayo Clinic Biobank to mine the phenotype data for genetic associations. In particular, we demonstrate the role of using Resource Description Framework (RDF) for representing EHR diagnoses and procedure data, and enable federated querying via standardized Web protocols to identify subjects genotyped for Type 2 Diabetes and Hypothyroidism to discover gene-disease associations. Our study highlights the potential of Web-scale data federation techniques to execute complex queries. Conclusions: This study demonstrates how Semantic Web technologies can be applied in conjunction with clinical data stored in EHRs to accurately identify subjects with specific diseases and phenotypes, and identify genotype-phenotype associations.

Original languageEnglish (US)
Article number10
JournalJournal of Biomedical Semantics
Volume3
Issue number1
DOIs
StatePublished - Dec 17 2012

Fingerprint

Electronic Health Records
Semantic Web
Semantics
Genome-Wide Association Study
Health
Genetic Association Studies
Technology
Personal Health Records
Genes
Phenotype
Hypothyroidism
Sample Size
Type 2 Diabetes Mellitus
Genotype
Costs and Cost Analysis
Medical problems
Network protocols
Costs

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computer Networks and Communications
  • Health Informatics

Cite this

Applying semantic web technologies for phenome-wide scan using an electronic health record linked Biobank. / Pathak, Jyotishman; Kiefer, Richard C.; Bielinski, Suzette J; Chute, Christopher G.

In: Journal of Biomedical Semantics, Vol. 3, No. 1, 10, 17.12.2012.

Research output: Contribution to journalArticle

@article{90aae4ef73594fc0a2175428d129ea56,
title = "Applying semantic web technologies for phenome-wide scan using an electronic health record linked Biobank",
abstract = "Background: The ability to conduct genome-wide association studies (GWAS) has enabled new exploration of how genetic variations contribute to health and disease etiology. However, historically GWAS have been limited by inadequate sample size due to associated costs for genotyping and phenotyping of study subjects. This has prompted several academic medical centers to form {"}biobanks{"} where biospecimens linked to personal health information, typically in electronic health records (EHRs), are collected and stored on a large number of subjects. This provides tremendous opportunities to discover novel genotype-phenotype associations and foster hypotheses generation. Results: In this work, we study how emerging Semantic Web technologies can be applied in conjunction with clinical and genotype data stored at the Mayo Clinic Biobank to mine the phenotype data for genetic associations. In particular, we demonstrate the role of using Resource Description Framework (RDF) for representing EHR diagnoses and procedure data, and enable federated querying via standardized Web protocols to identify subjects genotyped for Type 2 Diabetes and Hypothyroidism to discover gene-disease associations. Our study highlights the potential of Web-scale data federation techniques to execute complex queries. Conclusions: This study demonstrates how Semantic Web technologies can be applied in conjunction with clinical data stored in EHRs to accurately identify subjects with specific diseases and phenotypes, and identify genotype-phenotype associations.",
author = "Jyotishman Pathak and Kiefer, {Richard C.} and Bielinski, {Suzette J} and Chute, {Christopher G.}",
year = "2012",
month = "12",
day = "17",
doi = "10.1186/2041-1480-3-10",
language = "English (US)",
volume = "3",
journal = "Journal of Biomedical Semantics",
issn = "2041-1480",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Applying semantic web technologies for phenome-wide scan using an electronic health record linked Biobank

AU - Pathak, Jyotishman

AU - Kiefer, Richard C.

AU - Bielinski, Suzette J

AU - Chute, Christopher G.

PY - 2012/12/17

Y1 - 2012/12/17

N2 - Background: The ability to conduct genome-wide association studies (GWAS) has enabled new exploration of how genetic variations contribute to health and disease etiology. However, historically GWAS have been limited by inadequate sample size due to associated costs for genotyping and phenotyping of study subjects. This has prompted several academic medical centers to form "biobanks" where biospecimens linked to personal health information, typically in electronic health records (EHRs), are collected and stored on a large number of subjects. This provides tremendous opportunities to discover novel genotype-phenotype associations and foster hypotheses generation. Results: In this work, we study how emerging Semantic Web technologies can be applied in conjunction with clinical and genotype data stored at the Mayo Clinic Biobank to mine the phenotype data for genetic associations. In particular, we demonstrate the role of using Resource Description Framework (RDF) for representing EHR diagnoses and procedure data, and enable federated querying via standardized Web protocols to identify subjects genotyped for Type 2 Diabetes and Hypothyroidism to discover gene-disease associations. Our study highlights the potential of Web-scale data federation techniques to execute complex queries. Conclusions: This study demonstrates how Semantic Web technologies can be applied in conjunction with clinical data stored in EHRs to accurately identify subjects with specific diseases and phenotypes, and identify genotype-phenotype associations.

AB - Background: The ability to conduct genome-wide association studies (GWAS) has enabled new exploration of how genetic variations contribute to health and disease etiology. However, historically GWAS have been limited by inadequate sample size due to associated costs for genotyping and phenotyping of study subjects. This has prompted several academic medical centers to form "biobanks" where biospecimens linked to personal health information, typically in electronic health records (EHRs), are collected and stored on a large number of subjects. This provides tremendous opportunities to discover novel genotype-phenotype associations and foster hypotheses generation. Results: In this work, we study how emerging Semantic Web technologies can be applied in conjunction with clinical and genotype data stored at the Mayo Clinic Biobank to mine the phenotype data for genetic associations. In particular, we demonstrate the role of using Resource Description Framework (RDF) for representing EHR diagnoses and procedure data, and enable federated querying via standardized Web protocols to identify subjects genotyped for Type 2 Diabetes and Hypothyroidism to discover gene-disease associations. Our study highlights the potential of Web-scale data federation techniques to execute complex queries. Conclusions: This study demonstrates how Semantic Web technologies can be applied in conjunction with clinical data stored in EHRs to accurately identify subjects with specific diseases and phenotypes, and identify genotype-phenotype associations.

UR - http://www.scopus.com/inward/record.url?scp=84889676637&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84889676637&partnerID=8YFLogxK

U2 - 10.1186/2041-1480-3-10

DO - 10.1186/2041-1480-3-10

M3 - Article

VL - 3

JO - Journal of Biomedical Semantics

JF - Journal of Biomedical Semantics

SN - 2041-1480

IS - 1

M1 - 10

ER -