Mining the human phenome using semantic web technologies: a case study for Type 2 Diabetes.

Jyotishman Pathak, Richard C. Kiefer, Suzette J Bielinski, Christopher G. Chute

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

The ability to conduct genome-wide association studies (GWAS) has enabled new exploration of how genetic variations contribute to health and disease etiology. However, historically GWAS have been limited by inadequate sample size due to associated costs for genotyping and phenotyping of study subjects. This has prompted several academic medical centers to form "biobanks" where biospecimens linked to personal health information, typically in electronic health records (EHRs), are collected and stored on large number of subjects. This provides tremendous opportunities to discover novel genotype-phenotype associations and foster hypothesis generation. In this work, we study how emerging Semantic Web technologies can be applied in conjunction with clinical and genotype data stored at the Mayo Clinic Biobank to mine the phenotype data for genetic associations. In particular, we demonstrate the role of using Resource Description Framework (RDF) for representing EHR diagnoses and procedure data, and enable federated querying via standardized Web protocols to identify subjects genotyped with Type 2 Diabetes for discovering gene-disease associations. Our study highlights the potential of Web-scale data federation techniques to execute complex queries.

Original languageEnglish (US)
Pages (from-to)699-708
Number of pages10
JournalAMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
Volume2012
StatePublished - 2012

Fingerprint

Electronic Health Records
Genome-Wide Association Study
Semantics
Type 2 Diabetes Mellitus
Personal Health Records
Technology
Genetic Association Studies
Sample Size
Genotype
Phenotype
Costs and Cost Analysis
Health
Genes

ASJC Scopus subject areas

  • Medicine(all)

Cite this

Mining the human phenome using semantic web technologies : a case study for Type 2 Diabetes. / Pathak, Jyotishman; Kiefer, Richard C.; Bielinski, Suzette J; Chute, Christopher G.

In: AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium, Vol. 2012, 2012, p. 699-708.

Research output: Contribution to journalArticle

@article{15c3da9670a14ecab4162ef4c79128eb,
title = "Mining the human phenome using semantic web technologies: a case study for Type 2 Diabetes.",
abstract = "The ability to conduct genome-wide association studies (GWAS) has enabled new exploration of how genetic variations contribute to health and disease etiology. However, historically GWAS have been limited by inadequate sample size due to associated costs for genotyping and phenotyping of study subjects. This has prompted several academic medical centers to form {"}biobanks{"} where biospecimens linked to personal health information, typically in electronic health records (EHRs), are collected and stored on large number of subjects. This provides tremendous opportunities to discover novel genotype-phenotype associations and foster hypothesis generation. In this work, we study how emerging Semantic Web technologies can be applied in conjunction with clinical and genotype data stored at the Mayo Clinic Biobank to mine the phenotype data for genetic associations. In particular, we demonstrate the role of using Resource Description Framework (RDF) for representing EHR diagnoses and procedure data, and enable federated querying via standardized Web protocols to identify subjects genotyped with Type 2 Diabetes for discovering gene-disease associations. Our study highlights the potential of Web-scale data federation techniques to execute complex queries.",
author = "Jyotishman Pathak and Kiefer, {Richard C.} and Bielinski, {Suzette J} and Chute, {Christopher G.}",
year = "2012",
language = "English (US)",
volume = "2012",
pages = "699--708",
journal = "AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium",
issn = "1559-4076",
publisher = "American Medical Informatics Association",

}

TY - JOUR

T1 - Mining the human phenome using semantic web technologies

T2 - a case study for Type 2 Diabetes.

AU - Pathak, Jyotishman

AU - Kiefer, Richard C.

AU - Bielinski, Suzette J

AU - Chute, Christopher G.

PY - 2012

Y1 - 2012

N2 - The ability to conduct genome-wide association studies (GWAS) has enabled new exploration of how genetic variations contribute to health and disease etiology. However, historically GWAS have been limited by inadequate sample size due to associated costs for genotyping and phenotyping of study subjects. This has prompted several academic medical centers to form "biobanks" where biospecimens linked to personal health information, typically in electronic health records (EHRs), are collected and stored on large number of subjects. This provides tremendous opportunities to discover novel genotype-phenotype associations and foster hypothesis generation. In this work, we study how emerging Semantic Web technologies can be applied in conjunction with clinical and genotype data stored at the Mayo Clinic Biobank to mine the phenotype data for genetic associations. In particular, we demonstrate the role of using Resource Description Framework (RDF) for representing EHR diagnoses and procedure data, and enable federated querying via standardized Web protocols to identify subjects genotyped with Type 2 Diabetes for discovering gene-disease associations. Our study highlights the potential of Web-scale data federation techniques to execute complex queries.

AB - The ability to conduct genome-wide association studies (GWAS) has enabled new exploration of how genetic variations contribute to health and disease etiology. However, historically GWAS have been limited by inadequate sample size due to associated costs for genotyping and phenotyping of study subjects. This has prompted several academic medical centers to form "biobanks" where biospecimens linked to personal health information, typically in electronic health records (EHRs), are collected and stored on large number of subjects. This provides tremendous opportunities to discover novel genotype-phenotype associations and foster hypothesis generation. In this work, we study how emerging Semantic Web technologies can be applied in conjunction with clinical and genotype data stored at the Mayo Clinic Biobank to mine the phenotype data for genetic associations. In particular, we demonstrate the role of using Resource Description Framework (RDF) for representing EHR diagnoses and procedure data, and enable federated querying via standardized Web protocols to identify subjects genotyped with Type 2 Diabetes for discovering gene-disease associations. Our study highlights the potential of Web-scale data federation techniques to execute complex queries.

UR - http://www.scopus.com/inward/record.url?scp=84880798801&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84880798801&partnerID=8YFLogxK

M3 - Article

C2 - 23304343

AN - SCOPUS:84880798801

VL - 2012

SP - 699

EP - 708

JO - AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium

JF - AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium

SN - 1559-4076

ER -