Building an i2b2-based integrated data repository for cancer research: A case study of ovarian cancer registry

Na Hong, Zheng Li, Richard C. Kiefer, Melissa S. Robertson, Ellen L. Goode, Chen Wang, Guoqian Jiang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

In this study, we describe our preliminary efforts in building an i2b2- based integrated data repository that supports centralized data management for ovarian cancer clinical research, and discuss important lessons learnt that would inspire the evaluation and enhancement for future generic cancer-specific data repository. We collected multiple types of heterogeneous clinical data, including demographic, outcome, chemo-treatment and lab-test information for ovarian cancer. To better integrate different data types, we conducted data normalization procedures through reusing standard codes and creating mappings between local codes and standard vocabularies. We also developed the extract, transform and load (ETL) scripts to load the data into an i2b2 instance. Through further analytic practices, we evaluated major expectations of the systems according to common clinical research needs, including cohort query and identification, clinical databased hypothesis-testing, and exploratory data-mining. We also identified and discussed outstanding issues we will address through additional enhancement of existing i2b2 system.

Original languageEnglish (US)
Title of host publicationData Management and Analytics for Medicine and Healthcare - 2nd International Workshop, DMAH 2016 Held at VLDB 2016, Revised Selected Papers
EditorsLixia Yao, Fusheng Wang, Gang Luo
PublisherSpringer Verlag
Pages121-135
Number of pages15
ISBN (Print)9783319577401
DOIs
StatePublished - 2017
Event2nd International Workshop on Data Management and Analytics for Medicine and Healthcare, DMAH 2016 held in conjunction with 42nd International Conference on Very Large Data Bases, VLDB 2016 - New Delhi, India
Duration: Sep 5 2016Sep 9 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10186 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other2nd International Workshop on Data Management and Analytics for Medicine and Healthcare, DMAH 2016 held in conjunction with 42nd International Conference on Very Large Data Bases, VLDB 2016
Country/TerritoryIndia
CityNew Delhi
Period9/5/169/9/16

Keywords

  • Cancer registry
  • Extract
  • Informatics for integrating biology and the bedside (i2b2)
  • Integrated data repository
  • Ovarian cancer research
  • Transform and load (ETL)

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Building an i2b2-based integrated data repository for cancer research: A case study of ovarian cancer registry'. Together they form a unique fingerprint.

Cite this