Domain-specific common data elements (CDEs) are emerging as an effective approach to standards-based clinical research data storage and retrieval. A limiting factor, however, is the lack of robust automated quality assurance (QA) tools for the CDEs in clinical study domains. The objectives of the present study are to prototype and evaluate a QA tool for the study of cancer CDEs using a post-coordination approach. The study starts by integrating the NCI caDSR CDEs and The Cancer Genome Atlas (TCGA) data dictionaries in a single Resource Description Framework (RDF) data store. We designed a compositional expression pattern based on the Data Element Concept model structure informed by ISO/IEC 11179, and developed a transformation tool that converts the pattern-based compositional expressions into the Web Ontology Language (OWL) syntax. Invoking reasoning and explanation services, we tested the system utilizing the CDEs extracted from two TCGA clinical cancer study domains. The system could automatically identify duplicate CDEs, and detect CDE modeling errors. In conclusion, compositional expressions not only enable reuse of existing ontology codes to define new domain concepts, but also provide an automated mechanism for QA of terminological annotations for CDEs.
|Original language||English (US)|
|Number of pages||10|
|Journal||AMIA ... Annual Symposium proceedings. AMIA Symposium|
|State||Published - Jan 1 2015|
ASJC Scopus subject areas