Formalizing ICD coding rules using Formal Concept Analysis

Guoqian D Jiang, Jyotishman Pathak, Christopher G. Chute

Research output: Contribution to journalArticle

21 Citations (Scopus)

Abstract

Background: With the 11th revision of the International Classification of Disease (ICD) being officially launched by the World Health Organization (WHO), the significance of a formal representation for ICD coding rules has emerged as a pragmatic concern. Objectives: To explore the role of Formal Concept Analysis (FCA) on examining ICD10 coding rules and to develop FCA-based auditing approaches for the formalization process. Methods: We propose a model for formalizing ICD coding rules underlying the ICD Index using FCA. The coding rules are generated from FCA models and represented in the Semantic Web Rule Language (SWRL). Two auditing approaches were developed focusing upon non-disjoint nodes and anonymous nodes manifest in the FCA model. The candidate domains (i.e. any three character code with their sub-codes) of all 22 chapters of the ICD10 2006 version were analyzed using the two auditing approaches. Case studies and a preliminary evaluation were performed for validation. Results: A total of 2044 formal contexts from the candidate domains of 22 ICD chapters were generated and audited. We identified 692 ICD codes having non-disjoint nodes in all chapters; chapters 19 and 21 contained the highest proportion of candidate domains with non-disjoint nodes (61.9% and 45.6%). We also identified 6996 anonymous nodes from 1382 candidate domains. Chapters 7, 11, 13, and 17, have the highest proportion of candidate domains having anonymous nodes (97.5%, 95.4%, 93.6% and 93.0%) while chapters 15 and 17 have the highest proportion of anonymous nodes among all chapters (45.5% and 44.0%). Case studies and a limited evaluation demonstrate that non-disjoint nodes and anonymous nodes arising from FCA are effective mechanisms for auditing ICD10. Conclusion: FCA-based models demonstrate a practical solution for formalizing ICD coding rules. FCA techniques could not only audit ICD domain knowledge completeness for a specific domain, but also provide a high level auditing profile for all ICD chapters.

Original languageEnglish (US)
Pages (from-to)504-517
Number of pages14
JournalJournal of Biomedical Informatics
Volume42
Issue number3
DOIs
StatePublished - Jun 2009

Fingerprint

Formal concept analysis
International Classification of Diseases
Semantic Web
Semantics
Language
Health

Keywords

  • Auditing approach
  • Clinical terminologies
  • Formal Concept Analysis (FCA)
  • International Classification of Disease (ICD)
  • Semantic Web Rule Language (SWRL)

ASJC Scopus subject areas

  • Computer Science Applications
  • Health Informatics

Cite this

Formalizing ICD coding rules using Formal Concept Analysis. / Jiang, Guoqian D; Pathak, Jyotishman; Chute, Christopher G.

In: Journal of Biomedical Informatics, Vol. 42, No. 3, 06.2009, p. 504-517.

Research output: Contribution to journalArticle

Jiang, Guoqian D ; Pathak, Jyotishman ; Chute, Christopher G. / Formalizing ICD coding rules using Formal Concept Analysis. In: Journal of Biomedical Informatics. 2009 ; Vol. 42, No. 3. pp. 504-517.
@article{8017d1b6e773426aa8d486c8112deb7b,
title = "Formalizing ICD coding rules using Formal Concept Analysis",
abstract = "Background: With the 11th revision of the International Classification of Disease (ICD) being officially launched by the World Health Organization (WHO), the significance of a formal representation for ICD coding rules has emerged as a pragmatic concern. Objectives: To explore the role of Formal Concept Analysis (FCA) on examining ICD10 coding rules and to develop FCA-based auditing approaches for the formalization process. Methods: We propose a model for formalizing ICD coding rules underlying the ICD Index using FCA. The coding rules are generated from FCA models and represented in the Semantic Web Rule Language (SWRL). Two auditing approaches were developed focusing upon non-disjoint nodes and anonymous nodes manifest in the FCA model. The candidate domains (i.e. any three character code with their sub-codes) of all 22 chapters of the ICD10 2006 version were analyzed using the two auditing approaches. Case studies and a preliminary evaluation were performed for validation. Results: A total of 2044 formal contexts from the candidate domains of 22 ICD chapters were generated and audited. We identified 692 ICD codes having non-disjoint nodes in all chapters; chapters 19 and 21 contained the highest proportion of candidate domains with non-disjoint nodes (61.9{\%} and 45.6{\%}). We also identified 6996 anonymous nodes from 1382 candidate domains. Chapters 7, 11, 13, and 17, have the highest proportion of candidate domains having anonymous nodes (97.5{\%}, 95.4{\%}, 93.6{\%} and 93.0{\%}) while chapters 15 and 17 have the highest proportion of anonymous nodes among all chapters (45.5{\%} and 44.0{\%}). Case studies and a limited evaluation demonstrate that non-disjoint nodes and anonymous nodes arising from FCA are effective mechanisms for auditing ICD10. Conclusion: FCA-based models demonstrate a practical solution for formalizing ICD coding rules. FCA techniques could not only audit ICD domain knowledge completeness for a specific domain, but also provide a high level auditing profile for all ICD chapters.",
keywords = "Auditing approach, Clinical terminologies, Formal Concept Analysis (FCA), International Classification of Disease (ICD), Semantic Web Rule Language (SWRL)",
author = "Jiang, {Guoqian D} and Jyotishman Pathak and Chute, {Christopher G.}",
year = "2009",
month = "6",
doi = "10.1016/j.jbi.2009.02.005",
language = "English (US)",
volume = "42",
pages = "504--517",
journal = "Journal of Biomedical Informatics",
issn = "1532-0464",
publisher = "Academic Press Inc.",
number = "3",

}

TY - JOUR

T1 - Formalizing ICD coding rules using Formal Concept Analysis

AU - Jiang, Guoqian D

AU - Pathak, Jyotishman

AU - Chute, Christopher G.

PY - 2009/6

Y1 - 2009/6

N2 - Background: With the 11th revision of the International Classification of Disease (ICD) being officially launched by the World Health Organization (WHO), the significance of a formal representation for ICD coding rules has emerged as a pragmatic concern. Objectives: To explore the role of Formal Concept Analysis (FCA) on examining ICD10 coding rules and to develop FCA-based auditing approaches for the formalization process. Methods: We propose a model for formalizing ICD coding rules underlying the ICD Index using FCA. The coding rules are generated from FCA models and represented in the Semantic Web Rule Language (SWRL). Two auditing approaches were developed focusing upon non-disjoint nodes and anonymous nodes manifest in the FCA model. The candidate domains (i.e. any three character code with their sub-codes) of all 22 chapters of the ICD10 2006 version were analyzed using the two auditing approaches. Case studies and a preliminary evaluation were performed for validation. Results: A total of 2044 formal contexts from the candidate domains of 22 ICD chapters were generated and audited. We identified 692 ICD codes having non-disjoint nodes in all chapters; chapters 19 and 21 contained the highest proportion of candidate domains with non-disjoint nodes (61.9% and 45.6%). We also identified 6996 anonymous nodes from 1382 candidate domains. Chapters 7, 11, 13, and 17, have the highest proportion of candidate domains having anonymous nodes (97.5%, 95.4%, 93.6% and 93.0%) while chapters 15 and 17 have the highest proportion of anonymous nodes among all chapters (45.5% and 44.0%). Case studies and a limited evaluation demonstrate that non-disjoint nodes and anonymous nodes arising from FCA are effective mechanisms for auditing ICD10. Conclusion: FCA-based models demonstrate a practical solution for formalizing ICD coding rules. FCA techniques could not only audit ICD domain knowledge completeness for a specific domain, but also provide a high level auditing profile for all ICD chapters.

AB - Background: With the 11th revision of the International Classification of Disease (ICD) being officially launched by the World Health Organization (WHO), the significance of a formal representation for ICD coding rules has emerged as a pragmatic concern. Objectives: To explore the role of Formal Concept Analysis (FCA) on examining ICD10 coding rules and to develop FCA-based auditing approaches for the formalization process. Methods: We propose a model for formalizing ICD coding rules underlying the ICD Index using FCA. The coding rules are generated from FCA models and represented in the Semantic Web Rule Language (SWRL). Two auditing approaches were developed focusing upon non-disjoint nodes and anonymous nodes manifest in the FCA model. The candidate domains (i.e. any three character code with their sub-codes) of all 22 chapters of the ICD10 2006 version were analyzed using the two auditing approaches. Case studies and a preliminary evaluation were performed for validation. Results: A total of 2044 formal contexts from the candidate domains of 22 ICD chapters were generated and audited. We identified 692 ICD codes having non-disjoint nodes in all chapters; chapters 19 and 21 contained the highest proportion of candidate domains with non-disjoint nodes (61.9% and 45.6%). We also identified 6996 anonymous nodes from 1382 candidate domains. Chapters 7, 11, 13, and 17, have the highest proportion of candidate domains having anonymous nodes (97.5%, 95.4%, 93.6% and 93.0%) while chapters 15 and 17 have the highest proportion of anonymous nodes among all chapters (45.5% and 44.0%). Case studies and a limited evaluation demonstrate that non-disjoint nodes and anonymous nodes arising from FCA are effective mechanisms for auditing ICD10. Conclusion: FCA-based models demonstrate a practical solution for formalizing ICD coding rules. FCA techniques could not only audit ICD domain knowledge completeness for a specific domain, but also provide a high level auditing profile for all ICD chapters.

KW - Auditing approach

KW - Clinical terminologies

KW - Formal Concept Analysis (FCA)

KW - International Classification of Disease (ICD)

KW - Semantic Web Rule Language (SWRL)

UR - http://www.scopus.com/inward/record.url?scp=65649129251&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=65649129251&partnerID=8YFLogxK

U2 - 10.1016/j.jbi.2009.02.005

DO - 10.1016/j.jbi.2009.02.005

M3 - Article

VL - 42

SP - 504

EP - 517

JO - Journal of Biomedical Informatics

JF - Journal of Biomedical Informatics

SN - 1532-0464

IS - 3

ER -