Electronic Health Record Phenotypes for Precision Medicine: Perspectives and Caveats From Treatment of Breast Cancer at a Single Institution

Matthew K. Breitenstein, Hongfang D Liu, Kara N. Maxwell, Jyotishman Pathak, Rui Zhang

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Precision medicine is at the forefront of biomedical research. Cancer registries provide rich perspectives and electronic health records (EHRs) are commonly utilized to gather additional clinical data elements needed for translational research. However, manual annotation is resource-intense and not readily scalable. Informatics-based phenotyping presents an ideal solution, but perspectives obtained can be impacted by both data source and algorithm selection. We derived breast cancer (BC) receptor status phenotypes from structured and unstructured EHR data using rule-based algorithms, including natural language processing (NLP). Overall, the use of NLP increased BC receptor status coverage by 39.2% from 69.1% with structured medication information alone. Using all available EHR data, estrogen receptor-positive BC cases were ascertained with high precision (P = 0.976) and recall (R = 0.987) compared with gold standard chart-reviewed patients. However, status negation (R = 0.591) decreased 40.2% when relying on structured medications alone. Using multiple EHR data types (and thorough understanding of the perspectives offered) are necessary to derive robust EHR-based precision medicine phenotypes.

Original languageEnglish (US)
Pages (from-to)85-92
Number of pages8
JournalClinical and Translational Science
Volume11
Issue number1
DOIs
StatePublished - Jan 1 2018

Fingerprint

Precision Medicine
Electronic Health Records
Medicine
Health
Breast Neoplasms
Phenotype
Natural Language Processing
Therapeutics
Informatics
Translational Medical Research
Information Storage and Retrieval
Processing
Estrogen Receptors
Registries
Biomedical Research
Neoplasms

ASJC Scopus subject areas

  • Neuroscience(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Pharmacology, Toxicology and Pharmaceutics(all)

Cite this

Electronic Health Record Phenotypes for Precision Medicine : Perspectives and Caveats From Treatment of Breast Cancer at a Single Institution. / Breitenstein, Matthew K.; Liu, Hongfang D; Maxwell, Kara N.; Pathak, Jyotishman; Zhang, Rui.

In: Clinical and Translational Science, Vol. 11, No. 1, 01.01.2018, p. 85-92.

Research output: Contribution to journalArticle

@article{8163c920b5aa4505b9a8cbb68a7b4ffc,
title = "Electronic Health Record Phenotypes for Precision Medicine: Perspectives and Caveats From Treatment of Breast Cancer at a Single Institution",
abstract = "Precision medicine is at the forefront of biomedical research. Cancer registries provide rich perspectives and electronic health records (EHRs) are commonly utilized to gather additional clinical data elements needed for translational research. However, manual annotation is resource-intense and not readily scalable. Informatics-based phenotyping presents an ideal solution, but perspectives obtained can be impacted by both data source and algorithm selection. We derived breast cancer (BC) receptor status phenotypes from structured and unstructured EHR data using rule-based algorithms, including natural language processing (NLP). Overall, the use of NLP increased BC receptor status coverage by 39.2{\%} from 69.1{\%} with structured medication information alone. Using all available EHR data, estrogen receptor-positive BC cases were ascertained with high precision (P = 0.976) and recall (R = 0.987) compared with gold standard chart-reviewed patients. However, status negation (R = 0.591) decreased 40.2{\%} when relying on structured medications alone. Using multiple EHR data types (and thorough understanding of the perspectives offered) are necessary to derive robust EHR-based precision medicine phenotypes.",
author = "Breitenstein, {Matthew K.} and Liu, {Hongfang D} and Maxwell, {Kara N.} and Jyotishman Pathak and Rui Zhang",
year = "2018",
month = "1",
day = "1",
doi = "10.1111/cts.12514",
language = "English (US)",
volume = "11",
pages = "85--92",
journal = "Clinical and Translational Science",
issn = "1752-8054",
publisher = "Wiley-Blackwell",
number = "1",

}

TY - JOUR

T1 - Electronic Health Record Phenotypes for Precision Medicine

T2 - Perspectives and Caveats From Treatment of Breast Cancer at a Single Institution

AU - Breitenstein, Matthew K.

AU - Liu, Hongfang D

AU - Maxwell, Kara N.

AU - Pathak, Jyotishman

AU - Zhang, Rui

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Precision medicine is at the forefront of biomedical research. Cancer registries provide rich perspectives and electronic health records (EHRs) are commonly utilized to gather additional clinical data elements needed for translational research. However, manual annotation is resource-intense and not readily scalable. Informatics-based phenotyping presents an ideal solution, but perspectives obtained can be impacted by both data source and algorithm selection. We derived breast cancer (BC) receptor status phenotypes from structured and unstructured EHR data using rule-based algorithms, including natural language processing (NLP). Overall, the use of NLP increased BC receptor status coverage by 39.2% from 69.1% with structured medication information alone. Using all available EHR data, estrogen receptor-positive BC cases were ascertained with high precision (P = 0.976) and recall (R = 0.987) compared with gold standard chart-reviewed patients. However, status negation (R = 0.591) decreased 40.2% when relying on structured medications alone. Using multiple EHR data types (and thorough understanding of the perspectives offered) are necessary to derive robust EHR-based precision medicine phenotypes.

AB - Precision medicine is at the forefront of biomedical research. Cancer registries provide rich perspectives and electronic health records (EHRs) are commonly utilized to gather additional clinical data elements needed for translational research. However, manual annotation is resource-intense and not readily scalable. Informatics-based phenotyping presents an ideal solution, but perspectives obtained can be impacted by both data source and algorithm selection. We derived breast cancer (BC) receptor status phenotypes from structured and unstructured EHR data using rule-based algorithms, including natural language processing (NLP). Overall, the use of NLP increased BC receptor status coverage by 39.2% from 69.1% with structured medication information alone. Using all available EHR data, estrogen receptor-positive BC cases were ascertained with high precision (P = 0.976) and recall (R = 0.987) compared with gold standard chart-reviewed patients. However, status negation (R = 0.591) decreased 40.2% when relying on structured medications alone. Using multiple EHR data types (and thorough understanding of the perspectives offered) are necessary to derive robust EHR-based precision medicine phenotypes.

UR - http://www.scopus.com/inward/record.url?scp=85040248565&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85040248565&partnerID=8YFLogxK

U2 - 10.1111/cts.12514

DO - 10.1111/cts.12514

M3 - Article

C2 - 29084368

AN - SCOPUS:85040248565

VL - 11

SP - 85

EP - 92

JO - Clinical and Translational Science

JF - Clinical and Translational Science

SN - 1752-8054

IS - 1

ER -