Enhancing Clinical Information Retrieval through Context-Aware Queries and Indices

Andrew Wen; Yanshan Wang; Vinod C. Kaggal; Sijia Liu; Hongfang Liu; Jungwei Fan

doi:10.1109/BigData47090.2019.9006241

Enhancing Clinical Information Retrieval through Context-Aware Queries and Indices

Andrew Wen, Yanshan Wang, Vinod C. Kaggal, Sijia Liu, Hongfang Liu, Jungwei Fan

Digital Health Sciences

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

The big data revolution has created a hefty demand for searching large-scale electronic health records (EHRs) to support clinical practice, research, and administration. Despite the volume of data involved, fast and accurate identification of clinical narratives pertinent to a clinical case being seen by any given provider is crucial for decision-making at the point of care. In the general domain, this capability is accomplished through a combination of the inverted index data structure, horizontal scaling, and information retrieval (IR) scoring algorithms. These technologies are also being used in the clinical domain, but have met limited success, particularly as clinical cases become more complex. One barrier affecting clinical performance is that contextual information, such as negation, temporality, and the subject of clinical mentions, impact clinical relevance but is not considered in general IR methodologies. In this study, we implemented a solution by identifying and incorporating the aforementioned semantic contexts as part of IR indexing/scoring with Elasticsearch. Experiments were conducted in comparison to baseline approaches with respect to: 1) evaluation of the impact on the quality (relevance) of the returned results, and 2) evaluation of the impact on execution time and storage requirements. The results showed a 5.1-23.1% improvement in retrieval quality, along with achieving 35% faster query execution time. Cost-wise, the solution required 1.5-2 times larger space and about 3 times increase in indexing time. The higher relevance demonstrated the merit of incorporating contextual information into clinical IR, and the near-constant increase in time and space suggested promising scalability.

Original language	English (US)
Title of host publication	Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019
Editors	Chaitanya Baru, Jun Huan, Latifur Khan, Xiaohua Tony Hu, Ronay Ak, Yuanyuan Tian, Roger Barga, Carlo Zaniolo, Kisung Lee, Yanfang Fanny Ye
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	2800-2807
Number of pages	8
ISBN (Electronic)	9781728108582
DOIs	https://doi.org/10.1109/BigData47090.2019.9006241
State	Published - Dec 2019
Event	2019 IEEE International Conference on Big Data, Big Data 2019 - Los Angeles, United States Duration: Dec 9 2019 → Dec 12 2019

Publication series

Name	Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019

Conference

Conference	2019 IEEE International Conference on Big Data, Big Data 2019
Country/Territory	United States
City	Los Angeles
Period	12/9/19 → 12/12/19

Keywords

Clinical Information Retrieval
EHR
Elasticsearch
Electronic Health Records
Information Retrieval

ASJC Scopus subject areas

Artificial Intelligence
Computer Networks and Communications
Information Systems
Information Systems and Management

Access to Document

10.1109/BigData47090.2019.9006241

Cite this

Wen, A., Wang, Y., Kaggal, V. C., Liu, S., Liu, H., & Fan, J. (2019). Enhancing Clinical Information Retrieval through Context-Aware Queries and Indices. In C. Baru, J. Huan, L. Khan, X. T. Hu, R. Ak, Y. Tian, R. Barga, C. Zaniolo, K. Lee, & Y. F. Ye (Eds.), Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019 (pp. 2800-2807). Article 9006241 (Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BigData47090.2019.9006241

Enhancing Clinical Information Retrieval through Context-Aware Queries and Indices. / Wen, Andrew; Wang, Yanshan; Kaggal, Vinod C. et al.
Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019. ed. / Chaitanya Baru; Jun Huan; Latifur Khan; Xiaohua Tony Hu; Ronay Ak; Yuanyuan Tian; Roger Barga; Carlo Zaniolo; Kisung Lee; Yanfang Fanny Ye. Institute of Electrical and Electronics Engineers Inc., 2019. p. 2800-2807 9006241 (Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Wen, A, Wang, Y, Kaggal, VC, Liu, S, Liu, H & Fan, J 2019, Enhancing Clinical Information Retrieval through Context-Aware Queries and Indices. in C Baru, J Huan, L Khan, XT Hu, R Ak, Y Tian, R Barga, C Zaniolo, K Lee & YF Ye (eds), Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019., 9006241, Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019, Institute of Electrical and Electronics Engineers Inc., pp. 2800-2807, 2019 IEEE International Conference on Big Data, Big Data 2019, Los Angeles, United States, 12/9/19. https://doi.org/10.1109/BigData47090.2019.9006241

Wen A, Wang Y, Kaggal VC, Liu S, Liu H , Fan J. Enhancing Clinical Information Retrieval through Context-Aware Queries and Indices. In Baru C, Huan J, Khan L, Hu XT, Ak R, Tian Y, Barga R, Zaniolo C, Lee K, Ye YF, editors, Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019. Institute of Electrical and Electronics Engineers Inc. 2019. p. 2800-2807. 9006241. (Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019). doi: 10.1109/BigData47090.2019.9006241

Wen, Andrew ; Wang, Yanshan ; Kaggal, Vinod C. et al. / Enhancing Clinical Information Retrieval through Context-Aware Queries and Indices. Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019. editor / Chaitanya Baru ; Jun Huan ; Latifur Khan ; Xiaohua Tony Hu ; Ronay Ak ; Yuanyuan Tian ; Roger Barga ; Carlo Zaniolo ; Kisung Lee ; Yanfang Fanny Ye. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 2800-2807 (Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019).

@inproceedings{0d4bd35814694dc19342a557cd67940e,

title = "Enhancing Clinical Information Retrieval through Context-Aware Queries and Indices",

abstract = "The big data revolution has created a hefty demand for searching large-scale electronic health records (EHRs) to support clinical practice, research, and administration. Despite the volume of data involved, fast and accurate identification of clinical narratives pertinent to a clinical case being seen by any given provider is crucial for decision-making at the point of care. In the general domain, this capability is accomplished through a combination of the inverted index data structure, horizontal scaling, and information retrieval (IR) scoring algorithms. These technologies are also being used in the clinical domain, but have met limited success, particularly as clinical cases become more complex. One barrier affecting clinical performance is that contextual information, such as negation, temporality, and the subject of clinical mentions, impact clinical relevance but is not considered in general IR methodologies. In this study, we implemented a solution by identifying and incorporating the aforementioned semantic contexts as part of IR indexing/scoring with Elasticsearch. Experiments were conducted in comparison to baseline approaches with respect to: 1) evaluation of the impact on the quality (relevance) of the returned results, and 2) evaluation of the impact on execution time and storage requirements. The results showed a 5.1-23.1% improvement in retrieval quality, along with achieving 35% faster query execution time. Cost-wise, the solution required 1.5-2 times larger space and about 3 times increase in indexing time. The higher relevance demonstrated the merit of incorporating contextual information into clinical IR, and the near-constant increase in time and space suggested promising scalability.",

keywords = "Clinical Information Retrieval, EHR, Elasticsearch, Electronic Health Records, Information Retrieval",

author = "Andrew Wen and Yanshan Wang and Kaggal, {Vinod C.} and Sijia Liu and Hongfang Liu and Jungwei Fan",

note = "Publisher Copyright: {\textcopyright} 2019 IEEE.; 2019 IEEE International Conference on Big Data, Big Data 2019 ; Conference date: 09-12-2019 Through 12-12-2019",

year = "2019",

month = dec,

doi = "10.1109/BigData47090.2019.9006241",

language = "English (US)",

series = "Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "2800--2807",

editor = "Chaitanya Baru and Jun Huan and Latifur Khan and Hu, {Xiaohua Tony} and Ronay Ak and Yuanyuan Tian and Roger Barga and Carlo Zaniolo and Kisung Lee and Ye, {Yanfang Fanny}",

booktitle = "Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019",

}

TY - GEN

T1 - Enhancing Clinical Information Retrieval through Context-Aware Queries and Indices

AU - Wen, Andrew

AU - Wang, Yanshan

AU - Kaggal, Vinod C.

AU - Liu, Sijia

AU - Liu, Hongfang

AU - Fan, Jungwei

PY - 2019/12

Y1 - 2019/12

N2 - The big data revolution has created a hefty demand for searching large-scale electronic health records (EHRs) to support clinical practice, research, and administration. Despite the volume of data involved, fast and accurate identification of clinical narratives pertinent to a clinical case being seen by any given provider is crucial for decision-making at the point of care. In the general domain, this capability is accomplished through a combination of the inverted index data structure, horizontal scaling, and information retrieval (IR) scoring algorithms. These technologies are also being used in the clinical domain, but have met limited success, particularly as clinical cases become more complex. One barrier affecting clinical performance is that contextual information, such as negation, temporality, and the subject of clinical mentions, impact clinical relevance but is not considered in general IR methodologies. In this study, we implemented a solution by identifying and incorporating the aforementioned semantic contexts as part of IR indexing/scoring with Elasticsearch. Experiments were conducted in comparison to baseline approaches with respect to: 1) evaluation of the impact on the quality (relevance) of the returned results, and 2) evaluation of the impact on execution time and storage requirements. The results showed a 5.1-23.1% improvement in retrieval quality, along with achieving 35% faster query execution time. Cost-wise, the solution required 1.5-2 times larger space and about 3 times increase in indexing time. The higher relevance demonstrated the merit of incorporating contextual information into clinical IR, and the near-constant increase in time and space suggested promising scalability.

AB - The big data revolution has created a hefty demand for searching large-scale electronic health records (EHRs) to support clinical practice, research, and administration. Despite the volume of data involved, fast and accurate identification of clinical narratives pertinent to a clinical case being seen by any given provider is crucial for decision-making at the point of care. In the general domain, this capability is accomplished through a combination of the inverted index data structure, horizontal scaling, and information retrieval (IR) scoring algorithms. These technologies are also being used in the clinical domain, but have met limited success, particularly as clinical cases become more complex. One barrier affecting clinical performance is that contextual information, such as negation, temporality, and the subject of clinical mentions, impact clinical relevance but is not considered in general IR methodologies. In this study, we implemented a solution by identifying and incorporating the aforementioned semantic contexts as part of IR indexing/scoring with Elasticsearch. Experiments were conducted in comparison to baseline approaches with respect to: 1) evaluation of the impact on the quality (relevance) of the returned results, and 2) evaluation of the impact on execution time and storage requirements. The results showed a 5.1-23.1% improvement in retrieval quality, along with achieving 35% faster query execution time. Cost-wise, the solution required 1.5-2 times larger space and about 3 times increase in indexing time. The higher relevance demonstrated the merit of incorporating contextual information into clinical IR, and the near-constant increase in time and space suggested promising scalability.

KW - Clinical Information Retrieval

KW - EHR

KW - Elasticsearch

KW - Electronic Health Records

KW - Information Retrieval

UR - http://www.scopus.com/inward/record.url?scp=85081359140&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85081359140&partnerID=8YFLogxK

U2 - 10.1109/BigData47090.2019.9006241

DO - 10.1109/BigData47090.2019.9006241

M3 - Conference contribution

AN - SCOPUS:85081359140

T3 - Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019

SP - 2800

EP - 2807

BT - Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019

A2 - Baru, Chaitanya

A2 - Huan, Jun

A2 - Khan, Latifur

A2 - Hu, Xiaohua Tony

A2 - Ak, Ronay

A2 - Tian, Yuanyuan

A2 - Barga, Roger

A2 - Zaniolo, Carlo

A2 - Lee, Kisung

A2 - Ye, Yanfang Fanny

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2019 IEEE International Conference on Big Data, Big Data 2019

Y2 - 9 December 2019 through 12 December 2019

ER -

Enhancing Clinical Information Retrieval through Context-Aware Queries and Indices

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this