Detecting Serendipitous Drug Usage in Social Media with Deep Neural Network Models

Boshu Ru, Dingcheng Li, Lixia Yao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Serendipitous drug usage refers to unexpected relief of comorbid diseases or symptoms when patients take a drug for another common or known indication. In the history of drug discovery, serendipity has contributed significantly to new and successful indications for many drugs. Our previous research has identified patient reported serendipitous drug usage in social media. If such information could be computationally identified in social media, it could be helpful for generating and validating drug-repositioning hypotheses. In this study, we framed detection of serendipitous drug usage in social media as a binary classification problem and investigated deep neural network models as a solution. We constructed word-embedding features from drug-review posts in the patient forum of WebMD, using the word2vec algorithm. We adopted the convolutional neural network (CNN), long short-term memory network (LSTM), and convolutional long short-term memory network (CLSTM) and redesigned them by adding contextual information that we extracted from drug-review posts, information filtering tools, medical ontology, and medical knowledge. We trained, tuned, and evaluated our deep neural network models on a gold standard dataset containing 15,714 sentences, of which 447 contained serendipitous drug usages. Additionally, we compared our deep neural networks to support vector machine, random forest, and AdaBoost.M1 algorithms. The results showed that adding context information helped to reduce the false-positive rate of deep neural network models. In the presence of an extremely imbalanced dataset and limited instances of serendipitous drug usage, deep neural network models did not outperform other machine learning models with n-gram and context features. However, deep neural network models could more effectively utilize word embedding in feature construction. This advantage made deep neural networks worthy of further investigation and improvement.

Original languageEnglish (US)
Title of host publicationProceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018
EditorsHarald Schmidt, David Griol, Haiying Wang, Jan Baumbach, Huiru Zheng, Zoraida Callejas, Xiaohua Hu, Julie Dickerson, Le Zhang
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1083-1090
Number of pages8
ISBN (Electronic)9781538654880
DOIs
StatePublished - Jan 21 2019
Event2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018 - Madrid, Spain
Duration: Dec 3 2018Dec 6 2018

Publication series

NameProceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018

Conference

Conference2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018
CountrySpain
CityMadrid
Period12/3/1812/6/18

Fingerprint

Social Media
Neural Networks (Computer)
Pharmaceutical Preparations
Long-Term Memory
Short-Term Memory
Information filtering
Adaptive boosting
Drug Repositioning
Deep neural networks
Support vector machines
Ontology
Learning systems
Drug Discovery
Neural networks
History

Keywords

  • Data mining
  • drug discovery
  • drug repurposing
  • health informatics
  • social media

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Informatics

Cite this

Ru, B., Li, D., & Yao, L. (2019). Detecting Serendipitous Drug Usage in Social Media with Deep Neural Network Models. In H. Schmidt, D. Griol, H. Wang, J. Baumbach, H. Zheng, Z. Callejas, X. Hu, J. Dickerson, ... L. Zhang (Eds.), Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018 (pp. 1083-1090). [8621252] (Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BIBM.2018.8621252

Detecting Serendipitous Drug Usage in Social Media with Deep Neural Network Models. / Ru, Boshu; Li, Dingcheng; Yao, Lixia.

Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018. ed. / Harald Schmidt; David Griol; Haiying Wang; Jan Baumbach; Huiru Zheng; Zoraida Callejas; Xiaohua Hu; Julie Dickerson; Le Zhang. Institute of Electrical and Electronics Engineers Inc., 2019. p. 1083-1090 8621252 (Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ru, B, Li, D & Yao, L 2019, Detecting Serendipitous Drug Usage in Social Media with Deep Neural Network Models. in H Schmidt, D Griol, H Wang, J Baumbach, H Zheng, Z Callejas, X Hu, J Dickerson & L Zhang (eds), Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018., 8621252, Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018, Institute of Electrical and Electronics Engineers Inc., pp. 1083-1090, 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018, Madrid, Spain, 12/3/18. https://doi.org/10.1109/BIBM.2018.8621252
Ru B, Li D, Yao L. Detecting Serendipitous Drug Usage in Social Media with Deep Neural Network Models. In Schmidt H, Griol D, Wang H, Baumbach J, Zheng H, Callejas Z, Hu X, Dickerson J, Zhang L, editors, Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018. Institute of Electrical and Electronics Engineers Inc. 2019. p. 1083-1090. 8621252. (Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018). https://doi.org/10.1109/BIBM.2018.8621252
Ru, Boshu ; Li, Dingcheng ; Yao, Lixia. / Detecting Serendipitous Drug Usage in Social Media with Deep Neural Network Models. Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018. editor / Harald Schmidt ; David Griol ; Haiying Wang ; Jan Baumbach ; Huiru Zheng ; Zoraida Callejas ; Xiaohua Hu ; Julie Dickerson ; Le Zhang. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 1083-1090 (Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018).
@inproceedings{9b5a2a81529043a285969ac4ba92a1c9,
title = "Detecting Serendipitous Drug Usage in Social Media with Deep Neural Network Models",
abstract = "Serendipitous drug usage refers to unexpected relief of comorbid diseases or symptoms when patients take a drug for another common or known indication. In the history of drug discovery, serendipity has contributed significantly to new and successful indications for many drugs. Our previous research has identified patient reported serendipitous drug usage in social media. If such information could be computationally identified in social media, it could be helpful for generating and validating drug-repositioning hypotheses. In this study, we framed detection of serendipitous drug usage in social media as a binary classification problem and investigated deep neural network models as a solution. We constructed word-embedding features from drug-review posts in the patient forum of WebMD, using the word2vec algorithm. We adopted the convolutional neural network (CNN), long short-term memory network (LSTM), and convolutional long short-term memory network (CLSTM) and redesigned them by adding contextual information that we extracted from drug-review posts, information filtering tools, medical ontology, and medical knowledge. We trained, tuned, and evaluated our deep neural network models on a gold standard dataset containing 15,714 sentences, of which 447 contained serendipitous drug usages. Additionally, we compared our deep neural networks to support vector machine, random forest, and AdaBoost.M1 algorithms. The results showed that adding context information helped to reduce the false-positive rate of deep neural network models. In the presence of an extremely imbalanced dataset and limited instances of serendipitous drug usage, deep neural network models did not outperform other machine learning models with n-gram and context features. However, deep neural network models could more effectively utilize word embedding in feature construction. This advantage made deep neural networks worthy of further investigation and improvement.",
keywords = "Data mining, drug discovery, drug repurposing, health informatics, social media",
author = "Boshu Ru and Dingcheng Li and Lixia Yao",
year = "2019",
month = "1",
day = "21",
doi = "10.1109/BIBM.2018.8621252",
language = "English (US)",
series = "Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "1083--1090",
editor = "Harald Schmidt and David Griol and Haiying Wang and Jan Baumbach and Huiru Zheng and Zoraida Callejas and Xiaohua Hu and Julie Dickerson and Le Zhang",
booktitle = "Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018",

}

TY - GEN

T1 - Detecting Serendipitous Drug Usage in Social Media with Deep Neural Network Models

AU - Ru, Boshu

AU - Li, Dingcheng

AU - Yao, Lixia

PY - 2019/1/21

Y1 - 2019/1/21

N2 - Serendipitous drug usage refers to unexpected relief of comorbid diseases or symptoms when patients take a drug for another common or known indication. In the history of drug discovery, serendipity has contributed significantly to new and successful indications for many drugs. Our previous research has identified patient reported serendipitous drug usage in social media. If such information could be computationally identified in social media, it could be helpful for generating and validating drug-repositioning hypotheses. In this study, we framed detection of serendipitous drug usage in social media as a binary classification problem and investigated deep neural network models as a solution. We constructed word-embedding features from drug-review posts in the patient forum of WebMD, using the word2vec algorithm. We adopted the convolutional neural network (CNN), long short-term memory network (LSTM), and convolutional long short-term memory network (CLSTM) and redesigned them by adding contextual information that we extracted from drug-review posts, information filtering tools, medical ontology, and medical knowledge. We trained, tuned, and evaluated our deep neural network models on a gold standard dataset containing 15,714 sentences, of which 447 contained serendipitous drug usages. Additionally, we compared our deep neural networks to support vector machine, random forest, and AdaBoost.M1 algorithms. The results showed that adding context information helped to reduce the false-positive rate of deep neural network models. In the presence of an extremely imbalanced dataset and limited instances of serendipitous drug usage, deep neural network models did not outperform other machine learning models with n-gram and context features. However, deep neural network models could more effectively utilize word embedding in feature construction. This advantage made deep neural networks worthy of further investigation and improvement.

AB - Serendipitous drug usage refers to unexpected relief of comorbid diseases or symptoms when patients take a drug for another common or known indication. In the history of drug discovery, serendipity has contributed significantly to new and successful indications for many drugs. Our previous research has identified patient reported serendipitous drug usage in social media. If such information could be computationally identified in social media, it could be helpful for generating and validating drug-repositioning hypotheses. In this study, we framed detection of serendipitous drug usage in social media as a binary classification problem and investigated deep neural network models as a solution. We constructed word-embedding features from drug-review posts in the patient forum of WebMD, using the word2vec algorithm. We adopted the convolutional neural network (CNN), long short-term memory network (LSTM), and convolutional long short-term memory network (CLSTM) and redesigned them by adding contextual information that we extracted from drug-review posts, information filtering tools, medical ontology, and medical knowledge. We trained, tuned, and evaluated our deep neural network models on a gold standard dataset containing 15,714 sentences, of which 447 contained serendipitous drug usages. Additionally, we compared our deep neural networks to support vector machine, random forest, and AdaBoost.M1 algorithms. The results showed that adding context information helped to reduce the false-positive rate of deep neural network models. In the presence of an extremely imbalanced dataset and limited instances of serendipitous drug usage, deep neural network models did not outperform other machine learning models with n-gram and context features. However, deep neural network models could more effectively utilize word embedding in feature construction. This advantage made deep neural networks worthy of further investigation and improvement.

KW - Data mining

KW - drug discovery

KW - drug repurposing

KW - health informatics

KW - social media

UR - http://www.scopus.com/inward/record.url?scp=85062512189&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062512189&partnerID=8YFLogxK

U2 - 10.1109/BIBM.2018.8621252

DO - 10.1109/BIBM.2018.8621252

M3 - Conference contribution

T3 - Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018

SP - 1083

EP - 1090

BT - Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018

A2 - Schmidt, Harald

A2 - Griol, David

A2 - Wang, Haiying

A2 - Baumbach, Jan

A2 - Zheng, Huiru

A2 - Callejas, Zoraida

A2 - Hu, Xiaohua

A2 - Dickerson, Julie

A2 - Zhang, Le

PB - Institute of Electrical and Electronics Engineers Inc.

ER -