Assessing the Need of Discourse-Level Analysis in Identifying Evidence of Drug-Disease Relations in Scientific Literature

Majid Rastegar-Mojarad, Ravikumar Komandur Elayavilli, Dingcheng Li, Hongfang D Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Relation extraction typically involves the extraction of relations between two or more entities occurring within a single or multiple sentences. In this study, we investigated the significance of extracting information from multiple sentences specifically in the context of drug-disease relation discovery. We used multiple resources such as Semantic Medline, a literature based resource, and Medline search (for filtering spurious results) and inferred 8,772 potential drug-disease pairs. Our analysis revealed that 6,450 (73.5%) of the 8,772 potential drug-disease relations did not occur in a single sentence. Moreover, only 537 of the drug-disease pairs matched the curated gold standard in Comparative Toxicogenomics Database (CTD), a trusted resource for drug-disease relations. Among the 537, nearly 75% (407) of the drug-disease pairs occur in multiple sentences. Our analysis revealed that the drug-disease pairs inferred from Semantic Medline or retrieved from CTD could be extracted from multiple sentences in the literature. This highlights the significance of the need of discourse-level analysis in extracting the relations from biomedical literature.

Original languageEnglish (US)
Title of host publicationStudies in Health Technology and Informatics
PublisherIOS Press
Pages539-543
Number of pages5
Volume216
ISBN (Print)9781614995630
DOIs
StatePublished - 2015
Event15th World Congress on Health and Biomedical Informatics, MEDINFO 2015 - Sao Paulo, Brazil
Duration: Aug 19 2015Aug 23 2015

Publication series

NameStudies in Health Technology and Informatics
Volume216
ISSN (Print)09269630
ISSN (Electronic)18798365

Other

Other15th World Congress on Health and Biomedical Informatics, MEDINFO 2015
CountryBrazil
CitySao Paulo
Period8/19/158/23/15

Fingerprint

Literature
Pharmaceutical Preparations
Toxicogenetics
Semantics
Databases

Keywords

  • Discourse-level analysis
  • Literature-based discovery
  • Relation extraction
  • Semantic Medline

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Informatics
  • Health Information Management

Cite this

Rastegar-Mojarad, M., Komandur Elayavilli, R., Li, D., & Liu, H. D. (2015). Assessing the Need of Discourse-Level Analysis in Identifying Evidence of Drug-Disease Relations in Scientific Literature. In Studies in Health Technology and Informatics (Vol. 216, pp. 539-543). (Studies in Health Technology and Informatics; Vol. 216). IOS Press. https://doi.org/10.3233/978-1-61499-564-7-539

Assessing the Need of Discourse-Level Analysis in Identifying Evidence of Drug-Disease Relations in Scientific Literature. / Rastegar-Mojarad, Majid; Komandur Elayavilli, Ravikumar; Li, Dingcheng; Liu, Hongfang D.

Studies in Health Technology and Informatics. Vol. 216 IOS Press, 2015. p. 539-543 (Studies in Health Technology and Informatics; Vol. 216).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Rastegar-Mojarad, M, Komandur Elayavilli, R, Li, D & Liu, HD 2015, Assessing the Need of Discourse-Level Analysis in Identifying Evidence of Drug-Disease Relations in Scientific Literature. in Studies in Health Technology and Informatics. vol. 216, Studies in Health Technology and Informatics, vol. 216, IOS Press, pp. 539-543, 15th World Congress on Health and Biomedical Informatics, MEDINFO 2015, Sao Paulo, Brazil, 8/19/15. https://doi.org/10.3233/978-1-61499-564-7-539
Rastegar-Mojarad M, Komandur Elayavilli R, Li D, Liu HD. Assessing the Need of Discourse-Level Analysis in Identifying Evidence of Drug-Disease Relations in Scientific Literature. In Studies in Health Technology and Informatics. Vol. 216. IOS Press. 2015. p. 539-543. (Studies in Health Technology and Informatics). https://doi.org/10.3233/978-1-61499-564-7-539
Rastegar-Mojarad, Majid ; Komandur Elayavilli, Ravikumar ; Li, Dingcheng ; Liu, Hongfang D. / Assessing the Need of Discourse-Level Analysis in Identifying Evidence of Drug-Disease Relations in Scientific Literature. Studies in Health Technology and Informatics. Vol. 216 IOS Press, 2015. pp. 539-543 (Studies in Health Technology and Informatics).
@inproceedings{40fedfc08b2644918508872f701b9d3b,
title = "Assessing the Need of Discourse-Level Analysis in Identifying Evidence of Drug-Disease Relations in Scientific Literature",
abstract = "Relation extraction typically involves the extraction of relations between two or more entities occurring within a single or multiple sentences. In this study, we investigated the significance of extracting information from multiple sentences specifically in the context of drug-disease relation discovery. We used multiple resources such as Semantic Medline, a literature based resource, and Medline search (for filtering spurious results) and inferred 8,772 potential drug-disease pairs. Our analysis revealed that 6,450 (73.5{\%}) of the 8,772 potential drug-disease relations did not occur in a single sentence. Moreover, only 537 of the drug-disease pairs matched the curated gold standard in Comparative Toxicogenomics Database (CTD), a trusted resource for drug-disease relations. Among the 537, nearly 75{\%} (407) of the drug-disease pairs occur in multiple sentences. Our analysis revealed that the drug-disease pairs inferred from Semantic Medline or retrieved from CTD could be extracted from multiple sentences in the literature. This highlights the significance of the need of discourse-level analysis in extracting the relations from biomedical literature.",
keywords = "Discourse-level analysis, Literature-based discovery, Relation extraction, Semantic Medline",
author = "Majid Rastegar-Mojarad and {Komandur Elayavilli}, Ravikumar and Dingcheng Li and Liu, {Hongfang D}",
year = "2015",
doi = "10.3233/978-1-61499-564-7-539",
language = "English (US)",
isbn = "9781614995630",
volume = "216",
series = "Studies in Health Technology and Informatics",
publisher = "IOS Press",
pages = "539--543",
booktitle = "Studies in Health Technology and Informatics",

}

TY - GEN

T1 - Assessing the Need of Discourse-Level Analysis in Identifying Evidence of Drug-Disease Relations in Scientific Literature

AU - Rastegar-Mojarad, Majid

AU - Komandur Elayavilli, Ravikumar

AU - Li, Dingcheng

AU - Liu, Hongfang D

PY - 2015

Y1 - 2015

N2 - Relation extraction typically involves the extraction of relations between two or more entities occurring within a single or multiple sentences. In this study, we investigated the significance of extracting information from multiple sentences specifically in the context of drug-disease relation discovery. We used multiple resources such as Semantic Medline, a literature based resource, and Medline search (for filtering spurious results) and inferred 8,772 potential drug-disease pairs. Our analysis revealed that 6,450 (73.5%) of the 8,772 potential drug-disease relations did not occur in a single sentence. Moreover, only 537 of the drug-disease pairs matched the curated gold standard in Comparative Toxicogenomics Database (CTD), a trusted resource for drug-disease relations. Among the 537, nearly 75% (407) of the drug-disease pairs occur in multiple sentences. Our analysis revealed that the drug-disease pairs inferred from Semantic Medline or retrieved from CTD could be extracted from multiple sentences in the literature. This highlights the significance of the need of discourse-level analysis in extracting the relations from biomedical literature.

AB - Relation extraction typically involves the extraction of relations between two or more entities occurring within a single or multiple sentences. In this study, we investigated the significance of extracting information from multiple sentences specifically in the context of drug-disease relation discovery. We used multiple resources such as Semantic Medline, a literature based resource, and Medline search (for filtering spurious results) and inferred 8,772 potential drug-disease pairs. Our analysis revealed that 6,450 (73.5%) of the 8,772 potential drug-disease relations did not occur in a single sentence. Moreover, only 537 of the drug-disease pairs matched the curated gold standard in Comparative Toxicogenomics Database (CTD), a trusted resource for drug-disease relations. Among the 537, nearly 75% (407) of the drug-disease pairs occur in multiple sentences. Our analysis revealed that the drug-disease pairs inferred from Semantic Medline or retrieved from CTD could be extracted from multiple sentences in the literature. This highlights the significance of the need of discourse-level analysis in extracting the relations from biomedical literature.

KW - Discourse-level analysis

KW - Literature-based discovery

KW - Relation extraction

KW - Semantic Medline

UR - http://www.scopus.com/inward/record.url?scp=84951950549&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84951950549&partnerID=8YFLogxK

U2 - 10.3233/978-1-61499-564-7-539

DO - 10.3233/978-1-61499-564-7-539

M3 - Conference contribution

SN - 9781614995630

VL - 216

T3 - Studies in Health Technology and Informatics

SP - 539

EP - 543

BT - Studies in Health Technology and Informatics

PB - IOS Press

ER -