OpBerg: Discovering Causal Sentences Using Optimal Alignments

Justin Wood; Nicholas Matiasz; Alcino Silva; William Hsu; Alexej Abyzov; Wei Wang

doi:10.1007/978-3-031-12670-3_2

OpBerg: Discovering Causal Sentences Using Optimal Alignments

Justin Wood, Nicholas Matiasz, Alcino Silva, William Hsu, Alexej Abyzov, Wei Wang

Quantitative Health Sciences

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

The biological literature is rich with sentences that describe causal relations. Methods that automatically extract such sentences can help biologists to synthesize the literature and even discover latent relations that had not been articulated explicitly. Current methods for extracting causal sentences are based on either machine learning or a predefined database of causal terms. Machine learning approaches require a large set of labeled training data and can be susceptible to noise. Methods based on predefined databases are limited by the quality of their curation and are unable to capture new concepts or mistakes in the input. We address these challenges by adapting and improving a method designed for a seemingly unrelated problem: finding alignments between genomic sequences. This paper presents a novel method for extracting causal relations from text by aligning the part-of-speech representations of an input set with that of known causal sentences. Our experiments show that when applied to the task of finding causal sentences in biological literature, our method improves on the accuracy of other methods in a computationally efficient manner.

Original language	English (US)
Title of host publication	Big Data Analytics and Knowledge Discovery - 24th International Conference, DaWaK 2022, Proceedings
Editors	Robert Wrembel, Johann Gamper, Gabriele Kotsis, Ismail Khalil, A Min Tjoa
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	17-30
Number of pages	14
ISBN (Print)	9783031126697
DOIs	https://doi.org/10.1007/978-3-031-12670-3_2
State	Published - 2022
Event	24th International Conference on Big Data Analytics and Knowledge Discovery, DaWaK 2022 - Vienna, Austria Duration: Aug 22 2022 → Aug 24 2022

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	13428 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	24th International Conference on Big Data Analytics and Knowledge Discovery, DaWaK 2022
Country/Territory	Austria
City	Vienna
Period	8/22/22 → 8/24/22

Keywords

Causality extraction
Sequence alignments
Zero-shot learning

ASJC Scopus subject areas

Theoretical Computer Science
General Computer Science

Access to Document

10.1007/978-3-031-12670-3_2

Cite this

Wood, J., Matiasz, N., Silva, A., Hsu, W., Abyzov, A., & Wang, W. (2022). OpBerg: Discovering Causal Sentences Using Optimal Alignments. In R. Wrembel, J. Gamper, G. Kotsis, I. Khalil, & A. M. Tjoa (Eds.), Big Data Analytics and Knowledge Discovery - 24th International Conference, DaWaK 2022, Proceedings (pp. 17-30). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13428 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-12670-3_2

OpBerg: Discovering Causal Sentences Using Optimal Alignments. / Wood, Justin; Matiasz, Nicholas; Silva, Alcino et al.
Big Data Analytics and Knowledge Discovery - 24th International Conference, DaWaK 2022, Proceedings. ed. / Robert Wrembel; Johann Gamper; Gabriele Kotsis; Ismail Khalil; A Min Tjoa. Springer Science and Business Media Deutschland GmbH, 2022. p. 17-30 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13428 LNCS).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Wood, J, Matiasz, N, Silva, A, Hsu, W, Abyzov, A & Wang, W 2022, OpBerg: Discovering Causal Sentences Using Optimal Alignments. in R Wrembel, J Gamper, G Kotsis, I Khalil & AM Tjoa (eds), Big Data Analytics and Knowledge Discovery - 24th International Conference, DaWaK 2022, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 13428 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 17-30, 24th International Conference on Big Data Analytics and Knowledge Discovery, DaWaK 2022, Vienna, Austria, 8/22/22. https://doi.org/10.1007/978-3-031-12670-3_2

Wood J, Matiasz N, Silva A, Hsu W, Abyzov A, Wang W. OpBerg: Discovering Causal Sentences Using Optimal Alignments. In Wrembel R, Gamper J, Kotsis G, Khalil I, Tjoa AM, editors, Big Data Analytics and Knowledge Discovery - 24th International Conference, DaWaK 2022, Proceedings. Springer Science and Business Media Deutschland GmbH. 2022. p. 17-30. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-031-12670-3_2

Wood, Justin ; Matiasz, Nicholas ; Silva, Alcino et al. / OpBerg : Discovering Causal Sentences Using Optimal Alignments. Big Data Analytics and Knowledge Discovery - 24th International Conference, DaWaK 2022, Proceedings. editor / Robert Wrembel ; Johann Gamper ; Gabriele Kotsis ; Ismail Khalil ; A Min Tjoa. Springer Science and Business Media Deutschland GmbH, 2022. pp. 17-30 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{2510ced95fc940cfb561560becec6603,

title = "OpBerg: Discovering Causal Sentences Using Optimal Alignments",

abstract = "The biological literature is rich with sentences that describe causal relations. Methods that automatically extract such sentences can help biologists to synthesize the literature and even discover latent relations that had not been articulated explicitly. Current methods for extracting causal sentences are based on either machine learning or a predefined database of causal terms. Machine learning approaches require a large set of labeled training data and can be susceptible to noise. Methods based on predefined databases are limited by the quality of their curation and are unable to capture new concepts or mistakes in the input. We address these challenges by adapting and improving a method designed for a seemingly unrelated problem: finding alignments between genomic sequences. This paper presents a novel method for extracting causal relations from text by aligning the part-of-speech representations of an input set with that of known causal sentences. Our experiments show that when applied to the task of finding causal sentences in biological literature, our method improves on the accuracy of other methods in a computationally efficient manner.",

keywords = "Causality extraction, Sequence alignments, Zero-shot learning",

author = "Justin Wood and Nicholas Matiasz and Alcino Silva and William Hsu and Alexej Abyzov and Wei Wang",

note = "Publisher Copyright: {\textcopyright} 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.; 24th International Conference on Big Data Analytics and Knowledge Discovery, DaWaK 2022 ; Conference date: 22-08-2022 Through 24-08-2022",

year = "2022",

doi = "10.1007/978-3-031-12670-3_2",

language = "English (US)",

isbn = "9783031126697",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "17--30",

editor = "Robert Wrembel and Johann Gamper and Gabriele Kotsis and Ismail Khalil and Tjoa, {A Min}",

booktitle = "Big Data Analytics and Knowledge Discovery - 24th International Conference, DaWaK 2022, Proceedings",

}

TY - GEN

T1 - OpBerg

T2 - 24th International Conference on Big Data Analytics and Knowledge Discovery, DaWaK 2022

AU - Wood, Justin

AU - Matiasz, Nicholas

AU - Silva, Alcino

AU - Hsu, William

AU - Abyzov, Alexej

AU - Wang, Wei

PY - 2022

Y1 - 2022

N2 - The biological literature is rich with sentences that describe causal relations. Methods that automatically extract such sentences can help biologists to synthesize the literature and even discover latent relations that had not been articulated explicitly. Current methods for extracting causal sentences are based on either machine learning or a predefined database of causal terms. Machine learning approaches require a large set of labeled training data and can be susceptible to noise. Methods based on predefined databases are limited by the quality of their curation and are unable to capture new concepts or mistakes in the input. We address these challenges by adapting and improving a method designed for a seemingly unrelated problem: finding alignments between genomic sequences. This paper presents a novel method for extracting causal relations from text by aligning the part-of-speech representations of an input set with that of known causal sentences. Our experiments show that when applied to the task of finding causal sentences in biological literature, our method improves on the accuracy of other methods in a computationally efficient manner.

AB - The biological literature is rich with sentences that describe causal relations. Methods that automatically extract such sentences can help biologists to synthesize the literature and even discover latent relations that had not been articulated explicitly. Current methods for extracting causal sentences are based on either machine learning or a predefined database of causal terms. Machine learning approaches require a large set of labeled training data and can be susceptible to noise. Methods based on predefined databases are limited by the quality of their curation and are unable to capture new concepts or mistakes in the input. We address these challenges by adapting and improving a method designed for a seemingly unrelated problem: finding alignments between genomic sequences. This paper presents a novel method for extracting causal relations from text by aligning the part-of-speech representations of an input set with that of known causal sentences. Our experiments show that when applied to the task of finding causal sentences in biological literature, our method improves on the accuracy of other methods in a computationally efficient manner.

KW - Causality extraction

KW - Sequence alignments

KW - Zero-shot learning

UR - http://www.scopus.com/inward/record.url?scp=85135895102&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85135895102&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-12670-3_2

DO - 10.1007/978-3-031-12670-3_2

M3 - Conference contribution

AN - SCOPUS:85135895102

SN - 9783031126697

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 17

EP - 30

BT - Big Data Analytics and Knowledge Discovery - 24th International Conference, DaWaK 2022, Proceedings

A2 - Wrembel, Robert

A2 - Gamper, Johann

A2 - Kotsis, Gabriele

A2 - Khalil, Ismail

A2 - Tjoa, A Min

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 22 August 2022 through 24 August 2022

ER -

OpBerg: Discovering Causal Sentences Using Optimal Alignments

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this