REVCOM: A robust Bayesian method for evolutionary rate estimation

Andrew J. Bordner, Ruben Abagyan

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

Motivation: Evolutionary conservation estimated from a multiple sequence alignment is a powerful indicator of the functional significance of a residue and helps to predict active sites, ligand binding sites, and protein interaction interfaces. Many algorithms that calculate conservation work well, provided an accurate and balanced alignment is used. However, such a strong dependence on the alignment makes the results highly variable. We attempted to improve the conservation prediction algorithm by making it more robust and less sensitive to (1) local alignment errors, (2) overrepresentation of sequences in some branches and (3) occasional presence of unrelated sequences. Results: A novel method is presented for robust constrained Bayesian estimation of evolutionary rates that avoids overfitting independent rates and satisfies the above requirements. The method is evaluated and compared with an entropy-based conservation measure on a set of 1494 protein interfaces. We demonstrated that ∼62% of the analyzed protein interfaces are more conserved than the remaining surface at the 5% significance level. A consistent method to incorporate alignment reliability is proposed and demonstrated to reduce arbitrary variation of calculated rates upon inclusion of distantly related or unrelated sequences into the alignment.

Original languageEnglish (US)
Pages (from-to)2315-2321
Number of pages7
JournalBioinformatics
Volume21
Issue number10
DOIs
StatePublished - May 15 2005
Externally publishedYes

Fingerprint

Bayes Theorem
Robust Methods
Bayesian Methods
Alignment
Sequence Alignment
Conservation
Protein
Entropy
Constrained Estimation
Catalytic Domain
Carrier Proteins
Proteins
Binding Sites
Multiple Sequence Alignment
Significance level
Overfitting
Ligands
Bayesian Estimation
Binding sites
Branch

ASJC Scopus subject areas

  • Clinical Biochemistry
  • Computational Theory and Mathematics
  • Computer Science Applications

Cite this

REVCOM : A robust Bayesian method for evolutionary rate estimation. / Bordner, Andrew J.; Abagyan, Ruben.

In: Bioinformatics, Vol. 21, No. 10, 15.05.2005, p. 2315-2321.

Research output: Contribution to journalArticle

Bordner, Andrew J. ; Abagyan, Ruben. / REVCOM : A robust Bayesian method for evolutionary rate estimation. In: Bioinformatics. 2005 ; Vol. 21, No. 10. pp. 2315-2321.
@article{303c4187740e4849aa7fe5c33a97b6b3,
title = "REVCOM: A robust Bayesian method for evolutionary rate estimation",
abstract = "Motivation: Evolutionary conservation estimated from a multiple sequence alignment is a powerful indicator of the functional significance of a residue and helps to predict active sites, ligand binding sites, and protein interaction interfaces. Many algorithms that calculate conservation work well, provided an accurate and balanced alignment is used. However, such a strong dependence on the alignment makes the results highly variable. We attempted to improve the conservation prediction algorithm by making it more robust and less sensitive to (1) local alignment errors, (2) overrepresentation of sequences in some branches and (3) occasional presence of unrelated sequences. Results: A novel method is presented for robust constrained Bayesian estimation of evolutionary rates that avoids overfitting independent rates and satisfies the above requirements. The method is evaluated and compared with an entropy-based conservation measure on a set of 1494 protein interfaces. We demonstrated that ∼62{\%} of the analyzed protein interfaces are more conserved than the remaining surface at the 5{\%} significance level. A consistent method to incorporate alignment reliability is proposed and demonstrated to reduce arbitrary variation of calculated rates upon inclusion of distantly related or unrelated sequences into the alignment.",
author = "Bordner, {Andrew J.} and Ruben Abagyan",
year = "2005",
month = "5",
day = "15",
doi = "10.1093/bioinformatics/bti347",
language = "English (US)",
volume = "21",
pages = "2315--2321",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "10",

}

TY - JOUR

T1 - REVCOM

T2 - A robust Bayesian method for evolutionary rate estimation

AU - Bordner, Andrew J.

AU - Abagyan, Ruben

PY - 2005/5/15

Y1 - 2005/5/15

N2 - Motivation: Evolutionary conservation estimated from a multiple sequence alignment is a powerful indicator of the functional significance of a residue and helps to predict active sites, ligand binding sites, and protein interaction interfaces. Many algorithms that calculate conservation work well, provided an accurate and balanced alignment is used. However, such a strong dependence on the alignment makes the results highly variable. We attempted to improve the conservation prediction algorithm by making it more robust and less sensitive to (1) local alignment errors, (2) overrepresentation of sequences in some branches and (3) occasional presence of unrelated sequences. Results: A novel method is presented for robust constrained Bayesian estimation of evolutionary rates that avoids overfitting independent rates and satisfies the above requirements. The method is evaluated and compared with an entropy-based conservation measure on a set of 1494 protein interfaces. We demonstrated that ∼62% of the analyzed protein interfaces are more conserved than the remaining surface at the 5% significance level. A consistent method to incorporate alignment reliability is proposed and demonstrated to reduce arbitrary variation of calculated rates upon inclusion of distantly related or unrelated sequences into the alignment.

AB - Motivation: Evolutionary conservation estimated from a multiple sequence alignment is a powerful indicator of the functional significance of a residue and helps to predict active sites, ligand binding sites, and protein interaction interfaces. Many algorithms that calculate conservation work well, provided an accurate and balanced alignment is used. However, such a strong dependence on the alignment makes the results highly variable. We attempted to improve the conservation prediction algorithm by making it more robust and less sensitive to (1) local alignment errors, (2) overrepresentation of sequences in some branches and (3) occasional presence of unrelated sequences. Results: A novel method is presented for robust constrained Bayesian estimation of evolutionary rates that avoids overfitting independent rates and satisfies the above requirements. The method is evaluated and compared with an entropy-based conservation measure on a set of 1494 protein interfaces. We demonstrated that ∼62% of the analyzed protein interfaces are more conserved than the remaining surface at the 5% significance level. A consistent method to incorporate alignment reliability is proposed and demonstrated to reduce arbitrary variation of calculated rates upon inclusion of distantly related or unrelated sequences into the alignment.

UR - http://www.scopus.com/inward/record.url?scp=19544390737&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=19544390737&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/bti347

DO - 10.1093/bioinformatics/bti347

M3 - Article

C2 - 15749694

AN - SCOPUS:19544390737

VL - 21

SP - 2315

EP - 2321

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 10

ER -