Factors influencing the ability of knowledge-based potentials to identify native sequence-structure matches

Jean Pierre A. Kocher; Marianne J. Rooman; Shoshana J. Wodak

doi:10.1006/jmbi.1994.1109

Factors influencing the ability of knowledge-based potentials to identify native sequence-structure matches

Jean Pierre A. Kocher, Marianne J. Rooman, Shoshana J. Wodak

Research output: Contribution to journal › Article › peer-review

196 Scopus citations

Abstract

Several types of potentials are derived from a dataset of known protein structures by computing statistical relations between amino acid sequence and different descriptions of the protein conformation. These potentials formulate in different ways backbone dihedral angle preferences, pairwise distance-dependent interactions between amino acid residues, and solvation effects based on accessible surface area calculations. Parameters affecting the characteristics and the performance of the potentials are critically assessed by monitoring recognition of the native fold in a strict screening test, where each sequence in the dataset is threaded through a repertoire of motifs, generated from all corresponding structures. Sequence gaps are not allowed, to avoid additional approximations. Results show that residue interaction potentials computed from distances between average side-chain centroids perform significantly better on this test than those computed considering inter-C(α) or inter-C(β) distances. Combining potentials that are based on different structural descriptions and different interactions is also beneficial. The performance of some of these potentials is in fact so good that they recognize the correct fold for all the tested proteins, including subunits known to be unstable in the absence of quaternary interactions. Most strikingly, potentials representing backbone dihedral angle preferences recognize as many as 68 protein chains out of a total of 74, even though they consider solely local interactions along the chain, which, being the same as those considered in secondary structure prediction methods, are well known to be incapable of determining the full three-dimensional fold. This leads us to question the ability of procedures that screen a limited repertoire of structures to act as a stringent test for the potentials. We concede, however, that they are useful and fast tests, capable of revealing gross shortcomings of the potentials, or possible biases towards native recognition due, for example, to effects of sequence memory.

Original language	English (US)
Pages (from-to)	1598-1613
Number of pages	16
Journal	Journal of Molecular Biology
Volume	235
Issue number	5
DOIs	https://doi.org/10.1006/jmbi.1994.1109
State	Published - Feb 3 1994

Keywords

Potential functions
Protein data bases
Protein structure prediction

ASJC Scopus subject areas

Structural Biology
Molecular Biology

Access to Document

10.1006/jmbi.1994.1109

Cite this

@article{f99f6ae85a9848aa9609bbbbc87597c4,

title = "Factors influencing the ability of knowledge-based potentials to identify native sequence-structure matches",

abstract = "Several types of potentials are derived from a dataset of known protein structures by computing statistical relations between amino acid sequence and different descriptions of the protein conformation. These potentials formulate in different ways backbone dihedral angle preferences, pairwise distance-dependent interactions between amino acid residues, and solvation effects based on accessible surface area calculations. Parameters affecting the characteristics and the performance of the potentials are critically assessed by monitoring recognition of the native fold in a strict screening test, where each sequence in the dataset is threaded through a repertoire of motifs, generated from all corresponding structures. Sequence gaps are not allowed, to avoid additional approximations. Results show that residue interaction potentials computed from distances between average side-chain centroids perform significantly better on this test than those computed considering inter-C(α) or inter-C(β) distances. Combining potentials that are based on different structural descriptions and different interactions is also beneficial. The performance of some of these potentials is in fact so good that they recognize the correct fold for all the tested proteins, including subunits known to be unstable in the absence of quaternary interactions. Most strikingly, potentials representing backbone dihedral angle preferences recognize as many as 68 protein chains out of a total of 74, even though they consider solely local interactions along the chain, which, being the same as those considered in secondary structure prediction methods, are well known to be incapable of determining the full three-dimensional fold. This leads us to question the ability of procedures that screen a limited repertoire of structures to act as a stringent test for the potentials. We concede, however, that they are useful and fast tests, capable of revealing gross shortcomings of the potentials, or possible biases towards native recognition due, for example, to effects of sequence memory.",

keywords = "Potential functions, Protein data bases, Protein structure prediction",

author = "Kocher, {Jean Pierre A.} and Rooman, {Marianne J.} and Wodak, {Shoshana J.}",

year = "1994",

month = feb,

day = "3",

doi = "10.1006/jmbi.1994.1109",

language = "English (US)",

volume = "235",

pages = "1598--1613",

journal = "Journal of Molecular Biology",

issn = "0022-2836",

publisher = "Academic Press Inc.",

number = "5",

}

TY - JOUR

T1 - Factors influencing the ability of knowledge-based potentials to identify native sequence-structure matches

AU - Kocher, Jean Pierre A.

AU - Rooman, Marianne J.

AU - Wodak, Shoshana J.

PY - 1994/2/3

Y1 - 1994/2/3

N2 - Several types of potentials are derived from a dataset of known protein structures by computing statistical relations between amino acid sequence and different descriptions of the protein conformation. These potentials formulate in different ways backbone dihedral angle preferences, pairwise distance-dependent interactions between amino acid residues, and solvation effects based on accessible surface area calculations. Parameters affecting the characteristics and the performance of the potentials are critically assessed by monitoring recognition of the native fold in a strict screening test, where each sequence in the dataset is threaded through a repertoire of motifs, generated from all corresponding structures. Sequence gaps are not allowed, to avoid additional approximations. Results show that residue interaction potentials computed from distances between average side-chain centroids perform significantly better on this test than those computed considering inter-C(α) or inter-C(β) distances. Combining potentials that are based on different structural descriptions and different interactions is also beneficial. The performance of some of these potentials is in fact so good that they recognize the correct fold for all the tested proteins, including subunits known to be unstable in the absence of quaternary interactions. Most strikingly, potentials representing backbone dihedral angle preferences recognize as many as 68 protein chains out of a total of 74, even though they consider solely local interactions along the chain, which, being the same as those considered in secondary structure prediction methods, are well known to be incapable of determining the full three-dimensional fold. This leads us to question the ability of procedures that screen a limited repertoire of structures to act as a stringent test for the potentials. We concede, however, that they are useful and fast tests, capable of revealing gross shortcomings of the potentials, or possible biases towards native recognition due, for example, to effects of sequence memory.

AB - Several types of potentials are derived from a dataset of known protein structures by computing statistical relations between amino acid sequence and different descriptions of the protein conformation. These potentials formulate in different ways backbone dihedral angle preferences, pairwise distance-dependent interactions between amino acid residues, and solvation effects based on accessible surface area calculations. Parameters affecting the characteristics and the performance of the potentials are critically assessed by monitoring recognition of the native fold in a strict screening test, where each sequence in the dataset is threaded through a repertoire of motifs, generated from all corresponding structures. Sequence gaps are not allowed, to avoid additional approximations. Results show that residue interaction potentials computed from distances between average side-chain centroids perform significantly better on this test than those computed considering inter-C(α) or inter-C(β) distances. Combining potentials that are based on different structural descriptions and different interactions is also beneficial. The performance of some of these potentials is in fact so good that they recognize the correct fold for all the tested proteins, including subunits known to be unstable in the absence of quaternary interactions. Most strikingly, potentials representing backbone dihedral angle preferences recognize as many as 68 protein chains out of a total of 74, even though they consider solely local interactions along the chain, which, being the same as those considered in secondary structure prediction methods, are well known to be incapable of determining the full three-dimensional fold. This leads us to question the ability of procedures that screen a limited repertoire of structures to act as a stringent test for the potentials. We concede, however, that they are useful and fast tests, capable of revealing gross shortcomings of the potentials, or possible biases towards native recognition due, for example, to effects of sequence memory.

KW - Potential functions

KW - Protein data bases

KW - Protein structure prediction

UR - http://www.scopus.com/inward/record.url?scp=0028318094&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0028318094&partnerID=8YFLogxK

U2 - 10.1006/jmbi.1994.1109

DO - 10.1006/jmbi.1994.1109

M3 - Article

C2 - 8107094

AN - SCOPUS:0028318094

SN - 0022-2836

VL - 235

SP - 1598

EP - 1613

JO - Journal of Molecular Biology

JF - Journal of Molecular Biology

IS - 5

ER -

Factors influencing the ability of knowledge-based potentials to identify native sequence-structure matches

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this