journal article Open Access Jan 29, 2003

Fine‐grained protein fold assignment by support vector machines using generalized npeptide coding schemes and jury voting from multiple‐parameter sets

View at Publisher Save 10.1002/prot.10313
Abstract
AbstractIn the coarse‐grained fold assignment of major protein classes, such as all‐α, all‐β, α + β, α/β proteins, one can easily achieve high prediction accuracy from primary amino acid sequences. However, the fine‐grained assignment of folds, such as those defined in the Structural Classification of Proteins (SCOP) database, presents a challenge due to the larger amount of folds available. Recent study yielded reasonable prediction accuracy of 56.0% on an independent set of 27 most populated folds. In this communication, we apply the support vector machine (SVM) method, using a combination of protein descriptors based on the properties derived from the composition of n‐peptide and jury voting, to the fine‐grained fold prediction, and are able to achieve an overall prediction accuracy of 69.6% on the same independent set—significantly higher than the previous results. On 10‐fold cross‐validation, we obtained a prediction accuracy of 65.3%. Our results show that SVM coupled with suitable global sequence‐coding schemes can significantly improve the fine‐grained fold prediction. Our approach should be useful in structure prediction and modeling. Proteins 2003;50:531–536. © 2003 Wiley‐Liss, Inc.
Topics

No keywords indexed for this article. Browse by subject →

References
30
[1]
Protein Structure Prediction and Structural Genomics

David Baker, Andrej Šali

Science 10.1126/science.1065659
[2]
Ab initio protein structure prediction on a genomic scale: Application to the Mycoplasma genitalium genome

Daisuke Kihara, Yang Zhang, Hui Lu et al.

Proceedings of the National Academy of Sciences 10.1073/pnas.092135699
[3]
Ab initio construction of protein tertiary structures using a hierarchical approach

Yu Xia, Enoch S. Huang, Michael Levitt et al.

Journal of Molecular Biology 10.1006/jmbi.2000.3835
[4]
Ab Initio fold prediction of small helical proteins using distance geometry and knowledge-based scoring functions 1 1Edited by F. Cohen

Enoch S. Huang, Ram Samudrala, Jay W. Ponder

Journal of Molecular Biology 10.1006/jmbi.1999.2861
[5]
Fold prediction of helical proteins using torsion angle dynamics and predicted restraints

Chao Zhang, Jingtong Hou, Sung-Hou Kim

Proceedings of the National Academy of Sciences 10.1073/pnas.052003799
[6]
Ab initio prediction of protein structure using LINUS

Rajgopal Srinivasan, George D. Rose

Proteins: Structure, Function, and Bioinformatics 10.1002/prot.10103
[7]
Prospects for ab initio protein structural genomics

Kim T Simons, Charlie Strauss, David Baker

Journal of Molecular Biology 10.1006/jmbi.2000.4459
[8]
Knowledge-based prediction of protein structures and the design of novel molecules

T. L. BLUNDELL, B. L. Sibanda, M. J. E. Sternberg et al.

Nature 10.1038/326347a0
[10]
Generalized comparative modeling (GENECOMP): A combination of sequence comparison, threading, and lattice modeling for protein structure prediction and refinement

A. Kolinski, M.R. Betancourt, D. Kihara et al.

Proteins: Structure, Function, and Bioinformatics 10.1002/prot.1080
[15]
Prediction of Protein Structural Classes

Kuo-Chen Chou, Chun-Ting Zhang

Critical Reviews in Biochemistry and Molecular Bio... 10.3109/10409239509083488
[16]
Prediction of protein folding class from amino acid composition

Inna Dubchak, Stephen R. Holbrook, Sung‐Hou Kim

Proteins: Structure, Function, and Bioinformatics 10.1002/prot.340160109
[17]
Structural patterns in globular proteins

Michael Levitt, Cyrus Chothia

Nature 10.1038/261552a0
[18]
The Nature of Statistical Learning Theory

Vladimir N. Vapnik

10.1007/978-1-4757-2440-0
[19]
Knowledge-based analysis of microarray gene expression data by using support vector machines

Michael P. S. Brown, WILLIAM NOBLE GRUNDY, David Lin et al.

Proceedings of the National Academy of Sciences 10.1073/pnas.97.1.262
[20]
Jaakkola T "Using the Fisher kernel method to detect remote protein homologies" ISMB (1999)
[22]
SCOP: A structural classification of proteins database for the investigation of sequences and structures

Alexey G. Murzin, Steven E. Brenner, Tim Hubbard et al.

Journal of Molecular Biology 10.1016/s0022-2836(05)80134-2
[23]
The Folding Type of a Protein Is Relevant to the Amino Acid Composition

Hiroshi NAKASHIMA, Ken Nishikawa, Tatsuo Ooi

The Journal of Biochemistry 10.1093/oxfordjournals.jbchem.a135454
[24]
Correlation between stability of a protein and its dipeptide composition: a novel approach for predicting in vivo stability of a protein from its primary sequence

Kunchur Guruprasad, B.V.Bhasker Reddy, Madhusudan W. Pandit

"Protein Engineering, Design and Selection" 10.1093/protein/4.2.155
[25]
Wu CH "Motif identification neural design for rapid and sensitive protein family search" Comput Appl Biosci (1996)
[26]
ChangCC LinCJ. LIBSVM: A library for support vector machines.2001. Software available fromhttp://www.csie.ntu.edu.tw/∼cjlin/libsvm
[27]
Knowledge‐based protein secondary structure assignment

Dmitrij Frishman, Patrick Argos

Proteins: Structure, Function, and Bioinformatics 10.1002/prot.340230412
[28]
Assessing the accuracy of prediction algorithms for classification: an overview

Pierre Baldi, Søren Brunak, Yves Chauvin et al.

Bioinformatics 10.1093/bioinformatics/16.5.412
[29]
Prediction of Protein Secondary Structure at Better than 70% Accuracy

Burkhard Rost, Chris Sander

Journal of Molecular Biology 10.1006/jmbi.1993.1413
[30]
SCOP: a Structural Classification of Proteins database

L. Lo Conte

Nucleic Acids Research 10.1093/nar/28.1.257
Cited By
24
Prediction of disulfide connectivity from protein sequences

Yu‐Ching Chen, Jenn‐Kang Hwang · 2005

Proteins: Structure, Function, and...
BMC Bioinformatics
Metrics
24
Citations
30
References
Details
Published
Jan 29, 2003
Vol/Issue
50(4)
Pages
531-536
License
View
Cite This Article
Chin‐Sheng Yu, Jung‐Ying Wang, Jinn‐Moon Yang, et al. (2003). Fine‐grained protein fold assignment by support vector machines using generalized npeptide coding schemes and jury voting from multiple‐parameter sets. Proteins: Structure, Function, and Bioinformatics, 50(4), 531-536. https://doi.org/10.1002/prot.10313
Related

You May Also Like

Improved side‐chain torsion potentials for the Amber ff99SB protein force field

Kresten Lindorff‐Larsen, Stefano Piana · 2010

5,733 citations

Structure validation by Cα geometry: ϕ,ψ and Cβ deviation

Simon C. Lovell, Ian W. Davis · 2003

4,087 citations

Essential dynamics of proteins

Andrea Amadei, Antonius B. M. Linssen · 1993

3,134 citations