journal article Open Access Feb 13, 2007

Periodic distributions of hydrophobic amino acids allows the definition of fundamental building blocks to align distantly related proteins

View at Publisher Save 10.1002/prot.21319
Abstract
AbstractSeveral studies on large and small families of proteins proved in a general manner that hydrophobic amino acids are globally conserved even if they are subjected to high rate substitution. Statistical analysis of amino acids evolution within blocks of hydrophobic amino acids detected in sequences suggests their usage as a basic structural pattern to align pairs of proteins of less than 25% sequence identity, with no need of knowing their 3D structure. The authors present a new global alignment method and an automatic tool for Proteins with HYdrophobic Blocks ALignment (PHYBAL) based on the combinatorics of overlapping hydrophobic blocks. Two substitution matrices modeling a different selective pressure inside and outside hydrophobic blocks are constructed, the Inside Hydrophobic Blocks Matrix and the Outside Hydrophobic Blocks Matrix, and a 4D space of gap values is explored. PHYBAL performance is evaluated against Needleman and Wunsch algorithm run with Blosum 30, Blosum 45, Blosum 62, Gonnet, HSDM, PAM250, Johnson and Remote Homo matrices. PHYBAL behavior is analyzed on eight randomly selected pairs of proteins of >30% sequence identity that cover a large spectrum of structural properties. It is also validated on two large datasets, the 127 pairs of the Domingues dataset with >30% sequence identity, and 181 pairs issued from BAliBASE 2.0 and ranked by percentage of identity from 7 to 25%. Results confirm the importance of considering substitution matrices modeling hydrophobic contexts and a 4D space of gap values in aligning distantly related proteins. Two new notions of local and global stability are defined to assess the robustness of an alignment algorithm and the accuracy of PHYBAL. A new notion, the SAD‐coefficient, to assess the difficulty of structural alignment is also introduced. PHYBAL has been compared with Hydrophobic Cluster Analysis and HMMSUM methods. Proteins 2007. © 2007 Wiley‐Liss, Inc.
Topics

No keywords indexed for this article. Browse by subject →

References
51
[2]
Twilight zone of protein sequence alignments

Burkhard Rost

Protein Engineering, Design and Selection 10.1093/protein/12.2.85
[3]
Basic local alignment search tool

Stephen F. Altschul, Warren Gish, Webb Miller et al.

Journal of Molecular Biology 10.1016/s0022-2836(05)80360-2
[4]
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs

S. Altschul

Nucleic Acids Research 10.1093/nar/25.17.3389
[8]
Dayhoff MO (1978)
[9]
Exhaustive Matching of the Entire Protein Sequence Database

Gaston H. Gonnet, Mark A. Cohen, Steven A. Benner

Science 10.1126/science.1604319
[10]
Amino acid substitution matrices from protein blocks.

S Henikoff, Jorja G. Henikoff

Proceedings of the National Academy of Sciences 10.1073/pnas.89.22.10915
[11]
Determinants of a protein fold

Donald Bashford, Cyrus Chothia, Arthur M. Lesk

Journal of Molecular Biology 10.1016/0022-2836(87)90521-3
[12]
Deciphering the Message in Protein Sequences: Tolerance to Amino Acid Substitutions

James U. Bowie, John F. Reidhaar-Olson, Wendell A. Lim et al.

Science 10.1126/science.2315699
[13]
Environment‐specific amino acid substitution tables: Tertiary templates and prediction of protein folds

John OVERINGTON, Dan Donnelly, Mark S. Johnson et al.

Protein Science 10.1002/pro.5560010203
[14]
Alignment of the amino acid sequences of distantly related proteins using variable gap penalties

Arthur M. Lesk, Michael Levitt, Cyrus Chothia

"Protein Engineering, Design and Selection" 10.1093/protein/1.1.77
[15]
Enriching the sequence substitution matrix by structural information

Octavian Teodorescu, Tamara Galor, Jarosław Pillardy et al.

Proteins: Structure, Function, and Bioinformatics 10.1002/prot.10474
[16]
A Method to Identify Protein Sequences That Fold into a Known Three-Dimensional Structure

James U. Bowie, Roland Lüthy, David Eisenberg

Science 10.1126/science.1853201
[17]
A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence

Danny W. Rice, David Eisenberg

Journal of Molecular Biology 10.1006/jmbi.1997.0924
[18]
Improved pairwise alignments of proteins in the Twilight Zone using local structure predictions

Yao-Ming Huang, Christopher Bystroff

Bioinformatics 10.1093/bioinformatics/bti828
[19]
Some Factors in the Interpretation of Protein Denaturation

W. Kauzmann

Advances in Protein Chemistry 10.1016/s0065-3233(08)60608-7
[20]
Side Chain Packing of the N- and C-Terminal Helices Plays a Critical Role in the Kinetics of Cytochrome c Folding

Wilfredo Colón, Gulnur A. Elove, L. Paul Wakem et al.

Biochemistry 10.1021/bi960052u
[21]
Structural and dynamic characterization of partially folded states of apomyoglobin and implications for protein folding

David Eliezer, Jian Yao, H. Jane Dyson et al.

Nature Structural Biology 10.1038/nsb0298-148
[22]
Simulating the minimum core for hydrophobic collapse in globular proteins

Jerry Tsai, Mark Gerstein, Michael Levitt

Protein Science 10.1002/pro.5560061212
[23]
The fundamentals of protein folding: bringing together theory and experiment

Christopher M Dobson, Martin Karplus

Current Opinion in Structural Biology 10.1016/s0959-440x(99)80012-8
[26]
Role of Hydrophobic Clusters and Long-Range Contact Networks in the Folding of (α/β)8 Barrel Proteins

S. Selvaraj, M. Michael Gromiha

Biophysical Journal 10.1016/s0006-3495(03)75000-0
[27]
Hydrophobic cluster analysis: An efficient new way to compare and analyse amino acid sequences

C. Gaboriaud, V. Bissery, T. Benchetrit et al.

FEBS Letters 10.1016/0014-5793(87)80439-8
[28]
Callebaut I "Deciphering protein sequence information through hydrophobic cluster analysis (HCA): current status and perspectives" Cell mol life sci review (1997)
[29]
Hydrophobic cluster analysis: procedures to derive structural and functional information from 2-D-representation of protein sequences

L. Lemesle-Varloot, B. Henrissat, C. Gaboriaud et al.

Biochimie 10.1016/0300-9084(90)90120-6
[30]
A general method applicable to the search for similarities in the amino acid sequence of two proteins

Saul B. Needleman, Christian D. Wunsch

Journal of Molecular Biology 10.1016/0022-2836(70)90057-4
[31]
HOMSTRAD: A database of protein structure alignments for homologous families

Kenji Mizuguchi, Charlotte M. Deane, Tom L. Blundell et al.

Protein Science 10.1002/pro.5560071126
[34]
ProSup: a refined tool for protein structure alignment

Peter Lackner, Walter A. Koppensteiner, Manfred J. Sippl et al.

Protein Engineering, Design and Selection 10.1093/protein/13.11.745
[35]
Structure-derived substitution matrices for alignment of distantly related sequences

Andreas Prlić, Francisco S. Domingues, Manfred J. Sippl

Protein Engineering, Design and Selection 10.1093/protein/13.8.545
[36]
A Structural Basis for Sequence Comparisons

Mark S. Johnson, John P. Overington

Journal of Molecular Biology 10.1006/jmbi.1993.1548
[37]
Recognition of analogous and homologous protein folds: analysis of sequence and structure conservation 1 1Edited by F. E. Cohen

Robert B Russell, Mansoor A.S Saqi, Roger A Sayle et al.

Journal of Molecular Biology 10.1006/jmbi.1997.1019
[38]
MATRAS: a program for protein 3D structure comparison

T. Kawabata

Nucleic Acids Research 10.1093/nar/gkg581
[39]
KrissinelE HenrickK.Protein structure coomparison in 3D based on secondary structure matching (SSM) followed by an alignment scored by a new structural similarity function.2003; Proceedings of the 5th International Conference on Molecular Structural Biology. Vienna September 3–7 2003.
[40]
Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions

E. Krissinel, K. Henrick

Acta Crystallographica Section D Biological Crysta... 10.1107/s0907444904026460
[41]
Structure-based evaluation of sequence comparison and fold recognition alignment accuracy 1 1Edited by B. Honig

Francisco S. Domingues, Peter Lackner, Antonina Andreeva et al.

Journal of Molecular Biology 10.1006/jmbi.2000.3615
[42]
BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs.

J D Thompson, F Plewniak, O Poch

Bioinformatics 10.1093/bioinformatics/15.1.87
[43]
Hubbard SJ (1993)
[46]
HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins 1 1Edited by J. Thornton

Christopher Bystroff, Vesteinn Thorsson, David Baker

Journal of Molecular Biology 10.1006/jmbi.2000.3837

Showing 50 of 51 references

Metrics
12
Citations
51
References
Details
Published
Feb 13, 2007
Vol/Issue
67(3)
Pages
695-708
License
View
Cite This Article
J. Baussand, C. Deremble, A. Carbone (2007). Periodic distributions of hydrophobic amino acids allows the definition of fundamental building blocks to align distantly related proteins. Proteins: Structure, Function, and Bioinformatics, 67(3), 695-708. https://doi.org/10.1002/prot.21319
Related

You May Also Like

Improved side‐chain torsion potentials for the Amber ff99SB protein force field

Kresten Lindorff‐Larsen, Stefano Piana · 2010

5,733 citations

Structure validation by Cα geometry: ϕ,ψ and Cβ deviation

Simon C. Lovell, Ian W. Davis · 2003

4,087 citations

Essential dynamics of proteins

Andrea Amadei, Antonius B. M. Linssen · 1993

3,134 citations