advertisement: compare things at compare-stuff.com!
next up previous contents
Next: 3D-1D methods Up: Prediction of protein structure Previous: Sequence composition approaches   Contents

Fold recognition

With the observation that the number of distinct structures was not growing as fast as the PDB as a whole, it was suggested a few years ago that only a finite and relatively small number of fold topologies were encoded by the millions of protein sequences in nature[Chothia, 1992,Blundell & Johnson, 1993,Orengo et al., 1994]. Most estimates for this limit are in the range of one thousand, and the time scale for reaching it in the order of tens of years. Bowie and coworkers realised that structural information could be used in an analogous way to multiple sequence information in profile methods[Bowie et al., 1990,Bowie et al., 1991]. Through the alignment and scoring of query sequences against the complete library of folds, the difficult problem of structure prediction by ab initio methods (discussed below) would be bypassed. Whilst the fold universe is not yet fully explored, estimates from the deposition of structures to the PDB set the probability of a newly sequenced protein (with no detectable structural homologue) being similar to a known fold at 70%[Orengo et al., 1994]. Since protein structure and function, are more conserved than protein sequence, the identification of correspondences between novel sequences and known structures would greatly assist in the characterisation of these sequences. A huge field, known as fold recognition, has developed to tackle this problem; its major developments are reviewed below.



Subsections
next up previous contents
Next: 3D-1D methods Up: Prediction of protein structure Previous: Sequence composition approaches   Contents
Copyright Bob MacCallum - DISCLAIMER: this was written in 1997 and may contain out-of-date information.