advertisement: compare things at compare-stuff.com!
next up previous contents
Next: Comparative modelling Up: Fold recognition Previous: 3D-1D methods   Contents


Threading methods

The 3D-1D methods described above perform respectably well despite a major limitation in their methodology. By defining structural environments or residue classes much of the structural context information is lost, compromising the specificity of sequence-structure matches. In other words, one amphipathic helix or strand is much like any other, and their mis-alignment is inevitable unless more detailed structural relationships are considered. The first solution to this problem was provided by Jones et al.jones:threader, who applied the double dynamic programming algorithm of Taylor and Orengo[Taylor & Orengo, 1989, see the SSAP algorithm discussed above]. A low level alignment is used to score the pairwise residue interactions (structural environment) at each equivalence $S_{i,j}$ by minimising an empirical pairwise distance potential[Sippl, 1990]. In other words, the low level alignment attempts to place query residues $k$ at positions in the library structure $l$ such that the distances between $j$ and $l$ are similar to those observed frequently in a database for the amino acids of types $i$ and $k$. As with the SSAP algorithm, the final alignment and score is traced from a high level matrix derived from the low level alignment scores.

This approach is known as `threading' because it attempts to evaluate the sequence in 3D as it is threaded through a library structure. The term is also frequently applied to 3D-1D methods however. Through the flexible optimisation of relative distributions of amino acids, threading methods ought to be able to detect the more distant homologues and analogues than the more rigid 3D-1D methods. In reality, however, neither method has a vastly superior success rate.

The alignment problem in threading is a major one. The number of all possible sub-alignments at each equivalence is exponential with respect to sequence length[Lathrop, 1994]. The double dynamic programming approach of Jones et al.jones:threader does not exhaustively search this space, but is still computationally expensive. Time-consuming algorithms cannot be thoroughly tested or optimised and as a result their understanding is compromised. The `frozen approximation'[Flockner et al., 1995, for example] speeds up the alignment process by testing the suitability of pairwise distances between query residue $k$ and library residues $l$ rather than query residues mounted onto the structure. The residue identities of the library structure can be updated with those of the query sequence at successive iterations of the alignment[Godzik et al., 1992]. Other algorithms to search for the best alignment have also been employed: branch-and-bound[Lathrop & Smith, 1996], Monte Carlo[Madej et al., 1995] and exhaustive searches using heuristics[Russell et al., 1996].

As the fold recognition field matures, more emphasis is being placed on the statistics of threading scores[Bryant & Altschul, 1995], and on how exactly these methods work (D.T. Jones & Thornton, 1996). It seems that the successes and failures of threading come from the generation of models with feasible hydrophobic cores, rather than the intended inter-residue and secondary structure relationships. Indeed, pairwise interactions are poorly conserved across large evolutionary distances[Russell & Barton, 1994,Russell et al., 1997]. Pairwise distance and contact potentials encode, in a roundabout fashion, secondary structure propensities and hydrophobic burial. The poor (but gradually improving) alignments generated by fold recognition methods[Lemer et al., 1995] are therefore an inevitable consequence.


next up previous contents
Next: Comparative modelling Up: Fold recognition Previous: 3D-1D methods   Contents
Copyright Bob MacCallum - DISCLAIMER: this was written in 1997 and may contain out-of-date information.