advertisement: compare things at compare-stuff.com! |
The 3D-1D methods described above perform respectably well despite a major
limitation in their methodology. By defining structural environments or
residue classes much of the structural context information is lost,
compromising the specificity of sequence-structure matches. In other
words, one amphipathic helix or strand is much like any other, and their
mis-alignment is inevitable unless more detailed structural relationships
are considered. The first solution to this problem was provided by Jones
et al.jones:threader, who applied the double dynamic
programming algorithm of Taylor and Orengo[Taylor & Orengo, 1989, see the SSAP algorithm
discussed above]. A low level alignment is used to score the
pairwise residue interactions (structural environment) at each equivalence
by minimising an empirical pairwise distance
potential[Sippl, 1990]. In other words, the low level alignment
attempts to place query residues
at positions in the library structure
such that the distances between
and
are similar to those
observed frequently in a database for the amino acids of types
and
.
As with the SSAP algorithm, the final alignment and score is traced from a
high level matrix derived from the low level alignment scores.
This approach is known as `threading' because it attempts to evaluate the sequence in 3D as it is threaded through a library structure. The term is also frequently applied to 3D-1D methods however. Through the flexible optimisation of relative distributions of amino acids, threading methods ought to be able to detect the more distant homologues and analogues than the more rigid 3D-1D methods. In reality, however, neither method has a vastly superior success rate.
The alignment problem in threading is a major one. The number of all
possible sub-alignments at each equivalence is exponential with respect to
sequence length[Lathrop, 1994]. The double dynamic programming
approach of Jones et al.jones:threader does not
exhaustively search this space, but is still computationally expensive.
Time-consuming algorithms cannot be thoroughly tested or optimised and as a
result their understanding is compromised. The `frozen
approximation'[Flockner et al.,
1995, for example] speeds up the alignment
process by testing the suitability of pairwise distances between query
residue and library residues
rather than query residues mounted
onto the structure. The residue identities of the library structure can be
updated with those of the query sequence at successive iterations of the
alignment[Godzik et al.,
1992]. Other algorithms to search for the best
alignment have also been employed: branch-and-bound[Lathrop & Smith, 1996],
Monte Carlo[Madej et al.,
1995] and exhaustive searches
using heuristics[Russell et al.,
1996].
As the fold recognition field matures, more emphasis is being placed on the statistics of threading scores[Bryant & Altschul, 1995], and on how exactly these methods work (D.T. Jones & Thornton, 1996). It seems that the successes and failures of threading come from the generation of models with feasible hydrophobic cores, rather than the intended inter-residue and secondary structure relationships. Indeed, pairwise interactions are poorly conserved across large evolutionary distances[Russell & Barton, 1994,Russell et al., 1997]. Pairwise distance and contact potentials encode, in a roundabout fashion, secondary structure propensities and hydrophobic burial. The poor (but gradually improving) alignments generated by fold recognition methods[Lemer et al., 1995] are therefore an inevitable consequence.