advertisement: compare things at compare-stuff.com! |
The secondary structure prediction program DSC[King & Sternberg, 1996, also see
Section 1.4.4] was used to generate
probabilities for the helix and strand secondary structural states for each
residue in a domain sequence using as input its associated multiple
sequence alignment. Therefore the property vector, , has two
components. The DSC method is approximately 70% accurate using the
measure, as tested in blind trials at CASP2. We make the assumption that
the DSC algorithm does not significantly memorise the secondary structural
states of residues in its learning set of protein structures (many of which
will have homologues in our dataset). Our reason is that it has far fewer
parameters than the PHD method (DSC:1000, PHD:25,000) compared with the
number of residues in the training set (23,000). Furthermore, when tested
without jack-knifing on its training set of 126 proteins,
increases by
only a few percent (R. King, personal communication).
The results for fold recognition using alignments of two component vectors
of helix and strand probabilities are quite interesting
(Table 4.4). The number of correct top-ranking predictions is
not special (), but the mean adjusted rank is the best so far
(
). The alignment quality is, however, the worst so
far (
). It seems likely that the improved average
ranking is the result of non-specific recognition of domains with similar
predicted secondary structure content. The poor alignments are probably
due to the lack of phase information in the secondary structure
predictions, compared with the hydrophobicity information which, as already
discussed, is frequently alternating in magnitude.
As with the alignment of sequence conservation information, we found that
the combination of hydrophobicity with predicted secondary structure
information was cooperative ( is now a three component vector).
Figure 4.2 shows the effect of the ratio between the two
components. The minima for
and
are
found, surprisingly, at different ratios: 2:1 and 1:2 respectively
(hydrophobicity:secondary structure prediction), both giving better
performance than either measure alone. The numerical results for these
combinations and the intervening ratio of 1:1 are given in
Tables 4.4 and 4.9.
![]() |
The low
results for ratios 1:1 and 1:2 may be in part
due to the improved recognition of Ig-like topologies (2.60.40). There are
correct top ranking predictions with a ratio of 1:1, almost twice
as many as the basic Smith Waterman method.