Significance estimates

In order to estimate the reliability of fold recognition, one must first assume that null predictions can be confidently made. Accuracy measures (for example

) can only be used to estimate the probability of a correct recognition given that one can be made. For the large trial of 78 queries against 197 library domains the probability of a correct top prediction is 45% (18% are detected by single sequence methods). It has already been mentioned that the Z-score (alignment score normalised by the mean and standard deviation of the the 197 query-library alignment scores) can be taken as an estimate of the significance of a sequence-structure pairing. Figure 4.3 shows how a Z-score threshold or filter affects the accuracy (fraction of correct top hits above threshold) and coverage (fraction of queries with top hits above threshold). From this analysis it is clear that the Z-score does allow some discrimination between strong and weak predictions. The accuracy is around 90% for the 18% of queries, producing a top ranking alignment with a Z-score greater than 1.6 (

). These are equivalent to, but not identical to, the easy targets which were correctly identified by the Smith Waterman method. More interesting is the rise in accuracy between Z-thresholds of 1.0 and 1.3 of 45% to 57%. This is not a side-effect of the 14 confident predictions with

, since when these are excluded the accuracy rises from 34% to 41% (14 out of 34 correct).

Of the top ranking alignments with

, only 7 out of 22 (32%) are correct. Hand checking the top ranking, but incorrect (according to CATH), alignments with high Z-scores identified a further 7 pairs (in Table 4.10) of domains whose SSAP alignment looked meaningful. In the latest version of CATH, many of these have been corrected. Thus when

, 13 out of 22 (59%) are correct by these criteria; the coverage is 34% (a total of 65 predictions with top ranking Z-score

**Figure 4.3:** Summary statistics of fold recognition trials for the set of 78 query and 197 library folds, using hydrophobicity and secondary structure prediction information. Measurements of accuracy, coverage and null prediction quality are shown using various thresholds of Z-score. The fraction of correct top hits rises as the Z-score threshold becomes more severe, whilst coverage (number of top-ranking alignments above the threshold) naturally decreases. Queries with no recognisable fold (null predictions) should not give alignments with high Z-scores. The fraction of incorrect null predictions decreases with increasing Z-score threshold (see Section 4.4.4).
$\begin{figure}\begin{center} \par\epsfig{file=chap5/figs/stats.eps,width=\onetoapage}\par\end{center}\end{figure}$

**Table 4.10:** Similar folds with different CATH topologies are recognised with high Z-scores
query		library
domain	topology	domain	topology	Z-score
1atnA2	3.30.420	1atr01	3.40.40	1.997
1rcb00	1.20.160	1rfbA0	1.10.430	1.594
1scuA1	3.40.330	1atr01	3.40.40	1.544
3ecaA2	3.40.320	1ntr00	3.40.330	1.518
1ovb00	3.40.190	2ohxA2	3.40.330	1.502
2hmqA0	1.20.120	1rcb00	1.20.160	1.440
3inkC0	1.20.160	1glqA2	1.10.270	1.424

Better methods of assessing the significance of alignments and alignment scores[Bryant & Altschul, 1995,Henikoff, 1996] could be employed. By performing

alignments with shuffled query and/or library sequences one can at best calculate P-scores (probability that the true alignment score is better than random) of $N^{-1}$ . Ranking by P-score[Bryant & Altschul, 1995] rather than our Z-score may improve the number of top hits and simultaneously provide better confidence estimates.