advertisement: compare things at compare-stuff.com!
next up previous contents
Next: Jack-knifing Up: Class prediction Previous: Dataset completeness and class   Contents

Pairwise similarity

It is essential that sets of folds used in class prediction experiments do not contain homologous pairs. If a query sequence has a significant match to a sequence in the dataset (using standard sequence alignment tools) then clearly its secondary structural class (and architecture and topology) can be assigned unambiguously to that of the homologous sequence of known structure. The CATH dataset used here contains only homologous superfamily representatives. None of these should share significant pairwise sequence similarity, however a few pairs are homologous and are corrected in subsequent releases of CATH.



Copyright Bob MacCallum - DISCLAIMER: this was written in 1997 and may contain out-of-date information.