advertisement: compare things at compare-stuff.com! |
In the course of this work a better understanding of the detection of remote homologies has been gained through the integration of a simple, yet novel scoring method into the standard dynamic programming alignment algorithm leading to the effective use of information from multiple sequence alignments. In this final section we discuss possible further improvements to this method.
As discussed in Section 4.4.4, fold recognition trials using more queries and larger libraries with different sized domains need to be performed in order to determine the accuracy and reliability of SIVA. Confidence estimates using shuffled sequences will also be investigated. An Internet-based service would make SIVA accessible to the experimental community. After the submission of this thesis, the latest CATH database will be used to test SIVA and vice versa. SIVA could be extended to use multiple alignments from sequence databases (rather than the CATH structural database used here). Further optimisation of parameters and the combination of sequence information from various sources could lead to small improvements. Even an increase of a single percentage point could lead to the structural or functional annotation of hundreds or thousands more genome sequences.
In the next chapter we investigate further the problems of extracting structurally relevant information from protein sequences in the context of fold recognition.