| advertisement: compare things at compare-stuff.com! |
Many of the laboratories which have developed structure comparison
algorithms have also undertaken the difficult task of classifying protein
structures. The Protein Data Bank, or PDB[Bernstein et al.,
1977], is the most
obvious source of material, since it is where the majority of
experimentally determined structures are deposited and is freely available
to all
. There are currently in the
order of 5000 crystallographic protein structures and 900 NMR protein
structures in the PDB; added to which are a number of nucleic acid,
carbohydrate and peptide structures and theoretical models. Pre-processing
and pairwise comparison of the proteins in the PDB is a major task. As
discussed above, the splitting of multi-domain proteins into their
constituent domains is difficult to automate, yet it is an essential part
of any classification effort.