advertisement: compare things at compare-stuff.com! |
Many of the laboratories which have developed structure comparison
algorithms have also undertaken the difficult task of classifying protein
structures. The Protein Data Bank, or PDB[Bernstein et al.,
1977], is the most
obvious source of material, since it is where the majority of
experimentally determined structures are deposited and is freely available
to all. There are currently in the
order of 5000 crystallographic protein structures and 900 NMR protein
structures in the PDB; added to which are a number of nucleic acid,
carbohydrate and peptide structures and theoretical models. Pre-processing
and pairwise comparison of the proteins in the PDB is a major task. As
discussed above, the splitting of multi-domain proteins into their
constituent domains is difficult to automate, yet it is an essential part
of any classification effort.