Prediction of sub-cellular location

advertisement: compare things at compare-stuff.com!

Next: Secondary structure prediction Up: Ab initio methods Previous: Secondary structural class prediction Contents

Prediction of sub-cellular location

Extreme amino acid composition bias is observed in a number of proteins, for example Gly-X-X repeats in collagen. Could amino acid preferences exist for other proteins with different general functions or cellular location (for example: nuclear, mitochondrial, cytoplasmic, membrane-associated, extracellular)? Nakashima and Nishikawanakashima:location showed that intracellular and extracellular proteins can be discriminated from the composition of singlets and duplets of amino acids with an accuracy of over 80%, and that this was not a secondary consequence of structural class (for example, in eukaryotes, many extracellular proteins are predominantly $\beta$ -sheet and have stabilising disulphide bonds). Previous work had shown similar trends in the intra and extracellular domains of transmembrane proteins[Nakashima & Nishikawa, 1992]. A more detailed predictive study was performed by Cedano et al.cedano:location. 76% of proteins were correctly predicted to fall into one of five classes: integral membrane, anchored membrane, extracellular, intracellular and nuclear. Both groups suggest physical justifications for the observed amino acid preferences. For example, the low content of hydrophobic and charged residues in extracellular proteins might allow faster transport across the endoplasmic reticulum membrane during synthesis.

These predictions are partly academic, however, since many proteins are tagged with specific signal sequences which specify their subcellular destination, and transmembrane helices can be predicted with good accuracy[Rost et al., 1995, and many others]. Not all targeting signals have been fully characterised, however, and the membrane topology of proteins often has to be deduced from sequence analysis. It is important to remember that large sets of structures and sequences used to test structure prediction algorithms effectively often contain proteins which fold or function in very different environments.

Next: Secondary structure prediction Up: Ab initio methods Previous: Secondary structural class prediction Contents