NucPred - Predicting Nuclear Localization of Proteins
NucPred (pronounced newk-pred) analyses a eukaryotic protein sequence and predicts if the protein:
How it works
NucPred is an ensemble (or jury) of 100 sequence based predictors. Each is given the sequence of interest and provides a "yes" or "no" answer to the question "does the protein spend some time in the nucleus?". If the fraction of predictors giving a "yes" answer (also known as the NucPred score) exceeds some prior agreed threshold, then the protein is predicted to have a nuclear role.
The individual predictors are evolved using an evolutionary machine learning approach called genetic programming on a set of known nuclear and non-nuclear proteins. The predictors use regular expression pattern matching to make the yes/no decision and the regular expressions are themselves evolved (using the open source PerlGP system). The Perl source code of the evolved predictors and a demo script to use them is provided free of charge to all.
Genetic programming in a nutshellGenetic programming is an artificial evolutionary algorithm where computer program code is generated automatically - usually in order to perform some predefined task. As with other evolutionary algorithms, a population (of computer programs) undergoes repeated cycles of selection (according to the fitness/suitability of the computer program), mutation and recombination. One particular feature of genetic programming is that the evolving individuals are optimised in terms of both their "shape" and parameters, while most other optimisation methods assume a fixed shape and optimise only the parameters. This freedom to explore the search space is particularly useful when evolving regular expressions as we have done here.
Related servicesFind nuclear localisation signals with PredictNLS (our server gives a handy link to this service too). Predict subcellular location with TargetP, pr PSORT II. Use TMHMM to predict transmembrane helices (and potentially rule out a nuclear location for your protein).
Further informationIf you find NucPred useful, please cite this paper:
NucPred - Predicting Nuclear Localization of Proteins. Brameier M, Krings A, Maccallum RM. Bioinformatics, 2007. PubMed id: 17332022
Follow these links for the source code for NucPred and a preprint of the original NucPred methods paper. An analysis of orthologues from March 2004 may be of interest. For all enquiries, please contact Bob MacCallum.