![]() | ||||||
Bob MacCallum has movedDivision of Cell & Molecular BiologyImperial College London South Kensington Campus London SW7 2AZ I still get mail on my SBC address and continue to support my web services. My Imperial web page is here.
Group membersAn up-to-date list can be obtained from the SBC personnel database (also: same list plus ex-members).Research at SBCThe challenge I set myself when moving to Sweden was to make significant advances in protein structure prediction using a fairly new kind of evolutionary machine learning algorithm called genetic programming (see below). The group as a whole has performed research in a number of other areas. Research highlightsA major contribution to the community has been the release of my Perl genetic programming system PerlGP under the GPL licence, importantly allowing others to replicate and scrutinise my research. As an example of an "off-the-shelf" application of PerlGP, we have evolved simple Perl expressions to predict the nuclear localisation of proteins from amino acid sequence. This project, codenamed NucPred, is being prepared for publication and through our web service will provide novel clues for experimental biologists. We also have an interest in other biologically inspired computational techniques with emergent properties, such as Kohonen's self-organising map (SOM). We showed that the SOM has a novel use in finding optimal views for 3D protein structures (project OVOP, [pdf]). SOMs and genetic programming are currently being used in contact prediction and prediction of ordered/disordered regions in proteins. Research lowlightsI have worked on secondary structure prediction using evolved regular expressions, with undramatic results [pdf]. Early on, we looked at the relationship between the "non-localness" of protein structures (measured by contact order) and secondary structure patterns and prediction accuracy, but unfortunately no firm conclusions could be drawn. Self-organised mate selection was introduced into the genetic programming algorithm, but it was very difficult to show if it helped in the search for better solutions. Genetic programmingLike other evolutionary algorithms, genetic programming (GP) simulates the processes of fitness-based selection, reproduction and mutation seen in natural populations. In GP we evolve populations of computer programs, expressions or subroutines that should solve some particular task. Typically, the size and structure of these programs is not limited - so that complex features and substructures may evolve and be swapped around in processes similar to biological recombination. Other biologically inspired machine learning techniques, such as artificial neural networks, have been very useful for finding patterns and trends in data, however they typically have a fixed architecture and repertoire of operations, and so cannot be expected to be able to solve all problems. GP-derived solutions can be, in theory at least, infinitely expressive - that is to say that they can contain conditional statements, loops and memory which should be sufficient to compute anything. In practice however, GP is not yet routinely providing solutions to the world's problems. One issue is, of course, the size of the search space that is often encountered with real-world problems. A deeper challenge is to overcome the crudeness of our algorithms and problem definitions (e.g. fitness measures) that is seen when they are compared to real biological systems.
|
Structure prediction servers
NucPredA new nuclear localization prediction tool, and analysis of eukaryotic proteomes.PerlGP - PerlGPI am the author of PerlGP - an open source, fully featured, Perl-based genetic programming system.OVOP - Optimal Views Of ProteinsCheck out my ex-student Oscar Sverud's project OVOP to find the best automatic views of proteins. [get the colour preprint here]
A fun toy I made a few years ago. You can compare things based on web search or PubMed document totals. There
were similar tools
out there before compare-stuff (many are now dead) but this is a
little bit more sophisticated. Here are some examples of things you can try.
|