| Authors: Bob MacCallum, Andrea Krings, Markus Brameier and Amine Heddad, Stockholm Bioinformatics Center, Stockholm University, Sweden. |
NucPred - multiple sequences
Fetching Q9UIE9 from www.uniprot.org...
Fetching Q9CT49 from www.uniprot.org...
Fetching Q9V3Y5 from www.uniprot.org...
Fetching Q8I4F3 from www.uniprot.org...
Got 4 sequences with 3178 residues
Calculating NucPred scores 0.96 0.95 1.00 0.99
Running ClustalW (please be patient)
NucPred coloured multiple alignment (warning - the alignment may be inaccurate in places because we have aligned full-length sequences which may have different domain organisation) |
sp|P49756|RBM25_HUMAN 0.96 MSFPPHLNRPPMGIPALPPGIPPPQFPGFPPPVPPGTPMIPVPMSIMAPAPTVLVPTVSM
sp|B2RY56|RBM25_MOUSE 0.95 MSFPPHLNRPPMGIPALPPGIPPPQFPGFPPPVPPGTPMIPVPMSIMAPAPTVLVPTVSM
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 MSYPPRAPMPPFMNTAIPP---------------------PHIMQNMAKPPRSFRNSATI
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 ------------------------------------------------------------
cons msfpphlnrppmgipalpp pvpmsimapaptvlvptvsm
sp|P49756|RBM25_HUMAN 0.96 VG-KHLGARKDHPGLKAKENDENCGPTTTVFVGNISEKASDMLIRQLLAKCGLVLSWKRV
sp|B2RY56|RBM25_MOUSE 0.95 VG-KHLGARKDHPGLKLKENDENCGPTTTVFVGNISEKASDMLIRQLLAKCGLVLSWKRV
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 SSQPTVYQRPPEPQPQFR------GPIITVFVGNISERVPEALLKRILNACGVVINWKRV
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 ------------------------------------------------------------
cons vg khlgarkdhpglk k gptttvfvgnisekasdmlirqllakcglvlswkrv
sp|P49756|RBM25_HUMAN 0.96 QGASGKLQAFGFCEYKEPESTLRALRLLHDLQIGEKKLLVKVDAKTKAQLDEWKAKKKAS
sp|B2RY56|RBM25_MOUSE 0.95 QGASGKLQAFGFCEYKEPESTLRALRLLHDLQIGEKKLLVKVDAKTKAQLDEWKAKKKAN
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 S-------TFGFCEFDGPIAAMRAVRLLSEMEIDGKKLVAKVDAKNKVLIEDYKEQECKN
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 ------------------------------------------------------------
cons q afgfceykepestlralrllhdlqigekkllvkvdaktkaqldewkakkkan
sp|P49756|RBM25_HUMAN 0.96 NGNARPETVTNDDEEALD------EETKRRDQMIKGAIEVLIREYSSELNAPSQESDSHP
sp|B2RY56|RBM25_MOUSE 0.95 -GNARPETVTNDDEEALD------EETKRRDQMIKGAIEVLIREYSSELNAPSQESDSHP
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 GDRSNPVDEKTEDEFAIAQMHEFLEEHKHEFEGFDSSSRADLYGSANRNKKTRREEDIKM
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 ------------------------------------------------------------
cons gnarpetvtnddeeald eetkrrdqmikgaievlireysselnapsqesdshp
sp|P49756|RBM25_HUMAN 0.96 RKKKKEKKEDIFRRFPVAPLIPYPLITKEDINAIEMEEDK-RDLISREISKFRDTHKKLE
sp|B2RY56|RBM25_MOUSE 0.95 RKKKKEKKEDIFRRFPVAPLIPYPLITKEDINAIEMEEDK-RDLISREISKFRDTHKKLE
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 KVLSENTLEEEKRNLISSEIGKFRMRAEEDEHRKELEKEKEKEKLAASKEKERKKQREME
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 ------------------------------------------------------------
cons rkkkkekkedifrrfpvaplipyplitkedinaiemeedk rdlisreiskfrdthkkle
sp|P49756|RBM25_HUMAN 0.96 -----EEKGKKEKERQEIE----------------KERRERERERERERERRERERERER
sp|B2RY56|RBM25_MOUSE 0.95 -----EEKGKKEKERQEIE----------------KERRERERERERERERRERERERER
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 RMSSTSKSGSSSTAAASSSSATATSSTPAADGADMSDKTDKESVAVVIKETVKESKESAS
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 ----------------------------------------------MYNQPRIGTRYDEH
cons eekgkkekerqeie kerrererererererrerererer
sp|P49756|RBM25_HUMAN 0.96 EREREKEKERERERERDRDRDRTKERDRDRDRERDRDRDRERSSDRNKDRSRSREKSRDR
sp|B2RY56|RBM25_MOUSE 0.95 EREREKEKERERERERDRDRDRTKERDRDR--ERDRDRDRERSSDRNKDRSRSREKSRDR
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 STGRKESSSAAIEITQKERRSDSKETRRRRSKSRSKDRERERERELRELRDKERERERDR
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 KRNR----N---------RRSRSRTRSRERTSRRSRSKESPRSKRSRRSRSSSNSSSESD
cons ereRekekererererdr R r kerdRdR eR rdr reRssdr kdRsrsreksrdr
sp|P49756|RBM25_HUMAN 0.96 ERERERERERERERERERERERERERERERERER-----EKDKKRDREEDEEDAYERRKL
sp|B2RY56|RBM25_MOUSE 0.95 ERERERERERERERERERERERERERERERERE-------KDKKRDREEDEEDAYERRKL
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 ERERERERNEMRERERNEMREREREREREREREEKLLKPVRDTWREKE-MEDELRDRKKA
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 SSSSRSSSYSSRSRSRTPQRSRTSKRRRSNSHAS--------------EDSDDAREKRAF
cons ererererereReReRereReRereReRerere kdkkrdreede da errkl
sp|P49756|RBM25_HUMAN 0.96 ERKLREKEAAYQERLKNWEIRERKKTREYEKEAEREEERRREMAKEAKRLKEFLEDYDDD
sp|B2RY56|RBM25_MOUSE 0.95 ERKLREKEAAYQERLKNWEIRERKKTREYEKEAEREEERRREMAKEAKRLKEFLEDYDDD
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 EKKAREKEIAYQTRLTDWEVREKRKAKENEKYRLKELLRQEERETDAKRLKEFVEDYDDE
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 KQMMLDKKQAYLARLKRWESRERQMSKRYEREERKEKDRKKTLQKEGKRLKLFLEDYDDE
cons erklreKeaAYqeRLknWEiRErkkt eyEkeae EeeRrremakeaKRLKeFlEDYDD
sp|P49756|RBM25_HUMAN 0.96 RDDPKYYRGSALQKRLRDREKEMEADERDRKREKEELEEIRQRLLAEGHPDPDAELQRME
sp|B2RY56|RBM25_MOUSE 0.95 RDDPKYYRGSALQKRLRDREKEMEADERDRKREKEELEEIRQRLLAEGHPDPDAELQRME
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 RDDSLYYRGRELQQRLAERVREADADSKDREKEAEELAELKSKFFSGEYENPSLEFEKAR
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 KDDPKYYTSSQFFQRKRDYEREREADQKDRMQEQQEIEELKRQIMEEAANDESINIEEEA
cons rDDpkYYrgsalq Rlrdre EmeADe DRkrEkeEleE qrllaeghpdp ael rme
sp|P49756|RBM25_HUMAN 0.96 QEAERRRQP----QIKQEPESEEEEEEKQEKEEK-----REEPMEEEEEPEQKPCLKPTL
sp|B2RY56|RBM25_MOUSE 0.95 QEAERRRQP----QIKQEPESEEEEEEKQEKEEK-----REEPVEEEEEPEQKPCLKPTL
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 LEIEKLYEPRILINVNQEPPAAATSSVHQRKQAASAPGEDDEGQKQRQKSQQLDSYDPDM
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 RKRHKLKEEE---AMRKMRADSGSPNPHQPLGQSAN---GEKSSSEEESDSEKTDVKKEI
cons qeae r p qikqepeseeeeee Qekeek reep eeeeepeqkpclkptl
sp|P49756|RBM25_HUMAN 0.96 R------PISS------APSVSSASGNATPNTPGDESPCGIIIPHENSPDQQQPEEHRPK
sp|B2RY56|RBM25_MOUSE 0.95 R------PISS------APSVSSASGNATPNTPGDESPCGIIIPHENSPDQQQPEEHRPK
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 AGTNDDDSISNDDRASMADTASNASGVYAKNNNNDQSLSNSLSRHNSESRDSLAQIHTPT
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 K-----DEIKE------EPIDVDISEHVDPNTGSSGTNGNFGWKAIGDDSSLNTKINRPI
cons r pIss apsvssaSgnatpNtpgdespc iiiphenspdqqqpe hrPk
sp|P49756|RBM25_HUMAN 0.96 IGLSLK---------LGASNSPGQP--------NSVKRKKLPVDSVFNKFEDEDSDDVPR
sp|B2RY56|RBM25_MOUSE 0.95 IGLSLK---------LGASNSPGQP--------NSVKRKKLPVDSVFNKFEDEDSDDVPR
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 QSAILNDQESGHDAILPSATPPMTMPLISLTLGNNLKKKKIEATGVFVNDDDNDENINPK
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 ANGNQN---------QPQIKKEASP-----------IPIAQRLSGVFGN--DDDEDDVHS
cons iglsl l asnspgqp nsvkrkklpvd VFn feDeD ddvpr
sp|P49756|RBM25_HUMAN 0.96 KRKLVPLDYGE-------------------------------------------------
sp|B2RY56|RBM25_MOUSE 0.95 KRKLVPLDYGE-------------------------------------------------
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 KRKLVPLDYDDNISNTTPSNHAASSGSGAANNSSSSNNNNSSSADRQSSAVSAATAAAAA
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 KKKLKPFEITR-------------------------------------------------
cons KrKLvPldyge
sp|P49756|RBM25_HUMAN 0.96 ---------------------DDKNATKGTVN----------------------------
sp|B2RY56|RBM25_MOUSE 0.95 ---------------------DDKNATKGTVN----------------------------
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 VSQKIAQAFGGGSSGSGSGSGSGSNASGGKNNGGGSSSSSNNKHNSNSKHGKNEAAASAG
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 -----------------------E-ERMQVMS----------------------------
cons ddknatkgtvn
sp|P49756|RBM25_HUMAN 0.96 -------------------TEEKRKHIKSLIEKIPTAKPELFAYPLDWSIVDSILMERRI
sp|B2RY56|RBM25_MOUSE 0.95 -------------------TEEKRKHIKSLIEKIPTAKPELFAYPLDWSIVDSILMERRI
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 SSADAAAANIKKDENGAKVYDEKRRHIKSIIDRIPTQKEELFNYKLDRNEIDSGLMERKI
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 -------------------AEEKRELTKQIIKTIPATKDELFVHRIEWDQLDGKWMNDRI
cons teEKRkhiKs IekIPtaKpELFaypldwsivDsilMerrI
sp|P49756|RBM25_HUMAN 0.96 RPWINKKIIEYIGEEEATLVDFVCSKVMAHSSPQSILDDVAMVLDEEAEVFIVKMWRLLI
sp|B2RY56|RBM25_MOUSE 0.95 RPWINKKIIEYIGEEEATLVDFVCSKVMAHSSPQSILDDVAMVLDEEAEVFIVKMWRLLI
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 RPWINKKIIEYIGEPEPTLVDFICSKVLAGSPPQSILDDVQMVLDEEAEVFVVKMWRLLI
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 RPWVAKKVTQFLGEEDKSFCDFICDQIEKQATPQEILKDVAVIIDEDAEQFVIKMWRLLI
cons RPWinKKiieyiGEeeatlvDF CskvmahssPQsILdDVamvlDEeAEvF vKMWRLLI
sp|P49756|RBM25_HUMAN 0.96 YETEAKKIGLVK-
sp|B2RY56|RBM25_MOUSE 0.95 YETEAKKIGLVK-
tr|Q9V3Y5|Q9V3Y5_DROME 1.00 YELDAKKSGLAGK
tr|Q8I4F3|Q8I4F3_CAEEL 0.99 YEGQARRLGIT--
cons YEteAkkiGlvk
The consensus ('cons') is calculated as follows: the most frequent character in each column is shown in lowercase (unless it is a gap character); uppercase letters represent a column containing just one amino acid and no gaps. |
The tags HS, MM, DM... correspond to species names Homo sapiens, Mus musculus, Dropophila melanogaster... |
Positively and negatively influencing subsequences are coloured according to the following scale:
(non-nuclear) negative ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| positive (nuclear)
What does the NucPred score mean?
You have to decide on a NucPred score threshold. Sequences which score greater than or equal to this threshold are predicted to spend some time in the nucleus. Higher thresholds yield fewer predicted nuclear proteins, but these predictions are more accurate (you can have higher confidence in them). The table below gives more details of the performance of NucPred estimated using the sequences it was trained on (by cross-validation). Another benchmark is available in the Bioinformatics 2007 paper. |
NucPred score threshold | Specificity | Sensitivity |
see above | fraction of proteins predicted to be nuclear that actually are nuclear | fraction of true nuclear proteins that are predicted (coverage) |
0.10 | 0.45 | 0.88 |
0.20 | 0.52 | 0.83 |
0.30 | 0.57 | 0.77 |
0.40 | 0.63 | 0.69 |
0.50 | 0.70 | 0.62 |
0.60 | 0.71 | 0.53 |
0.70 | 0.81 | 0.44 |
0.80 | 0.84 | 0.32 |
0.90 | 0.88 | 0.21 |
1.00 | 1.00 | 0.02 |
Sequences which score >= 0.8 with NucPred and which
are predicted by PredictNLS to contain an NLS have been shown to be 93% correct with a coverage of 16%. (PredictNLS by itself is 87% correct with 26% coverage on the same data.) |
Go back to the NucPred Home Page.