SBC logo Authors: Amine Heddad, Andrea Krings, Markus Brameier and Bob MacCallum, Stockholm Bioinformatics Center, Stockholm University, Sweden.

NucPred

Fetching P78527 from www.uniprot.org...

The NucPred score for your sequence is 0.95 (see score help below)

   1  MAGSGAGVRCSLLRLQETLSAADRCGAALAGHQLIRGLGQECVLSSSPAV    50
51 LALQTSLVFSRDFGLLVFVRKSLNSIEFRECREEILKFLCIFLEKMGQKI 100
101 APYSVEIKNTCTSVYTKDRAAKCKIPALDLLIKLLQTFRSSRLMDEFKIG 150
151 ELFSKFYGELALKKKIPDTVLEKVYELLGLLGEVHPSEMINNAENLFRAF 200
201 LGELKTQMTSAVREPKLPVLAGCLKGLSSLLCNFTKSMEEDPQTSREIFN 250
251 FVLKAIRPQIDLKRYAVPSAGLRLFALHASQFSTCLLDNYVSLFEVLLKW 300
301 CAHTNVELKKAALSALESFLKQVSNMVAKNAEMHKNKLQYFMEQFYGIIR 350
351 NVDSNNKELSIAIRGYGLFAGPCKVINAKDVDFMYVELIQRCKQMFLTQT 400
401 DTGDDRVYQMPSFLQSVASVLLYLDTVPEVYTPVLEHLVVMQIDSFPQYS 450
451 PKMQLVCCRAIVKVFLALAAKGPVLRNCISTVVHQGLIRICSKPVVLPKG 500
501 PESESEDHRASGEVRTGKWKVPTYKDYVDLFRHLLSSDQMMDSILADEAF 550
551 FSVNSSSESLNHLLYDEFVKSVLKIVEKLDLTLEIQTVGEQENGDEAPGV 600
601 WMIPTSDPAANLHPAKPKDFSAFINLVEFCREILPEKQAEFFEPWVYSFS 650
651 YELILQSTRLPLISGFYKLLSITVRNAKKIKYFEGVSPKSLKHSPEDPEK 700
701 YSCFALFVKFGKEVAVKMKQYKDELLASCLTFLLSLPHNIIELDVRAYVP 750
751 ALQMAFKLGLSYTPLAEVGLNALEEWSIYIDRHVMQPYYKDILPCLDGYL 800
801 KTSALSDETKNNWEVSALSRAAQKGFNKVVLKHLKKTKNLSSNEAISLEE 850
851 IRIRVVQMLGSLGGQINKNLLTVTSSDEMMKSYVAWDREKRLSFAVPFRE 900
901 MKPVIFLDVFLPRVTELALTASDRQTKVAACELLHSMVMFMLGKATQMPE 950
951 GGQGAPPMYQLYKRTFPVLLRLACDVDQVTRQLYEPLVMQLIHWFTNNKK 1000
1001 FESQDTVALLEAILDGIVDPVDSTLRDFCGRCIREFLKWSIKQITPQQQE 1050
1051 KSPVNTKSLFKRLYSLALHPNAFKRLGASLAFNNIYREFREEESLVEQFV 1100
1101 FEALVIYMESLALAHADEKSLGTIQQCCDAIDHLCRIIEKKHVSLNKAKK 1150
1151 RRLPRGFPPSASLCLLDLVKWLLAHCGRPQTECRHKSIELFYKFVPLLPG 1200
1201 NRSPNLWLKDVLKEEGVSFLINTFEGGGCGQPSGILAQPTLLYLRGPFSL 1250
1251 QATLCWLDLLLAALECYNTFIGERTVGALQVLGTEAQSSLLKAVAFFLES 1300
1301 IAMHDIIAAEKCFGTGAAGNRTSPQEGERYNYSKCTVVVRIMEFTTTLLN 1350
1351 TSPEGWKLLKKDLCNTHLMRVLVQTLCEPASIGFNIGDVQVMAHLPDVCV 1400
1401 NLMKALKMSPYKDILETHLREKITAQSIEELCAVNLYGPDAQVDRSRLAA 1450
1451 VVSACKQLHRAGLLHNILPSQSTDLHHSVGTELLSLVYKGIAPGDERQCL 1500
1501 PSLDLSCKQLASGLLELAFAFGGLCERLVSLLLNPAVLSTASLGSSQGSV 1550
1551 IHFSHGEYFYSLFSETINTELLKNLDLAVLELMQSSVDNTKMVSAVLNGM 1600
1601 LDQSFRERANQKHQGLKLATTILQHWKKCDSWWAKDSPLETKMAVLALLA 1650
1651 KILQIDSSVSFNTSHGSFPEVFTTYISLLADTKLDLHLKGQAVTLLPFFT 1700
1701 SLTGGSLEELRRVLEQLIVAHFPMQSREFPPGTPRFNNYVDCMKKFLDAL 1750
1751 ELSQSPMLLELMTEVLCREQQHVMEELFQSSFRRIARRGSCVTQVGLLES 1800
1801 VYEMFRKDDPRLSFTRQSFVDRSLLTLLWHCSLDALREFFSTIVVDAIDV 1850
1851 LKSRFTKLNESTFDTQITKKMGYYKILDVMYSRLPKDDVHAKESKINQVF 1900
1901 HGSCITEGNELTKTLIKLCYDAFTENMAGENQLLERRRLYHCAAYNCAIS 1950
1951 VICCVFNELKFYQGFLFSEKPEKNLLIFENLIDLKRRYNFPVEVEVPMER 2000
2001 KKKYIEIRKEAREAANGDSDGPSYMSSLSYLADSTLSEEMSQFDFSTGVQ 2050
2051 SYSYSSQDPRPATGRFRRREQRDPTVHDDVLELEMDELNRHECMAPLTAL 2100
2101 VKHMHRSLGPPQGEEDSVPRDLPSWMKFLHGKLGNPIVPLNIRLFLAKLV 2150
2151 INTEEVFRPYAKHWLSPLLQLAASENNGGEGIHYMVVEIVATILSWTGLA 2200
2201 TPTGVPKDEVLANRLLNFLMKHVFHPKRAVFRHNLEIIKTLVECWKDCLS 2250
2251 IPYRLIFEKFSGKDPNSKDNSVGIQLLGIVMANDLPPYDPQCGIQSSEYF 2300
2301 QALVNNMSFVRYKEVYAAAAEVLGLILRYVMERKNILEESLCELVAKQLK 2350
2351 QHQNTMEDKFIVCLNKVTKSFPPLADRFMNAVFFLLPKFHGVLKTLCLEV 2400
2401 VLCRVEGMTELYFQLKSKDFVQVMRHRDDERQKVCLDIIYKMMPKLKPVE 2450
2451 LRELLNPVVEFVSHPSTTCREQMYNILMWIHDNYRDPESETDNDSQEIFK 2500
2501 LAKDVLIQGLIDENPGLQLIIRNFWSHETRLPSNTLDRLLALNSLYSPKI 2550
2551 EVHFLSLATNFLLEMTSMSPDYPNPMFEHPLSECEFQEYTIDSDWRFRST 2600
2601 VLTPMFVETQASQGTLQTRTQEGSLSARWPVAGQIRATQQQHDFTLTQTA 2650
2651 DGRSSFDWLTGSSTDPLVDHTSPSSDSLLFAHKRSERLQRAPLKSVGPDF 2700
2701 GKKRLGLPGDEVDNKVKGAAGRTDLLRLRRRFMRDQEKLSLMYARKGVAE 2750
2751 QKREKEIKSELKMKQDAQVVLYRSYRHGDLPDIQIKHSSLITPLQAVAQR 2800
2801 DPIIAKQLFSSLFSGILKEMDKFKTLSEKNNITQKLLQDFNRFLNTTFSF 2850
2851 FPPFVSCIQDISCQHAALLSLDPAAVSAGCLASLQQPVGIRLLEEALLRL 2900
2901 LPAELPAKRVRGKARLPPDVLRWVELAKLYRSIGEYDVLRGIFTSEIGTK 2950
2951 QITQSALLAEARSDYSEAAKQYDEALNKQDWVDGEPTEAEKDFWELASLD 3000
3001 CYNHLAEWKSLEYCSTASIDSENPPDLNKIWSEPFYQETYLPYMIRSKLK 3050
3051 LLLQGEADQSLLTFIDKAMHGELQKAILELHYSQELSLLYLLQDDVDRAK 3100
3101 YYIQNGIQSFMQNYSSIDVLLHQSRLTKLQSVQALTEIQEFISFISKQGN 3150
3151 LSSQVPLKRLLNTWTNRYPDAKMDPMNIWDDIITNRCFFLSKIEEKLTPL 3200
3201 PEDNSMNVDQDGDPSDRMEVQEQEEDISSLIRSCKFSMKMKMIDSARKQN 3250
3251 NFSLAMKLLKELHKESKTRDDWLVSWVQSYCRLSHCRSRSQGCSEQVLTV 3300
3301 LKTVSLLDENNVSSYLSKNILAFRDQNILLGTTYRIIANALSSEPACLAE 3350
3351 IEEDKARRILELSGSSSEDSEKVIAGLYQRAFQHLSEAVQAAEEEAQPPS 3400
3401 WSCGPAAGVIDAYMTLADFCDQQLRKEEENASVIDSAELQAYPALVVEKM 3450
3451 LKALKLNSNEARLKFPRLLQIIERYPEETLSLMTKEISSVPCWQFISWIS 3500
3501 HMVALLDKDQAVAVQHSVEEITDNYPQAIVYPFIISSESYSFKDTSTGHK 3550
3551 NKEFVARIKSKLDQGGVIQDFINALDQLSNPELLFKDWSNDVRAELAKTP 3600
3601 VNKKNIEKMYERMYAALGDPKAPGLGAFRRKFIQTFGKEFDKHFGKGGSK 3650
3651 LLRMKLSDFNDITNMLLLKMNKDSKPPGNLKECSPWMSDFKVEFLRNELE 3700
3701 IPGQYDGRGKPLPEYHVRIAGFDERVTVMASLRRPKRIIIRGHDEREHPF 3750
3751 LVKGGEDLRQDQRVEQLFQVMNGILAQDSACSQRALQLRTYSVVPMTSRL 3800
3801 GLIEWLENTVTLKDLLLNTMSQEEKAAYLSDPRAPPCEYKDWLTKMSGKH 3850
3851 DVGAYMLMYKGANRTETVTSFRKRESKVPADLLKRAFVRMSTSPEAFLAL 3900
3901 RSHFASSHALICISHWILGIGDRHLNNFMVAMETGGVIGIDFGHAFGSAT 3950
3951 QFLPVPELMPFRLTRQFINLMLPMKETGLMYSIMVHALRAFRSDPGLLTN 4000
4001 TMDVFVKEPSFDWKNFEQKMLKKGGSWIQEINVAEKNWYPRQKICYAKRK 4050
4051 LAGANPAVITCDELLLGHEKAPAFRDYVAVARGSKDHNIRAQEPESGLSE 4100
4101 ETQVKCLMDQATDPNILGRTWEGWEPWM 4128

Positively and negatively influencing subsequences are coloured according to the following scale:

(non-nuclear) negative ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| positive (nuclear)

with NucPred



If you find NucPred useful, please cite this paper:
NucPred - Predicting Nuclear Localization of Proteins. Brameier M, Krings A, Maccallum RM. Bioinformatics, 2007. PubMed id: 17332022
The authors also look forward to your comments and suggestions.

What does the NucPred score mean?

You have to decide on a NucPred score threshold. Sequences which score greater than or equal to this threshold are predicted to spend some time in the nucleus. Higher thresholds yield fewer predicted nuclear proteins, but these predictions are more accurate (you can have higher confidence in them). The table below gives more details of the performance of NucPred estimated using the sequences it was trained on (by cross-validation). Another benchmark is available in the Bioinformatics 2007 paper.

NucPred score threshold Specificity Sensitivity
see above fraction of proteins predicted to be nuclear that actually are nuclear fraction of true nuclear proteins that are predicted (coverage)
0.10 0.45 0.88
0.20 0.52 0.83
0.30 0.57 0.77
0.40 0.63 0.69
0.50 0.70 0.62
0.60 0.71 0.53
0.70 0.81 0.44
0.80 0.84 0.32
0.90 0.88 0.21
1.00 1.00 0.02

Sequences which score >= 0.8 with NucPred and which are predicted by PredictNLS to contain an NLS have been shown to be 93% correct with a coverage of 16%. (PredictNLS by itself is 87% correct with 26% coverage on the same data.)

Go back to the NucPred Home Page.