| Authors: Amine Heddad, Andrea Krings, Markus Brameier and Bob MacCallum, Stockholm Bioinformatics Center, Stockholm University, Sweden. |
NucPred
Fetching P34036 from www.uniprot.org...
The NucPred score for your sequence is 0.94 (see score help below)
1 MEDQQINVDSPTSGTNTPPVVTPTTSVISEQTLEETVKYLCKICPTLLDG 50
51 DQSVFQNNLSNQIPPENMNKLKKFISDSKIPVLLIQKTNPTINSSNQTST 100
101 TTSSSSSSDDNTLTSSQQQSKESFNFEIEVKFGGENKSTLAIVKRIPESI 150
151 VEYSSNKSIASQLQVLNLGDGSPMDTLHNYIHNSVAPFVRSYILSASKDD 200
201 ATTPSGGSADKSITSQNLDKEMKQSIGAVNQKIAELEISLYNCKQQVQIP 250
251 EVTLAINPEIKSISKRLKETTGRTIKPDDLGDKASSPEFLNLLQAGTTTW 300
301 AKNIQNVTKHNLIENLPSDVSTSQEINFWIELETSLQNIDQQLKSPEVEV 350
351 TLATLRQAKRFIASAPFETDTIGVRKAMDKVQSYKTLFKDFPITPLLTAT 400
401 DLDSISSSVAAIFSHLKKTKNPYYPIPRYLSFLEAIGRDMCNKVYQILRQ 450
451 KNLMNIDYNDFEHLNRSVRALFTLWDDQFGAFRDILRDLAKKRGNERIPL 500
501 IVNIDNRIQLVIIKIGKFRKQHEDLKNVVSNVLPGSQLGGGVVQTNTSNP 550
551 TSPQKQEINAIEEINQAYLEFKEIDVLQLSKEGEEIWDAVVKRYNSRTDR 600
601 VETYITVKLRDRLATAKNANEMFRVYQKFKDLLKRPKIRGATHEYESQLI 650
651 ERVKEDIRVLHDKFKMQYNNSEAYYMSQLRDLPPVSGAIIWARQIERQLD 700
701 TYMKRVANVLGDSWESDAEGQKLKSESDQFRHKLNTDHIFSKWADETEKR 750
751 SFDISGRILTIVKRGNKLALDINFDSHIIMLFKEVRNLQWLGFRVPLKIS 800
801 FISQGAKQVYPFAVSLKETLRTYAQTSGKVTPEFSTLVASYKRDVQANIT 850
851 EGFRLKWETIPKVDPYVRKLSTSINNFRDKVDDLIVKYSEIKKQLDGLKS 900
901 CPFKSESFNEIIANIQKVVDELNLANYSNLPQWVTQLDAQVESMLIERLI 950
951 DAINSWVQLIEGKEDQKDSQSTSGSSNKGGKLNRMNYSIRNKSDEENSSD 1000
1001 LTQPQQSQQQQQTISIKPKLEKTIHEIVIRNQILSLSPPLEVARVNWIDQ 1050
1051 LHSWLNICCDLPRIQSSRYDESAMVHRGGVDSKKQSTFRDMLPKLPQGSL 1100
1101 ESAYSAITNKLEQVQQYVSIWLQYQSLWDMDSSFVYSKLGDDLNKWQLLL 1150
1151 NQIKKSRSTFDNSSTEKQFGPLTIDYTQVQASVNNKYDYWHKDILGHFGS 1200
1201 KLAEKMNQFYETISSSRQELEKLSVETVSTEEAVHFIIQIQDMKKKLSSW 1250
1251 EADLRYYRTGQDLLQRQRFSFPNDWLDCERVEGEWSAFNEILNRKNATIS 1300
1301 EAIPQLQAKILQESKSINDRIKDFIDEWTANKPLQGSIKHSTALETLKIF 1350
1351 EGRLIRLREESDRLSKAKQALDLTDTTGSSSSDQDRLVPVEEEIQDLKAV 1400
1401 WVELSNTWQEIDSLKETAWSAIIPRKVRKSLEDTLQKLKNLPNRIRQYSA 1450
1451 FDHAQNLIKIYLKGNAIITDLHSEAIKDRHWKILKKRLNTNWIITELTLG 1500
1501 SIWDSDLARNENIYREVITAAQGEIALEEFLKGVREFWTTLELDLVNYQR 1550
1551 KCKLVRGWDDLFNKLAEHLNSISAMKMSPYYKVFEEEANHWDDRLNKVRS 1600
1601 LLDVWIDVQRRWVYLEGIFSGSGDINQLLPAESTRFKSINSEFIAILKKV 1650
1651 SGAPLILEVLAIERIQQTMERLSDLLGKVQKALGEYLERQRSAFARFYFV 1700
1701 GDEDLLEIIGNSKDIIKIQKHFRKMFAGLANLTLDDEKTTIIGMSSAEGE 1750
1751 TVTFKKPISIANGPKIHEWLTMVESEMKSTLATLLSESLQHFNQVDVNDH 1800
1801 SKYSEWVDNYPTQLVLLTSQIVWSTQVDQALGGGTLQQSKIQEQLQSIEQ 1850
1851 TTQMILNNLADSVLQDLSAQKRKKFEHLITELVHQRDVVRQLQKCKNLTG 1900
1901 NKDFDWLYHMRYYYDATQENVLHKLVIHMANATFYYGFEYLGIGERLVQT 1950
1951 PLTDRCYLTLTQALESRMGGNPFGPAGTGKTETVKALGSQLGRFVLVFCC 2000
2001 DEGFDLQAMSRIFVGLCQCGAWGCFDEFNRLEERILSAVSQQIQTIQVAL 2050
2051 KENSKEVELLGGKNISLHQDMGIFVTMNPGYAGRSNLPDNLKKLFRSMAM 2100
2101 IKPDREMIAQVMLYSQGFKTAEVLAGKIVPLFKLCQEQLSAQSHYDFGLR 2150
2151 ALKSVLVSAGGIKRKCQPPQLPPITDAESKTKADQIYCQYEIGVLLNSIN 2200
2201 DTMIPKLVADDIPLIQSLLLDVFPGSQLQPIQMDQLRKKIQEIAKQRHLV 2250
2251 TKQEWVEKILQLHQILNINHGVMMVGPSGGGKTTSWEVYLEAIEQVDNIK 2300
2301 SEAHVMDPKAITKDQLFGSLDLTTREWTDGLFTATLRRIIDNVRGESTKR 2350
2351 HWIIFDGDVDPEWVENLNSLLDDNKLLTLPNGERLALPNNVRVMFEVQDL 2400
2401 KYATLATISRCGMVWFSEEILTTQMIFQNYLDTLSNEPFDPQEKEQQKRN 2450
2451 ENAQLQQQQQTTITSPILTSPPTTSSSSRSTTSTTSMIPAGLKVQKECAA 2500
2501 IISQYFEPGGLVHKVLEDAGQRPHIMDFTRLRVLNSFFSLMNRSIVNVIE 2550
2551 YNQLHSDFPMSPENQSNYITNRLLYSLMWGLGGSMGLVERENFSKFIQTI 2600
2601 AITPVPANTIPLLDYSVSIDDANWSLWKNKVPSVEVETHKVASPDVVIPT 2650
2651 VDTTRHVDVLHAWLSEHRPLILCGPPGSGKTMTLTSTLRAFPDFEVVSLN 2700
2701 FSSATTPELLLKTFDHHCEYKRTPSGETVLRPTQLGKWLVVFCDEINLPS 2750
2751 TDKYGTQRVITFIRQMVEKGGFWRTSDHTWIKLDKIQFVGACNPPTDAGR 2800
2801 VQLTHRFLRHAPILLVDFPSTSSLTQIYGTFNRALMKLLPNLRSFADNLT 2850
2851 DAMVEFYSESQKRFTPDIQAHYIYSPRELSRWDRALLEAIQTMDGCTLEG 2900
2901 LVRLWAHEALRLFQDRLVETEEKEWTDKKIDEVALKHFPSVNLDALKRPI 2950
2951 LYSNWLTKDYQPVNRSDLREYVKARLKVFYEEELDVPLVLFNEVLDHILR 3000
3001 IDRVFRQPQGHALLIGVSGGGKSVLSRFVAWMNGLSIYTIKVNNNYKSSD 3050
3051 FDDDLRMLLKRAGCKEEKICFIFDESNVLESSFLERMNTLLAGGEVPGLF 3100
3101 EGEEFTALMHACKETAQRNGLILDSEEELYKYFTSQVRRNLHVVFTMNPA 3150
3151 SPDFHNRSATSPALFNRCVLDWFGEWSPEALFQVGSEFTRNLDLENPQYI 3200
3201 APPVFIQEAEIMGNNLMAIPPSHRDAVVSSLVYIHQTIGEANIRLLKRQG 3250
3251 RQNYVTPRHYLDFINQVVLLINEKRDQLEEEQLHLNIGLKKLRDTEAQVK 3300
3301 DLQVSLAQKNRELDVKNEQANQKLKQMVQDQQAAEIKQKDARELQVQLDV 3350
3351 RNKEIAVQKVKAYADLEKAEPAIIEAQEAVSTIKKKHLDEIKSLPKPPTP 3400
3401 VKLAMEAVCLMLGGKKLEWADIRKKIMEPNFITSIINYDTKKMMTPKIRE 3450
3451 AITKGYLEDPGFDYETVNRASKACGPLVKWATAQTYYSEILDRIKPLREE 3500
3501 VEQLENAANELKLKQDEIVATITALEKSIATYKEEYATLIRETEQIKTES 3550
3551 SKVKNKVDRSIALLDNLNSERGRWEQQSENFNTQMSTVVGDVVLASAFLA 3600
3601 YIGFFDQNFRTDLMRKWMIRLDSVGIKFKSDLSVPSFLSKPEERLNWHAN 3650
3651 SLPSDELCIENAIMLKRFNRYPLVIDPSGQAMEFLMNQYADKKITKTSFL 3700
3701 DSSFMKNLESALRFGCPLLVQDVENIDPVLNPVLNKEIRKKGGRILIRLG 3750
3751 DQDVDFSPSFMIFLFTRDPTAHFTPDLCSRVTFVNFTVTPSSLQSQCLHE 3800
3801 ALKTERPDTHKKRSDLLKIQGEFQVKLRILEKSLLNALSQASGNILDDDS 3850
3851 VISTLETLKKETTEIALKVEETETVMQEISEVSALYNPMALSCSRVYFAM 3900
3901 EELSQFHLYQFSLRAFLDIFYNLLNNNPNLVDKKDPNERLVYLSKDIFSM 3950
3951 TFNRVTRTLLNDDKLTFALQLTIISVKGTSNEIEESEWDFLLKGGDNLTS 4000
4001 IKETIPQLDSLLSTTQQKWLICLRQQVPSFSKLVDHIQQNSSDWKQFFGK 4050
4051 DQVGEPIIPESWIVAQAQLSNQQSTIVSNFRKILLMKAFHSDRVLQYSHS 4100
4101 FVCSVFGEDFLNTQELDMANIVEKEVKSSSPLLLCSVPGYDASSKVDDLA 4150
4151 LQLHKQYKSFAIGSPEGFELAEKSIYAAAKSGTWVLLKNIHLAPQWLVQL 4200
4201 EKKLHSLSPHPSFRLFMTSEIHPALPANLLRMSNVFSYENPPGVKANLLH 4250
4251 TFIGIPATRMDKQPAERSRIYFLLAWFHAIIQERLRYIPLGWTKFFEFND 4300
4301 ADLRGALDSIDYWVDLYSKGRSNIDPDKIPWIAVRTILGSTIYGGRIDNE 4350
4351 FDMRLLYSFLEQLFTPSAFNPDFPLVPSIGLSVPEGTTRAHFMKWIEALP 4400
4401 EISTPIWLGLPENAESLLLSNKARKMINDLQKMQSSEEDGEDDQVSGSSK 4450
4451 KESSSSSSEDKGKAKLRATITEWTKLLPKPLKQLKRTTQNIKDPLFRCFE 4500
4501 REISTGGKLVKKITNDLANLLELISGNIKSTNYLRSLTTSISKGIVPKEW 4550
4551 KWYSVPETISLSVWISDFSKRMQQLSEISESSDYSSIQVWLGGLLNPEAY 4600
4601 ITATRQSASQLNGWSLENLRLHASSLGKISSEGGASFNVKGMALEGAVWN 4650
4651 NDQLTPTDILSTPISIATLTWKDKDDPIFNNSSSKLSVPVYLNETRSELL 4700
4701 FSIDLPYDQSTSKQNWYQRSVSISSWKSDI 4730
Positively and negatively influencing subsequences are coloured according to the following scale:
(non-nuclear) negative ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| positive (nuclear)
What does the NucPred score mean?
You have to decide on a NucPred score threshold. Sequences which score greater than or equal to this threshold are predicted to spend some time in the nucleus. Higher thresholds yield fewer predicted nuclear proteins, but these predictions are more accurate (you can have higher confidence in them). The table below gives more details of the performance of NucPred estimated using the sequences it was trained on (by cross-validation). Another benchmark is available in the Bioinformatics 2007 paper. |
NucPred score threshold | Specificity | Sensitivity |
see above | fraction of proteins predicted to be nuclear that actually are nuclear | fraction of true nuclear proteins that are predicted (coverage) |
0.10 | 0.45 | 0.88 |
0.20 | 0.52 | 0.83 |
0.30 | 0.57 | 0.77 |
0.40 | 0.63 | 0.69 |
0.50 | 0.70 | 0.62 |
0.60 | 0.71 | 0.53 |
0.70 | 0.81 | 0.44 |
0.80 | 0.84 | 0.32 |
0.90 | 0.88 | 0.21 |
1.00 | 1.00 | 0.02 |
Sequences which score >= 0.8 with NucPred and which
are predicted by PredictNLS to contain an NLS have been shown to be 93% correct with a coverage of 16%. (PredictNLS by itself is 87% correct with 26% coverage on the same data.) |
Go back to the NucPred Home Page.