Xenopus tropicalis cytochrome P450s The 2006 Bioinformatics class searched the ESTdb for matches to Human P450s in the species Xenopus tropicalis. There are over 1 million ESTs for this frog species in ESTdb. The yellow sequences were turned in for assignment #1.Additional sequence was added later to complete the sequences, by walking upstream or downstream on more X. tropicalis ESTs. The refseq_rna database was also searched.(in progress, Jan. 23, 2006) Note: there are 22.9 million reads for X. tropicalis in the Trace Archive.Incomplete sequences can be completed by chromosome walking with the MegaBlast server.This file now has 55 sequences. 52 from X. tropicalis and 3 from X. laevis50/52 are full length CYP sequences. I have just added a cluster of 25 genes and some pseudogenes on scaffold 55That will push the gene count up near 75.
>BX728777
CX904306.1 CYP1A1 N. Abdletawab
552208048 411550065 409289324 388847477
62629_prot from UCSC browser scaf 287 (+) 1408174-1414975
62%
to 1A1 57% to 1A2, 90% to 1A6, 91% to 1A7
MMDNSTTTEVLVASIVFAIVFLVIRSQRVKLPPGTKKLPGP
MPYPVIGNLLSLSKNPHLSLTKMSETYGDVFQIQIGTKPMLVLSGLETLRQALIRQSDEF
AGRPDLFTFRLVGDGQSMTFSSDSGEV
WRARRRLAQNALKTFATSPSPTSSNSCLVEENIITEAEYLIRKFKELIDDKGEFDPYRYV
VVSVANVICGMCFGKRYNHDDEELLNVVNLTDEFGAAAASGNPADFIPILQYFPNSSMKA
FKEINQKFLAFMQKFTKEHYKTFDKNHIRDITDSLIQHSQEKRVDENSDIQLSNEKIVNI
VNDLFGAGFDTITTALSWSLMYLVAHPNIQQRIQDELDQVIGRERRPRLSDRAQLPYTE
AFILEMFRHSSFMPFTIPH (1)
CTTKDTMLNGYFIPKGICVLINQWQVNHDP(2)
NLWQDPFKFCPERFLNNDGTMVNKTEMEKVMIFGL
GKRRCVGEAIGRMEVFLFLTTMLQQMQFFKQDGEKLDMSPQYGLTMKHKR
CHLTAKLRFALLTN*
>DN053435 DN024870 51% to CYP1A8P ortholog DN024871 mate pair to DN024870DN025714.1
MESAVKKTLMDMMPMLLKASISFLTVLLVMSILWKKRNSLPGPWAVPI
VGNFFQLGDQIHITLTDMRNRYGDVFQIKLGLMPIVVVSGLETVKRVLLKEGENFADRPN
FYSFSLFSNGSSMTFSEKYGESWKIHKKIMKNALRNLSNESTNSSNCSCRLEEYVCAEAS
DLVQELTDLSAEKVAFDPSQSIVITVANVVCALSFGKRYDHHDKEFLTLIDFNNDLRKA
AGGGLLADFIPILRFIPSSSVKALKKFVQSFHSFIAKCVKDHFATFEENNIRDITDA
LIQLCKERKSEDKNQLLSDDQIISTVNDIFGAGFDTITSALLWAIFYLLRYPEFQDKIHK
EIEEKIGCNRAPRFNDRKDLHYTEAFINEVLRHSSFVPFGLPHCTTMDTKLNGYFLPKGT
CVFTNLYQVNHDNTVWKDADMFMPERFLDQNGQIIKSLTEKVLVFGMGVRKCLGEDVARN
EMFVIMTIMMQRLKLVKSTKHELDPIPVYGLTLKPKPYYLVAKVRT*
>CX846813.1 C.Blackwell 1B1 as query 55% to 1B1 orthologCL126458.1 from GSS, Trace archive 483147144 391272900 233714403422555774 (from Trace search with Human DNA for last part) 483233841MNWKIWEDLGQSSVPKLLLSFLCALTVAHILKWIHEWIIPRWIRS
SQPPGPFPWPLFGNALQMGSYPHLAFIDLAKRYGNIFQIKLGSQKIVVLNGDLVIRHALL
HKGEDFAGRPKFTSYQFVSGGRSLAFGCYTEKWKAHRKLAHSTVRAFSTGNPQTKRCLAE
NVLKEARDLIALFSELGQGGKYFYPGRHTVVSVANVMSAVCFGRRYQHGDLEFQSLLSNN
DKFTRSVGAGSLVDVMPWLQRFPNPVRSVFRSFQQ (1)
VNYEFYDFVYKKFLLHRNTANQAV
TRDMMDAFIHILITKEGKVRADDADGGEEKGKNGQYFFHSLEAEHVPS
TVTDIFGASQDTLSTALQWVIFFLVR (2)
YPEIQTKLQDEMDRVIGKDRLPCIEDQPKLPYLMAFLYEF
MRFSSFVPITIPHATTKNTTIMGYQIPKDTVVFVNQWSVNHDPQKWSNPGEFNPSRFLDD
NGLINKDLVSNIMIFSVGKRRCIGEELSKIQLFMFSSILLHQCIFTALPADNLNPKGDYG
LSIKPKPFRISMTLRHGSMDLLNNSVLSGMAE*
>CYP2D45
NM_001015719.1 CX969358.1 54% to 2D6 E.Mahrous
MSLLSQLCPFALGCNVFTLGIIFTLLLLLLDFMKRRKPCTDFPP
SPPSWPFVGNLLQMDFRDLHNSFKQLSKQYGDVMSLRVFWKPTVVLNGFEVIKEALIQ
KSEDTADRPPFNLYEILGFVGNNKAVVLANYGQSWKDLRRFTLSTLRDFGMGKKSLEE
RVRDEAGYLCDAFQSEQGGPFDPHVLINTAVSNVICSIIFGERFEYDDHKFLKLLCLI
EESIKAESGPVPQIISSLPWSSKVPGLARLFFQPRIHMLQYLQEIINEHKQTWDSGHT
RDFIDAFMLEMKKAKGVKDSNFNDQNLLLTTADLFSAGSETTTTTLRWGLLFMLLYPD
VQRKVQEEIDQVIGRTRKPTMGDVLQMPYTNAVIHEIQRYADIIPLSVPHMAYRDTHI
KGFFIPKGTVIMTNLSSVLKDEKVWEKPFQFYPEHFLDRDGKFVKREAFMAFSAGRRV
CLGEQLARMELFLFFTSLLQRFSFQIPDGEPCLREDPVFVFLQVPHDYKICAKVR
>CYP2D.1 scaffold_160:807096-818137 (-) strand UCSC browser
52% to 2D6
MSLLSQLCPFALGCNVFTLGIIFTLLLLLLDFMKRRKPCTDFPPSPPSWPFVGNLLQMDSSSLSNSFRQ
(0)
LKKQYGDVFSLQFYWQNVVVLNGYEAIKEALLQKSEDFADRPPFELYEGIGFTGNNK (1)
GVVTAKYGQSWKDLRRFTLSTLRDFGMGKKSLEERVGEEAGYLCDAFLSEQ (1)
GQLFDPHYKLNTAVANIISFIVFGDRFDYDDYKFQKLLNLNQAMFEVESGTMAQ (0)
IATAIPWLAKLPGLAKMIYRPHVDVLEYLQKIISDHQKTWNPACTRDLIDAFTLQMEK (0)
AKGDKENHFNEKNLLFTTFDLFTAGSETSTTTLRWGLLYMLQYPDVQ (1)
RKVQEEIDKVIGKSRKPVMADVLQMSYTNAVIHEIQRCADLVPLSLIHMTYRDTEVQGFSIPK (0)
GVAVIPNLSSVLKDEKVWEKPFQFYPEHFLDADGKFVKQEAFLPFST (1)
GRRACLGERLARMELFLFFTSLLQRFSFQIPDGEPCPRDDPIVYIVQIPHPYKLCAKIR*
>CYP2D.2 scaffold_160:866974-882965 (-) strand UCSC browser
DR873330.1 Trace archive 408392602, 234381521
MSLLSQLCPFALGCNVFTLGIIFTLLLLLLDFMKRRKPCTDFPPSPPSWPFVGNLLQMDFSSLSFRQ
(0)
LRKQYGDVFSLQLGWQNVVVLNGYEAIKEALLQKSEDFADRPPFELYEGIGFTGNNK
(1)
GVVLANYSQSWKDLRRFTLSTLRDFGMGKKSLEEKVREEAGYLCDAFQSEQ
(1)
GQLFDPHYKLNTAVANIMNSIVFGDRFDYDDYKFQKLLNLNQEMFEVEFGTMAQ
(0)
IATAIPWLAKLPGLAKMIYRPHVDVLEYLQKIISDHQKTCNPACTRDLIDAFTLEMEK
(0)
VKGDKENYFNEKNLLFTAFDLFTAGSETSSTTLRWGLLYMLLYPDVQ
(1)
RKVQEEIDQVIGKSRKAAMADVLQMSYTNAVIHEIQRCADLVPLSVTHMTYRDTEVQGFSIPK
(0)
GVAVCPNLSSVLKDEKVWEKPFQFYPEHFLDADGKFVKQEAFLPFST
(1)
GRRACLGERLARMELFLFFTSLLQRFSFQIPDGEPCPRDDPIVYIVQFPHPYKLCAKIR
>CYP2D.3 scaffold_160:1923301-1927860 (+) strand then a gap
UCSC browser
Trace arfchive to fill in seq gap with exon 4 387743496
241672823 to finish exon 7
418485537 walking down, 479264026 = exon 8, 248788894 = exon 9
5aa diffs to 2D45
MSLLSQLCPFALGCNVFTLGIIFTLLLLLLDFMKRRKPCTDFPPSPPSWPFVGNLLQMDFRDLHNSFKQ
(0)
LSKQYGDVMSLQVFWKSMVVLNGFEVIKEALIQKSEDTADRPPFNLYEILGFVGNNK
(1)
AVVLANYGQSWKDLRRFTLSTLRDFGMGKKSLEERVRDEAGYLCDAFQSEQ
(1)
GGPFDPHVLINTAVSNVICSIIFGERFEYDDHKFLKLLCLIEESIKAESGPVPQ
(0)
IISSLPWSSKVPGLARLFFQPRIHMLQYLQEIINEHKQTWDSGHTRDFIDAFMLEMKK
(0)
AKGVKDSNFNDQNLLLTTADLFSAGSETTTTTLRWGLLFMLLYPDVQ
(1)
RKVQEEIDQVIGRTRKPTMGDVLQMPYTNAVIHEIQRYGDIIPLSVPHMAYRDTHIKGFFIPK
(0)
GTVIMTNLSSVLKDEKVWEKPFQFYPEHFLDRDGKFVKREAFMAFSA
(1)
GRRVCLGEQLARMELFLFFTSLLQRFSFQIPDGEPCPREDPVFVFLQVPHDYKICAKVR*
>CYP2D.4 scaffold_160: 1936385-1938199 (+) strand (exons 6-9)
BX707908.1, CX969358 mate = CX969359 3aa diffs to 2D45
MSLLSQLCPFALGCNVVTLGIIFTLLLLLLDFMKRRKPCTDFPPSPPSWPFVGNLLQMDFRDLHNSFKQ
LSKQYGDVMSLRVFWKPTVVLNGFEVIKEALIQKSEDTADRPPFNLYEILGFVGNNK
AVVLANYGQSWKDLRRFTLSTLRDFGMGKKSLEERVRDEAGYLCDAFQSEQ
GGPFDPHVLINTAVSNVICSIIFGERFEYDDHKFLKLLCLIEESIKAESGPVPQ
IISSLPWSSKVPGLARLFFQPRIHMLQYLQEIINEHKQTWDSGHTRDFIDAFMLEMKK
AKGVKDSNFNDQNLLLTTADLFSAGSETTTTTLRWGLLFMLLYPDVQ
RKVQEEIDQVIGRTRKPTMGDVLQMPYTNAVIHEIQRYGDIIPLSVPHMAYRDTHIKGFFIPK
GTVIMTNLSSVLKDEKVWEKPFQFYPEHFLDRDGKFVKREAFMAFSA
GRRVCLGEQLARMELFLFFTSLLQRFSFQIPDGEPCPREDPVFVFLQVPHDYKICAKVR*
There are some assembly difficulties at scaf 160 in this
region. Some duplicate exons
exist. The D.3 and D.4 may be the
same gene. Only 4 aa diffs
>DN017333.1 51% to 2C8, Ramy Naguib Attia
cannot extend in the ESTdb
MSPSIFTLLIFVLLVLLSIMWWKKNLKDRSLLPPGPTPLPFLGNLLQVKPKEFLKALDK (0)
LKEKHGSVFTVYFGARPTVILCGYQTVKEALIDQADTFSSRGKMALAEHILKGY
(1)
GITGSNGERWKQLRRFALTTLRNFGMGKRTIEKRIQEETTFLIEEFRNAE
(1)
GMPFDPTFYLGCAVSNIICSIVFGERFDYNDKQFLFLLKNINKVLRFMNSTWGV
(0)
VFFTFDKIMCHIPGPHQKAMKHLVDLKAFVQQRVRESKEILDINSPQHFIDCFLIKMQE (0)
EQENPHSEFHMDNLIGSALNLFFAGTETVSTTLRYGILILLKWPHIQ
(1)
GRIQEEIDDVIGRQQCPKIEDRSKMPYTDAVIHEIQRFSDIVPTGLPHTATQDTTFRGHTIPK
(0)
GTDVFALLTTVLKDPEVFQNPEEFNPERFLDENGILKKSQAFMPFSA
(1)
GKRMCPGESLARMEIFLFLTTLLQKFTLIPTVPSVDLDVTPEISSSGHLPREYKMCVLPTT*
>CYP2Q2CX972427.1 best hit to CYP2A13 in ESTdb X.tropicalis S.Hill, A.Bolen
NM_001010998.1 89% to
CYP2Q1, 55% to 2A6 same as CX972427.1
from refseq database
MDTSWLWTLLLCLLISAMLIYSTWNKMYRKRNLPPGPTPIPLFG
NVMQIKRGEMVKSLIELGKKYGDIYTLYFGPSPVVILCSYRAIKEALIDQAEEFSGRG
AIPSFDQYFQGYGVVFTNGEEWKNLRRFSLSTLRNFGMGKRGIEERIKEEAQFLVAEI
KSYKEKPFDPTNILVQ
CVSNVICSVVFGNRFEYANKDFQNLLSLFQSVFQETSSSWGQLLNMLPAVMNHVPGPHKNIIRDMNKLEDFVLQRVKENEKTVDPNSPRDLIDSFLIKMQQENKNPTSPFHMKNLIATILSIFFAGTETVSTTLRHGFLILLIHPEIEAKLQEEIDRVVGQNRSPTIEDRNKMPYTDAVIHEIQRLSDVIPMNVPHLVTKDTKFRGYTIPKGTNIYPLLCAVLRDPEQFDTPSKFNPNHFLDDKGCFKSNDGFMPFSTGKRICLGEGLARMELFLFLT
NILQNFKLHSESGLTEDNIAPKMKGFANYPTSYQLSFIPR
>CYP2Q3
NM_001010999.1 54% to 2A6 79% to NM_001010998.1, 78% to 2Q1
from refseq database
MDTTWLWSLQLFLLIATMLIYSTWNKMYRKRNLPPGPTPIPLFG
NVLQIKRGEMVKSLLELGKKYGPVYTLYFGPSPVIILCDYQSIKEALNDQAEEFSGRG
KIPSWDQFFQGYGESFSNGDEWKQLRRFSLTTLRNFGMGKRGIEERIQEEAQFLVAEI
KSYKGKPFDPTKILVQCVSNVICSVVFGQRYEYSNKDFHKLLYMFQAVFEDTSSTLGQ
LMTLLPNIMNHIPGPHKTVVNKLNKVNDFILQRVKENEKTLDPNSPRHFIDSFLIQMQ
KEKDNPVTKFHWKNLLCTIMNLFFAGTETVSTTLRHGFLMLLIHPEIEEKLHEEIDRV
VGQDRSPTIEDRSKMPYTDAVIHEIQRFSDVLPMSLPHLVMKDTQFRGYTIPKGTDVY
PLICAALRDPKQFATPNKFNPQHFLDDNGLFKSSNAFLPFSTGKRICLGEGLARMELF
LFLTNILQNFKLHSENQFAEDDIAPKMNGFANYPLSYEFSLIPRVQSLLVL
>CX329225.2
DR834894.1 CX379987.2 Yun Peng 74% to human 2R1
MFPPVPLVALVAAALLIGGFLVRQIVKQRKPRGFPPGPPGLPLIGNILA
LASDPHVYMKKQSKIHGQ
(0)
IFSLDLGGISTVVLNGYDAV
KECLVRQSDVFADRPSLPLFKKLTNMGGLLNAKYGRCWTEHRKLAVSCFRTFGCSQKSFE
SKISEECLFFLDAIDSYKGKALDPKHLVTIAVSNVSNLILFGERFRYDDNDFLHMIEIFS
ENIELATSAWVFLYNAFPLIGFLPFGKHQQLFRNASEVYDFLLQIIGRFSENRKPQSPRH
FIDAYMDEMERNEAD
PDSTYSMENLIFSVGELIIAGTETTTNVLRWAMLFMALYPNIQGQVQKEIDGVVGLNRMPTFEEKSRMPYTEAVLHEILRYCNIAPLGIFHATSRDTVVRGYSIPEGTTVITNLYSVHFDEKYWTDPEIFYPERFLDSAGQFTKKEAFVPFSLGRRHCLGEQLARMEMYLFFTALLQRFHLHFPQGFVPNLRPKLGMTLQPHPYVICAERR*
>CX850388.1
different from 2R1 above 91%
MFPPVPLVALVAAALLIGGFLVRPIVKQRKPRGFPPGSPGLPLIGNILALASDPHVYLKK 184
QSKIHGQIFSLDLGGISTVVLNGYDAVKECLVRQSDVFADRLSLPLFKKLTNMGGLLNAK 364
YGRCWTEHRKLAVSCFRTFCCSQKSFESKISEECLFFLEANDSYE
>CYP2U1
CX851239.1 CX439683.1 CX959423.1 DR836116.1
best hit to CYP2U1 in ESTdb X.tropicalis M.Puljic
best match in human = CYP2U1 63%, CYP2U1 ortholog
MSDLAQDSMSGTLDWKQMGYASWSLLGDCASVSALLLYIALFLGLYLLMGSLWRYYQI
IHSNAPPGPTPWPIVGNFAFMLMPGWLM
QLLNFGIAKGKLRRVPAGATRRGAFLYPHIVLTEMAKMYGKIYGLYIGTRLMVILNDFNS
VKDALVSHSEVFSDRPSVSLVTIITKRKGIVFAPYGPIWRQQRRFSHSTLRYFGLGKLSL
EPKIIEEFKYVKAEMLKFGNKGFSPFEIINNAVSNVICSISFGKRFNYEDKEFKTMLSLM
SRGLEISVNSEAVLICLCSWLYYLPFGPFKELRQIVIDITAFLKRIIAE
HQVTLDPANPRDFIDMYLLHIKEEQKGQAESIFNTEYLFYIIGDLFIAGTDTTTNTLLWS
LLYMCLYPDVQEKVQAEIDTVIGRDRPPSLTDKSQMPFTEATIMEVQRMTVVVPLSVPHM
ASESSVFHGYTIPKGSVVMANLWSVHRDPKVWEKPNDFMPKRFLDENGQILKKEAFIPFG
IGRRVCMGEQLAKMELFLMFVNLLQSFSFSLADDTFKPSLEGRFGLTLAPYPFDIKITKR
*
>DN060997.1
DR833173.1 DR842090.1 CF374775.1
best
hit to CYP2S1 in ESTdb X.tropicalis H.Penmatsa, K. Iyer, G. Vasser
best match in human = 2C18 55%, 47% to 2S1 not a 2S orthologMEILGATAVLLVICAF
FLLLNTIQVIRRQGKGKLPPGPTPLPFLGNFLQLRGEEVFKSLLEFGKKYGPVYTIHLGM
EPVVVLCSFDIVKEALNDNGDEFGARGHMPLLEKISHGGHGVVASNGERWKQLRRFSLMT
LRNFGMGKRSIEERIQEEAHFLTNEFKYTKGQPVDP
TFYFSKAVSNVICSVVFGDRFEYEDTEFLRLLGLLNQVFRGFSSVWGQLYNIFPKV
MGKLPGPHNMIFKSVNSLQEFIMQRINMHQET
LDPSSPRDFIDCFLIKMQQEKDVPQTEFHMQGALNTTFDMFGAGTETVSTTLRYGLLILL
KHPDIEERIQKEIDSVIGRNRAPCIEDRSRMPYTDAVIHEIQRFVDIIPMGIPHKVTRDI
QFQGYFIPKGTTVYPMLSSVLHDPKQFKYPDIFNPGHFIDENGKFCKNDGFMPFSSGKRI
CVGEGLARMELFLFITTILQNYTLRSPVDTEDLDLTPELSGFGNIPRPYKLCFIPR*
>NM_001001212.2
51% to 2C18
MDWALEINGLPILLLIAALLLLLARKVGKKVKGCLPPGPKPLPI
LGNLLQLKSREIHKPLLEFNKKYGPVYTLYMGSMPAVVLCGYEAVKEALVDNAEKFSG
RAEVPIVNLTTQGYGIAFSNGERWKELRRFSLTTLRNFGMGKRSIEERIQEEIHFLLE
AFHETQGSFFSPAFIIRRSVSNVICSVVFGKRFDYTDQKLQILLDLIAENLRRVDNIW
VQVYNFIPKLLNILPGPHHKLTENYKAQLRYVEEIVQEHGKTLDPSAPQDYIDAFLLK
MEQERKKAHTEYNVQNLLSCSLDIFFAGQESTSSTLGYGLLILMKYPHIKEKVQAEIE
SVIGRSRRPCMDDRAKMPYTEAVIHEIMRFIDFFPLGVPHSVTEDTLYRGYVIPKGTT
IFPFLHSVLFDPSMFERPQEFYPGHFLNQDGSFRKNEGFMAFSAGKRACPGKSLARVE
IFLYLTSILQQFDPQPALSPKDIDLSPEYSGFGKMAPSFQLKLVPH
>NM_001005711.1
45% to 2C8
MEPLTIFLCLFIFLLLLFTWKTHKRRVQLPPGPYPLPLLGNVLQ
GITVLYDSYRKLSEQYGPVFTVWLGSTPMVVLCGYEVLKDALINHSQEFGARGAFPVP
ERLTDGYGVISTNGTRWQQLRRFSVTVLRNFGMGKRSMEERIHEETQHLIQAVQHTGG
EAFDPLYLLGRAVNNIINLIVFGRRWDYKDKMMIKLFNIINSILLFLRSPLGVIYSAL
YQIMQHLPGPHQKIFHDSETVKSFIREQINSHKETLDSDSPRDYIDCFLIKANQEKDH
HSSEFSQENLVNTVFDFFVAGTETATNTIQFSLLVIITYPHIQAQVQKEIDKVVGPDR
LPGIADRAQMPYTNAVIHEIHRFLDLVPLSLPHMATQDTVCRGFRIPKGTTVIPLIGS
ALCDPAHWETPEEFNPEHFLNQNGEFYIPPAFMPFSAGKRVCLGEGLARMEIFLFFTA
LLQKFTIRVANQTDTFNLRTLRRAFRKKGLFYQLRAMPRTCTVEK
>NM_001004777.1
(gap missing C-helix, 22 aa)
CX454308.2 69% to NM_001035117
MDRKQPYKTLMEVSKKYGSVFSVRVGPLKMVVLCGYDTVKDALLNYPDEFADRPALPLFD
ELVKGHGIIFSNGENWKVMRRFSLSTLRDFGMGKKTIESKIIEECDHLVQKFNSYGGKPFDNTM
IMNAAAANIIASILLSHRFHYENPTLLRLLKLVNENMRLMASPIALLYNTYPSIMRWV
PGCHKTIYNNAQELMEFIRETFSKHKVELDINDQRNLIDAFLSRQQEEKPHSAKYFHD
DNLTILVIDLFAAGMETTSTTLRWALLLMMKYPEIQKKVQDEIEKVIGSVEPRAEHRK
EMPYTDAVLHEIQRFANITPMNGPHATTKDVTFRGFFLPKGTYVIPLLASVLKDENYF
EKPNEFYPEHFLDSEGHFMKNEAFLPFSAGRRSCAGENLARMELFLFFTSLLQNFTFQ
APPGEELDLTPDVGGTVPPRPHTVCALPRS
>NM_001004878.1 66% to NM_001035117, 51% to 2K17 zebrafish
MDRKQPYKALLKVSKKYGPVCSFQIGPLKTVVLCGYDTVKDALLNDEFADRPAMPMLD
DVAKGHGILSSNGENWRVMRRFALSTLRDFGMGKKTIESKINEE
CDHLVQKFSSYGGKPFDTTMIMNAAVANIIASILLSHRFHYENPTLLRLLKLVNENTK
FMASRIAMLYNTFPSIMRWIPGCHKSIYKNAQELLEFIRETFSKQKVELDINDQRNLI
DAFLSRQQEPNSGKYFHDDNLTILVFDLFVAGMETTSTTLRWALLLMMKYPEIQKKVQ
DEIEKVIGSAEPRAEHRKEMPYTDAVIHEIQRFANIFPMNGPHATTKDVTFRGFLIPK
GTFVIPLLASVLKDENYFKKPNEFYPEHFLDSEGHFVKNDAFLPFSAGRRSCAGENLA
RMELFLFFTSLLQNFTFQAPPGEELDLTPDVGIATPPMQHTVCALPRA
>scaffold_21945:198-3026
1 aa diff to scaffold 55 (-) 506398-495841
198
EKVHDEISRVIGSAHPTYSHRTQMPFTNAVIHEILRFADIVPLSVPHETTRDVHFKGYFIPK
(0) 1158
2607 GTYIIPLLSSVLKDKTQFDAPEEFNPNHFLDSEGNFLKKEAFMPFSA (1) 2747
2847 GRRACPGEILARMELFIFFTSLLQKFSFRPPPGVTNINLSSDVGFTSVPLEGMICAIPRA
3026
>scaffold_2219:41253-41363 (-) exon 2 partial
41363 LWKKYGSIFSVQIGSQKMVVLCGYETVKDALVNYAEE 41253
>scaffold_3861: 233-373 (+) exon 8, this scaffold has large
gaps
100% to second exon 4-9 86% to 21818_prot
233 GTFVIPLLTSVLYDQTRFEKPKEFYPQHFLDSEGNFVKNEAFLPFSA 373
>scaffold_3861:8120-8284 (+) exon 2 this scaffold has large
gaps
100% to 21818_prot
8120 LWKKYGSIFRVQIGSQKMVVLCGYETVKDALINHGEEFSERPRLPIFQVIANGY 8284
>scaffold_3433:6638-6781 (+) exon 6, 95% to DT436641.1
6638 AKHPETYSYFHNENLVRLVRNVFSAGVETTSTALRWALLLMIKYPDIQ 6781
>scaffold_3433:23829-23954 (+) exon 4
100% to Green = DT436641.1 75% to 21819_prot
23829 FVFSLGKPFDNTMIMNAAVANIIVSIVLGHRFDYQDPKFLRL 23954
>scaffold_2615 : 2-55 (+) exon 9
2 VGFTSVPLEGMICAIPRA 55
>scaffold_2899 : 33841-33948 (+)
94% to 49369_prot scaffold_996:232793-245538
33841 GTQVIPLLASVLQDETYFEKPEEFYPQHFLDSEGLF 33948
>scaffold_590 : 150217-150324 exon 8, 83% to $$$$$$8
150217 LWDKPYFEKPDEFYAQHFLDSEGNFVKNEAFLPFSA 150324
>14945_prot 55% to DN017333.1, 52% to 2C8
scaffold_1232:575-12890 (+) exon 3 partial
575
MAMDSAGTVLLAACVIVLFYLVKWRGNNKRKNLPPGPTAFPLLGNFLQVSTTEIPSSCVE 754
1024
LSKTYGPVFTLYLGGHRSIILIGYDAVKEALIDNSDVFSDRGEGGVSEMIFKNY 1185
3914
GVILSNGERWKTMRRFTLTTLRNFGMGKRSVEERIQEEARSLEEAFRKKK 4063
5417
DEPFDPIYLLGLAVSNIICSIIFGERFDYEDEKFMTLLMYIREFVKLLNSFFGM 5578
5936
LFNFFPNLFCYIPGPHQNIFTYFNKLKQFVKDEAKSHKDTLDANCPRDFIDCFLIRME 6109
8038
QEKNNPNSEFHYENLFGTILDLFLAGTETTSSTLRYAFLILLKYPEIQ 8181
9338
ENVYKEIVQVIGQHRYPSVEDRSKMPYTEAVIHEVQRIGDILPLGLEHAASKDTTFRGYDIPK 9526
11964 GTLIFPLLTSVLKDPKYFKNPDQFDPEHFLDENGCFKKNDAFMPFST 12104
12708
GKRVCAGEGLARMELFIFLTTILQKFILKSTVATEEIKITPEPNTNGSRPWPYKMFVVPRC 12890
>scaffold_16683:748-5473 = BC092552.1
MDPVSVLLSVVVCIFLFKVFYDGEKESQNFPPGPKPLPLIGNLHIINMEKPYLTFME
LAEKYGSVFSFHLGTEKVVVLCGTDAVRDALINHAEEFSGRPKVAIFDQIFKGH
GIIFADGENWKVMRRFSLSTLRDFGMGKKTIEEKISEESDCL
VETFKSHGGKPFDNTMIMNAAVANIIVALLLSQRFDYQDPTLLKLVKSINKIVRITGSSMVMLYNTF
PSIMQWIPGSHQNVVKNAEKIYTFLIETFTKHRHQLDVNDQRDLIDTFLIKQQEEKSSST
KFFHDENLKVLLLNLFGAGMETTSTTLRWGILLMMKYPEVQKKVQDEIDRVIGSAEPRLE
HQKQMPYTDAVIHEIQRFADLVPNNVPHATTKDVTFRGYFIPK
GTHVIPLLTSVLKDKDYFKKPNEFYPEHFLDSEGHFVKNEAFLPFSA
GRRICAGETLAKMELFLFFTNLLQNFTFQPPPGVEVQLTRGVAITSIPTEHKICALPRS*
>52542_prot scaffold_1232:27024-44511 (+)
MELGVTWSLILAVIVSFLVYSFTWRRKLRKINMPPGPPLYPLLGNMLQIS
AKEFPQSLVKLSEKYGTVFTVYLPSKPAVILSGYDCIKEALLDNNESFGA
RGESPLGYLLFKDYGVIFSNGERWKQLRRFSLSCLRDFGMGKKSIEERIQ
EEARCLVEELGKNGDTPMDPTYMLTLAVSNVICTVVFGERFDYKDEKFMT
LISLLKIVSRDFSSAWGIRSRRPRTRSCAQKLLNLFPNTLSRLPGPPQRL
FRNFDKLKAFVAESLKSHQETLNSDCPRDFIDCFLIKMEKEKNNPQTEFH
SDNLFGTVLDLFFAGTETTSITLKYSFLMLLKYTEVTRKAMEEIDNIIGQ
ERCPFYEDRIKMPYTNAVIHEIQRMADIVPLGVPHATTHDIIFRGYNIPK
DTIIFPLMTSVLKDPKYFNDPKQFDPAHFLDENGSFKKNDAFQPFSIGKR
SCLGEGLARMEIFLFITSILQAFNLKSDTAPQDIDITPEPDKNGAIPRTY
KMYFVPK
>14947_prot scaffold_1232:47102-62978 (+)
MAVLGIETLFLVCSFTFLVFLFSRRQRHARLPPGPTPLPLLGNVLQLDFSKQVKEFVKLGSQY
GPVSMVYLGPYPVLVLNGYDVVKEAFVDNGEVFSNRGKNAFIEMIFKGR
GVAFSNGERWRQMRRFSLSTLRDFGMGKRRVEERVQEE
ACALVEEFKKTKGTPFNSTYLMTLAVSNVICSVVFGERFDYQNETFLSVL
ALLKDTFKIITSPWTQLFSFAPGLLKHLPGPHKKAAENLDRLKTFVTEFV
ASHEETLEENFPRDYIDCFLIKMRQEKDNVNTEFDYENLFVTLMNLFFAG
TETTSITLQYGMLILLKYPDIQKKIHEEIDSVIGFNRCPSMEDRPKMPYT
DATIHEIQRFADIVPMGVPRSTNKDTTLRGYDIPKGTTVFPMLTCILKDP
RYFKDPESFNPCHFLDEKGCLKKTDAFIPFSIGKRVCLGEGLARMEIFLF
LTSILQRFELKCHMDPKDIDISPVPSKSAYMPRPYELYITPR
>52545_prot scaffold_1232:71267-83903 (+) short seq
exon 7 gap filled in by DT419848.1, missing exon 2
82% to 52547_prot scaffold_1232:122253-139910
71267 MDVAGLGTFLLVLITFILTLSSWNTMYKKVNLPPGPTPLPLIGNLMNIKKGKMVNSLMK
(0)
GLSFSNGERWRQMRHFTLKTLKNFGMGKKSIEEKIQEEALCLVEEIRKSG (1)
ETPVDPSKLIMDAVSNVFCSIMFGRRFEYNEEKFANLLTNVNEIFRLMSNTWGQ (0)
LESIFPSVMAYIPGPHKKKNTLSEELISFLHERVKSNQETFDPSAPRDFIDEYLMKIEQ (0)
EKKNPNSEFTMRNTLLTFFSIFLGGTETSTTTIKHGLLLLIKYPEIQ (1)
79425 AKLHMEIDHVIGRNRIVNINDRNAMPYMEAVINEIQRFSDIAPLNAPRKVTKDVQFRGYSIPK
(0) 79511
DTEIYPLLCTVHRDPKYFSSPYEFNPSHFLDEQGRFRKSEAMMAFSA (1)
GKRICPGESLARMELFLFFTTILQNFTLTSPTHFTEDDVAPKMAGFMNHPIQYKASFISR*
83903