Xenopus tropicalis cytochrome P450s
 
 
The 2006 Bioinformatics class searched the ESTdb for matches to Human P450s in 
the species Xenopus tropicalis.  There are over 1 million ESTs for this frog 
species in ESTdb.  The yellow sequences were turned in for assignment #1.
Additional sequence was added later to complete the sequences, by walking 
upstream or downstream on more X. tropicalis ESTs. The refseq_rna 
database was also searched.
(in progress, Jan. 23, 2006) 
Note: there are 22.9 million reads for X. tropicalis in the Trace Archive.
Incomplete sequences can be completed by chromosome walking with the MegaBlast server.
This file now has 55 sequences. 52 from X. tropicalis and 3 from X. laevis
50/52 are full length CYP sequences. 
I have just added a cluster of 25 genes and some pseudogenes on scaffold 55
That will push the gene count up near 75. 
 

>BX728777 CX904306.1 CYP1A1 N. Abdletawab

552208048 411550065 409289324 388847477

62629_prot from UCSC browser scaf 287 (+) 1408174-1414975

62% to 1A1 57% to 1A2, 90% to 1A6, 91% to 1A7

MMDNSTTTEVLVASIVFAIVFLVIRSQRVKLPPGTKKLPGP

MPYPVIGNLLSLSKNPHLSLTKMSETYGDVFQIQIGTKPMLVLSGLETLRQALIRQSDEF

AGRPDLFTFRLVGDGQSMTFSSDSGEV

WRARRRLAQNALKTFATSPSPTSSNSCLVEENIITEAEYLIRKFKELIDDKGEFDPYRYV

VVSVANVICGMCFGKRYNHDDEELLNVVNLTDEFGAAAASGNPADFIPILQYFPNSSMKA

FKEINQKFLAFMQKFTKEHYKTFDKNHIRDITDSLIQHSQEKRVDENSDIQLSNEKIVNI

VNDLFGAGFDTITTALSWSLMYLVAHPNIQQRIQDELDQVIGRERRPRLSDRAQLPYTE

AFILEMFRHSSFMPFTIPH (1)

CTTKDTMLNGYFIPKGICVLINQWQVNHDP(2)

NLWQDPFKFCPERFLNNDGTMVNKTEMEKVMIFGL

GKRRCVGEAIGRMEVFLFLTTMLQQMQFFKQDGEKLDMSPQYGLTMKHKR

CHLTAKLRFALLTN*

 

>DN053435 DN024870 51% to CYP1A8P ortholog DN024871 mate pair to DN024870
DN025714.1

MESAVKKTLMDMMPMLLKASISFLTVLLVMSILWKKRNSLPGPWAVPI

VGNFFQLGDQIHITLTDMRNRYGDVFQIKLGLMPIVVVSGLETVKRVLLKEGENFADRPN

FYSFSLFSNGSSMTFSEKYGESWKIHKKIMKNALRNLSNESTNSSNCSCRLEEYVCAEAS

DLVQELTDLSAEKVAFDPSQSIVITVANVVCALSFGKRYDHHDKEFLTLIDFNNDLRKA

AGGGLLADFIPILRFIPSSSVKALKKFVQSFHSFIAKCVKDHFATFEENNIRDITDA

LIQLCKERKSEDKNQLLSDDQIISTVNDIFGAGFDTITSALLWAIFYLLRYPEFQDKIHK

EIEEKIGCNRAPRFNDRKDLHYTEAFINEVLRHSSFVPFGLPHCTTMDTKLNGYFLPKGT

CVFTNLYQVNHDNTVWKDADMFMPERFLDQNGQIIKSLTEKVLVFGMGVRKCLGEDVARN

EMFVIMTIMMQRLKLVKSTKHELDPIPVYGLTLKPKPYYLVAKVRT*

 

>CX846813.1 C.Blackwell 1B1 as query 55% to 1B1 ortholog
CL126458.1 from GSS, Trace archive 483147144 391272900 233714403
422555774 (from Trace search with Human DNA for last part) 483233841
MNWKIWEDLGQSSVPKLLLSFLCALTVAHILKWIHEWIIPRWIRS

SQPPGPFPWPLFGNALQMGSYPHLAFIDLAKRYGNIFQIKLGSQKIVVLNGDLVIRHALL

HKGEDFAGRPKFTSYQFVSGGRSLAFGCYTEKWKAHRKLAHSTVRAFSTGNPQTKRCLAE

NVLKEARDLIALFSELGQGGKYFYPGRHTVVSVANVMSAVCFGRRYQHGDLEFQSLLSNN

DKFTRSVGAGSLVDVMPWLQRFPNPVRSVFRSFQQ (1)

VNYEFYDFVYKKFLLHRNTANQAV

TRDMMDAFIHILITKEGKVRADDADGGEEKGKNGQYFFHSLEAEHVPS

TVTDIFGASQDTLSTALQWVIFFLVR (2)

YPEIQTKLQDEMDRVIGKDRLPCIEDQPKLPYLMAFLYEF

MRFSSFVPITIPHATTKNTTIMGYQIPKDTVVFVNQWSVNHDPQKWSNPGEFNPSRFLDD

NGLINKDLVSNIMIFSVGKRRCIGEELSKIQLFMFSSILLHQCIFTALPADNLNPKGDYG

LSIKPKPFRISMTLRHGSMDLLNNSVLSGMAE*

 

>CYP2D45

NM_001015719.1 CX969358.1  54% to 2D6 E.Mahrous

MSLLSQLCPFALGCNVFTLGIIFTLLLLLLDFMKRRKPCTDFPP

SPPSWPFVGNLLQMDFRDLHNSFKQLSKQYGDVMSLRVFWKPTVVLNGFEVIKEALIQ

KSEDTADRPPFNLYEILGFVGNNKAVVLANYGQSWKDLRRFTLSTLRDFGMGKKSLEE

RVRDEAGYLCDAFQSEQGGPFDPHVLINTAVSNVICSIIFGERFEYDDHKFLKLLCLI

EESIKAESGPVPQIISSLPWSSKVPGLARLFFQPRIHMLQYLQEIINEHKQTWDSGHT

RDFIDAFMLEMKKAKGVKDSNFNDQNLLLTTADLFSAGSETTTTTLRWGLLFMLLYPD

VQRKVQEEIDQVIGRTRKPTMGDVLQMPYTNAVIHEIQRYADIIPLSVPHMAYRDTHI

KGFFIPKGTVIMTNLSSVLKDEKVWEKPFQFYPEHFLDRDGKFVKREAFMAFSAGRRV

CLGEQLARMELFLFFTSLLQRFSFQIPDGEPCLREDPVFVFLQVPHDYKICAKVR

 

>CYP2D.1 scaffold_160:807096-818137  (-) strand UCSC browser

52% to 2D6

MSLLSQLCPFALGCNVFTLGIIFTLLLLLLDFMKRRKPCTDFPPSPPSWPFVGNLLQMDSSSLSNSFRQ (0)

LKKQYGDVFSLQFYWQNVVVLNGYEAIKEALLQKSEDFADRPPFELYEGIGFTGNNK (1)

GVVTAKYGQSWKDLRRFTLSTLRDFGMGKKSLEERVGEEAGYLCDAFLSEQ (1)

GQLFDPHYKLNTAVANIISFIVFGDRFDYDDYKFQKLLNLNQAMFEVESGTMAQ (0)

IATAIPWLAKLPGLAKMIYRPHVDVLEYLQKIISDHQKTWNPACTRDLIDAFTLQMEK (0)

AKGDKENHFNEKNLLFTTFDLFTAGSETSTTTLRWGLLYMLQYPDVQ (1)

RKVQEEIDKVIGKSRKPVMADVLQMSYTNAVIHEIQRCADLVPLSLIHMTYRDTEVQGFSIPK (0)

GVAVIPNLSSVLKDEKVWEKPFQFYPEHFLDADGKFVKQEAFLPFST (1)

GRRACLGERLARMELFLFFTSLLQRFSFQIPDGEPCPRDDPIVYIVQIPHPYKLCAKIR*

 

>CYP2D.2 scaffold_160:866974-882965 (-) strand UCSC browser

DR873330.1 Trace archive 408392602, 234381521

MSLLSQLCPFALGCNVFTLGIIFTLLLLLLDFMKRRKPCTDFPPSPPSWPFVGNLLQMDFSSLSFRQ (0)

LRKQYGDVFSLQLGWQNVVVLNGYEAIKEALLQKSEDFADRPPFELYEGIGFTGNNK (1)

GVVLANYSQSWKDLRRFTLSTLRDFGMGKKSLEEKVREEAGYLCDAFQSEQ (1)

GQLFDPHYKLNTAVANIMNSIVFGDRFDYDDYKFQKLLNLNQEMFEVEFGTMAQ (0)

IATAIPWLAKLPGLAKMIYRPHVDVLEYLQKIISDHQKTCNPACTRDLIDAFTLEMEK (0)

VKGDKENYFNEKNLLFTAFDLFTAGSETSSTTLRWGLLYMLLYPDVQ (1)

RKVQEEIDQVIGKSRKAAMADVLQMSYTNAVIHEIQRCADLVPLSVTHMTYRDTEVQGFSIPK (0)

GVAVCPNLSSVLKDEKVWEKPFQFYPEHFLDADGKFVKQEAFLPFST (1)

GRRACLGERLARMELFLFFTSLLQRFSFQIPDGEPCPRDDPIVYIVQFPHPYKLCAKIR

 

>CYP2D.3 scaffold_160:1923301-1927860  (+) strand then a gap  UCSC browser

Trace arfchive to fill in seq gap with exon 4 387743496

241672823 to finish exon 7

418485537 walking down, 479264026 = exon 8, 248788894 = exon 9

5aa diffs to 2D45

MSLLSQLCPFALGCNVFTLGIIFTLLLLLLDFMKRRKPCTDFPPSPPSWPFVGNLLQMDFRDLHNSFKQ (0)

LSKQYGDVMSLQVFWKSMVVLNGFEVIKEALIQKSEDTADRPPFNLYEILGFVGNNK (1)

AVVLANYGQSWKDLRRFTLSTLRDFGMGKKSLEERVRDEAGYLCDAFQSEQ (1)

GGPFDPHVLINTAVSNVICSIIFGERFEYDDHKFLKLLCLIEESIKAESGPVPQ (0)

IISSLPWSSKVPGLARLFFQPRIHMLQYLQEIINEHKQTWDSGHTRDFIDAFMLEMKK (0)

AKGVKDSNFNDQNLLLTTADLFSAGSETTTTTLRWGLLFMLLYPDVQ (1)

RKVQEEIDQVIGRTRKPTMGDVLQMPYTNAVIHEIQRYGDIIPLSVPHMAYRDTHIKGFFIPK (0)

GTVIMTNLSSVLKDEKVWEKPFQFYPEHFLDRDGKFVKREAFMAFSA (1)

GRRVCLGEQLARMELFLFFTSLLQRFSFQIPDGEPCPREDPVFVFLQVPHDYKICAKVR*

 

>CYP2D.4 scaffold_160: 1936385-1938199 (+) strand (exons 6-9)

BX707908.1, CX969358 mate = CX969359 3aa diffs to 2D45

MSLLSQLCPFALGCNVVTLGIIFTLLLLLLDFMKRRKPCTDFPPSPPSWPFVGNLLQMDFRDLHNSFKQ

LSKQYGDVMSLRVFWKPTVVLNGFEVIKEALIQKSEDTADRPPFNLYEILGFVGNNK

AVVLANYGQSWKDLRRFTLSTLRDFGMGKKSLEERVRDEAGYLCDAFQSEQ

GGPFDPHVLINTAVSNVICSIIFGERFEYDDHKFLKLLCLIEESIKAESGPVPQ

IISSLPWSSKVPGLARLFFQPRIHMLQYLQEIINEHKQTWDSGHTRDFIDAFMLEMKK

AKGVKDSNFNDQNLLLTTADLFSAGSETTTTTLRWGLLFMLLYPDVQ

RKVQEEIDQVIGRTRKPTMGDVLQMPYTNAVIHEIQRYGDIIPLSVPHMAYRDTHIKGFFIPK

GTVIMTNLSSVLKDEKVWEKPFQFYPEHFLDRDGKFVKREAFMAFSA

GRRVCLGEQLARMELFLFFTSLLQRFSFQIPDGEPCPREDPVFVFLQVPHDYKICAKVR*

 

There are some assembly difficulties at scaf 160 in this region.  Some duplicate exons exist.  The D.3 and D.4 may be the same gene.  Only 4 aa diffs

 

>DN017333.1 51% to 2C8, Ramy Naguib Attia

cannot extend in the ESTdb

MSPSIFTLLIFVLLVLLSIMWWKKNLKDRSLLPPGPTPLPFLGNLLQVKPKEFLKALDK (0)

LKEKHGSVFTVYFGARPTVILCGYQTVKEALIDQADTFSSRGKMALAEHILKGY (1)

GITGSNGERWKQLRRFALTTLRNFGMGKRTIEKRIQEETTFLIEEFRNAE (1)

GMPFDPTFYLGCAVSNIICSIVFGERFDYNDKQFLFLLKNINKVLRFMNSTWGV (0)

VFFTFDKIMCHIPGPHQKAMKHLVDLKAFVQQRVRESKEILDINSPQHFIDCFLIKMQE (0)

EQENPHSEFHMDNLIGSALNLFFAGTETVSTTLRYGILILLKWPHIQ (1)

GRIQEEIDDVIGRQQCPKIEDRSKMPYTDAVIHEIQRFSDIVPTGLPHTATQDTTFRGHTIPK (0)

GTDVFALLTTVLKDPEVFQNPEEFNPERFLDENGILKKSQAFMPFSA (1)

GKRMCPGESLARMEIFLFLTTLLQKFTLIPTVPSVDLDVTPEISSSGHLPREYKMCVLPTT*

 

>CYP2Q2
CX972427.1 best hit to CYP2A13 in ESTdb X.tropicalis S.Hill, A.Bolen

NM_001010998.1 89% to CYP2Q1, 55% to 2A6 same as CX972427.1

from refseq database

MDTSWLWTLLLCLLISAMLIYSTWNKMYRKRNLPPGPTPIPLFG

NVMQIKRGEMVKSLIELGKKYGDIYTLYFGPSPVVILCSYRAIKEALIDQAEEFSGRG

AIPSFDQYFQGYGVVFTNGEEWKNLRRFSLSTLRNFGMGKRGIEERIKEEAQFLVAEI

KSYKEKPFDPTNILVQ

CVSNVICSVVFGNRFEYANKDFQNLLSLFQSVFQETSSSWGQLLNMLPAVMN
HVPGPHKNIIRDMNKLEDFVLQRVKENEKTVDPNSPRDLIDSFLIKMQQENKNPTSPFHM
KNLIATILSIFFAGTETVSTTLRHGFLILLIHPEIEAKLQEEIDRVVGQNRSPTIEDRNK
MPYTDAVIHEIQRLSDVIPMNVPHLVTKDTKFRGYTIPKGTNIYPLLCAVLRDPEQFDTP
SKFNPNHFLDDKGCFKSNDGFMPFSTGKRICLGEGLARMELFLFLT

NILQNFKLHSESGLTEDNIAPKMKGFANYPTSYQLSFIPR

 

>CYP2Q3

NM_001010999.1 54% to 2A6 79% to NM_001010998.1, 78% to 2Q1

from refseq database

MDTTWLWSLQLFLLIATMLIYSTWNKMYRKRNLPPGPTPIPLFG

NVLQIKRGEMVKSLLELGKKYGPVYTLYFGPSPVIILCDYQSIKEALNDQAEEFSGRG

KIPSWDQFFQGYGESFSNGDEWKQLRRFSLTTLRNFGMGKRGIEERIQEEAQFLVAEI

KSYKGKPFDPTKILVQCVSNVICSVVFGQRYEYSNKDFHKLLYMFQAVFEDTSSTLGQ

LMTLLPNIMNHIPGPHKTVVNKLNKVNDFILQRVKENEKTLDPNSPRHFIDSFLIQMQ

KEKDNPVTKFHWKNLLCTIMNLFFAGTETVSTTLRHGFLMLLIHPEIEEKLHEEIDRV

VGQDRSPTIEDRSKMPYTDAVIHEIQRFSDVLPMSLPHLVMKDTQFRGYTIPKGTDVY

PLICAALRDPKQFATPNKFNPQHFLDDNGLFKSSNAFLPFSTGKRICLGEGLARMELF

LFLTNILQNFKLHSENQFAEDDIAPKMNGFANYPLSYEFSLIPRVQSLLVL

 

>CX329225.2 DR834894.1 CX379987.2 Yun Peng 74% to human 2R1

MFPPVPLVALVAAALLIGGFLVRQIVKQRKPRGFPPGPPGLPLIGNILA

LASDPHVYMKKQSKIHGQ (0)

IFSLDLGGISTVVLNGYDAV

KECLVRQSDVFADRPSLPLFKKLTNMGGLLNAKYGRCWTEHRKLAVSCFRTFGCSQKSFE

SKISEECLFFLDAIDSYKGKALDPKHLVTIAVSNVSNLILFGERFRYDDNDFLHMIEIFS

ENIELATSAWVFLYNAFPLIGFLPFGKHQQLFRNASEVYDFLLQIIGRFSENRKPQSPRH

FIDAYMDEMERNEAD

PDSTYSMENLIFSVGELIIAGTETTTNVLRWAMLFMALYPNIQGQVQKEIDGVVGLNRMP
TFEEKSRMPYTEAVLHEILRYCNIAPLGIFHATSRDTVVRGYSIPEGTTVITNLYSVHFD
EKYWTDPEIFYPERFLDSAGQFTKKEAFVPFSLGRRHCLGEQLARMEMYLFFTALLQRFH
LHFPQGFVPNLRPKLGMTLQPHPYVICAERR*

 

>CX850388.1 different from 2R1 above 91%

MFPPVPLVALVAAALLIGGFLVRPIVKQRKPRGFPPGSPGLPLIGNILALASDPHVYLKK  184

QSKIHGQIFSLDLGGISTVVLNGYDAVKECLVRQSDVFADRLSLPLFKKLTNMGGLLNAK  364

YGRCWTEHRKLAVSCFRTFCCSQKSFESKISEECLFFLEANDSYE

 

>CYP2U1

CX851239.1 CX439683.1 CX959423.1 DR836116.1

best hit to CYP2U1 in ESTdb X.tropicalis M.Puljic

best match in human = CYP2U1 63%, CYP2U1 ortholog

MSDLAQDSMSGTLDWKQMGYASWSLLGDCASVSALLLYIALFLGLYLLMGSLWRYYQI

IHSNAPPGPTPWPIVGNFAFMLMPGWLM

QLLNFGIAKGKLRRVPAGATRRGAFLYPHIVLTEMAKMYGKIYGLYIGTRLMVILNDFNS

VKDALVSHSEVFSDRPSVSLVTIITKRKGIVFAPYGPIWRQQRRFSHSTLRYFGLGKLSL

EPKIIEEFKYVKAEMLKFGNKGFSPFEIINNAVSNVICSISFGKRFNYEDKEFKTMLSLM

SRGLEISVNSEAVLICLCSWLYYLPFGPFKELRQIVIDITAFLKRIIAE

HQVTLDPANPRDFIDMYLLHIKEEQKGQAESIFNTEYLFYIIGDLFIAGTDTTTNTLLWS

LLYMCLYPDVQEKVQAEIDTVIGRDRPPSLTDKSQMPFTEATIMEVQRMTVVVPLSVPHM

ASESSVFHGYTIPKGSVVMANLWSVHRDPKVWEKPNDFMPKRFLDENGQILKKEAFIPFG

IGRRVCMGEQLAKMELFLMFVNLLQSFSFSLADDTFKPSLEGRFGLTLAPYPFDIKITKR

*

 

>DN060997.1 DR833173.1 DR842090.1 CF374775.1

best hit to CYP2S1 in ESTdb X.tropicalis H.Penmatsa, K. Iyer, G. Vasser

best match in human = 2C18 55%, 47% to 2S1 not a 2S ortholog
MEILGATAVLLVICAF

FLLLNTIQVIRRQGKGKLPPGPTPLPFLGNFLQLRGEEVFKSLLEFGKKYGPVYTIHLGM

EPVVVLCSFDIVKEALNDNGDEFGARGHMPLLEKISHGGHGVVASNGERWKQLRRFSLMT

LRNFGMGKRSIEERIQEEAHFLTNEFKYTKGQPVDP

TFYFSKAVSNVICSVVFGDRFEYEDTEFLRLLGLLNQVFRGFSSVWGQLYNIFPKV

MGKLPGPHNMIFKSVNSLQEFIMQRINMHQET

LDPSSPRDFIDCFLIKMQQEKDVPQTEFHMQGALNTTFDMFGAGTETVSTTLRYGLLILL

KHPDIEERIQKEIDSVIGRNRAPCIEDRSRMPYTDAVIHEIQRFVDIIPMGIPHKVTRDI

QFQGYFIPKGTTVYPMLSSVLHDPKQFKYPDIFNPGHFIDENGKFCKNDGFMPFSSGKRI

CVGEGLARMELFLFITTILQNYTLRSPVDTEDLDLTPELSGFGNIPRPYKLCFIPR*

 

>NM_001001212.2 51% to 2C18

MDWALEINGLPILLLIAALLLLLARKVGKKVKGCLPPGPKPLPI

LGNLLQLKSREIHKPLLEFNKKYGPVYTLYMGSMPAVVLCGYEAVKEALVDNAEKFSG

RAEVPIVNLTTQGYGIAFSNGERWKELRRFSLTTLRNFGMGKRSIEERIQEEIHFLLE

AFHETQGSFFSPAFIIRRSVSNVICSVVFGKRFDYTDQKLQILLDLIAENLRRVDNIW

VQVYNFIPKLLNILPGPHHKLTENYKAQLRYVEEIVQEHGKTLDPSAPQDYIDAFLLK

MEQERKKAHTEYNVQNLLSCSLDIFFAGQESTSSTLGYGLLILMKYPHIKEKVQAEIE

SVIGRSRRPCMDDRAKMPYTEAVIHEIMRFIDFFPLGVPHSVTEDTLYRGYVIPKGTT

IFPFLHSVLFDPSMFERPQEFYPGHFLNQDGSFRKNEGFMAFSAGKRACPGKSLARVE

IFLYLTSILQQFDPQPALSPKDIDLSPEYSGFGKMAPSFQLKLVPH

 

>NM_001005711.1 45% to 2C8

MEPLTIFLCLFIFLLLLFTWKTHKRRVQLPPGPYPLPLLGNVLQ

GITVLYDSYRKLSEQYGPVFTVWLGSTPMVVLCGYEVLKDALINHSQEFGARGAFPVP

ERLTDGYGVISTNGTRWQQLRRFSVTVLRNFGMGKRSMEERIHEETQHLIQAVQHTGG

EAFDPLYLLGRAVNNIINLIVFGRRWDYKDKMMIKLFNIINSILLFLRSPLGVIYSAL

YQIMQHLPGPHQKIFHDSETVKSFIREQINSHKETLDSDSPRDYIDCFLIKANQEKDH

HSSEFSQENLVNTVFDFFVAGTETATNTIQFSLLVIITYPHIQAQVQKEIDKVVGPDR

LPGIADRAQMPYTNAVIHEIHRFLDLVPLSLPHMATQDTVCRGFRIPKGTTVIPLIGS

ALCDPAHWETPEEFNPEHFLNQNGEFYIPPAFMPFSAGKRVCLGEGLARMEIFLFFTA

LLQKFTIRVANQTDTFNLRTLRRAFRKKGLFYQLRAMPRTCTVEK

 

>NM_001004777.1 (gap missing C-helix, 22 aa) CX454308.2 69% to NM_001035117

MDRKQPYKTLMEVSKKYGSVFSVRVGPLKMVVLCGYDTVKDALLNYPDEFADRPALPLFD

ELVKGHGIIFSNGENWKVMRRFSLSTLRDFGMGKKTIESKIIEECDHLVQKFNSYGGKPFDNTM

IMNAAAANIIASILLSHRFHYENPTLLRLLKLVNENMRLMASPIALLYNTYPSIMRWV

PGCHKTIYNNAQELMEFIRETFSKHKVELDINDQRNLIDAFLSRQQEEKPHSAKYFHD

DNLTILVIDLFAAGMETTSTTLRWALLLMMKYPEIQKKVQDEIEKVIGSVEPRAEHRK

EMPYTDAVLHEIQRFANITPMNGPHATTKDVTFRGFFLPKGTYVIPLLASVLKDENYF

EKPNEFYPEHFLDSEGHFMKNEAFLPFSAGRRSCAGENLARMELFLFFTSLLQNFTFQ

APPGEELDLTPDVGGTVPPRPHTVCALPRS

 

>NM_001004878.1 66% to NM_001035117, 51% to 2K17 zebrafish

MDRKQPYKALLKVSKKYGPVCSFQIGPLKTVVLCGYDTVKDALLNDEFADRPAMPMLD

DVAKGHGILSSNGENWRVMRRFALSTLRDFGMGKKTIESKINEE

CDHLVQKFSSYGGKPFDTTMIMNAAVANIIASILLSHRFHYENPTLLRLLKLVNENTK

FMASRIAMLYNTFPSIMRWIPGCHKSIYKNAQELLEFIRETFSKQKVELDINDQRNLI

DAFLSRQQEPNSGKYFHDDNLTILVFDLFVAGMETTSTTLRWALLLMMKYPEIQKKVQ

DEIEKVIGSAEPRAEHRKEMPYTDAVIHEIQRFANIFPMNGPHATTKDVTFRGFLIPK

GTFVIPLLASVLKDENYFKKPNEFYPEHFLDSEGHFVKNDAFLPFSAGRRSCAGENLA

RMELFLFFTSLLQNFTFQAPPGEELDLTPDVGIATPPMQHTVCALPRA

 

>scaffold_21945:198-3026 1 aa diff to scaffold 55 (-) 506398-495841

 198 EKVHDEISRVIGSAHPTYSHRTQMPFTNAVIHEILRFADIVPLSVPHETTRDVHFKGYFIPK (0) 1158

2607 GTYIIPLLSSVLKDKTQFDAPEEFNPNHFLDSEGNFLKKEAFMPFSA (1) 2747

2847 GRRACPGEILARMELFIFFTSLLQKFSFRPPPGVTNINLSSDVGFTSVPLEGMICAIPRA 3026

 

>scaffold_2219:41253-41363 (-) exon 2 partial

41363 LWKKYGSIFSVQIGSQKMVVLCGYETVKDALVNYAEE 41253

 

>scaffold_3861: 233-373 (+) exon 8, this scaffold has large gaps

100% to second exon 4-9 86% to 21818_prot

233 GTFVIPLLTSVLYDQTRFEKPKEFYPQHFLDSEGNFVKNEAFLPFSA 373

 

>scaffold_3861:8120-8284 (+) exon 2 this scaffold has large gaps

100% to 21818_prot

8120 LWKKYGSIFRVQIGSQKMVVLCGYETVKDALINHGEEFSERPRLPIFQVIANGY 8284

 

>scaffold_3433:6638-6781 (+) exon 6, 95% to DT436641.1

6638 AKHPETYSYFHNENLVRLVRNVFSAGVETTSTALRWALLLMIKYPDIQ 6781

 

>scaffold_3433:23829-23954 (+) exon 4

100% to Green = DT436641.1 75% to 21819_prot

23829 FVFSLGKPFDNTMIMNAAVANIIVSIVLGHRFDYQDPKFLRL 23954

 

>scaffold_2615 : 2-55 (+) exon 9

2 VGFTSVPLEGMICAIPRA 55

 

>scaffold_2899 : 33841-33948 (+)

94% to 49369_prot   scaffold_996:232793-245538

33841 GTQVIPLLASVLQDETYFEKPEEFYPQHFLDSEGLF 33948

 

>scaffold_590 : 150217-150324 exon 8, 83% to $$$$$$8

150217 LWDKPYFEKPDEFYAQHFLDSEGNFVKNEAFLPFSA 150324

 

>14945_prot 55% to DN017333.1, 52% to 2C8

scaffold_1232:575-12890 (+) exon 3 partial

  575 MAMDSAGTVLLAACVIVLFYLVKWRGNNKRKNLPPGPTAFPLLGNFLQVSTTEIPSSCVE 754

 1024 LSKTYGPVFTLYLGGHRSIILIGYDAVKEALIDNSDVFSDRGEGGVSEMIFKNY 1185

 3914 GVILSNGERWKTMRRFTLTTLRNFGMGKRSVEERIQEEARSLEEAFRKKK 4063

 5417 DEPFDPIYLLGLAVSNIICSIIFGERFDYEDEKFMTLLMYIREFVKLLNSFFGM 5578

 5936 LFNFFPNLFCYIPGPHQNIFTYFNKLKQFVKDEAKSHKDTLDANCPRDFIDCFLIRME 6109

 8038 QEKNNPNSEFHYENLFGTILDLFLAGTETTSSTLRYAFLILLKYPEIQ 8181

 9338 ENVYKEIVQVIGQHRYPSVEDRSKMPYTEAVIHEVQRIGDILPLGLEHAASKDTTFRGYDIPK 9526

11964 GTLIFPLLTSVLKDPKYFKNPDQFDPEHFLDENGCFKKNDAFMPFST 12104

12708 GKRVCAGEGLARMELFIFLTTILQKFILKSTVATEEIKITPEPNTNGSRPWPYKMFVVPRC 12890

 

>scaffold_16683:748-5473 = BC092552.1

MDPVSVLLSVVVCIFLFKVFYDGEKESQNFPPGPKPLPLIGNLHIINMEKPYLTFME

LAEKYGSVFSFHLGTEKVVVLCGTDAVRDALINHAEEFSGRPKVAIFDQIFKGH

GIIFADGENWKVMRRFSLSTLRDFGMGKKTIEEKISEESDCL

VETFKSHGGKPFDNTMIMNAAVANIIVALLLSQRFDYQDPTLLKLVKSINKIVRITGSSMVMLYNTF

PSIMQWIPGSHQNVVKNAEKIYTFLIETFTKHRHQLDVNDQRDLIDTFLIKQQEEKSSST

KFFHDENLKVLLLNLFGAGMETTSTTLRWGILLMMKYPEVQKKVQDEIDRVIGSAEPRLE

HQKQMPYTDAVIHEIQRFADLVPNNVPHATTKDVTFRGYFIPK

GTHVIPLLTSVLKDKDYFKKPNEFYPEHFLDSEGHFVKNEAFLPFSA

GRRICAGETLAKMELFLFFTNLLQNFTFQPPPGVEVQLTRGVAITSIPTEHKICALPRS*

 

>52542_prot scaffold_1232:27024-44511 (+)

MELGVTWSLILAVIVSFLVYSFTWRRKLRKINMPPGPPLYPLLGNMLQIS

AKEFPQSLVKLSEKYGTVFTVYLPSKPAVILSGYDCIKEALLDNNESFGA

RGESPLGYLLFKDYGVIFSNGERWKQLRRFSLSCLRDFGMGKKSIEERIQ

EEARCLVEELGKNGDTPMDPTYMLTLAVSNVICTVVFGERFDYKDEKFMT

LISLLKIVSRDFSSAWGIRSRRPRTRSCAQKLLNLFPNTLSRLPGPPQRL

FRNFDKLKAFVAESLKSHQETLNSDCPRDFIDCFLIKMEKEKNNPQTEFH

SDNLFGTVLDLFFAGTETTSITLKYSFLMLLKYTEVTRKAMEEIDNIIGQ

ERCPFYEDRIKMPYTNAVIHEIQRMADIVPLGVPHATTHDIIFRGYNIPK

DTIIFPLMTSVLKDPKYFNDPKQFDPAHFLDENGSFKKNDAFQPFSIGKR

SCLGEGLARMEIFLFITSILQAFNLKSDTAPQDIDITPEPDKNGAIPRTY

KMYFVPK

 

>14947_prot scaffold_1232:47102-62978 (+)

MAVLGIETLFLVCSFTFLVFLFSRRQRHARLPPGPTPLPLLGNVLQLDFSKQVKEFVKLGSQY

GPVSMVYLGPYPVLVLNGYDVVKEAFVDNGEVFSNRGKNAFIEMIFKGR

GVAFSNGERWRQMRRFSLSTLRDFGMGKRRVEERVQEE

ACALVEEFKKTKGTPFNSTYLMTLAVSNVICSVVFGERFDYQNETFLSVL

ALLKDTFKIITSPWTQLFSFAPGLLKHLPGPHKKAAENLDRLKTFVTEFV

ASHEETLEENFPRDYIDCFLIKMRQEKDNVNTEFDYENLFVTLMNLFFAG

TETTSITLQYGMLILLKYPDIQKKIHEEIDSVIGFNRCPSMEDRPKMPYT

DATIHEIQRFADIVPMGVPRSTNKDTTLRGYDIPKGTTVFPMLTCILKDP

RYFKDPESFNPCHFLDEKGCLKKTDAFIPFSIGKRVCLGEGLARMEIFLF

LTSILQRFELKCHMDPKDIDISPVPSKSAYMPRPYELYITPR

 

>52545_prot scaffold_1232:71267-83903 (+) short seq

exon 7 gap filled in by DT419848.1, missing exon 2

82% to 52547_prot scaffold_1232:122253-139910

71267 MDVAGLGTFLLVLITFILTLSSWNTMYKKVNLPPGPTPLPLIGNLMNIKKGKMVNSLMK (0)

 

GLSFSNGERWRQMRHFTLKTLKNFGMGKKSIEEKIQEEALCLVEEIRKSG (1)

ETPVDPSKLIMDAVSNVFCSIMFGRRFEYNEEKFANLLTNVNEIFRLMSNTWGQ (0)

LESIFPSVMAYIPGPHKKKNTLSEELISFLHERVKSNQETFDPSAPRDFIDEYLMKIEQ (0)

EKKNPNSEFTMRNTLLTFFSIFLGGTETSTTTIKHGLLLLIKYPEIQ (1)

79425 AKLHMEIDHVIGRNRIVNINDRNAMPYMEAVINEIQRFSDIAPLNAPRKVTKDVQFRGYSIPK (0) 79511

DTEIYPLLCTVHRDPKYFSSPYEFNPSHFLDEQGRFRKSEAMMAFSA (1)

GKRICPGESLARMELFLFFTTILQNFTLTSPTHFTEDDVAPKMAGFMNHPIQYKASFISR* 83903