Xenopus tropicalis cytochrome P450s The 2006 Bioinformatics class searched the ESTdb for matches to Human P450s in the species Xenopus tropicalis. There are over 1 million ESTs for this frog species in ESTdb. The yellow sequences were turned in for assignment #1.Additional sequence was added later to complete the sequences, by walking upstream or downstream on more X. tropicalis ESTs. The refseq_rna database was also searched.(in progress, Jan. 23, 2006) Note: there are 22.9 million reads for X. tropicalis in the Trace Archive.Incomplete sequences can be completed by chromosome walking with the MegaBlast server.This file now has 55 sequences. 52 from X. tropicalis and 3 from X. laevis50/52 are full length CYP sequences. I have just added a cluster of 25 genes and some pseudogenes on scaffold 55That will push the gene count up near 75.
>BX728777
CX904306.1 CYP1A1 N. Abdletawab
552208048 411550065 409289324 388847477
62629_prot from UCSC browser scaf 287 (+) 1408174-1414975
62%
to 1A1 57% to 1A2, 90% to 1A6, 91% to 1A7
MMDNSTTTEVLVASIVFAIVFLVIRSQRVKLPPGTKKLPGP
MPYPVIGNLLSLSKNPHLSLTKMSETYGDVFQIQIGTKPMLVLSGLETLRQALIRQSDEF
AGRPDLFTFRLVGDGQSMTFSSDSGEV
WRARRRLAQNALKTFATSPSPTSSNSCLVEENIITEAEYLIRKFKELIDDKGEFDPYRYV
VVSVANVICGMCFGKRYNHDDEELLNVVNLTDEFGAAAASGNPADFIPILQYFPNSSMKA
FKEINQKFLAFMQKFTKEHYKTFDKNHIRDITDSLIQHSQEKRVDENSDIQLSNEKIVNI
VNDLFGAGFDTITTALSWSLMYLVAHPNIQQRIQDELDQVIGRERRPRLSDRAQLPYTE
AFILEMFRHSSFMPFTIPH (1)
CTTKDTMLNGYFIPKGICVLINQWQVNHDP(2)
NLWQDPFKFCPERFLNNDGTMVNKTEMEKVMIFGL
GKRRCVGEAIGRMEVFLFLTTMLQQMQFFKQDGEKLDMSPQYGLTMKHKR
CHLTAKLRFALLTN*
>DN053435 DN024870 51% to CYP1A8P ortholog DN024871 mate pair to DN024870DN025714.1
MESAVKKTLMDMMPMLLKASISFLTVLLVMSILWKKRNSLPGPWAVPI
VGNFFQLGDQIHITLTDMRNRYGDVFQIKLGLMPIVVVSGLETVKRVLLKEGENFADRPN
FYSFSLFSNGSSMTFSEKYGESWKIHKKIMKNALRNLSNESTNSSNCSCRLEEYVCAEAS
DLVQELTDLSAEKVAFDPSQSIVITVANVVCALSFGKRYDHHDKEFLTLIDFNNDLRKA
AGGGLLADFIPILRFIPSSSVKALKKFVQSFHSFIAKCVKDHFATFEENNIRDITDA
LIQLCKERKSEDKNQLLSDDQIISTVNDIFGAGFDTITSALLWAIFYLLRYPEFQDKIHK
EIEEKIGCNRAPRFNDRKDLHYTEAFINEVLRHSSFVPFGLPHCTTMDTKLNGYFLPKGT
CVFTNLYQVNHDNTVWKDADMFMPERFLDQNGQIIKSLTEKVLVFGMGVRKCLGEDVARN
EMFVIMTIMMQRLKLVKSTKHELDPIPVYGLTLKPKPYYLVAKVRT*
>CX846813.1 C.Blackwell 1B1 as query 55% to 1B1 orthologCL126458.1 from GSS, Trace archive 483147144 391272900 233714403422555774 (from Trace search with Human DNA for last part) 483233841MNWKIWEDLGQSSVPKLLLSFLCALTVAHILKWIHEWIIPRWIRS
SQPPGPFPWPLFGNALQMGSYPHLAFIDLAKRYGNIFQIKLGSQKIVVLNGDLVIRHALL
HKGEDFAGRPKFTSYQFVSGGRSLAFGCYTEKWKAHRKLAHSTVRAFSTGNPQTKRCLAE
NVLKEARDLIALFSELGQGGKYFYPGRHTVVSVANVMSAVCFGRRYQHGDLEFQSLLSNN
DKFTRSVGAGSLVDVMPWLQRFPNPVRSVFRSFQQ (1)
VNYEFYDFVYKKFLLHRNTANQAV
TRDMMDAFIHILITKEGKVRADDADGGEEKGKNGQYFFHSLEAEHVPS
TVTDIFGASQDTLSTALQWVIFFLVR (2)
YPEIQTKLQDEMDRVIGKDRLPCIEDQPKLPYLMAFLYEF
MRFSSFVPITIPHATTKNTTIMGYQIPKDTVVFVNQWSVNHDPQKWSNPGEFNPSRFLDD
NGLINKDLVSNIMIFSVGKRRCIGEELSKIQLFMFSSILLHQCIFTALPADNLNPKGDYG
LSIKPKPFRISMTLRHGSMDLLNNSVLSGMAE*
>CYP2D45
NM_001015719.1 CX969358.1 54% to 2D6 E.Mahrous
MSLLSQLCPFALGCNVFTLGIIFTLLLLLLDFMKRRKPCTDFPP
SPPSWPFVGNLLQMDFRDLHNSFKQLSKQYGDVMSLRVFWKPTVVLNGFEVIKEALIQ
KSEDTADRPPFNLYEILGFVGNNKAVVLANYGQSWKDLRRFTLSTLRDFGMGKKSLEE
RVRDEAGYLCDAFQSEQGGPFDPHVLINTAVSNVICSIIFGERFEYDDHKFLKLLCLI
EESIKAESGPVPQIISSLPWSSKVPGLARLFFQPRIHMLQYLQEIINEHKQTWDSGHT
RDFIDAFMLEMKKAKGVKDSNFNDQNLLLTTADLFSAGSETTTTTLRWGLLFMLLYPD
VQRKVQEEIDQVIGRTRKPTMGDVLQMPYTNAVIHEIQRYADIIPLSVPHMAYRDTHI
KGFFIPKGTVIMTNLSSVLKDEKVWEKPFQFYPEHFLDRDGKFVKREAFMAFSAGRRV
CLGEQLARMELFLFFTSLLQRFSFQIPDGEPCLREDPVFVFLQVPHDYKICAKVR
>CYP2D.1 scaffold_160:807096-818137 (-) strand UCSC browser
52% to 2D6
MSLLSQLCPFALGCNVFTLGIIFTLLLLLLDFMKRRKPCTDFPPSPPSWPFVGNLLQMDSSSLSNSFRQ
(0)
LKKQYGDVFSLQFYWQNVVVLNGYEAIKEALLQKSEDFADRPPFELYEGIGFTGNNK (1)
GVVTAKYGQSWKDLRRFTLSTLRDFGMGKKSLEERVGEEAGYLCDAFLSEQ (1)
GQLFDPHYKLNTAVANIISFIVFGDRFDYDDYKFQKLLNLNQAMFEVESGTMAQ (0)
IATAIPWLAKLPGLAKMIYRPHVDVLEYLQKIISDHQKTWNPACTRDLIDAFTLQMEK (0)
AKGDKENHFNEKNLLFTTFDLFTAGSETSTTTLRWGLLYMLQYPDVQ (1)
RKVQEEIDKVIGKSRKPVMADVLQMSYTNAVIHEIQRCADLVPLSLIHMTYRDTEVQGFSIPK (0)
GVAVIPNLSSVLKDEKVWEKPFQFYPEHFLDADGKFVKQEAFLPFST (1)
GRRACLGERLARMELFLFFTSLLQRFSFQIPDGEPCPRDDPIVYIVQIPHPYKLCAKIR*
>CYP2D.2 scaffold_160:866974-882965 (-) strand UCSC browser
DR873330.1 Trace archive 408392602, 234381521
MSLLSQLCPFALGCNVFTLGIIFTLLLLLLDFMKRRKPCTDFPPSPPSWPFVGNLLQMDFSSLSFRQ
(0)
LRKQYGDVFSLQLGWQNVVVLNGYEAIKEALLQKSEDFADRPPFELYEGIGFTGNNK
(1)
GVVLANYSQSWKDLRRFTLSTLRDFGMGKKSLEEKVREEAGYLCDAFQSEQ
(1)
GQLFDPHYKLNTAVANIMNSIVFGDRFDYDDYKFQKLLNLNQEMFEVEFGTMAQ
(0)
IATAIPWLAKLPGLAKMIYRPHVDVLEYLQKIISDHQKTCNPACTRDLIDAFTLEMEK
(0)
VKGDKENYFNEKNLLFTAFDLFTAGSETSSTTLRWGLLYMLLYPDVQ
(1)
RKVQEEIDQVIGKSRKAAMADVLQMSYTNAVIHEIQRCADLVPLSVTHMTYRDTEVQGFSIPK
(0)
GVAVCPNLSSVLKDEKVWEKPFQFYPEHFLDADGKFVKQEAFLPFST
(1)
GRRACLGERLARMELFLFFTSLLQRFSFQIPDGEPCPRDDPIVYIVQFPHPYKLCAKIR
>CYP2D.3 scaffold_160:1923301-1927860 (+) strand then a gap
UCSC browser
Trace arfchive to fill in seq gap with exon 4 387743496
241672823 to finish exon 7
418485537 walking down, 479264026 = exon 8, 248788894 = exon 9
5aa diffs to 2D45
MSLLSQLCPFALGCNVFTLGIIFTLLLLLLDFMKRRKPCTDFPPSPPSWPFVGNLLQMDFRDLHNSFKQ
(0)
LSKQYGDVMSLQVFWKSMVVLNGFEVIKEALIQKSEDTADRPPFNLYEILGFVGNNK
(1)
AVVLANYGQSWKDLRRFTLSTLRDFGMGKKSLEERVRDEAGYLCDAFQSEQ
(1)
GGPFDPHVLINTAVSNVICSIIFGERFEYDDHKFLKLLCLIEESIKAESGPVPQ
(0)
IISSLPWSSKVPGLARLFFQPRIHMLQYLQEIINEHKQTWDSGHTRDFIDAFMLEMKK
(0)
AKGVKDSNFNDQNLLLTTADLFSAGSETTTTTLRWGLLFMLLYPDVQ
(1)
RKVQEEIDQVIGRTRKPTMGDVLQMPYTNAVIHEIQRYGDIIPLSVPHMAYRDTHIKGFFIPK
(0)
GTVIMTNLSSVLKDEKVWEKPFQFYPEHFLDRDGKFVKREAFMAFSA
(1)
GRRVCLGEQLARMELFLFFTSLLQRFSFQIPDGEPCPREDPVFVFLQVPHDYKICAKVR*
>CYP2D.4 scaffold_160: 1936385-1938199 (+) strand (exons 6-9)
BX707908.1, CX969358 mate = CX969359 3aa diffs to 2D45
MSLLSQLCPFALGCNVVTLGIIFTLLLLLLDFMKRRKPCTDFPPSPPSWPFVGNLLQMDFRDLHNSFKQ
LSKQYGDVMSLRVFWKPTVVLNGFEVIKEALIQKSEDTADRPPFNLYEILGFVGNNK
AVVLANYGQSWKDLRRFTLSTLRDFGMGKKSLEERVRDEAGYLCDAFQSEQ
GGPFDPHVLINTAVSNVICSIIFGERFEYDDHKFLKLLCLIEESIKAESGPVPQ
IISSLPWSSKVPGLARLFFQPRIHMLQYLQEIINEHKQTWDSGHTRDFIDAFMLEMKK
AKGVKDSNFNDQNLLLTTADLFSAGSETTTTTLRWGLLFMLLYPDVQ
RKVQEEIDQVIGRTRKPTMGDVLQMPYTNAVIHEIQRYGDIIPLSVPHMAYRDTHIKGFFIPK
GTVIMTNLSSVLKDEKVWEKPFQFYPEHFLDRDGKFVKREAFMAFSA
GRRVCLGEQLARMELFLFFTSLLQRFSFQIPDGEPCPREDPVFVFLQVPHDYKICAKVR*
There are some assembly difficulties at scaf 160 in this
region. Some duplicate exons
exist. The D.3 and D.4 may be the
same gene. Only 4 aa diffs
>DN017333.1 51% to 2C8, Ramy Naguib Attia
cannot extend in the ESTdb
MSPSIFTLLIFVLLVLLSIMWWKKNLKDRSLLPPGPTPLPFLGNLLQVKPKEFLKALDK (0)
LKEKHGSVFTVYFGARPTVILCGYQTVKEALIDQADTFSSRGKMALAEHILKGY
(1)
GITGSNGERWKQLRRFALTTLRNFGMGKRTIEKRIQEETTFLIEEFRNAE
(1)
GMPFDPTFYLGCAVSNIICSIVFGERFDYNDKQFLFLLKNINKVLRFMNSTWGV
(0)
VFFTFDKIMCHIPGPHQKAMKHLVDLKAFVQQRVRESKEILDINSPQHFIDCFLIKMQE (0)
EQENPHSEFHMDNLIGSALNLFFAGTETVSTTLRYGILILLKWPHIQ
(1)
GRIQEEIDDVIGRQQCPKIEDRSKMPYTDAVIHEIQRFSDIVPTGLPHTATQDTTFRGHTIPK
(0)
GTDVFALLTTVLKDPEVFQNPEEFNPERFLDENGILKKSQAFMPFSA
(1)
GKRMCPGESLARMEIFLFLTTLLQKFTLIPTVPSVDLDVTPEISSSGHLPREYKMCVLPTT*
>CYP2Q2CX972427.1 best hit to CYP2A13 in ESTdb X.tropicalis S.Hill, A.Bolen
NM_001010998.1 89% to
CYP2Q1, 55% to 2A6 same as CX972427.1
from refseq database
MDTSWLWTLLLCLLISAMLIYSTWNKMYRKRNLPPGPTPIPLFG
NVMQIKRGEMVKSLIELGKKYGDIYTLYFGPSPVVILCSYRAIKEALIDQAEEFSGRG
AIPSFDQYFQGYGVVFTNGEEWKNLRRFSLSTLRNFGMGKRGIEERIKEEAQFLVAEI
KSYKEKPFDPTNILVQ
CVSNVICSVVFGNRFEYANKDFQNLLSLFQSVFQETSSSWGQLLNMLPAVMNHVPGPHKNIIRDMNKLEDFVLQRVKENEKTVDPNSPRDLIDSFLIKMQQENKNPTSPFHMKNLIATILSIFFAGTETVSTTLRHGFLILLIHPEIEAKLQEEIDRVVGQNRSPTIEDRNKMPYTDAVIHEIQRLSDVIPMNVPHLVTKDTKFRGYTIPKGTNIYPLLCAVLRDPEQFDTPSKFNPNHFLDDKGCFKSNDGFMPFSTGKRICLGEGLARMELFLFLT
NILQNFKLHSESGLTEDNIAPKMKGFANYPTSYQLSFIPR
>CYP2Q3
NM_001010999.1 54% to 2A6 79% to NM_001010998.1, 78% to 2Q1
from refseq database
MDTTWLWSLQLFLLIATMLIYSTWNKMYRKRNLPPGPTPIPLFG
NVLQIKRGEMVKSLLELGKKYGPVYTLYFGPSPVIILCDYQSIKEALNDQAEEFSGRG
KIPSWDQFFQGYGESFSNGDEWKQLRRFSLTTLRNFGMGKRGIEERIQEEAQFLVAEI
KSYKGKPFDPTKILVQCVSNVICSVVFGQRYEYSNKDFHKLLYMFQAVFEDTSSTLGQ
LMTLLPNIMNHIPGPHKTVVNKLNKVNDFILQRVKENEKTLDPNSPRHFIDSFLIQMQ
KEKDNPVTKFHWKNLLCTIMNLFFAGTETVSTTLRHGFLMLLIHPEIEEKLHEEIDRV
VGQDRSPTIEDRSKMPYTDAVIHEIQRFSDVLPMSLPHLVMKDTQFRGYTIPKGTDVY
PLICAALRDPKQFATPNKFNPQHFLDDNGLFKSSNAFLPFSTGKRICLGEGLARMELF
LFLTNILQNFKLHSENQFAEDDIAPKMNGFANYPLSYEFSLIPRVQSLLVL
>CX329225.2
DR834894.1 CX379987.2 Yun Peng 74% to human 2R1
MFPPVPLVALVAAALLIGGFLVRQIVKQRKPRGFPPGPPGLPLIGNILA
LASDPHVYMKKQSKIHGQ
(0)
IFSLDLGGISTVVLNGYDAV
KECLVRQSDVFADRPSLPLFKKLTNMGGLLNAKYGRCWTEHRKLAVSCFRTFGCSQKSFE
SKISEECLFFLDAIDSYKGKALDPKHLVTIAVSNVSNLILFGERFRYDDNDFLHMIEIFS
ENIELATSAWVFLYNAFPLIGFLPFGKHQQLFRNASEVYDFLLQIIGRFSENRKPQSPRH
FIDAYMDEMERNEAD
PDSTYSMENLIFSVGELIIAGTETTTNVLRWAMLFMALYPNIQGQVQKEIDGVVGLNRMPTFEEKSRMPYTEAVLHEILRYCNIAPLGIFHATSRDTVVRGYSIPEGTTVITNLYSVHFDEKYWTDPEIFYPERFLDSAGQFTKKEAFVPFSLGRRHCLGEQLARMEMYLFFTALLQRFHLHFPQGFVPNLRPKLGMTLQPHPYVICAERR*
>CX850388.1
different from 2R1 above 91%
MFPPVPLVALVAAALLIGGFLVRPIVKQRKPRGFPPGSPGLPLIGNILALASDPHVYLKK 184
QSKIHGQIFSLDLGGISTVVLNGYDAVKECLVRQSDVFADRLSLPLFKKLTNMGGLLNAK 364
YGRCWTEHRKLAVSCFRTFCCSQKSFESKISEECLFFLEANDSYE
>CYP2U1
CX851239.1 CX439683.1 CX959423.1 DR836116.1
best hit to CYP2U1 in ESTdb X.tropicalis M.Puljic
best match in human = CYP2U1 63%, CYP2U1 ortholog
MSDLAQDSMSGTLDWKQMGYASWSLLGDCASVSALLLYIALFLGLYLLMGSLWRYYQI
IHSNAPPGPTPWPIVGNFAFMLMPGWLM
QLLNFGIAKGKLRRVPAGATRRGAFLYPHIVLTEMAKMYGKIYGLYIGTRLMVILNDFNS
VKDALVSHSEVFSDRPSVSLVTIITKRKGIVFAPYGPIWRQQRRFSHSTLRYFGLGKLSL
EPKIIEEFKYVKAEMLKFGNKGFSPFEIINNAVSNVICSISFGKRFNYEDKEFKTMLSLM
SRGLEISVNSEAVLICLCSWLYYLPFGPFKELRQIVIDITAFLKRIIAE
HQVTLDPANPRDFIDMYLLHIKEEQKGQAESIFNTEYLFYIIGDLFIAGTDTTTNTLLWS
LLYMCLYPDVQEKVQAEIDTVIGRDRPPSLTDKSQMPFTEATIMEVQRMTVVVPLSVPHM
ASESSVFHGYTIPKGSVVMANLWSVHRDPKVWEKPNDFMPKRFLDENGQILKKEAFIPFG
IGRRVCMGEQLAKMELFLMFVNLLQSFSFSLADDTFKPSLEGRFGLTLAPYPFDIKITKR
*
>DN060997.1
DR833173.1 DR842090.1 CF374775.1
best
hit to CYP2S1 in ESTdb X.tropicalis H.Penmatsa, K. Iyer, G. Vasser
best match in human = 2C18 55%, 47% to 2S1 not a 2S orthologMEILGATAVLLVICAF
FLLLNTIQVIRRQGKGKLPPGPTPLPFLGNFLQLRGEEVFKSLLEFGKKYGPVYTIHLGM
EPVVVLCSFDIVKEALNDNGDEFGARGHMPLLEKISHGGHGVVASNGERWKQLRRFSLMT
LRNFGMGKRSIEERIQEEAHFLTNEFKYTKGQPVDP
TFYFSKAVSNVICSVVFGDRFEYEDTEFLRLLGLLNQVFRGFSSVWGQLYNIFPKV
MGKLPGPHNMIFKSVNSLQEFIMQRINMHQET
LDPSSPRDFIDCFLIKMQQEKDVPQTEFHMQGALNTTFDMFGAGTETVSTTLRYGLLILL
KHPDIEERIQKEIDSVIGRNRAPCIEDRSRMPYTDAVIHEIQRFVDIIPMGIPHKVTRDI
QFQGYFIPKGTTVYPMLSSVLHDPKQFKYPDIFNPGHFIDENGKFCKNDGFMPFSSGKRI
CVGEGLARMELFLFITTILQNYTLRSPVDTEDLDLTPELSGFGNIPRPYKLCFIPR*
>NM_001001212.2
51% to 2C18
MDWALEINGLPILLLIAALLLLLARKVGKKVKGCLPPGPKPLPI
LGNLLQLKSREIHKPLLEFNKKYGPVYTLYMGSMPAVVLCGYEAVKEALVDNAEKFSG
RAEVPIVNLTTQGYGIAFSNGERWKELRRFSLTTLRNFGMGKRSIEERIQEEIHFLLE
AFHETQGSFFSPAFIIRRSVSNVICSVVFGKRFDYTDQKLQILLDLIAENLRRVDNIW
VQVYNFIPKLLNILPGPHHKLTENYKAQLRYVEEIVQEHGKTLDPSAPQDYIDAFLLK
MEQERKKAHTEYNVQNLLSCSLDIFFAGQESTSSTLGYGLLILMKYPHIKEKVQAEIE
SVIGRSRRPCMDDRAKMPYTEAVIHEIMRFIDFFPLGVPHSVTEDTLYRGYVIPKGTT
IFPFLHSVLFDPSMFERPQEFYPGHFLNQDGSFRKNEGFMAFSAGKRACPGKSLARVE
IFLYLTSILQQFDPQPALSPKDIDLSPEYSGFGKMAPSFQLKLVPH
>NM_001005711.1
45% to 2C8
MEPLTIFLCLFIFLLLLFTWKTHKRRVQLPPGPYPLPLLGNVLQ
GITVLYDSYRKLSEQYGPVFTVWLGSTPMVVLCGYEVLKDALINHSQEFGARGAFPVP
ERLTDGYGVISTNGTRWQQLRRFSVTVLRNFGMGKRSMEERIHEETQHLIQAVQHTGG
EAFDPLYLLGRAVNNIINLIVFGRRWDYKDKMMIKLFNIINSILLFLRSPLGVIYSAL
YQIMQHLPGPHQKIFHDSETVKSFIREQINSHKETLDSDSPRDYIDCFLIKANQEKDH
HSSEFSQENLVNTVFDFFVAGTETATNTIQFSLLVIITYPHIQAQVQKEIDKVVGPDR
LPGIADRAQMPYTNAVIHEIHRFLDLVPLSLPHMATQDTVCRGFRIPKGTTVIPLIGS
ALCDPAHWETPEEFNPEHFLNQNGEFYIPPAFMPFSAGKRVCLGEGLARMEIFLFFTA
LLQKFTIRVANQTDTFNLRTLRRAFRKKGLFYQLRAMPRTCTVEK
>NM_001004777.1
(gap missing C-helix, 22 aa)
CX454308.2 69% to NM_001035117
MDRKQPYKTLMEVSKKYGSVFSVRVGPLKMVVLCGYDTVKDALLNYPDEFADRPALPLFD
ELVKGHGIIFSNGENWKVMRRFSLSTLRDFGMGKKTIESKIIEECDHLVQKFNSYGGKPFDNTM
IMNAAAANIIASILLSHRFHYENPTLLRLLKLVNENMRLMASPIALLYNTYPSIMRWV
PGCHKTIYNNAQELMEFIRETFSKHKVELDINDQRNLIDAFLSRQQEEKPHSAKYFHD
DNLTILVIDLFAAGMETTSTTLRWALLLMMKYPEIQKKVQDEIEKVIGSVEPRAEHRK
EMPYTDAVLHEIQRFANITPMNGPHATTKDVTFRGFFLPKGTYVIPLLASVLKDENYF
EKPNEFYPEHFLDSEGHFMKNEAFLPFSAGRRSCAGENLARMELFLFFTSLLQNFTFQ
APPGEELDLTPDVGGTVPPRPHTVCALPRS
>NM_001004878.1 66% to NM_001035117, 51% to 2K17 zebrafish
MDRKQPYKALLKVSKKYGPVCSFQIGPLKTVVLCGYDTVKDALLNDEFADRPAMPMLD
DVAKGHGILSSNGENWRVMRRFALSTLRDFGMGKKTIESKINEE
CDHLVQKFSSYGGKPFDTTMIMNAAVANIIASILLSHRFHYENPTLLRLLKLVNENTK
FMASRIAMLYNTFPSIMRWIPGCHKSIYKNAQELLEFIRETFSKQKVELDINDQRNLI
DAFLSRQQEPNSGKYFHDDNLTILVFDLFVAGMETTSTTLRWALLLMMKYPEIQKKVQ
DEIEKVIGSAEPRAEHRKEMPYTDAVIHEIQRFANIFPMNGPHATTKDVTFRGFLIPK
GTFVIPLLASVLKDENYFKKPNEFYPEHFLDSEGHFVKNDAFLPFSAGRRSCAGENLA
RMELFLFFTSLLQNFTFQAPPGEELDLTPDVGIATPPMQHTVCALPRA
>scaffold_21945:198-3026
1 aa diff to scaffold 55 (-) 506398-495841
198
EKVHDEISRVIGSAHPTYSHRTQMPFTNAVIHEILRFADIVPLSVPHETTRDVHFKGYFIPK
(0) 1158
2607 GTYIIPLLSSVLKDKTQFDAPEEFNPNHFLDSEGNFLKKEAFMPFSA (1) 2747
2847 GRRACPGEILARMELFIFFTSLLQKFSFRPPPGVTNINLSSDVGFTSVPLEGMICAIPRA
3026
>scaffold_2219:41253-41363 (-) exon 2 partial
41363 LWKKYGSIFSVQIGSQKMVVLCGYETVKDALVNYAEE 41253
>scaffold_3861: 233-373 (+) exon 8, this scaffold has large
gaps
100% to second exon 4-9 86% to 21818_prot
233 GTFVIPLLTSVLYDQTRFEKPKEFYPQHFLDSEGNFVKNEAFLPFSA 373
>scaffold_3861:8120-8284 (+) exon 2 this scaffold has large
gaps
100% to 21818_prot
8120 LWKKYGSIFRVQIGSQKMVVLCGYETVKDALINHGEEFSERPRLPIFQVIANGY 8284
>scaffold_3433:6638-6781 (+) exon 6, 95% to DT436641.1
6638 AKHPETYSYFHNENLVRLVRNVFSAGVETTSTALRWALLLMIKYPDIQ 6781
>scaffold_3433:23829-23954 (+) exon 4
100% to Green = DT436641.1 75% to 21819_prot
23829 FVFSLGKPFDNTMIMNAAVANIIVSIVLGHRFDYQDPKFLRL 23954
>scaffold_2615 : 2-55 (+) exon 9
2 VGFTSVPLEGMICAIPRA 55
>scaffold_2899 : 33841-33948 (+)
94% to 49369_prot scaffold_996:232793-245538
33841 GTQVIPLLASVLQDETYFEKPEEFYPQHFLDSEGLF 33948
>scaffold_590 : 150217-150324 exon 8, 83% to $$$$$$8
150217 LWDKPYFEKPDEFYAQHFLDSEGNFVKNEAFLPFSA 150324
>14945_prot 55% to DN017333.1, 52% to 2C8
scaffold_1232:575-12890 (+) exon 3 partial
575
MAMDSAGTVLLAACVIVLFYLVKWRGNNKRKNLPPGPTAFPLLGNFLQVSTTEIPSSCVE 754
1024
LSKTYGPVFTLYLGGHRSIILIGYDAVKEALIDNSDVFSDRGEGGVSEMIFKNY 1185
3914
GVILSNGERWKTMRRFTLTTLRNFGMGKRSVEERIQEEARSLEEAFRKKK 4063
5417
DEPFDPIYLLGLAVSNIICSIIFGERFDYEDEKFMTLLMYIREFVKLLNSFFGM 5578
5936
LFNFFPNLFCYIPGPHQNIFTYFNKLKQFVKDEAKSHKDTLDANCPRDFIDCFLIRME 6109
8038
QEKNNPNSEFHYENLFGTILDLFLAGTETTSSTLRYAFLILLKYPEIQ 8181
9338
ENVYKEIVQVIGQHRYPSVEDRSKMPYTEAVIHEVQRIGDILPLGLEHAASKDTTFRGYDIPK 9526
11964 GTLIFPLLTSVLKDPKYFKNPDQFDPEHFLDENGCFKKNDAFMPFST 12104
12708
GKRVCAGEGLARMELFIFLTTILQKFILKSTVATEEIKITPEPNTNGSRPWPYKMFVVPRC 12890
>scaffold_16683:748-5473 = BC092552.1
MDPVSVLLSVVVCIFLFKVFYDGEKESQNFPPGPKPLPLIGNLHIINMEKPYLTFME
LAEKYGSVFSFHLGTEKVVVLCGTDAVRDALINHAEEFSGRPKVAIFDQIFKGH
GIIFADGENWKVMRRFSLSTLRDFGMGKKTIEEKISEESDCL
VETFKSHGGKPFDNTMIMNAAVANIIVALLLSQRFDYQDPTLLKLVKSINKIVRITGSSMVMLYNTF
PSIMQWIPGSHQNVVKNAEKIYTFLIETFTKHRHQLDVNDQRDLIDTFLIKQQEEKSSST
KFFHDENLKVLLLNLFGAGMETTSTTLRWGILLMMKYPEVQKKVQDEIDRVIGSAEPRLE
HQKQMPYTDAVIHEIQRFADLVPNNVPHATTKDVTFRGYFIPK
GTHVIPLLTSVLKDKDYFKKPNEFYPEHFLDSEGHFVKNEAFLPFSA
GRRICAGETLAKMELFLFFTNLLQNFTFQPPPGVEVQLTRGVAITSIPTEHKICALPRS*
>52542_prot scaffold_1232:27024-44511 (+)
MELGVTWSLILAVIVSFLVYSFTWRRKLRKINMPPGPPLYPLLGNMLQIS
AKEFPQSLVKLSEKYGTVFTVYLPSKPAVILSGYDCIKEALLDNNESFGA
RGESPLGYLLFKDYGVIFSNGERWKQLRRFSLSCLRDFGMGKKSIEERIQ
EEARCLVEELGKNGDTPMDPTYMLTLAVSNVICTVVFGERFDYKDEKFMT
LISLLKIVSRDFSSAWGIRSRRPRTRSCAQKLLNLFPNTLSRLPGPPQRL
FRNFDKLKAFVAESLKSHQETLNSDCPRDFIDCFLIKMEKEKNNPQTEFH
SDNLFGTVLDLFFAGTETTSITLKYSFLMLLKYTEVTRKAMEEIDNIIGQ
ERCPFYEDRIKMPYTNAVIHEIQRMADIVPLGVPHATTHDIIFRGYNIPK
DTIIFPLMTSVLKDPKYFNDPKQFDPAHFLDENGSFKKNDAFQPFSIGKR
SCLGEGLARMEIFLFITSILQAFNLKSDTAPQDIDITPEPDKNGAIPRTY
KMYFVPK
>14947_prot scaffold_1232:47102-62978 (+)
MAVLGIETLFLVCSFTFLVFLFSRRQRHARLPPGPTPLPLLGNVLQLDFSKQVKEFVKLGSQY
GPVSMVYLGPYPVLVLNGYDVVKEAFVDNGEVFSNRGKNAFIEMIFKGR
GVAFSNGERWRQMRRFSLSTLRDFGMGKRRVEERVQEE
ACALVEEFKKTKGTPFNSTYLMTLAVSNVICSVVFGERFDYQNETFLSVL
ALLKDTFKIITSPWTQLFSFAPGLLKHLPGPHKKAAENLDRLKTFVTEFV
ASHEETLEENFPRDYIDCFLIKMRQEKDNVNTEFDYENLFVTLMNLFFAG
TETTSITLQYGMLILLKYPDIQKKIHEEIDSVIGFNRCPSMEDRPKMPYT
DATIHEIQRFADIVPMGVPRSTNKDTTLRGYDIPKGTTVFPMLTCILKDP
RYFKDPESFNPCHFLDEKGCLKKTDAFIPFSIGKRVCLGEGLARMEIFLF
LTSILQRFELKCHMDPKDIDISPVPSKSAYMPRPYELYITPR
>52545_prot scaffold_1232:71267-83903 (+) short seq
exon 7 gap filled in by DT419848.1, missing exon 2
82% to 52547_prot scaffold_1232:122253-139910
71267 MDVAGLGTFLLVLITFILTLSSWNTMYKKVNLPPGPTPLPLIGNLMNIKKGKMVNSLMK
(0)
GLSFSNGERWRQMRHFTLKTLKNFGMGKKSIEEKIQEEALCLVEEIRKSG (1)
ETPVDPSKLIMDAVSNVFCSIMFGRRFEYNEEKFANLLTNVNEIFRLMSNTWGQ (0)
LESIFPSVMAYIPGPHKKKNTLSEELISFLHERVKSNQETFDPSAPRDFIDEYLMKIEQ (0)
EKKNPNSEFTMRNTLLTFFSIFLGGTETSTTTIKHGLLLLIKYPEIQ (1)
79425 AKLHMEIDHVIGRNRIVNINDRNAMPYMEAVINEIQRFSDIAPLNAPRKVTKDVQFRGYSIPK
(0) 79511
DTEIYPLLCTVHRDPKYFSSPYEFNPSHFLDEQGRFRKSEAMMAFSA (1)
GKRICPGESLARMELFLFFTTILQNFTLTSPTHFTEDDVAPKMAGFMNHPIQYKASFISR*
83903
>scaffold_1232:89713-89889 extra exon 1
MDVTGLGTILLVLISCVLIFSSWKTFYQKHNLPPGPTPLPLMGNLMNIKKGKLVSSLMK
>scaffold_1232
90% to 52547_prot
scaffold_1232:122253-139910, 54% to 2Q2
93883
MDVTGLGTILLVLISCVLIFSSWKTFYQKHNLPPGPTPLPLMGNLMNIKKGKLVSSLMK 94059
96262
LWEQYGAVYTLYFGTQPVIVLCGYDAVKEALVDQAEAFGARGKISSLDPVTQGY 96423
96988 GIGFSNGERWRQMRHFTLKALRDYGMGKKSIEEKIQEEALCLVEEFRKSG
97137
98027
EMPINPSTHIMKAVANIFFSIMLGNRFEYNNETFSALLATLEEMYTLMNNTWSQ 98194
99835
IENVLPKLMAYIPGPHKKRDALAKELILFFHERVKANQETFDPSAPRDFIDEFLIKMEQ 100014
101345 EKKNPNSEFTMRNILMTFFSIFIGGTETSTTTLKHGLLLLIKYPEIQ 101485
116449 AKLHMEIDNVIGRNRTVNLNDRNSMPYMEAVINEIQRFSDIAPLNLPRKVTKDVQFRGYCIPK
116637
119036 DTEIYPLLCTVHRDAKYFSSPYEFNPSHFLDEQGRFKKNDALMAFSA
119176
119933 GKRMCPGESLARMELFLFFTTILQNFTLTSPTHFTEDDVAPKMTGIINHPIQYKASFIA
120109
extra exons 7,8,8
113766 AKLHMEIDNVIGRNRTVNLNDRKFMPYMEAVIN 113864
115357 DTEIYPLLCTVHRDAKYFSSPYEFNPSHFLDEQGRFKKNDALMAFSA
115479
115792
DTEIYPLLCTVHRDAKYFSSPYEFNPSHFLDEQGRFKKNDALMAFSA 115932
>52547_prot scaffold_1232:122253-139910 (+) poor model revised
missing exons 2,3 found on DT436730.1
58% to 52548_prot scaffold_1232:145476-158239, 55% to 2Q2
MYVAGLGTILLVLISCVLIFSSWKTLYQKHNLPPGPTPLPLIGNLMNIKRGKLVSSLMK (0)
LWEQYGAVYTLYFGIQPVIVLCGYDAVKEALVDQAEDFGARGKISSLDPVTQGY
GLSFSNGERWRQLRHFTLKALRDFGMGKKSIEEKIQEEALCLVEEFRKSG
EMPTDPEKPIMKAVSNIFFTIVLGNRFEYNDETFSALLAKVEEMFRLMSNTWSQ (0)
IENVLPKLMAYIPGPHKKRDALGKQLILFLHERIKANQETFDPSAPRDFIDEFLIKMEQ (0)
EKKNPNSEFTMKNTLLTFYSIFLGGTETSTTTLKHGLLLLIKYPEIQ
AKLHMEIDNVIGRNRTANMIDRNSMPYMEAVINEIQRFSDIIPLNVPRKVTKDVQFRGYCIPK
DTEIYPLLCTVHHDAKYFSSPYEFNPSHFLDEQGKFKKNNAMMAFSA
GKRICPGESLTRMELFLFFTTILQNFTLTSPTHFTDNDVAPKMTGFINHPIQYKASFISR
>52548_prot scaffold_1232:145476-158239 (+) 67% to 2Q2
MDITGLGTLVLILLISCIVIYSTWNSMYRKRNLPPGPTPLPLIGNLLQIKRGEMVKSLTE
FGKQYGPVYTLYLGPRPVIVLNGYQAVKEALIDQGEEFSGRGKLVVADLIFGGF
GVVFSNGDRWKQLRRFSLMTLRDFGMGKRSIEERIKEEAQCLQVELHKYK (1)
QTPTDPQNILVQAVSNVICSVVFGNRFEYENSEFLKLLRLFNETFQMMSSTWGQ
LQQIIPFIMNYIPGPHQKIDKVVARQLEFVSERVKKNQETIDFNSPRDFIDCFLIKMQQ
ETQNPTSEFNLKNLLMTVLNLFVAGTETVSSTLRNGILLLLKYPHIQ
EKLHKEIDVVIGQNRSPNIDDRSKMPYMDAVIHEIQRFTDILPMNLPHSVIKDTAFQGYTIPK
DTDVYPMLCSVLRDPTQFTTPENFNPEHFLDDSGCFKKSDAFMPFST
GKRICLGEGLARMELFLFLTTILQNFTLTSETQITESDITPRMAGFANVPISYKVSFVPR
>4371_prot scaffold_1232:170631-190051 (+) = CYP2Q3
MDTTWLWSLQLFLLIATMLIYSTWNKMYRKRNLPPGPTPIPLFGNVLQIKRGEMVKSLLE
LGKKYGPVYTLYFGPSPVIILCDYQSIKEALNDQAEEFSGRGKIPSWDQFFQGY
GESFSNGDEWKQLRRFSLTTLRNFGMGKRGIEERIQEEAQFLVAEIKSYK
GKPFDPTKILVQCVSNVICSVVFGQRYEYSNKDFHKLLYMFQAVFEDTSSTLGQ
LMTLLPNIMNHIPGPHKTVVNKLNKVNDFILQRVKENEKTLDPNSPRHFIDSFLIQMQK
EKDNPVTKFHWKNLLCTIMNLFFAGTETVSTTLRHGFLMLLIHPEIE
EKLHEEIDRVVGQDRSPTIEDRSKMPYTDAVIHEIQRFSDVLPMSLPHLVMKDTQFRGYTIPK
GTDVYPLICAALRDPKQFATPNKFNPQHFLDDNGLFKSSNAFLPFST
GKRICLGEGLARMELFLFLTNILQNFKLHSENQFAEDDIAPKMNGFANYPLSYEFSLIPRVQSLLVL*
>4372_prot scaffold_1232:202751-216554 (+) = CYP2Q2
MDTSWLWTLLLCLLISAMLIYSTWNKMYRKRNLPPGPTPIPLFGNVMQ
IKRGEMVKSLIELGKKYGDIYTLYFGPSPVVILCSYRAIKEALIDQAEEF
SGRGAIPSFDQYFQGYGVVFTNGEEWKNLRRFSLSTLRNFGMGKRGIEER
IKEEAQFLVAEIKSYKEKPFDPTNILVQCVSNVICSVVFGNRFEYANKDF
QNLLSLFQSVFQETSSSWGQLLNMLPAVMNHVPGPHKNIIRDMNKLEDFV
LQRVKENEKTVDPNSPRDLIDSFLIKMQQENKNPTSPFHMKNLIATILSI
FFAGTETVSTTLRHGFLILLIHPEIEAKLQEEIDRVVGQNRSPTIEDRNK
MPYTDAVIHEIQRLSDVIPMNVPHLVTKDTKFRGYTIPKGTNIYPLLCAV
LRDPEQFDTPSKFNPNHFLDDKGCFKSNDGFMPFSTGKRICLGEGLARME
LFLFLTNILQNFKLHSESGLTEDNIAPKMKGFANYPTSYQLSFIPR
>21828_prot two genes fused scaffold_55:606485-680452
second part has some P450 exons poor model (revised)
upper part is rhesus blood group glycoprotein rhag
DT438894.1
603844
MDLVYSPSVCLLLATAVFIILYTLIDWARSSARNFPSGPLALPLIGHLHIINLKRPSEALNK 603659
602714 ISKTHGNIFRIQMGTVEMVVLAGYEAVKEALIDNAEAFAGRPFVPILDDIFHGY
602553
593338 GIPFSHGDNWKEMRRFTLSTFRDFGMGKRTIEDKIIEECGFLIKEIEVYK (1)
593189
590829 DEPVELKEFISVAVGNIISSIVLGHRFDNYQHPTLLRVLELVHENFRLLGSPSVI (0)
590665
588598 LYNIFPIMRFFPGDHKKIMKNLEELHCFLRETFLKHLKVLERDDQRGYIDAYLVRQLE
(0) 588425
586539 EKGNPKSYFHEQNLLSILATLFAAGTDTTIASIRWAISFMVKNPLIQ (1) 586399
584383
KRVHEEIDRVIGSSQPQFHHRKSMPYTNAVVHETQRVANVVPMNLPHATTRDINFRGYHLPK (0) 584198
581601 GTYIVPLLESVLFDKTQFERAEEFYPEHFLDSDGKFVMRPAFLPFST (1) 581464
GKRICIGETLAKMELFIFFTSLMQKFSFHPPPGDPNFDVKPAIGLTSPPLPRKLCIVPRS
>6348_prot scaffold_55:566754-603844 DT450622.1 83% to
21828_prot
578469
MDLVYSPSMCLLLAAVVFIILYTLIDWARSSARNFPSGPLALPLIGHLHIINLKRPSEALNK 578284
577359 ISKTHGNIFRIQMGTVEMVVLAGYETVKEALIDNAEAFAGRPFVPILDDIFHGY
577198
575649 GIPFSNGENWKEMRRFTISRFRDFGVGKRTMEDKITEESVCLIKEMEVLK 575500
575200 DEPVELTPYISVAVGNIIASIVLGHRFDDYKNPTLLRVLQLTSENLSYLGSPSVL
575036
574075 LYNVFPILRFFPGDRNKLLKNLKELHCFLRETFMKHLKVLERDDQRGYIDAFLVKQLE
573902
572545 EKENSNSYFHEKNLICILVSLFSAGTDTTIASIRWALTFMVKNPHIQ 572408
571008
QRVHEEIDRVIGSSQPQFHHRTSMPYTNAVVHETQRVANVVPMNLPHATTTDVNFRGYHLPK 570823
569487 GTYVVPLLESVLFDKTQFERAEEFYPEHFLDSDGKFVMRPAFLPFST 569347
566936
GKRICIGETLAKMEVFIFFTTLMQKFSFHAPPGEPDIEIKRGIGLTSPPLPQKLCIVRRS 566757
>scaffold_55 exon 7, same as NM_001015757.1
558700 MDFTFSLATYLVLVVTVLYILSNWKRKALNNFPPGPKGWPLVGNVFSIDLKKPQRTYIE
558524
550333
EKVHDEIARVIGSAHPTYSHRTQMPFTNAVIHEMLRFADIVPLSVPHETTRDVHFKGYFIPK 550148
>21822_prot 2 P450s fused scaffold_55:465695-533227 (-)
99% to DT436641.1
549227 GTYIIPLLTSVLKDKTQFDAPEEFNPNHFLDSEGNFLKKEAFMPFSA 549087
548988 GRRACPGEILARMELFIFFTSLLQKFSFHPPPGVTNINLSSDVGFTSVPLEGMICAIPRA
548809
>scaffold 55 partial seq 100% to DT436641.1
543754 LSKKYGPVFSVQMGRKKMVILVGYETVKDALVTHAEEFGGRASIPVNKNLEK 543599
542568 GMIFSNGENWKAMRRFTITTLKDFGMGKSTIEETIAHECSYLVQYFASFK 542419
542089 GKPFDDSTILITSVANIIVAILLGHRMEYEDPVFLRLVNLNSEYVKLLGSPMVT
541928
541086 IYNMFPALGFLPGCHKTIEKNIKELYAFVRRTFVEHQKHLDIHDQRSFIDAFLARQKEE
540910
540653 AKHPETYSYFHNENLVRLVRNLFSAGMETTSTALRWALLLMIKYPDIQ 540510
>DT436641.1 DT433530.1 DT443285.1 DN045517.1 S.Sarva 95% to NM_001015757
MDFTFSLATYLVLVVTVLYILSNWKRKALNNFPPGPKGWPLVGNVFSIDLKKPQRTYIE
LSKKYGPVFSVQMGRKKMVILVGYETVKDALVTHAEEFGGRASIPVNKNLEKGL
GMIFSNGENWKAMRRFTITTLKDFGMGKSTIEETIAHECSYLVQYFASFK
GKPFDDSTILITSVANIIVAILLGHRMEYEDPVFLRLVNLNSEYVKLLGSPMVT
IYNMFPALGFLPGCHKTIEKNIKELYAFVRRTFVEHQKHLDIHDQRSFIDAFLARQKEE
AKHPETYSYFHNENLVRLVRNLFSAGMETTSTALRWALLLMIKYPDIQEKVHDEISRVIGSAHPTYSHRTQMPFTNAVIHEILRFSDILPLGVPHETTRDVHFKGYFIPKGTYIIPLLSSVLKDKTQFDAPEEFNPNHFLDSEGNFLKKEAFMPFSA
GRRACPGEILARMELFIFFTSLLQKFSFHPPPGVTNINLSSDVGFTSVPLEGMICAIPRA*
>scaffold 55 (-) 100% to DT436641.1
539574 GMIFSNGENWKAMRRFTITTLKDFGMGKSTIEETIAHECSYLVQYFASFK
539425
539096 GKPFDDSTILITSVANIIVAILLGHRMEYEDPVFLRLVNLNSEYVKLLGSPMVT
538935
538098 IYNMFPALGFLPGCHKTIE
538042
>scaffold 55 (-) 96% to DT436641.1
532638 MDFTFSLATYLVLVVTVLYILSNWKRKALNNFPPGPKGWPLVGNVFSIDLKKPQRTYIE
532462
530541 LSKKYGPVFSVQMGRKKMVILVGYETVKDALVTHAEEFGGRASIPVNKNLEKGL
530383
527019 GITFSNGENWKAMRRFTITTLKDFGMGKSTIEETIAHECSYLVQYFASFK
526870
525259 GKPFDNSTILSTSVANIIAPILFGHRMEYEDPVFLRLVNLNSEYVKLLGSPMVT
525098
524043 IYNMFPALGFLPGCHKTIEKNLKELYAFVRRTFVEHQKHLDIHDQRSFIDVFLARQKE
523870
521860 EAKHPETNSYFHNENLVRLVRNVFSAGMETTSTALRWALLLMIKYPDIQ
521714
521136 EKVHDEISRVIGSAHPTYSHRTQMPFTNAVIHEILRFADIVPLSVPHETTRDVHFKGYFIPK
520951
519207
XXXXXXXXXXXLKDKTQFDAPEEFNPNHVLDSEGNFLKKEAFMPFSA 519100
519001
GRRACPGEILARMELFIFFTSLLQKFSFHSPPGVTNINLSSDVGFTSVPLEGMICAIPRA 518822
>scaffold 55 (-) 94% to NM_001015757.1 95% to DT436641.1
506398 MDFTFSLATYLVLVVTVFYILSNWKRKALNNFPPGPKGWPLVGNVFSIDLKKPQRTYIE
506222
504460 LSKKYGPVFSVQMGRKKMVILVGYETVKDALVTHAEEFGGRAYIPVNKDLEKGL
504299
500885 GITFSNGENWKAMRRFTITTLKDFGMGKSTIEEKITHECSYLVQYFAFSK 500736
500410 GKPFDNSTILITSVANIIVAILLGHRMEYEDPVFLRLLNLNSEYVKLLGSPMVT
500252
499245 IYNMFPALGFLPGCHKTIERNMKELYAFVRRTFVEHQKNLDIHDQRSFIDAFLARQKEE
499069
497715 AKHPETKSYFHNENLVRLVRNVFSAGVETTSTALRWALLLMIKYPDIQ 497572
497006 EKVHDEISRVIGSAHPTYSHRTQMPFTNAVIHEILRFADIVPLNVPHETTRDVHFKGYFIPK
496821
496260 GTYIIPLLSSVLKDKTQFDAPEEFNPNHFLDSEGNFLKKEAFMPFSA 496120
496020 GRRACPGEILARMELFIFFTSLLQKFSFRPPPGVTNINLSSDVGFTSVPLEGMICAIPRA 495841
>Green = DT436641.1
75% to 21819_prot
trace archive for gap
243598069 431692585 (both run into gap)
484940
MFLGDPVTVLLAVALCLIVAITLYRQKRDSSKNFPPGPKPLPIIGNIHNINLKRPYLTYL E 484758
481692 LWKKYGPIFRVQIGSQKMVVLCGYETVKDALVNYAEEFSERPVVPIFLDVVKEY
481531
seq gap
479226 GKPFDNTMIMNAAVANIIVSIVLGHRFDYQDPKFLRLMSLINENLRLTGSPTVM
479065
477865 LYNVFPSVMRWLPGNHQTVGKNAAENQRFIRETFIKHKEKLDVNDQRNLVDAFLVKQQE
477689
474586 KNGNAVYFHDDNLTMLVSNLFAAGMETTSTSVRWGLLLMMKYPEIQ
474449
470107
KNVQNEIEKVIGQSRPQTEHRKSMPYTDAVIHEIQRFGNIIPMNLPHATAQDVTFRGYFLPK 469922
466117 GTYIIPLLSSVLKDKTQFDAPEEFNPNHFLDSEGNFLKKEAFMPFSA
465977
465877 GRRACPGEILARMELFIFFTSLLQKFSFHPPPGVTNINLSSDVGFTSVPLEGMICAIPRA
465698
>$$$$2 100% to $$$$3 and 21819_prot
454570
MFLGDPVTVLLAVALCLIVANTLYRQKRDSYKNFPPGPKPLPIIGNIHNINLKRPYLTYLE 454388
>$$$$3 100% to $$$$2 and 21819_prot
449457 MFLGDPVTVLLAVALCLIVANTLYRQKRDSYKNFPPGPKPLPIIGNIHNINLKRPYLTYLE
449275
>6347_prot scaffold_55:435508-454570 (-) = first exon of seq below
join with scaffold_55:422403-435585 (-) between 6347 and 21819
84% to 21819_prot
duplicated exons 5 and 6
436471 LWKKYGPIFSVQIGSQKMVVLCGYETVKDALVNYAEEFSERPVVPIFLDAVKEY
436310
435614
GIIFSHGENWKVMRRFTLSTLRDFGMGRRTIEDRINEECDFLVEQFKSFK 435465
434353
GEPFENTMIMNAAVANIIVSIVLGHRFDYQDPIFLRLMSLINENIRLMGSPTVM 434192
432802
LYNVFPSVMRWLPGNHQTVGKNAAENRRFLRETFTKHRDKLDINDQRNLVDAFLVKQ Q 432629
432003 EKNGNAVYFHDENLTMLVSNLFAAGMETTSTSVRWGLLLMMKYPEIQ
431863
430611 LYNVFPSVMRWLPGNHQTVGKNAAENRRFIRETFTKHRDKLDVNDQRNLIDAYLVRQQ
430438
429812 EKNGNAVYFHDDNLTVLVSNLFAAGMETTSTSVRWGLLLMMKYPEIQ
429672
428838
ENVQNEIEKVIGQSRPQTEHRKSMPYTDAVIHEIQRFGNIIPMNLPHATAQDVTFRGYFLPK 428653
427205 GTYVIPLLTSVLYDQTRFEKPKEFYPQHFLDSEGNFVKNEAFLPFSA
427065
426319 GKRSCAGENLAKMELFLFFTSLLQNFTFQAPPGEELDLTPAIGITTPPLPHNICALPRT
426143
>21819_prot parts of two genes long last intron has more exons
scaffold_55:413156-422625 (-) corrected gene model
81% to scaffold_55:314488-344970
422625
MFLGDPVTVLLAVALCLIVANTLYRQKRDSYKNFPPGPKPLPIIGNIHNINLKRPYLTYLE (0) 422443
421910
LWKKYGSIFSVQIGSQKMVVLCGYETVKDALVNHGEEFSERPEIPIFHVIAKGY
(1) 421749
420605
GVIFSHGENWKVMRRFTLSTLRDFGMGKKSIEDKINEECDSLVEKLRSY (1) 420456
419591
GKAFENSVTINAAVANIIVSLLLGRRFDYEDPTFLRLMSLMNANFRLMGSPMVM 419430
417270
LYNLYPSIIRWLPGSHKTVGKNAAETQRFIRETFTKRREKLDVNDQRNLIDAFLVRQQ 417097
416953
ETKEDGCSFHDDNLTVLVSNLFAAGMETTSSTLRWGLLLMMKYPEIQ (1) 416813
415727
KNVQNEIEKVIGQSRPQTEHRKSMPYTDAVIHEIQRFGNIIPMNLPHATAQDVTFRGYFLPK (0) 415542
414638 GTYVIPLLTSVLYDKDHFEKPNEFYPQHFLDSEGNFVRNEAFLPFSA
(1) 414498
413332
GKRSCAGENLAKMQLFLFFTSLLQNFTFQAPPGEELDLTPTTGFTTPPLLHNICALPRT 413156
394115
NFSFQAPPGEELDLTPTTGFTTPSLLHNICALPHT* 394233
394006
NFTFQAPPGEELDLTSTTGFTTPPLPHNICALPRT* 394114
393899 NFTFQAPPGEELDLTPTTGFTTPPLPHNICALPRT* 394005
393791
NFTFQAPPGEELDLTPTTGFTTPPLPHNICALPRT* 393898
>second exon 4-9 86% to 21818_prot CX463658.2
CR436794.1 CR426826.1
84% to 21819_prot, N-term from ESTs
MFLGDPVTVLLAVALCLIVANTLYRQKRDSYKNFPPGPKPLPIIGNIHNINLKRP
YLTYLELWKKYGPIFSVQIGSQKMVVLCGYETVKDALVNYAEEFSERPVIPIFLDAVKEY
GVIFSHGENWKVMRRFTLSTLRDFGMGRRTIEDRINEECDFLVEQFKSFK
393678 GKPFDNTMIMNAAVANIIVSIVLGHRFDYQDPIFLRLMSLINENVRLTGSPKAM 393517
391361 LYNVFPSVMRWLPGNHQTVGKNAAEYHRFIRETFTKYRDKLDINDQRNLVDAFLVKQQ 391188
390087
EKNGNAVYFHDDNLTVLVSNLFVAGMETTSTSVRWGLLLMMKFPEIQ 389947
373288 ENVQNEIEKVIGQSRPQTEHRKSMPYTDAVIHEIQRFGNIIPMNLPHATAQDVTFRGYFLPK 373103
GTFVIPLLTSVLYDQTRFEKPKEFYPQHFLDSEGNFVKNEAFLPFSA
369454 GKRSCAGENLARMELFLFFTSLLQNFTFQAPPGEELDLTPAIGITTPPLPHNICALPRT
369278
>scaffold_55 fragment of exon 5 same as 389947 exon 5
368850 TTSTSVRWGLLLMMKFPEIQ 368791
>21818_prot 72% to NM_001004878.1 82% to 21819_prot
scaffold_55:344864-351768 (-) CF344279.1
354925
MFLGDPVTILLAVVLCLIVANTLYRGKKDGVGNLLPGPKPLPIIGNIHILNLKKPYLTYLK (0) 354743
LWKKYGSIFRVQIGSQKMVVLCGYETVKDALINHGEEFSERPRLPIFQVIANGY
351804 GVAFSHGENWKVMRRFTLTALRDFGMGRRTIEDRINEECDFLVEAFKSYK 351655
350789 GKPFENLMILNAAVANIIVSIVFGHRFDYQNPTFLRLMRLINENARLLGSPTAM
350628
348627 LYNVFPSVMRWLPGSHKTLRKNVDEIKIFIRETFTKQRDKLDVNDQRNLIDAFLVKQQ
348454
347806 EKNGNGPYFHDENLTTLVNNLFSAGMETTSSTLRWGLLLMMKYPEIQ 347666
347063 KNVQNEIEKVIGQSRPQIEHRKSMPYTDAVIHEIQRFGNIIPMNLPHATAQDVTFRGYFLPK
346956
346172 GTYVIPLLTSVLYDQSHFEKPNEFYPQHFLDSEGNFVKNEAFLPFSA 346032
345043 GKRSCAGENLANMELFLFFTSLLQNFTFQAPPGEELDLTPGTGLSAPPLPHNICALPRT
344867
>scaffold_55:314488-344970 81% to 21819_prot
339702
MFLGDPVTLLLAVVLSLIVANTLYRKERVNVQNFPPGPKPLPIIGNIHNINAKRPYLTYLE (0) 339520
337426
LWKKYGSVFSVQIGSQRMVLLCGYETVKDALVNHAEEFSDRPIIPLFHEITKGN 337265
333747
GVVFANGENWKVMRRFTILALRDFGMGRRTIEYRINEECDFLVEKIKSYRG 333595
333068
GEPFENTMIMNAAVANIIVSILLGHRFDYQDPTILRLLSLINQSVKITGSPMVM 332907
331666
LYNMFPSVMRWLPGSHKTLAINVAEIQSFIRETFTKYRDKLEINDQRNLIDAFLVKQQE 331490
330183
NKENGLYFHDDNLTMLVSNLFTAGMETTSSTLRWGLLLMMKYPEIQ 330046
328774
ENVQNEIEKVIGQSRPQTEHRKSMPYTDAVIHEIQRFGNIIPMNLPHATAQDVTFRGYFLPK 328589
327361 GTFVIPLLMSVLYDQSHFENPNEFYPQHFLDSEGNFVKNEAFLPFSA
327221
321805 GKRSCAGENLARMELFLFFTSLLQNFTFQAPPGEELDLTPGTGLSAPPSPYKICALPCS
321629
>21816_prot 77% to NM_001035117 correct seq 80% to
21810_prot
scaffold_55:303150-314597 (-)
314597 MDPISILLSIAVCVFLLNLFYGGKGDSKMFPPGPKPLPLIGNLLIMNMKKPHLTFME
(0) 314427
314228 LAEKYGSVFSVQLGTEKVVVLCGTDAVKEALINHADEFSERPKIPIFEDVSKGY
314067
312244 GLIFSHGENWKVMRRFTLTTLRDFGMGKKTIEERICEESDCLVEAFKSYK 312095
310744 GKPFENTLIMNAAVANIIVSILLGHRFDYQDTALLKLIKIINENVRLMGSPMVM
310583
308441 LYNTYPSVMQWLPGKHKTVAENTLKLFKFLEETFTKHRDQLDVNDQRDLVDTFLVKQQE
308265
307774 EKPSSSKFFHDQNLTLLVSNLFGAGMETTSTTLRWGLLLMMKYPDIQ 307634
306167
KKVQDEIDKVIGSAEPQTEHRKLMPYTDAVIHEIQRFANIAPSNLPHATTTDVTFRGYFIPK 305985
304474 GTQVIPLLTSVLQDKNYFKKPEEFYPEHFLDSEGHFMKNEAFLPFSA 304334
303329 GRRSCAGETLAKMELFLFFTKLLQNFTFQPPPGVEVQLTSGEGFTSSPLQHNICALPRT
303153
>21813_prot scaffold_55:257853-278655 (+) 90% to 21811_prot
part of exon 2 in seq gap, trace archive for exon 2 586458683
262853 MDPVSVLLSVVVCIFLYKVFYGGKERPENFPPGPKPLPLIGNLHIMNMRKPHLTFME
(0) 263023
265029 LAKTYGSVFSFQLGLEKIVVLCGTDTVKDALINHAEEFSERAKIPVFEDIAKGH
269321 GIVFAHGENWKVMRRFTLSALRDFGMGKKTIEDKICEESDCLVETFKSYN 269470
270333 GKPFDNTFILNSAAANIIVTILLGDRFDYKDPKMLNLIKVVNQNMRIGGGFMVR
270494
273263 LYNTYPTIMRWIPGSHQTVSKNVATIFKFLNETFTEHRKVLDVNDQRDLIDAFLVKQQE
273439
274172 EELSSKKFFYNQNLTVLVTNLFAAGMETTSTTLRWGLLLMMKYPEIQ 274312
276404
KKIQKEIDQVIGSAQPRLEHRKQMPYTDAVIHEIQRFANIAPINIPHETTQDVTFRGYFIPK 276589
277716 GTQVIPLLASVLRDKAYFKKPEEFYPEHFLDSEGNFVKNEAFLPFSA 277856
278476 GKRSCAGETLAKMELFLFFTKLLQNFTFQPPPGVEVQLTCGVALTSIPLDHKICALPRS
278652
>scaffold_55:287553-290430 exons 1-3 (+) 89% to 21811_prot
21815_prot exons 4-8 missing exon 9 scaffold_55:291301-297995 (+)
287561 MDPVSVLLSVVVCIFLYKVFYGGEKESQNFPPGPKPLPLIGNLHIMNMRKPHLTFME
(0) 287731
289748 LAKTYGSVFSVQLGLRKTVVLCGADTVRDALINHAEEFSERARIPVFEDITKGHG
289912
290242 GIVFAHGENWKVMRRFTLSTLRDFGMGKKTIEDKICEESDSLVEIFKSYN
290391
291900 GKPFDNTLILNSAVANIIVTILLGDRFDYKDPTLLKLVKVVNQNIRIGGGFMAR
292061
292702 LYNIYPSVMRWIPGDHKTVFKNIAKVYKFLNKTFTEHRKVLDVNDQRDLIDAFLVKQQE
292878
294169 EKLSSKKFFHNQNLTVLVANLFAAGMETTSTTLRWGLLLMMKYPEIQ 294309
295416
KKIQEEIDRVIGSAEPRLEHRKLMPYTDAVIHEIQRFANIAPNNVPHETTQDVTFRGYFIPK 295601
296551 GTQVIPMLTSVLRDKAYFKKPEEFYPEHFLDSEGKFVKNEAFLPFSA 296691
297892 GRRSCAGETLAKMELFLFFTKLLQNFTFQPPPGVEVQLTCGVAMTSIPLYHNIC
ALSRS* 298071
>21812_prot scaffold_55:242882-252687 (-) 90% to 21809_prot
DN029946.1 fills seq gap
252687 MFSFEPITLFMAIVICLLIYLVYGGKGTPPNFPPGPKPLPLIGNLHIMNLKKPYMTLME
(0) 252511
250984 LGKKYGSVFSVQLGTEKVVVLCGYDAVKDALINHAEEFSDRPIIEAFHRRSNGH
250823
250732 GITFSHGENWKVMRRFTLATLRDFGMGKRTIEDKINEECISLVETFQSYK 250583
GEPFENSLILNAAVANIIVSILLGHRFEYQDPTLLKLIRLINEIARILGTPIVM
LYNAYPSVMRWLPGSHHNVEKNTQKSHTFI
247704 KETFAEHKAQLDINDQRDFIDAFLIKQSE 247618
245612 EKSATGRFFHNENLVSLVDSLFSAGMETTSTTLRWSLMLMMKYPEIQ 245472
245315
KKVQEEIDKVIGSAQPQMEHRKQMPYTDAVIHEIQRFADIVPTNLPHSTTKDVTFRGYLIPK 245130
243475 GTQVIPLLTSVLRDKAYFERPYEFYPQHFLDSEGNFVKNEAFIPFSA 243335
243067 GKRSCAGETLAKMELFLFFTKLLQNFTFQSPPGQDLHLTPLVGFTSAPMVHKICALSRTLD*
242882
>21811_prot scaffold_55:216779-238524 (+) 90% to 21813_prot
78% to NM_00103511
223244 MDPVSVLLSVVICIFLYKVFYGGKETSKNFPPGPKPLPLIGNLHIMNMKKPHLTFME
(0) 223414
225164 LAEKYGSVFSFEFGLRKTVVLCGTDTVRDALINHAEEFSERARIPVFEDITKGH
(1) 225325
225574 GIVFAHGENWKVMRRFTLSTLRDFGMGKKTIEDKICEESDCLVEIFKSYN
(1) 225723
227389 GKPFDNTLIMNSAVANIIVTILLGDRFDYKDPTMLKLVKVVNQNIRITGGLMAR
(0)227550
230255 LYNIYPSIMRWIPGSHQTVSKNMAKVFKFLNETFTEHRKQLDVNDQRDLIDAFLVKQRE
(0) 230431
232470 EKLSAKTFFHNDNLTVLVTNLFGAGMETTSTTLRWGLLLMMKYPVIQ
(1) 232610
234765 KKVQKEIDQVIGSAQPRLEHRKQMPYTDAVIHEIQRFANIAPINIPHETTQDVTFRGYFIPK
(0) 234947
236825 GTQVIPVLTSVLQDKAYFKKPEEFYPEHFLDSEGKFVKNEAFLPFSA
(1) 236965
238345 GKRSCAGETLAKMELFLFFTKLLQNFTFQPPPGVEVQLTCGVALTSIPADHKICALPLS
238521
>21810_prot scaffold_55:188843-209598 (-) 80% to 21816_prot
209598 MDPVSVLLSVVVCIFLFKVFYGGKRTLENFPPGPKPLPLIGNLHMMNMKKPHLTFME
(0) 209428
207937 LAEKYGSVFSVHLGTEKVVVLCGTDTVRDALINHAEEFSERAKMPIFEDFSKGL
(1) 207776
206533 GVVFGHGENWKVMRRFTLSTLRDFGMGKKTIEERISEESDCLVETIKSYE
(1) 206384
205040 GKPFDNTLIMNAAVANIIVHILLNHRFDYQDPTLLKL
LINIVIDNIKIGGSPIVM 204879
200634
LYNTYPSVVRWIPGSHKTLGENTAQLYKFLEETFTQHREQLDVNDQRDLIDAFLVKQQE 200458
198405 EKPSSAKFFHNENLVALLANLFVAGMETSSTTLRWGLLLMMKYPDIQ 198265
192757
KKVQDEIDKVIGSAEPRLEHRKLMPYTDAVIHEIQRFANIAPISLPHATTTDVTFRGYFIPK (0) 192572
191365
DTQVMIVLTSVLQDKDYFKKPEEFYPEHFLNSKGNFVKNEAFLPFSA (1) 191222
189019
GRRRICAGETLAKMELFLFFTKLLQNFTFQPPPGVEVDLTCADAMTSKPQEHQICALPRG* 188843
>21809_prot bad model green parts OK 90% to 21812_prot
77% to NM_001035117
(lower case)
scaffold_55:163837-187896 (-)
184541
MVSFEPITLFLAIVICLFLIYLVYGGKGTPPNFPPGPKPLPLIGNLHIINLKKPYMTFME (0) 184362
173558 LGKKYGSVFRVQLGTEKVVVLCGYDAVKDALINHAEEFSDRPIIETFHRRSNGH
173308 GITFSHGENWKVMRRFTIATLRDFGMGKRTIEDRINEECHSLVETFQSYK
173159
171488 GEPFETNLIMNAAVANIIVSILLGHRFEYQDPTLLKLIGLSNEMVRILGSPIVL
171339
169346 LYNAYPSVMKWLPGSHHNVIKNTQKSHTFIKETFTEHKAQLDINDQRDFIDAFLAKQSE
169170
167042 KKPNPGLFFHNENLVSLVDGLFVAGMETTSTTLRWGLLLMMKYPEIQ 166899
166538 KVQDEINKVIGSAQPQTEHRKQMPYTDAVIHEIQRFADIIPANLPHATTKDVTFRGYFIPK
166356
164553 GTQVIPMLTSVLRDKDYFERPYEFYPQHFLDSEGNFVKNEAFLPFSA
164413
164016 GKRSCAGETLAKMELFLFFTNLLQNFTFQPPPGQDLNLTTTGGFTSIPMVHKICALSRN
163840
>21808_prot New exons 3-5 scaffold_55:152750-158460 (+) no ESTs 83% to 21810_prot
exon 7 decaying, probable pseudogene
154072 MDPVSVLLSVVICIFLYKIFYGGKETPENSPPGPKPLPLIGNLHMINMKKPHLTFME
(0) 154242
seq gap
155983 GIVFAHGENWKVMRRFTLSTLRDFGMGKKTIEDRISEESDCLVGVFKSYE
(1)156132
156898 GKPFDNTMIMNAAVANIIVHILLNHRFEYQDPTLLKLIKIVSENIRIGGSPIVM (0) 157059
157308
LYNTYPSIMRWIPGRHKTVGANTAKLYDFLKETFTRHREHLDVNDQRDLIDVFLVKQQE (0)157484
158540 KKLSSTKFFHDENLTVLLGNLFGAGMETTSTTLRWGLLLMMKYPEVQ
158680
159012 LYNAFPSVMGWLPGRQQRLFENSQTFHESI KHKSQLDISDQRDLL 159147
>scaffold_55: 150542-150360 (-) 100% to NM_001004878.1
150542
MLAADPMTILLSAFICLLLGFVLFGRKRNVCQNFPPGPRPLPVIGNLLLMDRKQPYKALLK (0) 150360
>NM_001004878.1 66% to NM_001035117, 51% to 2K17 zebrafish
21807_prot (extra N-term piece) P=Q in browser
scaffold_55:119438-130135 (-)
130135 MLAADPMTILLSAFICLLLGFVLFGRKRNVCQNFPPGPRPLPVIGNLLLMDRKQPYKALLK (0) 129953
126756 VSKKYGPVCSFQIGPLKTVVLCGYDTVKDALLNDEFADRPAMPMLDDVAKGH 126601
124037 GILSSNGENWRVMRRFALSTLRDFGMGKKTIESKINEECDHLVQKFSSY 123891
123551 GKPFDTTMIMNAAVANIIASILLSHRFHYENPTLLRLLKLVNENTKFMASRIAM
123390
123231 LYNTFPSIMRWIPGCHKSIYKNAQELLEFIRETFSKQKVELDINDQRNLIDAFLSRQQE
(0) 123055
122583 PNSGKYFHDDNLTILVFDLFVAGMETTSTTLRWALLLMMKYPEIQ 122449
121271
KKVQDEIEKVIGSAEPRAEHRKEMPYTDAVIHEIQRFANIFPMNGPHATTKDVTFRGFLIPK 121089
119957 GTFVIPLLASVLKDENYFKKPNEFYPEHFLDSEGHFVKNDAFLPFSA 119817
119617 GRRSCAGENLARMELFLFFTSLLQNFTFQAPPGEELDLTPDVGIATPPMPHTVCALPRA 119441
>exon 1 with frameshift at end same as 21811_prot
115608 MDPVSVLLSVVICIFLYKVFYGGKETSKNFPPGPKPLPLIGNLHIMNMKKPHLTX
115769
115769 ME (0)115774
>$$$$$ 93% to NM_001004777
112992
MLAADPMTILLSAFICLLLGFVLFGRKRNVCQNFPPGPRALQVIGNLLLMDRRQPYETLIE (0) 112810
>NM_001004777.1 (gap missing C-helix, 22 aa) CX454308.2 69% to
NM_001035117
scaffold_55:85652-94653 (-)
94653 MLAADPMTILLSAFICLLLGFVLFGRKRNVCQNFPPGPRALPVIGNLLLMDRKQPYKTLME (0) 94471
93921 VSKKYGSVFSVRVGPLKMVVLCGYDTVKDALLNYPDEFADRPALPLFDELVKGH 93760
GIIFSNGENWKVMRRFSLSTLRDFGMGKKTIESKIIEECDHLVQKFNSYG
91987 GKPFDTTMIMNAAAANIIASILLSHRFHYENPTLLRLLKLVNENMRLMASPIAL
91826
89951
LYNTYPSIMRWVPGCHKTIYNNAQELMEFIRETFSKHKVELDINDQRNLIDAFLSRQQE 89775
88013
EKPHSAKYFHDDNLTILVIDLFAAGMETTSTTLRWALLLMMKYPEIQ 87873
87405
KKVQDEIEKVIGSVEPRAEHRKEMPYTDAVLHEIQRFANITPMNGPHATTKDVTFRGFFLPK 87220
86157
GTYVIPLLASVLKDENYFEKPNEFYPEHFLDSEGHFMKNEAFLPFSA 86017
85828 GRRSCAGETLARMELFLFFTSLLQNFTFQAPPGEELDLTPDVGGTVPPRPHTVCALPRS
85652
>scaffold_55
82615
GRRSCAGKTLAKMKLFLFFTSILQNFTFQAPPGVEPDLTPAISGTRTHKPHTVCALPRA 82439
>$$$$$4 95% to NM_001004777.1
78951 MLAADPMTILLSAFICLLLGFVLFGRKRNVCQNFPPGPRALPVIGNLLLMDRKQPYKTLME (0) 78769
78170
VSKKYGPIFSVRAGPQKMVVLCGYDTVKDALLNYPDEFADRPALPLFDEVVKGH 78009
76552
GIFFSNGENWKVMRRFGLSALRDFGMGKKTIESKINEECDHLVQKFNSYG 76403
75689
GKPFDTTMIMNAAAANIIASILLSHRFQYENPTLLRLLKLVNENIRLMASPIAL 75528
74153
LYNTYPSIMRWVPGCHKTIYKNAQELMEFIRVTFSKHKAELDINDQRNLIDAFLSRQQE 73977
67696
EKPHSAKYFHDDNLTILVFDLFAAGMETTSTTLRWALLLMMKYPEIQ 67556
67082
KKVQDEIEKVIGSVEPRAEHRKEMPYTDAVLHEIQRFGNITPMNGPHATTKDVTFRGFFLPK 66897
69093
GTYVIPLLASVLKDENYFEKPNEFYPEHFLDSEGHFVKNEAFLPFSA 68953
68767
GRRSCAGETLARMELFLFFTSLLQNFTFQAPPGEELDLTPDVGGTVPPRPHTVCALARS 68591
>$$$$$5
Note frameshift in exon 1
55169 MDPVSVLLSVVVCIFLYKVFYGGKEASQ 55086
55084 NFPPGPKPLPLIGNLHMMNMKKPHLTFME 54998
53620
FSKKYGPVFSIQLGLNKAIVLCGADAVKDALINHGDEFSGRPKIPVFDQISKGY 53459
52239 GVVFADGENWKVMRRFALSTLRDFGMGRKTIEDTIVEEXXXXXXXXXXXX
52126
51713
AKPFDNTLILNAAVANIIVHILLNHRFEYQDPTLIKLIKSVSENVKIAGSPIVM 51552
50894
LYNTYPSIMGWIPGSHKTVFENFQKLSNFLKETFTKRRDQLDVNDQRDLIDAFLVKQQE 50718
50601
LALQFQEKSSSKKFFHDENLKVLLGDLFAAGMETTSTTLRWGILMMMKYPDIQ 50443
49280
KKVQDEIDRVIGSAEPRLEHRKQIPYTDAVIHEIQRFANLVPIVLPHSITEDVTFRGYFLPK 49095
48788
GTQVIPLLISVMQDKDYFQKPEEFYPEHFLDSKGNFVKNEAFLPFSV 48648
48515
GKRSCVGETLAKMELFLFFTKLLQNFTFQPPHGVEVQLTCGDALTSIPLDHKICALPRS 48339
>$$$$$$6 nearly identical to $5
37714
GVVFADGENWKVMRRFALSTLRDFGMGRKTIEDTIVEESGCLVETFKSHE 37565
37222
GKPFDNTLILNAAVANIIVHILLNHRFDYQDPTLIKLIKSVSENVKIAGSPIVM 37061
36400
LYNTYPSIMGWIPGSHKTVFENFQKLSNFLKETFTKRRDQLDVNDQRDLIDAFLVKQQE 36224
36089
EKSSSKKFFHDENLKVLLGDLFAAGMETTSTTLRWGILLMMKYPDIQ 35949
34787
KKVQDEIDRVIGSAEPRLEHRKQIPYTDAVIHEIQRFANLVPIVLPHSITEDVTFRGYFL K 34608
34295
GTQVIPLLISVMQDKDYFQKPEEFYPEHFLDSKGNFVKNEAFLPFSV 34155
34022
GKRSCVGETLAKMELFLFFTKLLQNFTFQPPHGVEVQLTCGDALTSIPLDHKICALPRS 33846
>$$$$$$7 duplicate exons to $8
29803
LAKKYGPVFSVQLGTKKTVVLCGTDAVKDALINYADEFSGRPKTPLSEQASKGN 29642
28967
GIIFANGENWKVMRRFTLSTLRDFGMGKKTIEDRISEESDCLVETFKSHKGR 28812
28033
GKPFDNTLILNAAVANIIVHILLNHRFDYQDPTFLKLIKSVNDNVRNGARPIIVVSKLWP 27854
>$$$$$$8 no ESTs
26641
LAKKYGPVFSVQLGTKKTVVLCGTDAVKDALINYADEFSGRPKTPLSEQASKGN 26480
25807
GIIFANGENWKVMRRFTLSTLRDFGMGKKTIEDRISEESDCLVETFKSHKGR 25652
22131
LYNAFPSIIRWIPGTHKRIFASSQNFFNFLKEIFMKRKDQLDVNDQRDLVDAFLVKQQE 21955
21874
EKSSSTKFFHDENLKVLIGNLFGAGMETTSTTLRWGILLMMKYPEIQ 21734
20135
KKVQDEIDRVMGSTEPRPEHRKQMPYTDAVIHEIQRFADLVPNGVPHATTTDVTFRGYFIPK 19950
19461
GTQVFPLLTSVLRDKAYFKKPDEFYPEHFLDSEGNFLKNEAFLPFSAG 19318
>21803_prot scaffold_55:62-7002 (-) 84% to seq at 28033
DN017398.1 DT401910.1 DN087618.1 DN099678.1 DN087299.1
Seq completed by ESTs
49361_prot scaffold_996:1053-7259 same seq as 21803_prot
scaffold_55:62-7002
7002 MDPVSVLLSVVVCIFLFKFFYGGEKGSQNFPPGPKPLPLIGNLHMINMKKPYLTFME (0)
6832
6071 LAEKYGPVFSVHLGANKAVVLCGTDAVKDALINYADEFSGRPKTPLFEQTFKGN (1)
5910
4393 GIVFADGENWKVMRRFTISTLRDFGMGKKTIEDRIIEESCCLVETFKSHK (1) 4244
2832 GKPFDNTMILNAAVANIIVHILLKHRFEYQDPTLLKLIKGVNENVRNGARPIVM (0)
2671
LYNAFPSIIQWIPGTHKRIFANTQNFFNILKEIFIEHRDQLDVNDQRDLIDTFLVKQQE
EKSSSTKFFHDENLKVLIGNLFAAGMETTSTTLRWGILLMMKYPEIQ
661
KKVQDEIDRVIGSAEPRLEHRKLMPYTDAVVHEIQRFANLVPNGLPHATTTDVTFRGYFIPK (0) 476
GTQVIPLLTSVLRDKAYFKKPEEFYPEHFLDSKGNFLKNEAFLPFSA
GKRTCAGETLAKMELFLFFTKLLQNFTFQPPPGVEVQLTRGVSLTSIPLDHKICALSRS*
25 P450 gene cluster
on scaffold 55 continues on scaffold 996 upstream of
21803_prot One side
of this cluster has genes that are homologous to Chr 6p21
in humans. The other
side of the cluster has a methyl malonyl CoA mutase also
on 6p21. There are no
P450 gene clusters in humans on chr6, but CYP21A2 is at
6p21. The CYP21A2
gene is at 32Mb and the MUT gene and rhag are at 49.5Mb, not
in a syntenic region.
>49362_prot scaffold_996:115436-122968
86% to 21803_prot scaffold_55:62-7002
MDLVSVLLSVVVCIFLYKVFYGGEKESQNFPPGPKPLPLIGNLHIMNMKK
PFLTFMELAEKYGPVFSVQLGTKKVVVLCGTDAVKDALVNHADEFSGRPK
IPMFDQTSKGHGVIFADGENWKVMRRFTLSTLRDFGMGKKTLEDRIGEES
GCLVETFKSHEGKPFDNTLILNAAVANIIVHILLNHRFDYQDPTLLKLIK
SVSENVRIGGRPIVMLYNTYPSIMQWVPGSHKSIYENSQNLLNFLKETFT
EHRHQLDVNDQRDLIDTFLVKQQEEKSSSTKFFHDENLTILLSNLFGAGM
ETTSTTLRWGILLMMKYPDIQKKVQDEIDQVIGSAEPRLEHRKQMPYTDA
VIHEIQRFANLAPNGLPHATTTDVTFRGYFIPKGTQVIPVLTSVLRDKAY
FKKPEEFYPEHFLDSEGKFLKNEAFLPFSAGKRICAGETLAKMELFLFFT
KLLQNFTFQPPPGVEVQLTCGDAITSIPLDHKICALSRS
>49364_prot scaffold_996:134740-168381 poor model, missing
exons 6,7
same seq as:
NM_001035117 CYP2 family member, 50% to 2K21 zebrafish
from refseq database
83% to 49362_prot scaffold_996:115436-122968
MDPVSVLLSVVVCIFLFKVFYDGEKESQNFPPGPKPLPLIGNLHIINMEKPYLTFME
LAEKYGSVFSFHLGTEKVVVLCGTDAVRDALINHAEEFSGRPKVAIFDQIFKGH
GIIFADGENWKVMRRFSLSTLRDFGMGKKTIEEKISEESDCLVETFKSH
GGKPFDNTMIMNAAVANIIVALLLSQRFDYQDPTLLKLVKSINKIVRITGSSMVM
LYNTFPSIMQWIPGSHQNVVKNAEKIYTFLIETFTKHRHQLDVNDQRDLIDTFLIKQQE
EKSSSTKFFHDENLKVLLLNLFGAGMETTSTTLRWGILLMMKYPEVQ
KKVQDEIDRVIGSAEPRLEHQKQMPYTDAVIHEIQRFADLVPNNVPHATTKDVTFRGYFIPK
GTHVIPLLTSVLKDKDYFKKPNEFYPEHFLDSEGHFVKNEAFLPFSA
GRRICAGETLAKMELFLFFTNLLQNFTFQPPPGVEVQLTRGVAITSIPTEHKICALPRS
>4055_prot
scaffold_996:168592-176929
14029_prot scaffold_996:176841-181757 join these two
89% to 49367_prot scaffold_996:195103-207745
172383 MDLVSVLLSVVICIFLYKVFYGGEKESQNFPPGPKPLPIIGNFHMINMKKPHLTFME
172553
172634 LAKKYGSVFSIQLGPEKLVVVCGADAVKDALVNHADEFSARPTIPVFDKTSKGH
172795
174055 GVFFANGENWKVMRRFTLSTLRDFGMGKKTIEDRICEESDFLMETFKSYK
174204
174922 GKPFDNTMIMNAAVANIIVHILLNHRFDYQDPTLLKLINIVSENISIAAKPIVL 175080
176775 LYNAYPSIMEWVPGTHKSVAENMLKLYNFLRETFTQHRDQLDVNDQRDLIDVFLVKQQE
176951
177972 EKPSSTKFFNDQNLTVLLADLFGAGMETTSTTLRWGLLFIMKYPDIQ 178112
179041
KKVQDEIDKVIGSAQPRLEHRKKMPYTDAVIHEIQRLGNLAPNVGHETTTDVTFRGYFIPK 179223
180149 GTQVIILLTSVLQDKDYFKKPEEFYPEHFLDSEGNFVKNEAFLPFSA 180289
181578 GRRICVGETLAKMELFLFFTKLLQNFTFQPPPGVEVDLTCADAITSKPLEHQICALPRS*
181757
>49367_prot scaffold_996:195103-207745
80% to 49362_prot scaffold_996:115436-122968
MDLVSVLLAVVICFFIFKVFYGGKNAFQNFPPGPKPLPIIGNFHMINMKKPYLTFME
LAEKYGPVFSIQLGTEKVVVLYGADAVKDALINHGDEFSGRPTIPVFDRISKGH
GLFFANGENWKVMRRFTLSTLRDFGMGKKTIEDRICEESDFLMETFKSYK
GKPFDNTMIMNAAVANIIVHILLNHRFDYQDPTLLKLINTISENVRIAGKPMVV
LYNAYPSIMQWFPGIHKSVAESILQFYDFLRETFTQHRDQLDVNDQRDLIDVFLVKQQE
EKSSSTKFFNDHNLTALVADLFGAGMETTSTTLRWGLLFMMKYPDIQ
KKVQDEIDRVIGSAQPRLEHRKTMPYTDAVIHEIQRLGNLAPFIGHETTTDVTFRGYFIPK
GTQAIVLLASVLQDKDYFKKPEEFYPEHFLDSEGNFVKNEAFLPFSA
GRRMCVGETLAKMELFLFFTKLLQNFTFQPPPGVEVDLTCGDAVTSKPLDHQICALPRS
>49368_prot scaffold_996:213773-226056, 81% to 21809_prot
MFPLEPTTLFVAIVLCLFLIYLLLHNGKGTPPNFPPGPKPLPFIGNLHIM
NLNKPHKTYMELGNKYGSVFSVQLGTEKVVVLCGYDAVKDALINHAEEFS
ERAVSTLSRKRLKGYGIIFSHGENWKVMRRFTLATLRDFGMGKRTTEDTI
NEECNFLMETFKSYKGEPFETNLIMNAAVANIIVSILLGHRFEYQDPTLL
KLIGLVNEIVKLSGRPIIMIYDAFPSVVSWLPGSHQKVLENTRGLRNFIK
ETFTEHKARLDINDQRDLIDVFLVKQREEKPNPGLFFHNENLISLVSNLF
VAGMETTSTTLRWGLLLMMKYPEIQKKVQNEIDKVIGSAQPQMEHRKQMP
YTDAVIHEIQRFADIVPTNLPHATTMDVTFRGYLIPKGTRVIPLLTSVLR
DKAYFEKPYEFYPEHFLDSEGNFVKNEAFIPFSAGKRICAGETLAKMELF
LFFTNLLQNFTFRSPPGQDLPLTTAEGFTSIPMVHKICAVSRA
>49369_prot scaffold_996:232793-245538 missing exon 4, 78% to
$$$$$4
MLVADPMTILLSAFICLLLGFVLVGNKRHIYRKFPPGPRALPFIGNIQMIYVKQPYKTLLE
LSKTYGSIFSIQVGTEKMVVLCGYDTVKDALLNYPDDFADRPALPLIDDLAKRH
GVFFSNGENWRVMRRFALSALKDFGMGKKRMEKTIIEECDHLVQKFNSYGGQYH
Seq gap at repeat seq.
LYHTYPSIMRWVPGCHKTVYKNGRELYHFLKETFSKHKADLDINNQRNLIDAFLSKQQK
EKSKPDGFFHDDNLTTLLFDLFTAGMETIANTLRWAILLMMKYPEVQ
KKVQDEIEKVIGSAEPRVEHRKNMPYTDAVIHEIQRFANITPMNCPYATSQDVTFKGYFLPK
GTQVIPLLASVLQDEAYFEKPEEFYPQHFLDSEGHFVKNEASIPFSA
GRRSCAGENLARMELFLFFTSLLQNFTFQAPPGEELDLTPDVGLSTPPMQHTTCALSRACS
A flanking gene in
996 is MUT methylmalonyl CoA mutase it is on human chr 6p21
Just like the other end
of this cluster rhag gene. There
are no P450s at
This location on
human chr 6.
>NM_001015757.1
47% to 2K6 zebrafish
from refseq database
MDFTFSLATYLVLVVTVLYILSNWKRKALNNFPPGPKGWPLVGNVFSIDLKKPQRTYIE
LSKKYGPVFSVQMGRKKMVILVGYETVKDALVTHAEEFGGRAYIPVTKDLEKGL
GMIFSNGENWKAMRRFTITTLKDFGMGKSTIEETIAHECSYLVQYFASFK
GKPFDNSTILITSVANIIVAILLGHRMEYEDPVFLRLVNLNSEYVKLLGSPMVT
IYNMFPALGFLPGCHKTVKKNLKELYAFLKRTFVEYQKNFDIHDQRSFIDVFLARQKEE
AKHPETYSYFHNENLVRLVRNLFSAGMETTSTALRWALLLMIKYPDIQ
EKVHDEIARVIGSAHPTYSHRTQMPFTNAVIHEMLRFADIVPLSVPHETTRDVHFKGYFIPK
GTYIIPLLTSVLKDKTQFDAPEQFNPNHFLDSEGNFLKKEAFMPFSA
GRRACPGEILARMELFIFFTSLLQKFSFRPPPGVTNINLSSDVGFTSVPLEGMICAIPRA
>CYP3A81
NM_001015786.1 59% to
3A5
from refseq database
MNLIPHLSTGTWILLAALLVLILLYGIWPYGYFKKMGIPGPTPL
PFIGTFLEFRKGMVQFDTECFKKYGKMWGTYDGRQPVLAIMDPAIIKTILVKECYTNF
TNRRNFGLNGPFESAITIAEDEQWKRIRNVLSPTFTSGKLKEMFQIMKDYSDILVKNI
QGYVEKDEPCATKDVIGAYSMDVITSTSFSVNIDSLNKPSDPFVIHMKKLLKTGLLSP
LLILVVIFPFLRPILEGLNLNFVPKDFTEFFMNAVTSFREKRKKGDHSGRVDLLQLMM
DSRTTGGNDLSNKHKALTDAEIMAQSVIFIVAGYETTSTALSYLFYNLATHPDVQQRL
HEEIDSFLPDKASPTYDILMQMEYLDMVIQETLRLFPPAGRLERVSKQNVEINGVSIP
KGIVTLIPAYVLQRDPEYWPEPEEFRPERFSKENRATHTPFTFLPFGDGPRNCIGLRF
ALLSMKVAIVTLLQNFSVRPCAETLIPMEFSTIGFLQPKKPIVLKFLSRAAAHE
>CYP3A82
BX779027
CX415432.1 93% to 3A81 N. Abdletawab
CX415433 = mate pair
Trace archive 243619963 419527443 418746304
476067541 (exon 1)
416306382 joint before I-helix
MNLIPHLSTGTWILLAVLLVLILL (2)
YGIWPYGLFKKMGIPGPTPLPFIGTFLEFRKGMIQFDIECFKKYGR
MWGMYDGRQPVLAIMDPAIIKTILVKECYTNFTNRRNFGLNGPLESAITVAEDEQWKRIR
SVLSPTFTSGKLKEMFQIMKDYSDILVKNVQVSVDKDEPCATKDVIGAYSMDVITSTSFS
VNIDSLQNPSNPFVIHIKNLLKTGFLSPVIIFAVIFPFLRPIFEVLNISFFPKDFTQFFM
NAVTSFREKRKKGDHS (0)
GRVDLLQLMVDSGTTEGNDSSNQHKALTDAEIMAQSLIFIFAGYETTSTALSYLFYNLATHPDVQQKLHEEIDSFLPDKASPTYDILMQMEYLDMVIQETLRLFPPAGRLERVSKQNVEINGVIIPKGTVAMIPAYVLQRDPEYWPEPEEFRPERFSKENRATQTPFTFLPFGDGPRNCIGLRFALLSMKVAIATLLQNFSVRPCAETLIPMEFSTIGVLQPKKPIVLKFLSRAAAHE*
>CYP3A83 CX982440.1 DT453227 G.Vasser
60% to 3A80 chicken, 56% to 3A81 57% to 3A82
MTFLPDFSMATWTLLVLLLTLLAYYAIWPYKLFKRYGIPGPTPIPFIGTFL
GNRHGLMEFDMECFKKYGKVWGIYEGQKPLLAIVDPVIIKSIMVKECYTNFTNRRDFGLS
GPLKSSVLISKDEQWKRIRTVLSPTFTSGKLKQMFPLMNHYGELLVKNIHKKINNKEPLD
MKHIFGSYSMDIVLSTSFSVNVDSMNNPNDPFVTNARNLFTFSFFNPLFLISILCPFLVP
LLDKMNFCFLSLKILNFFKDAVASIKKKRQKGTH
EDRVDFLQLMVDAQSNEGKSVPEEEKHGYKE
LSDTEILAQSLIFIMAGYETTSTTLMFLAYNIATHPDVQRKLEEEIDALLPNKAPPTYDA
LMKMEYMDMVINESLRMFPPAIRIDRVCKKTMEINGVTIPAGVVIVVPLFALHLNPEIWP
EPEEFQPERFSKENQKNQDPYNFLPFGVGPRNCIGMRFALVNMKVALTILLQNFRLETCK
DTPVPLKICTKGYLKPTKPIILNLIPKVGQTVEE*
>CYP4T8
DR860523.1
CX967683.1 DT412234.1 DR834559.1
best
match to 4B1 in ESTdb X.tropicalis
A.Bolen,
K. Iyer, E. Mahrous, S. Aggarwal
89%
to CYP4T4 of Xenopus laevis
best match in human = 4B1 55%, probably a 4B1 ortholog
MWNTLVWQQVA
ALLCLLAVLLKATQLYLSQKRQENIFKQFPGPPRHWLLGNVDQIRRDGKDLDLLVNWTHK
NGGAFPVWFGNFSSFLFLTHPDYAKVIFGREEPKSSISYDFLVPWIGKGLLVLTGPKWFQ
HRRLLTPGFHYDVLKPYVNLISKCTTDMLDNWEKLITKQKTVELFQHLSLMTLDSIMKCA
FSYDSNCQKDSNNAYIKAVFDLS
YVANLRLRCFPYHNDTVFYLSPHGYRFRKACRITHEHTDKVIQQRKESMKLEKELEKIQQ
KRHLDFLDILLFARDEKGHGLSDEDLRAEVDTFMFEGHDTTASGISWILYCMAKYPEHQQ
KCREEIKEVLGDRQIMEWEDLGKIPYTNMCIKESLRMYPPVPGVARQLRNPVTFFDGRSV
PAGTVIGLSIYAIHKNPAVWEDPEVFNPLRFSPENSANRHSHAFLPFAAGPRNCIGQNFA
MNEMKVAVALTLNRFHLAPDLENPPIRIPQLVLKSKNGIHVHLTKVQ*
>CYP4T9
NM_001017348.2 55% to
4B1 human, 73% to DR860523.1 C. Blackwell
79% to CYP4T3 of
Xenopus laevis
from refseq database
MASTLWKALSSPWLSVNIYQIGQFVALLCVVLLLLKAYALYSRG
RRFAAALVPFPGPPAHWLYGHVNQFRRDGKDLDRLMVWVNKYPNAFPLWIGKFFGTLI
ITDPDYAKVVFGRSDPKTSTGYNFLVPWIGKGLLILSGNTWFQHRRLITPGFHYDVLK
PYVSLISDSTKIMLDELDVYSNKDESVELFQHVSLMTLDSIMKCAFSYHSNCQTDKDN
DYIQAVYDLSWLTQQRIRTFPYHSNLIYFLSPHGFRFRKACRIVHLHTDKVIGQRKKL
LESKEELEKVQKKRHLDFLDILLCSKDENGQGLSHEDLRAEVDTFMFEGHDTTSSGIS
WILYCMATHPEHQQKCREEISEALGERQTMEWDDLNRMPYTTMCIKESLRLYPPVPSV
SRELAKPITFHDGRSLPAGMLVSLQIYAIHRNPNVWKDPEIFDPLRFSPENSSKRHSH
AFVPFAAGPRNCIGQNFAMNEMKVAVALTLKRFELSPDLSKPPLKQPQLVLRSKNGIH
VYLKKAS
>CYP4F.a2
AL648475.2, CF786466.1, AL896688.2 S.Hill, N.Liao
NM_001016020.2 62% to
4F22, 1aa different to NM_001015770
90% to NM_001015810,
same as DT404766.1
from refseq database
MLPFLDHFLDSLNMSRTSFRVYIFAAVILMFCLIMCRTIFKMAI
YIYAYIINARRLRCFPEPPRRSWLLGHLGMFMPTEEGLTEISSAICNLRRTLLTWLGP
IPEVSLVHPDTVKPVVAASAAIAPKDELFYGFLRPWLGDGLLLSRGEKWGQHRRLLTP
AFHFDILKNYVKIFNQSTDIMLAKWRRLTAEGPVSLDMFEHVSLMTLDTLLKCTFSYD
SDCQEKPSDYISAIYELSSLVVKREHYLPHHFDFIYNLSSNGRKFRQACKTVHEFTAG
VVQQRKKALQEKGMEEWIKSKQGKTKDFIDILLLSKNEDGSQLSDEDMRAEVDTFMFE
GHDTTASGLSWILYNLACHPEYQEKCRKEITELLEGKDIKHLEWDELSKLPFTTMCIK
ESLRLHPPVVAVIRRCTEDIKLPKGDILPKGNCCIINIFGIHHNPDVWPNPQVYDPYR
FDPENLQERSSYAFVPFSAGPRNCIGQNFAMAEMKIVLALILYNFQVRLDETKTVRRK
PELILRAENGLWLQVEELKR
>CYP4F.a3
CX980073.1, CX968437.1, CF589328.1
NM_001015770.1 62% to
4F22
from refseq database
note there are two
seqs that differ at only 2 nucleotides
but there are three
ESTs that supprt each sequence so these
are probably alleles.
MLPFLDHFLDSLNMSRTSFRVYIFAAVILMFCLIMCRTIFKMAT
YIYAYIINARRLRCFPEPPRRSWLLGHLGMFMPTEEGLTEISSAICNLRRTLLTWLGP
IPEVSLVHPDTVKPVVAASAAIAPKDELFYGFLRPWLGDGLLLSRGEKWGQHRRLLTP
AFHFDILKNYVKIFNQSTDIMLAKWRRLTAEGPVSLDMFEHVSLMTLDTLLKCTFSYD
SDCQEKPSDYISAIYELSSLVVKREHYLPHHFDFIYNLSSNGRKFRQACKTVHEFTAG
VVQQRKKALQEKGMEEWIKSKQGKTKDFIDILLLSKNEDGSQLSDEDMRAEVDTFMFE
GHDTTASGLSWILYNLACHPEYQEKCRKEITELLEGKDIKHLEWDELSKLPFTTMCIK
ESLRLHPPVVAVIRRCTEDIKLPKGDILPKGNCCIINIFGIHHNPDVWPNPQVYDPYR
FDPENLQERSSYAFVPFSAGPRNCIGQNFAMAEMKIVLALILYNFQVRLDETKTVRRK
PELILRAENGLWLQVEELKR
>CYP4F.b1
NM_001015810.1 64% to
4F22
from refseq database
MLPSLDHFLDSLNMSRSSFRVYIFAAVILLFCLIMFRTILKMAI
YIYAYIINARRLRCFPEPPRRSWLLGHLGLFMPTEEGLTEVSNTISNFRKSFLTWMGP
ISLVSMVHPDTIKPMVAASAAIAPKDELFYGFLRPWLGDGLLLSRGEKWGRQRRLLTP
AFHFDILKNYVKIFNQSTDIMLAKWRRLAAVGPVSLDMFEHVSLMTLDTLLKCTFSYD
SDCQEKPSDYIAAIYELSSLVVKREHYLPHHFDFIYNLSSNGRKFHQACKTVHEFTAG
VVQQRKKALQEKGIEEWIKSKQGKTKDFIDILLLSKDEDGNQLSDEDMRAEVDTFMFE
GHDTTASGLSWILYNLACHPEYQEKCRKEITELLEGKDTKHLEWDELSQLPFTTMCIK
ESLRLHPPVTAVSRRCTEDIKLPDGKVIPKGNSCLISIYGTHHNPDVWPNPQVYDPYR
FDPEKLQERSSHAFVPFSAGPRNCIGQNFAMAEMKIVLALTLYNFYMRLDETKTVRRK
PELILRAENGLWLQVEELKQ
>CYP4V
CX972921.1
AL876066.2 CR409635.1 DT527908.1
best
hit to CYP4V2 in ESTdb X.tropicalis Xin Liu
best match in human = 4V2 62%, a 4V ortholog
MELGGEVHLLVWVAAAVVLLTLLALSILPALQDYVRKRRILKPIPGPGPNYPLIGDALFLKN
NGGDFFLQICEYTESYRLQPLLKVWIGTIPFIVVYHADTVEPVLSSSKHMDKAFLYKFLHPW
LGKGLLTSTGEKWRSRRKMITPTFHFAILSEFLEVMNEQSKILVEKLQTHVDGESFDCFM
DVTLCALDIISETAMGRKIQAQSNRDSEYVQAIYKMSDIIQRRQKMPWLWLDFLYAHLRD
GKEHDKNLKILHSFTDKAILERAEELKKMGEQKKEHCDSDPESDKPKKRSAFLDMLLMAT
DDAGNKMSYMDIREEVDTFMFEGHDTTAAALNWSLFLLGSHPEAQRQVHKELDEVFGKSD
RPVTMDDLKKLRYLEAVIKESLRIYPSVPLFGRTVTEDCSIRGFHVPKGVN
VVIIPYALHRDPEYFPEPEEFRPERFFPENASGRNPYAYIPFSAGLRNCIGQRFALMEEK
VVLSSILRNYWVEASQKREELCLLGELILRPQDGMWIKLKNRETAPTA*
>CX843268.1
CX920105.1 CX846214.1 54% to CYP5A1 human,
Ramy Naguib Attia
missing
last exon in seq gap, not in ESTdb or UCSC genome browser
Trace archive 414884422 418775930
scaffold_1778:46661-92162
MGESWLWGLDGCTVTLTLVAGFLGLLYWYSVSAFWQLEKAGIKHPKPLPFIGNI
MLFQKGFWEGDRHLLKTYGPICGYYMGRRPMIVIAEPDAIKQVLQKDFVNFTNRMRLNLV
TKPMSDSLLCLRDDKWKRVRSVLTPSFSAARMKEMCPLINQCCDVLVENLMEYASSGEAC
NVQRCYACFTMDVVASVAFGTQVDSQRDSDHPLVQNCKRFLELFTPFKPVVLLCLAFPSI
MIPIARRLPNKHRDRINSFFLKVIRDIIAFRENQPPNERRRDFLQLMLDAR
DSAGHVSVDHFDIVNQADLSVPQNQDRGQDPPRKSTQKTLNEEEILGQAFIFLIAGYETT
CSLLSFASYLLATHPDCQEKLLKEVDEFSQEHEEADYNTVHDLPYMEMVINETLRMYPPA
YRFAREAARDCTVMGLGIPAGAVVEIPIGCLQNDPRFWHEPEKFNPER (2)
FTAEEKQKRHPFLFLPFGAGPRSCIGMRLALLEAKITLYRVLRKFRFQTCDLTQ (0)
(MISSING last exon)
>BC060001.1 Xenopus
laevis CYP5A1 for comparison, 92% to X. tropicalis
MEEPTLWGLDGCTVTFALVAGFLGLLYWYSVSAFSQLDKVGIKH
PKPLPFIGNVMLFKKGFWEGDRHLIKTYGPICGYYMGRRPMIVIAEPDAIKQVLQKDF
VNFTNRMKLNLVTKPMSDSLLCLRDDKWKRVRSVMTPSFSAIRMKEMCPLINQCCDVL
VDNLLEYASSGEACNVQRCYACYTMDVVASVAFGTQVDSQRDPDHPLVQNCKRFLELF
TPFKPLILLCLAFPSIMIPIARRLPNKQRDRINSFFLKVIRDIIAFRENQPPNERRRD
FLQLMLDAQDSVSHVTVDHFDIVNQADLSVPQNSPSEKQDRGQDPPRKSSKKLNKEEI
LGQAFIFLIAGYETTCSLLSFTSYLLATHPDCQEKLLKEVDEFSQEHKEADYNTVHDL
PYMDMVINETLRMYPPAYRFAREAARDCTVMGQNIPAGAVVEIPIGCLQNDPRFWHEP
EKFNPERFTAEEKQKRHPFLFLPFGAGPRSCIGMRLALLEAKITLYRVLQKFRFQTCD
LTQ IPLQLSAMSTLRPKDGVYVTVVAR*
>CYP7A1
5102_prot 64% to human 7A1 scaffold_8:3335708-3339166 UCSC browser
MLTVSLIWGLVVALCCFFWLIVGIRRRQPGEPPLENGLIPYLGCALQFGA
NPLEFLRVRQNKFGNVFTCKIAGQFVHFVTDPFSFNSVMRHGRHFDWQKF
HFATSAKAFGHSNIDSSDSEVTQNVHDSFLKTLQGDALDPLISNMMENLQ
HTMLQNSSYKVNSKDWVTEGLYAFCYRVMFEAGYLTLFGKEFNSPEDKNL
ARQEAQRALILNAIENFKEFDKIFPALVAGLPIHVFKSAYSARENLAKDL
LHENLRKRNNISELISLRMFLNESMTSLNDMEKAKTHLALLWASQANTLP
ATFWSVFYLLRCPHAMKASTEEVQRVLEKASQKVNCDGRYIFLNRHELDD
MPVLDSIIKEAMRLSSASLNIRVAKENFVLHIDDKQAFNIRKDDIVALYP
QMVHLNPDIYEDPNNFKYDRYLGEDGKEKTSFFLNGRKLKYYYMPFGSGK
TKCPGRQFAVHEIKQMLTLIICYFDMELVDKNIRSPPLDQSRAGLGILQP
THDVDFRYKLKAH
>CYP7B1
CX908537.1 CX805060.1 CX814354.1
BX713865.1 CN080250.1
52% to 7B1 Y. Peng
MLDTLLYTVTGLVLGGLLLLLLLPRRQ
RREGEPPLENGWIPFIGLAYEFHKNALEFLISRQQKYGDIFTVHVAGKYITFIMDSTQFQ
YVIKHGKQLDFHEFANSLSSRTFDHPRLTEAKFPHLNDKLHRIYKIMQGRALDKLTDSMM
GNLQRVFKWKFSQATDWKAEKMYQFCCCIMFEASFMTLYGRDPIADGHKVISEIREK
FTKFDAKFPYLVINIPIALLGATKKIREELIHFFFPNKMEKRSEISEVVQERKNVLEQYELQDYDRAAHHFAFLWASVGNTIPATFWAMYYLVRHPEALAAVRDEIDHLLQSTGQKKGPEYDIHITREQLDSMVLLGSAIKESFRLCAASMNIRLVQEDFDLELEGNQTIRLRKDDFIALYPPALHMDPEIYEDPERYKYDRFVENGKEKILFYKKGKKLKEYLMPFGSGTSKCPGRFF
AMNEIKQFLAVLLIYVEMELVEHKALGHDNRRSGLGILLPNSDIMFRFKPRTLDL*
>CYP8A1
DT406915.1
DN063760.1 best hit to CYP8A1 in ESTdb
X.tropicalis H.Penmatsa
best match in human = 8A1 55%, an 8A1 orthologcannot extend in ESTdb
29605_prot model missing internal exon(s) UCSC browser
MVWAGVFSLLLLILCIGFCYYRFLHRTR (2)
QPNEPPLDRGSIPWLGYALEFGKDAAKFLSDMKEKHGDIFT (0)
IQVAGKFITVLLDPHSYDAVFWAPSNQLDFGKYARMLMDRMFDVRLPPSGGNEEKTLLAS (2)
HFQGSNLTKLTRSMFHNLSTILLKDRRLPNTEWTDQGLFDFIYGVMLR (2)
AGYLTLFGTESEQYTSTYSPMRDLKHSEDVYKEFRKLDWLLMKAARNTLST (1)
GEKEEASLVKNRLQKLISIKSHKGKCCKSSWFEYYQQHLEEIQATEDMQSRALVLQLWATQ (0)
GNAGPATFWLVLYLLKHPEAMAAVQAEFESIFQRNLQEKRHIEEMNQDLLDKMIIL (1)
DNVLNETLRLTAAPFISREVLTDMTLKLADGHQYQLRRGDRLCLFPFVSPQMDPEVHQQPE (0)
VFQHNRFLNADGTEKTEFYKKGKRLKYYNLPWGAGSNVCVGKKHAVNSIKQ
(2)
FVCLLLFYFDFELKTPAEKIPEFNRTRYGFGLLQPEHDILFRYRRKA*
>CYP8B1
CX973251.1 best hit to CYP8B1 in ESTdb X.tropicalis M.Puljic
best match in human = 8B1 54%, an 8B1 orthologcannot extend in ESTdbTrace archive 413359888 234313629 248374609 continues on CF343441.1MALFLPIILALLVSVIGGLYLLGMFRKRRPDEPPLDKGTIPWLGYALDFRKNTSTFLQKMHKK
HGDIFTVQIAGYYFTFVMDPLSFGPIIKESKGNLDFEEFAKDLVLRVFGYQSFTNDHKML
EKSSTKHLMGDGLIVMTQAMMENLQNLMVHNIGSGKGEREWQQDGLFNYSYNIVFRAGYL
ALYGNEPAKNKGSKEKAKEFDRKHSDELFYEFRKYDQLFPRLAYAVLPPKDKIEAERLKR
LFWNMLSVKKTLQKENISGWIGEQHQ
QRAEQGLPEYMQDRFMFLLLWASQGNTGPASFWFLLYLLKHPEALKAVREEVEAVLKETG
QEVKPGGPLINLTRDMLMKTPVMDSAVEETLRLTAAPVLIRAVKQDMKIKMASGKDFSMR
KGDRVALFPYIAVQMDPEIHAEPEKFKYNRFLNEDGTKKTEFFKNGKKVKYYTMPWGAGS
TICPGRFFATNELKQFAFLMLTYFEFELVNPNEEIPSIDPNRWGFGTMQPTRDVQFRYRLRY*
>CYP11A1
53% to CYP11A1 human 62% to 11A1 Fugu
CX377330.2 CX940317, Q. Tran, S. Sarva
cannot extend in ESTdb
22200_prot
scaffold_62:97483-110117 model short at N-term
prpose a GC boundary at the end exon 1 to preserve length
MLLLRRLPAVPSGLRMISHHSVVGAGPEMGTLSQVD
TPLPYNQMPGNWKRGWLELYRFWRKDGFHNIHYHMMENFQRFGPIYR (2)
EALGIYDSVFIQLPEDAATLFHVEGLH
PERLRVPPWYEYRDYRNRRYGVLLKKGEDWRSHRIALNREVLSMSAMSRF
LPLLDSVGQDFVHRAHIQVERSGRGKWTADLTNELFRFALESVCYVLYGQ
RLGLLQDYIDPESQQFIDSVSLMFNTTAPMLYLPPSLLRKINSSIWKDHV
RAWDAIFTHADRCIQQIYSSLRQQSDSTYSGVLSSLLLQDQMPLEDIKAS
VTELMAGGVDTTSMTLQWAMYELARTPSVQEKLRSEVIAARDASGKDLTA
LLKRIPLVKAALKETLRLHPVAITLQRYTQRDTVIRNYIIPQGTLVQVGL
YAMGRNPDIFALPQRFSPERWLGGGPTHFRGLGFGFGPRQCIGRRIAEIE
MQLFLIHILENFKIEINRMVDVGTTFNLILFPSKPIHLTLRPLK*
>54949_prot X. tropicalis predicted protein from UCSC browser
scaffold_1888:38981-41244
trace archive 483095886 413360250 242989364 mate = 242988788
584719101 584719005 234592190
50% to 11B2 human 49% to 11B1
scaffold_8238:179-3044 UCSC browser
387238891 419638572 mate = 419631756
MMAALVCGGTCSWDTVRGLRTKSTHFSTVQLAQDSQSLTSAKAQS
LPFKSIPCTGRNAWANLARYWKNNSFQQLHLVMEGHFQNLGPIYR (2)
ETLGTHSSVNIIHPQDVARLFQSEGVFPRRMGIEAWAAHRDLRNHKCGVFL (2)
LNGEDWRSDRLILNKEVLSLTGVKKFLPFLDEVANDFVSFLMRRINKNTRGTLTVDLYADLFRFTME
(1)
ASGYVLYGQRLGLLEEHPNEDSLRFIRAVETMIKTTLPLVYLPHQLLRLTDSALWTQHMEAWDVIFQQ (1)
ADRCIQNIYQEFCLGQERGYSGIMAELLLQGELPLDSITANVTELMAGGVDT (0)
TAMPLLFTLFELARNPSVQQELRAEIKRAEGQCPKDMNQLLNSMPLLKGAIKETLR (2)
LYPVGITVQRYPMKDIVLQNYHIPAG (0)
TLVQVGLYPMGRSSELFQNPLRYDPTRWMRRDETNFKALAFGFGSRQCIGRRIAETEMMLFLMH
(0)
VSIMNTDIKAQTYSGH*
>AF449175.1 Xenopus laevis steroid 11-beta-hydroxylase protein
(CYP11B1)
FAFGPIYRENLGTHSSVNIIHPHDVARLFQSEGIFPRRMGIGVWAAHRDLRNHKCGVFL
LNGEEWRSDWLILNKEVLSLAGVKKFLPFLDEVANDFVSFLMRRINKNTRGTLTVDLYADLFRFTME
ASGYVLYGLRLGLLEEHPNEDSLRFIRAVETMIKTTLPLLYLPHQLLRLMDSSLWIQHMEAWDIIFQQ
TDRCIQNIYQEFCLGQERGYSGIMAELLLQGELPLDSIKANVTELMAGGVDT
TAMPLLFTLFELARNPITS
>CYP17
CX931022 DR883840.1 CX957932.1 50%
to CYP17A1 human, Guo
Zhu
MISYVAAAVLLAFGLALLSIWKFAGGKPRGAKYPNSLPCLPFIGSLLHLASHLAPHI
LFNKLQEKYGSLYSFKMGSHYIVIVNHHEHAKEVLLKKGKTFGGRPRAVTTDLLTRNAKD
IAFADYSPTWKFHRKLVHAALSMFGEGTVAIEKIISREAASLCQTLITFQGSPLDMAPEL
TRAVTNVVCALCFNARY
KRCDPEFEEMLAYSKGIVDTVAKDSLVDIFPWLQIFPNKDLEILKRSVAIRDKLLQKKLK
EHKEAFCGEEVNDLLDALLKAKLSMENNNSNISQEVGLTDDHLLMTVGDIFGAGVETTTT
VLKWAVAYLLHYPKVQAKIQEELDVKVGFGRHPVLSDRRILPYLDATISEVLRIRPVAPL
LIPHVALHESSIGEYTIPQDARVVINLWSLHHDPNEWXNPEEFIPDRFLDENGNHLYTPS
QSYLPFGAGIRVCLGEALAKMEIFLFLSWILQRFTLEVPAGDSLPDLDGKFGVVLQVKKFRVT
AKLREVWKNIDLTT*
>CYP19
CX885719.1
CX885718.2 DN051995.1
best
hit to CYP19 in ESTdb X.tropicalis Z.Zhang
best match in human = CYP19 71%, CYP19 ortholog
5697_prot from UCSC browser scaffold_27:1527865-1546466
MEALNPVQYNSTEAVPTLAPATTVSLLLFIFLLIILWNQEETCLIPGPAYCMGLGP
LISYGRFLLTGIGKAANYYNNMYGEFVRVWINGEETLVISKASATFHIMKHSHYISRFGS
KLGLQCIGMNENGIIFNSNPSLWKVIRPFFIKALSGPGLMQTTEICIRSTKRYLDNLGNV
TNELGNVDVLKLMRLIMLDTSNNLFLRIPLD
ENEIVLKIQKYFDAWQALLLKPDIFFKISWLYKKYEKSANDLK
EAIEILIEQKRQKLSSSEKLDENMDFASELIFAQNHGDLTAENVNQCILE
MLIAAPDTMSVSLFFMLVLVAQHPKIEEGIMNEIDNVIGDRDVESNDIPN
LKVLENFIYESMRYQPVVDLVMRKALEDDMIDGYYVKKGTNIILNLGRMH
RIEYFPKPNEFTLENFEKTVPYRYFQPFGSGPRAC
AGKYIAMVMMKVILVTLFKRYKVQTLGGRCLENIQNNNDLSMHPDESQPCLEMIFIPKNTAELKQ*
>CYP20
DR851274.1
CX431022.2 CF240500.1 DT417160.1
best
hit to CYP20 in ESTdb X.tropicalis Z.Zhang
best match in human = CYP20 72%, CYP20 orthologMLDFAIFAIT
FLLILVGAVLYLYPSSRQACGIPGLAPTEEKDGNLQDIVNSGSLHEFLVNLHERFGPVAS
FWFGRRLVVSLGSLDLLKQHINPNKTSDPFQMMLKSLLGYQSGVIGEAAESHVQKKLFEN
GIIKALHSNFSVVIKLSEELLAKWLTYPQSQHVPLCQHMLGFAMKSVTQTAMGSSFDDDQ
EVIHFRRNHDAIWSEIGKGFLDGSIERSPNRKKLYEDALMEMETVLKKAIKERKVKNPGR
HVFVDSLLQGNLSDKQVLEDSMIFSLAGCVITANLCTWAIYFLT
TSEEVQDKLFKEVTRVIGKGPITMDKLEQLSYCRQILCETVR
TASLTPISARLQELEGRVDQHIIPKETLVLYALGVVLQDNTAWPLAYRFDPDRFNDETAK
QSLTLLGFSGSQECPELRFAYMVAMVLLSVLVRKLHLLPVKGQVMETKYELVTSPKEEAW
ITVSKRS*
>CYP21 41% to human CYP21A2, 45% to Fugu CYP21, 46% to
zebrafish CYP21
50686_prot scaffold_1026:150552-153289
50687_prot
MALLLLLLLFLVLLSLQWAKKYFLGFSNPHVHYPPCPPPLPFLGNLLHLA
HKDLPIHLLHLSRKYGSIYRLSFWGKDFVVLNNSNLIREALLKKWADFAG
RPKSYIGDLISLGGKDLSLGDYTPVWKVQRRLTHTSLQNCVRNDLENVLI
REARLLCQDCLNLNGEPVDISRSFSLRTCRIIAELTFGTTYDLSDPKFQE
IHKCIVNIIKLWESPSVTALDFIPFLQ (0)
KFPNQTLKLLMDTAKQRDSFIKSQVEAHKAHLPSSKCDEDILDGMMRFLL
EKSGDDSSGMSEFSEDHLHMAVVDLFIGGTETTASLLTWTVAYLMHYPEA
QDKIHQEIIGAVGMERYATYTDRNSLPYLNATVSEMLRLRPVVPLAVPHC
TIRDTSIAGYTIPKGTTVIPNIYAAHLDETIWDNPTQFYPENSHSSRALL
PFSVGARLCIGETLARMEVFFFLSHLLRDFRLLPPSPELLPELSGVFGIN
FKCRPFLVCISPRENTPKIQDLNNKT*
>CYP24A1
DT404490 67% to CYP24A1 human, Guo Zhu
Cannot
extend in ESTdb
26514_prot scaffold_125:1595529-1618234 UCSC browser (errors in
model)
MTSRIKRDFLGMLLKSRSISVQHSIPTATAVCDLKEKELPAPSSCPHSLA
ALPGPTKLPILGSLLDILRKGGLKRQHEALASYHKQFGKIFRMKLGSFDS
VHIGAPCLLEALYREESNYPKRLEIKPWKAYRDYRDEAYGLLILEGKDWQ
RVRSAFQQKLMKPTEVGKLDTKINEVLVDFMKRIDSVCDEDGT
IEDLYCELNKWSLESICLVLYEKRFGFLQPNLGEEAQNFITAIKM
MMSTFGLMMVTPVELHKSLNTKIWK
DHTHAWDSIFKTAKCHIDRRLTKLSSKGSEDFLCTIYNDSKLSKKEMYAT
ITEMLIGAVETTANSLLWAIFNISRNPHIQKKLLEEIESVLLPDQVPTAD
DIRNMPYLKACLKESMRITPSIPFTTRTLDKETVLGDYVLPKGTVLTINS
HVLGSNQECFDNWNQFRPERWLQQKNTINPFAHVPFGIGKRMCIGRRLAE
LQLQLTLCW (0)
LIRKYEIVATDNDPVETLHLGTLMPSRELPVAFHRR*
>CYP26A1
NM_001016147.2 (seq gap) CX456558.2 CX423895.2 CX424757.1
69% to human CYP26A1, L. Zhu
MDLYTLLTSALCTLALPVLLLLTAAKLWEVYCLSRKDASCRNPL
PPGTMGLPFFGETLQMVLQRRKFLQVKRRKYGRIYKTHLFGSPTVRVTGAENVRQILL
GEHKLVSVHWPASVRTILGAGCLSNLHDSEHKYTKKVIMQAFSREALANYVPLMEEEL
RRSVNLWLQSDSCVLVYPAIKRLMFRIAMRLLLGCDPQRLGREQEETLLEAFEEMTRN
LFSLPIDVPFSGLYRGLRARNIIHAQIEENIKEKLQREPDGQCRDALQLLIDHSRRTG
EPVNLQALKESATELLFGGHGTTASAATSLTTFLALHKDVLEK
VRKELESQGLLSNKPEEKKELSIEVLQQLKYTSCV
IKETLRLSPPVAGGFRVALKTFVLNGYQIPKGWNVIYSIADTHGEAELFPDKDE
FNPDRFLTPLPGDSSRFGFIPFGGGVRCCVGKEFAKILL
KVFIVELCRNCDWELLNGSPAMKTSPIICPVDNLPAKFKPFASSI
>CYP26B1
CX905408.1
CX388776.2 CX940676.2 82% to human 26B1 L.
Chen
MIFQSFDLVSALATLAACLVSVALLLAVSQQLWQ
LRWAATRDKSCKLPIPKGSMGFPLVGETFHWILQGSDFQSSRREKYGNVFKTHLLGRPLI
RVTGAENVRKILMGEHHLVSTEWPRSTRMLLGPNSLANSIGDIHRHKRKVFSKIFSHEAL
ESYLPKIQLVIQDTLRVWSSNPESINVYCEAQKLTFRMAIRVLLGFR
LSDEELSQLFQVFQQFVENVFSLPVDVPFSGYRRGIRAREMLLKSLEKAIQEKLQNTQGKDYADALDILIESGKEHGKELTMQELKDGTLELIFAAYATTASASTSLIMQLLKHPSVLEKLREELRGNSILHNGCVCEGALRVETISSLHYLDCVIKEILRLFSPVSGGYRTVLQTFELDGFQIPKGWSVLYSIRDTHDTAPVFKDVDVFDPDRFGQDRTEDKDGRFHYLPFGGGVRNCLGKHLAKLFLKVLAIELASMSRFEL
ATRTFPKIMPVPVVHPADELKVRFFGLDSNQNEIMTETEAMLGATV*
>CYP26C1
CX830022.1
CN120927.1 CR567555.1 CX376643.2 CR567556.1
65%
to human 26C1 L. Chen
note
this seq has in insertion
compared to human, but
the
insertion is supported by several ESTs and is real
also
seen in X. laevis (see below)
MFLLEISYTSFFEAALTSALSLVLLLAASHQLWS
LRWHSTRDRGSSLPLPKGSMGWPFFGETLHWLVQGSSFHSSRREKYGNIFKTHLLGKPVIRVTGAENIRKILLGEHHLVSTQWPQSTQIILGSNTLSNSIGELHRQKRKMMSKVLSSAALESYLPRIHEAVRWEVRSWCRGVGPVSMLSCAKALTFRIAARILLGLSLTDTQFQELTRTFEQLVENLFCLPLDIPFSGLRKGMKARDTLHQYMEEAIKEKLSKRDPDACEDALDYLINSA
KEGGKEINMQELKESAIELIFAAFLTTASASTSLVLLLLKHPSAIHKIRQELASHGL
SEHCEQCLPATENPNNNILQDNGHQCLTAGCQLPLVMGTEGQVKTLWEQTKQLLTDRTDK
DPQNSLSSKNLVNGENRIQEAPCSHDKSNCSPVPGKLQNSVFEGT
CQQNISLEKLKSLHYLDC
VVKEVLRLLPPVSGGYRTALQTFELDGYQIPKGWSVMYSIRDTHETAAVYQNAEMFDPER
FSTERDEGKLGRFNYIPFGGGARSCIGKELAQIILKILAMELVTTAKWELATPSFPKM
QTVPVVHPVDGLQLSFSFLGSNDSDKAARNRSLANP*
>CYP26C1 BC111476 Xenopus laevis cDNA
MFLLEISYTSFFEATLTSVLSLVLLLAASHQLWSLRWHSTRDRG
STLPLPKGSMGWPFFGETLHWLVQGSSFHSSRREKYGNVFKTHLLGKPVIRVTGAENI
RKILLGEHSLVSTQWPQSTQMILGSNTLSNSIGELHRQKRKVMSKVLSSAALECYFPR
IQEAVRWEVRGWCRGVGPVSMFACAKALTFRIASRILLGLSLTDSQFHELARTFEQLV
ENLFSLPLDIPFSGLRKGIKARDTLHQYMEEAIKEKLTRRDPDACEDALDYLINSAKE
GGKEINMQELKESAIELLFAAFLTTASASTSLVLLLLKHPSAILKMRQELASHGFSKQ
CQCLPDMENPNNNILQDNGHRCLTAGCQLPLLMGTEGHLKTQGEQTEQLLTDKTDPQN
SLSSKNPLKGKNRIQEAPCSHDKSTCTPVPGKLQSPVSEGT SQQNSNLEKIKSLHYLE
CVVKEVLRLLPPVSGGYRTALQTFELDGYQIPKGWSVMYSIRDTHETAAVYQNAEMFD
PERFSSERDEGKLGKFNYIPFGGGVRSCIGKELAKVILKILAMELVTTAKWELATPSF
PKMQTVPVVHPVDGLQLSFSFLSSSDRDRAARNGSLA
>CYP27A1
DR832386.1
CX969640.1 DR852196.1
best
hit to CYP27A1 in ESTdb X.tropicalis Xin Liu
best match in human = CYP27A1 55%, probably a CYP27A1 orthologcannot extend in ESTdbTrace archive 570051728(+), walked to 411568263(+) 494948503(+)
MPSASKLGFLPLGRCRWLLHTGRGVSVSQGRAVAGAAVGAVGEEKKMKTF
EDLPGPSLLTNIYWVFLRGYILYTHELQAIYKKNYGPMWKST
LGRYKTVNIADVDILETVLRQEGKYPMRSDMEVWKEHRRQRDLSLGPFTEEGHKWHTLRS
VLNKRMLKPAEAMLYTGVVNEVVTDFLVRLEEMRSETPSGDMVNDIPNALYRFAFEGISY
ILFETRIGCLEKQIPVETQRFIDSIGAMLKNSIFVTIFPPWTNNLLPYYKRYMDSWDNIF
AFGNKLINEKMKKIEARLERDEEVQGEYLTYLISSGKLTDKEIYGSVAELLLAGVDTTSN
TLSWALYHLAREPEIXNALYQEVIGVVPGQNIPTSEDISSMPLLRAVIKETLRL
>CYP27A like CX393015 DR862140.1 DN030991.1 49% to 27A1 human
scaffold_31:1,728,368-1,736,213 19854_prot UCSC browser
MIAQRLQTGA
QALLQQSCRASVQTVRKKATLGVSGATVVEGKTLKTLDDLPGPSPLKLLYWIFLRGYLFR
THELQVIFRKTYGPMWKMSDRQHAMVTVASPDLLESLLRKEGKYPTRADMFIMREHRDLR
GHSYGPVTEEGHQWHRIRTILNQRMLKPRETVVYAGSMNEVVSDLLLKIKELTAQSSSGT
QVNGVAELMYKFAFESICTVLFETRLGCLNKEILPETQKFIDSIGIMLEHLTMLTRLPQW
TKGILPYWGRYIEAWDTIFDFGKKLIDKKMEDIEGRLKRGEEVEGEYLTYLLSSGKLSME
EVYGSVVELLQAGVDTTSN
TLTWALYQLSRNPEIQNNLYQEVIRVIPGETIPDSEAIARMPLLKAVIKETLRL
FPVVPENARMINEKEVTIKDYVFPVKTQFILGHYAISRDETTFPEADRF
LPERWLRDSGMKHHPFGSIPFGYGVRACVGRRIAELEMHLALSRIIKMFQVIPDPDLGEV GAKNRAVLVANRPVNLRFIERQPRPE*
>CYP27?
AL787054.2 DT421274.1 CR427561.1 Quynh Tran
50% to 27A1 human, 42% to 27B1 human and 38% to 27C1 human
Also BC094536.1
MRKGCHALLWKTCWANVQTGRE
KATLGVAGAVAEQEKKLKMSTDLPGPSTLNILYWVFLRGYVFESHKLQVIWKKRYGPLW
KTCIGSHRLVNVASPELLETLLRQEGKYPMRTDMFMWKEHRDLQDFSYGPLTEEGHRWHT
LRRVLNQRMLKPKEAVRYTESFNDVVTDLLVVIKEITAQSPNRTTVDGVANLMYKFAFES
ICTVLFETRIGCLKKEIPPETEKFINSIAIMLENQTRMEKLPRWTRGIFPYWRRFVEGWD
NIFIYGKKLIDKKMEEIEGRLKRGEEVEGEYLTYLLSSGKLSMEEICGSVAELLQAGVDTTSNT
LTWALYQLARNPEIQHNLHQEVIGVTPGDTIPDSEAIARMPLLRAVIKETLRLYPVVPEN
GRVVTEKDVILNDYIIPKNSQFVLCHYALSRDETQFPEPDRFLPERWLR
DSGMKHHPFSSIPFGYGVRACAGRRIAELEMHLALSRIIKMFQVVPDPELGEVGTKNRTV
LVSSRPINLQFIER*
>CYP27B1
NM_001006906.1 CYP27B1 54% to human
MAQTLKLGSSRSSQLFRGLQELWAETVLKNSEKVIKGHKSLADM
PGPSTVSFISDLFCRRGLARLHELQLEGKAKFGPVWKASFGPILTVHVAEPSLIEQVL
RQEGKHPIRSDLSSWKDYRQCRGHSYGLLTAEGEEWQQFRSILGKHMLKPKEVEAYSD
VLNDVVGDLIKKINYQRSQNQNNVVKDIAKEFYMFGLEGISSVLFESRIGCLEPTVPK
ETEKFIQSINTMFVMTLLTMAMPKFLHKIFRKPWQKFCESWDYMFAFAKGHIDKRMKD
VAQKLAQGEKVEGKYLTYYLAQEKIPMKSIYGNVTELLLAGVDTISSTLSWSLYELAQ
HPDIQSAVYSEVEEILQGKQIPSPSDVARMPLLKAVVKEVLRLYPVIPGNARVVADRD
IQVGDYIIPKKTLITLCHYATSRDENVFSNPNEFQPDRWLKKEDTHHPYASLPFGFGK
RSCIGRRIAELEVYLALARILSHFEVKPEQPGSLVMPMTRTLLVPEKEINLQFLER
>CYP27C1
NM_001011341.1 short
at N-term (58 aa)
CX494774.2 68% to human
from refseq database
MAALGQLLRGSARLEGLARSFHRFPGAQAAGQALEHEQAEGVLGATVKGSPMVKNLKE
MPGPSTMANLVEFFWRDGFGRIQEIQQKHARQYGRIFKSHFGPQ
FVVSIADKDMVAQVLRAERDAPQRANMESWHEYRELRGRSTGLISAEGEKWLNMRSVL
RQKILRPRDVAMYSGGVNEVVEDLVKRIRKLRVQESDGLTVTNVNDLYFKYSMEAIAT
ILYECRLGCLDDQIPQQTKEYIEALELMFSMFKTTMYAGAIPKWLRPLIPKPWREFCR
SWDGLFKFSQIHVDDRLRQIESQLEKGEEVQGGVLTHLLLSKELDLEEIYANMTEMLL
AGVDTTSFTLSWATYLLAKNPGIQEAVYQQIVQNFGKDQVPTAEDVPKMPLVRAVVKE
TLRLFPVLPGNGRVTQDDLVVGGYFIPKGTQLALCHYSTSYDAECFPAAEEFRPERWI
RSGNLERKENFGSIPFGYGIRSCIGRRVAELEMHLLLIQLLQNFEIKPSPQTTTVLPK
THGLLCPGGKINVRFVDRQ
>CYP39A1
CX851900.1
CX931743.2 CX956889.1
best
hit to CYP39 in ESTdb X.tropicalis N.Liao
best match in human = CYP39 51%, probably a CYP39 ortholog
MDPIASVSSALLSPTAALGLLVALLTAVLVRYLLPNG
SQKPPYPPCIRGWIPWFGAAFDMGKAPLEFIARAREKHGPIFTVLAAGNRLTFLSGKEGI
SAFFSSKEADFQQAVQKPVQHTASINKEDFLKSHSSIHETIKLRLSQNRLHLYFDRIRNEFSTRIE
LLNPEGTEDLFALVKKVMYPAVADTLFGKGLCPTGKGKLEEFAEHFWKFDEGFEYGSQLP
EFLLRDWSQSKQWLLRLFKKIVIEAEMNNPLEETSKTLHQHLLDTLKGNSTYNNSLLLLW
ASQANANPVTFWTLGFIISDPLVYKAAMDEIHSVFGKAGNKELNMNEAELKRLPFIKTCV
LEAIRLRSPGAITRKAVQPLKINNYLVPAGDLLMLSPYWLHRDPTLFPEPEMFR
PERWSKANLEKNVFLEGFVAFGGGKYQCPGRWFALMEMHMLVVMMLYKYEFSLLDPLPKQ
SNLHLVGTQQPDGPCRVRYKLRK*
>CYP46A1
NM_001032346 CYP46 ortholog 54% to human CYP46, 82% to
NM_001032348
from refseq database, note: this frog has two CYP46 genes
MGLWALIGWAALLLLALILICFLLFSGYIHYIHMKYDHIPGPPR
DSFFLGHSPTMLRLMKNNLLMYDHFLGWVQKYGPVVRINGLHRVIILVVSPEAVKELL
MSPKYSKDKFYDVIANMFGVRFMGKGLVTDRDYDHWHKQRRIMDPAFSRTYLMGLMGP
FNEKAEELMEKLMEKADGKCEIKMHDMLSRLTLDVIGKVAFGMELNSLNDDLTPFPKA
ISLVMKGIVEMRNPMVRYSLAKRGFIRKVQESIRLLRQTGKECIERRQKQIQDGEEIP
VDILTQILKGAAMEEECDPEILLDNFVTFFIAGQETTANQLSFVVMELGRNPEILEKA
QAEIDEVIGSKRDIEYEDLGKLQYLSQVLKETLRLYPTAPGTSRGLTEDMVIDGVKVP
ENVTIMLNSYIMGRMEQYYSDPLTFNPDRFSPDAPKPYYSYFPFSLGPRSCIGQVFSQ
MEAKVVMAKLLQRYEFELAEGQSFKILDTGTLRPLDGVICRLRPRTSKKAATLQ
>CYP46A4
NM_001032348.1 CYP46
again 53% to human CYP46
from refseq database, note: this frog has two CYP46 genes
zebrafish also has 2 CYP46 genes
MGLWALFGWASLLLLALTLICFLLFCGYIQYIHMKYDHIPGPPRD
GFIFGHSPTILRLMKNNKVVYDQYLDWVQ
YGPVVRINALHRVIVLITSPEGVK
EFLMSPKYSKNDIYDRVATLYGM
RFMGKGLVTDKDHDHWYKQRRIMDPAFSR
TYLMDLMGPFNEKAEELMERLSEQADGKSDTEMHNLFSRVTLDVIAK
VAFGMELNSLKDDLTPLPQAISLVMNGI
VETRNPMIKYSLAKRGFIRKVQESIRLLR QTGKECIERRQKQIQDGEEIP
MDILTQILKGAALEEDCDPETLLDNFVTFFIAGQETTANQLSFAVMELGRNPEILQKA
QKEIDEVIGSRRFIEHEDLSKLHYLSQVLKETLRLYPTAPGTSRGLKEEIVIEGVRIP
PNVNVMFNSYIMGRMEQNYTDPLTFNPDRFSPGAPKPYYTYFPFSLGPRSCIGQVFSQ
MEAKVVMAKLLQRYDFELAEGQSFSIFDTGSLRPLDGVICRLRPRTSNTATTNKYIF
>CYP46A5 CX981536.1 CX970619.1 CX970620.1 CX370643.2
S. Aggarwal 87% to 46A1 Xenopus trop. and 84% to 46A4
54% to 46A1 humanscaffold_588:627624-672691MGLWAILGWAALLLLALILICFLLYCGYIHYIHMKYDHIPGPPRDR (2)SFIFGHSTALLKLVNENLLMYDYFLDW (2)VHKYGPVMRINGLHKVAVLVASPEGIK (0)EFLMSPKYLKDEFYDFFGSLFGER (2)LMGKGLLTDRDYDHWHKQRRIMDPAFSRT (2)YLMGLMGPFNEKAEELMEKLSENSDRKCEVNMHDMFSKVTLDVIGK (0)
VGFGMELNSLNDDQTPFPRAISLVMKGSVEIRNPMIK (0)
YSLAKRGLIRKVQESIRLLRQTGKECIERRQKQIQDGEEIPVDILTQILRGA (1)
ALEKDCDPETLLDNFVTFFIA (1)GQETTANQLSFAVMSLGRNPEILKK (2)AQAEIDEVIGSKRDIEYEDLGKLSYLSQ (0)VLKETLRLYPTAPGTSRTLENEIVIDEVRIPGNVTLM (0)LNSYVMGRMEQYYKDPLMFNPDRFSPDAPK (2)PYFTYFPFSLGPRNCIGQVFSQ (0)MEAKVVMAKFLQRYEFELAEGQSFKILDTGTLRPLDGVICRLRSRTNNKKANK*
>CYP51A1
NM_001016194.2 80% to
CYP51 human, CYP51 ortholog L.Zhu
MLLSLWEAGGTLLEEAVGGSLASRILIPCTFLLALAYVSKLAFK
HLQAEDPGNVKYPPFISSNIPFLGHAIAFGKSPISFLENAYDKYGPVFSFTMVGKTFT
YLVGSDAAALLFNSKNEDLNAEDVYSRLTTPVFGKGVAYDVPNPIFLEQKKMLKTGLN
IAHFKTHVQMIEEETQEYFERWGDSGVRNLFEALSELIILTASRCLHGKEIRSMLNER
VAQLYADLDGGFTHAAWLLPGWLPLPSFRRRDRAHREIKNIFYQVIQKRRNSAEREDD
MLQTLLDATYKDGTPLNDDEIAGMLIGLLLAGQHTSSTTSAWMGFFLAKNKSLQAQCF
AEQKAVCGEDLPPLNYDQLKDLQALDRCIKETLRLRPPIMTMMRMARTPQSVAGYNIP
PGHQVCVSPTVNHRLRDTWDKNTDFNPDRYLHDNPAAGEKFAYVPFGAGRHRCIGENF
AYVQIKTIWSTMLRMYEFELVDGYFPTINYTTMIHTPNNPVIRYKRRKN