472
cytochrome P450 sequence pieces from Amphioxus. Very fragmentary with
two
haplotypes for many genes.
From JGI
Branchiostoma assembly Jan 30, 2008
Search for
P450 at 1.0e-5 or less (481 results, some false positives)
This file
has clans 7, mito, 19, 20, 26, 46, 51, 74
CYP7 clan
(12 sequences) includes CYP39, no
CYP8 sequences found
$$$$$$$$$$
>fgenesh2_pg.scaffold_10000055|Brafl1
41% to CYP7A1
MPCSSCLAVMIKVMIPTTSTDHGWNLYPPVSLCRQGGVTPPTGWVVTPESYAILCPSWQQVFCEQYQCLSAATLLPAGLL
EAPPLFSYYSGARGVVHGGQGPVGHAPPPDSEMFSENVEYSGAITEQELWFEIYCLFQWNKTVFLFRVRGSSSSFEVTQT
NSPDSVRQALITAGLSSTRLRRAGGTCSTRDDYTYIMWKFLITLTSCFCMKRSNKVSPVEDGDVKEETAPGEEEMNRGTT
ECPVVTAQPPMSQPRSSKSSADVLAELRQDGLLPLNTRGESVAFQVPASEPDAPPRRPVKLAKLEETLQERRERVKKEPA
GSRTKLRQQLSDAANRRDEMLQNRSRKLAESSRRAKAKARAAKKERKSTAFVISSVSDTDAIVPRDSEKAQALEKRLSKR
RKRVAKRITAEDMKKQQELAAERRRRSNKVSPVEDGDVEKEPAPGEEEMVRGTTECPVVTAQPPMSQPRSSKSSADVLAE
LRQDGLLPLNTRGESVAFQVPLVKPASEPDAPPRRPVKLAKLEETLQERRERVKKEPAGSRSKLRQQLSDAANRRDEMLQ
NRSRKLAESSRRARAKARAAKKEGKSTAFVISSVSDTDAIVPRDSEKAQALEKRLSKRRKRVAKRITAEDMKKQQELAAE
RRRCHIDRLAYLSTYSK
MVTELLGVCLAVVLVFVLLQVTTRRRRPGEPPLEPGPLPYLGVALEFSRNPLGFITSRWKKYG
DVFTVRLAGHYTTFVLDPHSFTHAIRNSKVLDFRVFSSKIAHRAFGMPIVYGTHRDWVRADSDALYPKELQGQGLEKVTE
VMMNNLQSAMLAATDVKDKWNKGELWSFVYRIMFSASYKTLFGRHKEDEEETARLLHAMEEFQKYDKRFPEIISNVPWWL
MGQTKKRYEYLKSMVSPTELSQRGVSDFIRMRQEIYADGNLSPDEMTGFNFATMWASLSNTVPAAFWTLFYLLKDPVAMD
AVREEVNQILKETGQSLETVKEAGEMLHVTREQLNDMKCLGSAINEALRMCSASIIIRVATEDAELALESGSTFRIRKGD
RVALYPGFLHMDPEVFDDPETFKYDRFLENGMEKTTFYKNGRKLRHYLLPFGHGASMCPGRFFALNEIKQFVTIVVCYFN
MELMEKQTPPKDQSRAGLGTLAPLKECLFRYSLK*
>fgenesh2_pg.scaffold_63000051|Brafl1
38% to CYP7A danio
96% to fgenesh2_pg.scaffold_10000055|Brafl1
MVTELLGVCLAVVLVFVLLQVTTRRRRPGEPPLEPGPLPYLGVALEFSRNPLGFITSRWKKYGDVFTVRLAGHYTTFVLD
PHSFTHAIRNSKVLDFRVFSSKIAHRAFGMPIVYGTHRDWVRADSDALYPKELQGQGLEKVTEVMMTNLQSAMLAATDVK
AEWNKGELWSFVYRIMFSASYKTLFGRHKEDEEETARLLHAMEEFQKYDKRFPEIISNVPW
(gap)
CLMGQTKKRYEYLKSNTVP
AAFWTLFYLLKDPVAMDAVRAEVDQILKETGQSLETVKEAGKMIHVTREQLNDMKCLGSAINEALRMCSASIIIRVATED
AELALESGSTFRVRKGDRVALYPGFLHMDPEVFDDPETFKYDRFLENGMEKTTFYKNGRKLRHYLLPFGHGVSMCPGRFF
ALNEIKQFVTIVVCYFNMELMEKQTPPKDQSRAGLGTLAPLKECLFRYSLK*
>fgenesh2_pg.scaffold_1047000003|Brafl1
only 6 aa
diffs to fgenesh2_pg.scaffold_10000055|Brafl1
MVTELLGVCLAVVLVFVLLQVTTRRRRPGEPPLEPGPLPYLGVALEFSRNPLGFITSRWRKYGDVFTVRLAGHYTTFVLD
PHSFTHAIRNS STGGTEDQRSTTCVQIR ASYKTLFGRHKEEKDETALLLHAMEEFQKYDKRFPEIISNVPWWLMGHTKKR
YEYLK
DTCRHDRQCISRHGVGSCCAPRRPIFSPLPVCKSAGQVGDTCQRSGERLAYPTSVGRRQYIFTCPCAEGLQCELF
SGYADIGTCVPVQY*
>estExt_fgenesh2_pg.C_4350040|Brafl1
25% to CYP8b.c, 23% to CYP7B1 human
MGSVLGTLQLLGWNNQMLKPNREEDFVEKNIGFPCRVVTGNKTVQSVFDIDLFKKEEFCFGVVGEVRKDFTEGVCPCILS
NGKIHEKNKGFLMEVIAKAGEDIPPSTALSVLSNISKWGSTPMSDFESKLTDVAADAFLPNIFGESTHFHGEEIRLYRSG
AI AVRLSIVKALTGRNLDEERRAMTSILEKIKTSERYQQLLDLGKSYGLGEKEATAQLLFPVFINGAYGLAAHLVCTFAC
LDTISAEDREELREEALAALKNHRGLTRESLEEMPKIESFVLEVLRFCPNPVFWSTIATCPTTVEYTTDSGEHTLKIEEG
ERVYASSYWALRDPAVFDKPEDFMWRRFLGPEGDALRKHHVTFHGRLTDTPAVNNHMCPGKDVSLSALKGSIAIFNTFFG
WELQEPPFWTGKKLSRGSLPDNEVKIKSFWVQHPE
DLKEIFPSHFQDIVNEVDDVGDIDVLVKTKTGKYSGSGTNSNVYI
RLFDDKGHQSRELQLDVWWKDDFEKGQEGQYKLKDIKVAAPIVKIELFRDGCHPDDDWYCESVSVQLNPDNNGPTYDFPV
NRWIRQNDHVWLSPGGGEPPKDDVNPIDD*
>CYP7 estExt_fgenesh2_pg.C_10470002|Brafl1 40% to
CYP7B1
no allele
MISGILAGCLVVLVVAILVQAVGRKRDPNEPPLESGPVPYLGVALQFAMDSLKFIRSRQKKYGDVFTVKLAGKYTTFVLD
PHSYSDVMRQHKILDFKTVGMDIVERGFGTTHFEKTGRAHVLHTADAYFPVHLQGNALDPLTNTMMGHLQTAMLADIGEAA*
$$$$$$$$
>estExt_fgenesh2_pg.C_1950037|Brafl1
27% to CYP7D1, 30% to CYP7B1
MGGVWSNTYGFIKGVTDGVHMMKPEGEHPSVVRTNPGLPVVALMNQDTIHYAINPETYKKEPYSFGPVGVSKDVLRGHCP
SMFSNDEDHRRKKALLVDAYKQGEKSLPSILFNQIKAHFGEWSRLKDVPDFEERVFHIMSETLTEALFGRKIDGQLCFTW
LNGLITEAKTWIPMPSLAWKRRQAIKAIPELLKAIETAPKYRELVQLCHTHGVEVEEGIFTILYGTLFNGCAAQTAAIVS
SVARLHTLSDAEKN
EIIQTTLQVLEKHGGVSEESLGEMKTLESFILEVLRLHPPVFNYWVLARKDLVISPEKENIKVRKG
ERMLGCCFFAQRDGSVFPDPDRFRWNRFLDEQGGQKKHLFFPRGSFTEAADLNSHQCPGQDIGFFMMKTTLSVFLCYCSW
ELKDAPVWSDKPIRVGNPDDPVRLVRFNFRSEQ
AGRALTQGNRLVLIRAQVCLAVWTLTHLSVSRLVLKLDATTMPRNQR
APGSGGLPVSERRTRGHEKEIEAGWERSKFNEFVSDLVSLERSLPDTRPVRCHKAQVLDNLPTTSVIICFCEEAVSTLLR
SVHSVINRSPPHLLKEIILVDDASTAAYLKEDLDTYMSKFPQVKIVHLPEREGLIRARLRGAEIATGDVLTFLDSHIECN
VGWLEPLLDRIGRNRTTVPCPSIDRINDNTFGYEAANENMRGGFNWGMKFDWVSLPPGEDDRRYQDIWSQNEIIKSPTMA
GGLFSIDRRFFWELGGYDPGFQIWGAENLEISFKDIFYALNPHVENEIANAGDVSDRKRMREQLGCKSFQWYIDHVYPEI
TIPDLRAKARGEVKNRAMSLCLDAVYGEKVGAYFCHGEGGQQSFTLRMDDKIMLRWFFSVCLAAGLPIRNHKGAFLLTKK
PCTAPEVIAWNHTKGGPLVDQKTGKCLGVVNLSPEEHLVALRPCNQQRVQDWTFQNYLVDM*
>estExt_fgenesh2_pg.C_3320046|Brafl1
27% to CYP7D1
MGGVWSNTYGFIKGVTDGVHMMKPEGEHPSVVRTNPGLPVVALMNQDTIQYALNPETYKKEPYSFGPV
GVSKDVLRGHCP
SMFSNDEDHRRKKALLVDAYKQGEKSLSSILFNQIKAHFGEWSRLKDVPDFEERVFHIMSETLTEALFGRKIDGQLCFTW
LNGLITEAKTWIPMPSLAWKRRQAIKAIPELLKAIETAPKYRELVQLCHTHGVEVEEGIFTILYGTLFNGCAAQTAAIVS
SVARLHTLSDAEKNEIIQTTLQVLEKHGGVSEESLGDMKTLESFILEVLRLHPPVFNYWVLARKDLVISPEKENIKVCKG
ERMLGCCFFAQRDGSVFPDPDRFRWNRFLDEQGGQKKHLFFPRGSFMEAADLNSHQCPGQDIGFFMMKTTLSVLLCYCSW
ELKDAPVWSDKPIRVGNPDDPVRLVRFNFRSE QAGRALVNTSAKKI*
>estExt_fgenesh2_pg.C_1940045|Brafl1
27% to CYP7D1
MGGVWSDTFGFIKGLVHGPHMMKPEGEHPSVFRANPGVPAVVLLNRDTIQYAFNPETYEKEPYSFGPVCAAKDVVGGHCP
SMFSNDEDHRRKKALLIDVYKQGQKTLPSVFFSQIKAHFEEWSRLEDVPDFEERVFHITSETLTEALFGKKIDGRLCYTW
GNGIPTDFRTWIPIPPAARKRRQAVEVLPALLKAIKETPKYQELVQLCHTHGVEVEEGILTILYGTLFNGCGAQTATIIS
SVACLHTLSDAEKNEIIQTTLQVLEKRGG
ISEESLSEMKTLESFILEVLRLHPPVFNYWALARKDLVISPEKENIKVCKG
ERMVGSCFWAQRDGSVFPDPDRFRWNRFLDEDEQGGQKKHLFFPRGSWTEAADLDSHYCPGQDIGFFILKVLLAVLLGYC
SWELKDAPV WSDNTFRLGNPDDPVRLARFNFRSEQAGRALGIRPDNIAPNAI*
>estExt_fgenesh2_pg.C_510020|Brafl1
30% to CYP4V6
90% to estExt_fgenesh2_pg.C_1940045|Brafl1
87%
to estExt_fgenesh2_pg.C_3320046|Brafl1
87% to
estExt_fgenesh2_pg.C_1950037|Brafl1
MKPKGEHPSAFRMNNGVPAVVLLTRDTIQYAFNPETYEKDPYSFGPGGVSKDVVRGHCPSMFSNDEDHRRKKALLIDVYK
RGQKTLPSVFFSQIKEHLEEWSRLEDVPDFEERVFHIMSETLTEALFGRKIDGELCFTWLNGLLTDFKTWIPIPSMSRKR
RLAIEALPALLKAIKEAPKYQELVQLCHTHGVEVEEGIFTILYGTLFNGCAAQCAAIVSSVARLHTLSDTEKNDIIQTTL
QVLEKHGG VSEESLGEMKTLESFILEVLRLHPPVFNFWCLARKDLVISPEKENIKVCKGERMVGCCFWAQRDESVFPDPD
RFRWNRFLDEDKQGGQKKHLFFPRGSWTEAPDLDSHQCPGQDIGFFMMKALLAVLLGY CSWELTAAPMWSDKTIRVGNPD
DPVRLARFNFRSEQAGRALGIRPDNIAPNAI*
$$$$$$$$$
>CYP39 amphioxus 49% to CYP39
zebrafish, start MET not certain,
2 choices
MATTIGEHSPGDELYNAFKY
MILFSLCFAFFSWRNIVKKGRPPCMDGWIPWFGCAIDFGKAPLDFIEETKRK
(0)
LGPVFTIVAAGRWMTFVTEPEDITTFFQSPNLDFQKAVQDPVSHT
(1)
ASVSTESFFQHHTKIHDTIKGRLAPANLHSFCSNLWGEFKQQLEQLEHHGKDDLNTLVRR
(2)
CMFAAVVNNLFGAENVPTDKDRIQEFSDIFVKYDADFEYGSQLPPFFLR
(2)
EWAESKKWLLSLFSRSIANMERKETESQ
(0)
TLLQSLTKMVDRPHAPNYALLMLWASQANAVP(0)
MSFWVLAMILSNEDVHAAVKKEVQDNLGSP
(1)
GDEPITEEDLKKLPLLKRCIMETIRLRSPGVITRAVDKPLRIR
(0)
KYIVPKGHLLMMSPYWAHRNPNFFPEPDKFLP (0)
DRWLDADLEKNLFLDGFVGFGGGRYQCPGR
(2)
WFALMEMQMLLAMMIQMFDFKLLGEVPKEVCQNFNYLISIHII*
>fgenesh2_pg.scaffold_124000018|Brafl1
45% to CYP39A1
MATTIGEHSPGDELYNAFKYMILFSLCFAFFSWRNIVKKGRPPCMDGWIPWFGCAIDFGKAPLDFIEETKRKLGPVFTIV
AAGRWMTFVTEPEDITTFFQSPNLDFQKAVQDPVSHTASVSTESFFQHHTKIHDTIKGRLAPANLHSFCSNLWGEFKQQL
EQLEHHGKDDLNTLVRRCMFAAVVNNLFGAENVPTDKDRIQEFSDIFVKYDADFEYGSQLPPFFLRSIANMER
(gap)
KETESQT
LLQSLTKMVDRPHAPNYALLMLWASQANAVPMSFWVLAMILSNEDVHAAVKKEVQDNLGSPGDEPITEEDLKKLPLLKRC
IMETIRLRSPGVITRAVDKPLRIRKYIVPKGHLLMMSPYWAHRNPNFFPEPDKFLPDRWLDADLEKNLFLDGFVGFGGGR
YQCPGRWFALMEMQMLLAMMIQMFDFKLLGEVPKESPLHVVGTQQPVGPCPVEWTKI*
>CYP39A1 fgenesh2_pg.scaffold_124000030|Brafl1 45% to
N-term
MATTIGEHSPGDELYNAFKYMILFSLCFAFFSWRNIVKKGRPPCMDGWIPWFGCAIDFGKAPLDFIEETKRKLGPVFTIV
AAGRWMTFVTEPEDITTFFQSPNLDFQKAVQDPVSHTASVSTESFFQHHTKIHDTIKGRLAPANLHSFCSNLWGEFKQQL
EQLEHHGKDDLNTLVRRCMFAAVVNNLFGAENVPTDKDRIQEFSDIFVKYDADFEYGSQLPPFFLREWAESKKWLLSLFS
RSIANMERKETESQTLLQSLTKMVDRPHAPNYALLMLWASQANAVP
(gap)
KYIVPKGHLLMMSPYWAHRNPNFFPEPDKFLP
LSDLDGETRSMVEKMMYDQRQKAMGLPTSDEQKKEDVLKKFMEQHPEMDFSKAKFC*
Mito clan
(28 sequences, some duplicates)
$$$$$$$$
>CYP11amphi
mixed seq 43% to Gene C, 35% to Gene B, 34% to gene D
36%
to 27B1 fugu, 38% to 11A1 fugu, 33% to CYP24 fugu, 32% to 27C1 fugu
37%
to chicken CYP11A1, 39% to catfish Ictalurus punctatus 11A1
This
is a probable CYP11A gene
(2)
EAKPFSALPGPPSVPVLGNFLHMWWEGLLEKEKLNKNHIMFTDFFRQYGPIFR (2)
(2)
LKIVNVDMVSIKDPVAVQELFRKEGKYPARIDIKPWRRYREISGKATGVFLS (2)
(2)
NGKDWQKNRSIMARPMLRPKHVSTYVSNLDTVSADMIKRLRVLQARADGIEV PNISDELFKWALE (1)
(1)
SICTVLFNERMGYLQDNISQDAQDFIQGIHTIFLTTNTVIFPDADVHRFLRTKPWRQSVEAWDTVFRV(1)
(1)
GEKVMVRKLQEALEREERGEGEDDQPNFLAFVNSTGRLTKDEIYSNTIELMGAAIDT (0)
(0)
TSNTLLWTLYELSRRPELQDRLYQEVTQVIGQDKVMTWDHLKDLHLLKAIIKETLR (2) 885
(2)
MYPVVHNVSRLLQEDTVLMGYRLPAK (0)
(1)
TCVVAQVYAMGRDPQLFPDPDEFKPERWLRTGEAHDEINPYSSLPFGFGPRSCL (1)
(1)
GRRVAEVELQLLLAK (0)
(0)
MSQQFVLSQVEPEEISSVAQPLLMPETPLHLRFVDRK*
>fgenesh2_pg.scaffold_28000018|Brafl1
34% to CYP27C1 98% to 11amphi above 6 aa diffs
MMSVPVISGSRQRLSAVVGRAVSPWRPQGHIRVRALVGYRSGLVGPRTVPSPVQTYSTAAVGSTSHHNDDSEAKPFSALP
GPPSVPVLGNFLHMWWEGLLEKEKLNKNHIMFTDFFRQYGPIFRLKIVNVDMVSIKDPVAVQELFRKEGKYPARIDIKPW
RRYREISGKATGVFLSNGKDWQKNRSIMARPMLRPKHVSTYVSNLDTVSADMIKRLRVLQARADGIEVPNISDELFKWAL
ESICTVLFNERMGYLQDNISQDAQDFIQGIHTIFLTTNTVIFPDADVHRFLRTKPWRQSVQAWDTVFRVGEKVMVRKLQE
ALEREERGEGEDDQPNFLAFVNSTGRLTKDEIYSNTIELMGAAIDTTSNTLLWTLYELSRRPELQDRLHQEVTQVIGQDK
VMTWDHLKDLHLLKAIIKETLRMYPVAPNVSRVLQEDTVLMGYMLPAKTCVVAQVYAMGRDPQLFPDPDEFKPERWLRTG
EAHDEINPYSSLPFGFGPRSCLGRRVAEVELQLLLAKMSQQFVLSQVEPEEISSVAQPLLMPETPLHLRFVDRK*
$$$$$$$$$$$
this
block related to gene B
>Gene
B 84% to Gene D, 35% to CYP11 amphi, 33% to Gene C
30%
to CYP24 Fugu, 30% to 27A3 fugu, 27B fugu, 27C fugu, 30% to 11A fugu
in
nr blast best mammal hit is CYP24 mouse, but Drosphila hits are better.
34%
to 49A1 D. melanogaster
MYQLLSAARHQGQSLFRVCRARSLAALKTTYRPQSNKAEESVTYDTAARPFEEIPGPKGLPLIGTALEYTPF(1)
(1)
GQFKMITNLRESFRERTRTYGSIYRERIGPLDLVVISDPKEIEKVFRNE
GRYPERIELASIKVYREIKKLPTGLINL (2)
(2)
NGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTRDLVDVIRALIGKEESGGQVQNFINYVYRWALE (1)
(1)
AISVVVLDKRLGCLTLGDLEPGSDAKLMIDGVNDFFDAFVKLEMSATGL YKYISTPTWRKFAKAVDQFHR (2)
(2)
VAEKLLKEKLAKTTTEDGKPAESDTDFLQSLLSRNDVTFEEAMEMAVDLLSAGIDT
(0)
SGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIDDKVLNRMHYLRAVVKETFR (2)
(2)
VYPTVLNNVRRLDQDIVLSGYVVPAK (0)
(0)
TTILLAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCL (1)
(1)
GRRFAEQELHLGLIR (0)
(0)
IVQNFHVGWAGEDMKQDNRIILAPDRDSFVFSERT*
estExt_gwp.C_8820003|Brafl1 34% to CYP24
2134 6.9e-224 1
>fgenesh2_pg.scaffold_214000064|Brafl1
3 genes fused,
31% to
CYP24
MYQLLSAARHQGQSLFRVCGARSLAALKTPCRPQSNKAEESVTYDTAARPFEEIPGPKGLPLIGTALEYTPFGQFKMITN
LRESFRERTRTYGSIYRERIGPLDLVVISDPKEIEKVFRNEGRYPERIELASIKVYREIKKLPTGLINLNGPEWQRVRSS
VQKDLMRPKTVGAYASLQDDVTRDLVDVIRALIGKEESGGQVQNFINYVYRWALEAISVVVLDKRLGCLTLGDLQPGSDA
KLMIDGVNDYFASLVKLEMSATGLYKYISTPTWRKFAKAIDQWHFVAAKLLKEKLAKSATKDGKPAESDTDFLQSLLSRS
DVTFEEAMLMAVDLMAAGIDTSGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIDDKVLNRMHYLRAVVKETFRVYP
TVLNNVRRLDQDIVLSGYVVPAKTTILLAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCIGRRF
AEQELHLGLIR
30% to
CYP24
SNKAEESVTYDTAARPFEE
IPGPKGLPLIGTALEYTPFGQFKMITNLRGSFRERTRTYGSIYRERIGPL
DLVVISDPTEIEKVFRNEGRYPERIELASIKVYREIKKLPAGLINLNGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTR
DLVDVIRALIGKEESGGQVQNFINYVYRWALEAISVVVLDKRLGCLTLGDLQPGSDAKLMIDGVNDYFASLVKLEMSATG
LYKYVSTPTWRKFAKAIDQWHLVAAKLLKEKLAKTATKDGKPAESDTDFLQSLLSRSDVTFEEAMLMAVDLMAAGIDTSG
NTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIDDKVLNRMHYLRAVVKETFRVYPTVLNNVRRLDRDIVLSGYVVPAK
TTILMAHDVISSLPEYYPEPEVYRPERWLRDDESSSVQPFTLLPFGYGPRMCIDPNKKVRMY
31% to
CYP27A1
RLQRAVRHQGQSLFRVCG
ARSLAALKTTVTQTQSTRAEESGVYDTAARPFEEIPGPKGLPFIGTGWDYSPFGRFPIKTNFRDSFRERTRTYGSIYRER
IGPLDLVVISDPKEIGKVFRNEGKYPERPPMGSIKTYREVRKLPTGIANLNGPEWQRVRSSVQKDLMRPKTVGAYASLQD
DVTRDLVDVIRALIGREESGGQVQNFTNYVYRWALEAISVVVLDKRLGCLTLGDLEPGSDAKLMIDGVNDFFDAFVKLEM
SATGLYKYISTPTWRKFAKAVDQFHSVAEKLLKEKLAKTTTEDGKPAESDTDFLQSLLSRNDVTFEEAMEMAVDLLSAGI
DTTGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIDDKVLNRMHYLRAVRQETFRIYPTALSNMRTLDRDMVLSGYA
VPAKTIVLMAHDVISSLPEYYPEPEVYRPERWLRDDESSGVQPFTLLPFGYGPRMCIGRRFAEQELHLGLIRIVQNFHVG
WAGEDMKQVHRLILSPDRDTFVFSERT*
>fgenesh2_pg.scaffold_214000063|Brafl1
34% to CYP24
MSLLQRAVRQQGQSLFRVCGVRSLAALKTTYRLQSTRAEESVADDTAARPFEEIPGPKGLPLIGTALEYSPFGRFPIKTN
LRSSYRERTKIFGSIYREKIGPLDLVVISDPKEIEKVFRNEGRYPERLPLESIKAYRELKKLPAGVVNLNGPEWQRVRSS
VQKDLMRPKTVGAYASLQDDVTRDLVDVIRALIGKEESGGQVQNFINYVYRWALEAISVVVLDKRLGCLTLDDLEPGSDA
KLMIDGVNDFFDSFVVLETSATGLYKYISTPTWRRFEKAIDQWHTVAAKLLKEKLAKGATEEGKPAESDTDFLQSLLSRN
DVTFEEAMMTVVELLAGGIDTTGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIGDKVLNRMHYLRAVVKETFRVYP
TVPNNLRKLDRDIVLSGYRVPAKTTVFMVDDVISSLPEYYPEPEVYRPERWLRDDESSSVQPFTLLPFGYGPRMCIGRRF
AEQELHLGLIRIVQNFHVGWAGEDMKQVNRMVFAPDRDTFVFSERT*
>fgenesh2_pg.scaffold_214000062|Brafl1
two genes fused
30% to
CYP27A3 31% to CYP24
MQTLFSDWTGFSAFWTGQIFPKTPHTIDDFDSGLGSQSTRAEESVAYDTAARPFEEIPGPKGLPLIGTGLDYAPFGRFPL
KTHLRESFRERTKAYGSIYREKLGPLDLVVISDPKEIEKVFRNEG
(gap)
RNGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTR
DLVDVIRALIGKEGSGGQVQNFTNFVYRWALEAISVVVLDKRLGCLTLDDLEPGSDAKLMIDGVNDFFNAAVKLELSGAG
RLYKYISTPTWRKFANAIDQWHGVAAKLLKEKLTKSAAEDGKPAESDTDFLQSLLSRNDVTFEEAMLMAVDLMAAGIDTT
GNTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIDDKVLNRMHYLRAVVKETFRLCPTVGNNIRTLDRDMVLSGYVVPA
KTKIFMAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCIGRRFAEQELHLGLIRLAVRHQG
QSLLRVCGARSLAALKPTY
25% to
CYP24
RLQSTRAEESVADGTAARPFEEIPGPKGLPLIGTALDYTPFGRFPLKTNFRESFRERTRTYGSIY
REKIGPRELVVISDPKDIQKVYRNEGRYPERPQVDSIKTYREMKKLPAGIVVLNGPEWQRVRSSVQKDLMRPKTVGAYAS
LQDDVTRDLVDVIRALIGKEGSGGQVHNFINYVYRWTLESIGVVVLDKRLGCLTLGDLEPGSDAQLMIGGVNDFFNAFSK
LEMSATGLYKYISTPTWRKFQKAIDQWHTVAAKLLKEKLTQSTIEDGKPAESDTDFLQSLLSRNDVTFEEAMEMALDLL
(gap,
missing I-helix) 37% to 27C1
VYPTFLNNVRTLDRDIVLSGYVVPGKTIIIIGNDIISSLSEYYPEPEVYKPERWLRDDEFSSVQPFTLLPFGYGPRMCIGR
RFAEQELHLGLIRIVQNFH
VGWAGEDMKQENRMVFAPDRDTFVFSERT*
>fgenesh2_pg.scaffold_214000072|Brafl1
33% to CYP27A3
MATGRATSRRNGQWGATLAIREINGPEWQR
VRSSVQKDLMRPKTVGAYASLQDDVTRDLVDVIRALIGKKESGGQVQNFT
NYVYRWALEAISMVVLDKRLGCLTLNDLEPGSDAKLMIDGVNDFFDAFVKLEMSATGLYKYISTPTWRKFAKAFDQWHAV
AEKLLKEKLAKSAAEEGKPAESDTDFLQRLLSSKDITFEEAMMMAVDLMAAGIDTTGNTLMFNLFCLAKNPEAQEKLYRE
IQEVVPAGQPIDDKVLNRMHYLRAVRQETFRFYPTVLSNTRILDRDVVLSGYFVPAKTIVLMAHDVISSLPVYYPEPEVY
KPERWLRGDESSSVQPFALLPFGYGPRMCIGRRLAEQELHLGLIRIVQNFHVGWAGEDMKQNNRIILAPDRDTFVFSART*
>e_gw.882.7.1|Brafl1
RERTKIFGSIYREKIGPLDLVVISDPKEIEKVFRNEGRYPERLPLESIKAYRELKKLPAGVVNLNGPEWQRVRSSVQKDL
MRPKTVGAYASLQDDVTRDLVDVIRALIGKEESGGQVHNFINYVYRWALEAISVVVLDKRLGCLTLDDLEPGSDAKLMID
GVNDFFDSFVVLETSATGLYKYISTPTWRRFEKAIDQWHTVAAKLLKEKLAKSAAEDGKPAESDTNFLQSLLSRSDVTFE
EAMMTVVELLAGGIDTTGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIGDKVLNRMHYLRAVVKETFRVYPTVPNN
LRKLDRDIVLSGYRVPAKTTVFMVDDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCIGRRFAEQEL
HLGLIRVGSFAV*
>estExt_gwp.C_8820003|Brafl1
34% to CYP24
MSLLPRVVRHHGRLFNVCSARSLVTYRSQSTRAEESVAYDTAARPFEKIPGPKGLPLIGTGLDYAPFGRFPLKTHLRESF
RERTKAYGSIYREKLGPLDLVVISDPKEIEKVFRNEGRYPERVQLESVRTYREIKKLPIGVVNLNGPEWQRVRSSVQKDL
MRPKTVGAYASLQDDVTRDLVDVIRALIGKEGSGGQVQNFTNFVYRWALEAISVVVLDKRLGCLTLDDLVPGSDAKLMID
GVNDFFNAAVKLEMSGAGRLYKYISTPTWRKFANAIDQWHGVAAKLLKEKLAKSAAEEGKPAESDTDFLQSLLSRSDVTF
EEAMLMAVDLMAAGIDTTGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAEQPIDDKVLNRMHYLRAVVKETFRLCPTVGN
NIRTLDRDMVLSGYVVPAKTKIFMAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCIGRRFAEQE
LHLGLIRVSFVALFRH*
$$$$$$$$$
>Gene
D 84% to gene B, 34% to CYP11 amphi, 30% to gene C
31%
to CYP24 fugu
MSLLPRVVRHHGRLFNVCSARSLVTYRSQSTRAEESVAYDTAAR
PFEKIPGPKGLPLIGTGLDYAPF (1)
(1)
GRFPIKTNLRDSYRERTKTYGSIYREKIGPRELVVISDPKDIQKVYRNE
GRYPERPQVDSIKTYREMKKLPAGIVVL
(2)
(2)
NGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTRDLVDVIKALIGKEESGGQVHNFINYVYRWTLE (1)
(1)
AISVVVLDKRLGCLTLGDLEPGSDAQMMIGGVNDFFNAFAKLEMSATGL YKYISTPTWRKFQKAIDQWHT (2)
(2)
VAAKLLKEKLTQSTIEDGKPAESDTDFLQSLLSRNDVTFEEAMEMALDLLSAGIDT (0)
(0)
TGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAEQPIDDKVLNRMHYLRAVVKETFR (2)
(2)
LCPTVGNNIRTLDRDMVLSGYVVPAK (0)
(0)
TKIFMAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCI (1)
(1)
GRRFAEQELHLGLIR (0)
(0)
IVQNFHVGWAGEDMKQVNRLVLSPDRDSFVFSARA*
$$$$$$$
>GENE
F 61% TO GENES D AND B
MSRILQIVGRRAAFTQAGLQNVPVWRPLGGRNGRGAASSAAATEQTTVQDGAARPFDEIPGPRGLPFIGTALDYSPF
(1)
(1)
GRFPIHTKMANSTIERYQTYGKIYREKIGLRDMVFVCDPKDIETVFRSDGRLPERPIPESIATYRRLKNKPLGVALL
(2)
(2)
NGEEWFRLRRSVNKDMMRPKAVGAYATMQDEVSRELVGLIQGVVRKGKTAGQVPDFTKLLYKWGLE (1)
(1)
ALSLVVLGKRLGCLTLDQLPEDSDAQRMIGAVNDFFYSFAKLQMSFPLFRYIRTPGWTTFERAMDTVSS (2)
(2)
ITEKMIGERLEKLRQMEEPPDEADFLTSLLSREDMNLDEAIQMSVDLLQGAIDT (0)
(0)
TAHTLVFNLYCLAKNPDAQQKLYEEILEVVPPEQPIDDRVLNKMHYLRAVVKETFR (2)
(2)
MYPTLLSTARTLTRDVVLSGYHVPAK (0)
(0)
TNVMLAQNVISTLPEYYPEPESYIPERWLRTESSNVQSFSLLPFGYGPRMCI (1)
(1)
GRRFAEQELYLGLVR (0)
(0)
IIQNFHVGWDGEDMKQVWRIFNAPDRDTFVFSERKS*
>fgenesh2_pg.scaffold_119000067|Brafl1
CYP29% to CYP11A1
same as
gene F above
MTHTGNADGSVHGIEILANGSLQDKYSLSQGDMDGPIVPVNETITADGVQRNVILVNDQFPGPTLEVMEGAQVVVTVVNE
LLREATSLHFHGMYMRGVPYMDGVPYVTQCPILPMHSFTYRFKAEPAGTHWYHSHLGSQKEDGLYGAFIVHKNSIPTTPS
LPMFLQDWWHDDFNTIDVDSAYMEHRGPGRFFGPWQERGFSFEGTELTALNFKSALINGRGRYNNNSAPLTRFEISSGET
LRFRLINAGAEYTFRVSIDAHSMTVVANDGHDVEPVHVQSILVFPGESYDFEVVGDPSNSGTYWIRAQTLWAGKGPDVEP
EDRLQEVRAILAYDNAPTDEDPNSAMQTCTENSPCRVLNCPFPAFPAGSNTECIYVSDLNSTEEYSMSDESETEEYFFNF
GYQIGSSVNGRKFDTPKKPLIFKAPYDITPCEATCETDGCKCTYMVEIPLGKTIRFVLMDLGVESEGHHLIHLHGYDFRV
LAMGFPVHNETTGRWISQNADIDCGNDNKCNMASWNVTRPNLNYNKPPIRDTVVIPARGYTVIEFRSNNPGFWYFHCHQT
THMNEGMSMIIAEALDKLPALPYGFPTCGDFTGTEKPPGRGRTVAAMEQSVTKVELDHTQLVIIIVISAAMSATIALAAV
GIYNARAKVNAFQRQVVKRSYVVCDQALGPQVLTTDKPLDTRHKPRGIMHLLNAFILPCLCVTMATTQRCTDDVCEFTLV
VRYARTMTHTERDGEVHGIEILTNGSLQDKYSLSQGDMDGPIVPVEETITADGVQRNVIVVNDQFPGPTLEVIEGAQVVV
TVVNNLLREATSLHFHGMYMRGVPYMDGVPYVTQCPILPMHSFTYRFMAEPAGTHWYHSHLGSQKEEGLYGAFIVHKNSM
PTTPSLPMFLQDWWHDDFNNIDVDSAFMEHRGPGRFFVPWQNRGFSFDGNKLSSVRFISALINGRGRYNNNSAPLTRFEI
SPGETLRFRLINAGAEYTFRVSIDAHSMTVVANDGHDVEPVQVQSILVFPGESYDFEVVGDPSNSGTYWIRAQTLWAGKG
PDVEPEDRLQEVRAILAYDNDPTDEDPNSDMQNCTENSPCRVLNCPFPAFPAGSNTECVYVSDLNSTEEYSMPDESETEE
YFFNFGYQIGSSVNGRKFATPKKPLIFKAPYDITPCEATCETDGCTCTYTTEIPLGKTIRFVLMSLGFGSGGHHVIHLHG
YDFRVLAMGFPEYNETTGRWITQNDDINCGDDNKCNMAAWNVARPNLNYNKPPTRDTVVIPARGYTVIEFRSNNPGFWLF
HCHQTTHMKEGMSMIIAEALDKLPALPYGFPTCGDFTGTEKPPGRGRTAAAMEQSVTLVELDNTQLVIIIVVSAAMSATI
ALAAVGIYNARVNKSKEKMIDTP
IVGRRAAFTQAGLQNVPVWRPLGGRNGRGAASSAAATEQTTVQDGAARPFDEIPGPR
GLPFIGTALDYSPFGRFPIHTKMANSTIERYQTYGKIYREKIGLRDMVFVCDPKDIETVFRSDGRLPERPIPESIATYRR
LKNKPLGVALLNGEEWFRLRRSVNKDMMRPKAVGAYATMQDEVSRELVGLIQGVVRKGKTAGQVPDFTKLLYKWGLESLS
LVVLGKRLGCLTLDQLPEDSDAQRMIGAVNDFFYSFAKLQMSFPLFRYIRTPGWTTFERAMDTVSSITEKMIGERLEKLR
QMEEPPDEADFLTSLLSREDMNLDEAIQMSVDLLQGAIDTTAHTLVFNLYCLAKNPDAQQKLYEEILEVVPPEQPIDDRV
LNKMHYLRAVVKETFR
CAINSIMARHRTLHHGHRRKLSFIIPVLLVYVLVSAFLDLTYSGYMAKHVSDGDSHQTITTTEG
TNMTKLLWEGLSRLEQMDQQRANLTEKLKNIAKMANVSEEAIGPWLSQLRPMTIVDAPAGNRTALLTCQDIAEIRISNPM
GKGVTKVVELGNYQGHGVAVKRVLPTVKDVRECKRTIERSGWNKCFVFPNYKLLKEILLLQQLKHPNIVQLLGYCVQNEE
TDENLAEHGVVSVTEMGTKFHVGRARKMDWKMRLKMAIDLASLLDYLEHSPMGSLLMADFKVEQFVWVGGKVKLTDLDDV
SNVERKCAVDSDCWVDKKDVGVPCTNGSCRGLNAKHNMNGAYKTILRHIMVHTGTEETALREDLRSVSISAASLHSRLLQ
LLDKELAIDSPTHR*
$$$$$$$
>Gene
G 55% to amphi 11
MFLGLMRCQTPSQTYSTGPQAASHPQLDPP
AKPFSALPEPMKGLPGILKTLVVLCTGGMSRKAQLKSHVVIGQLFQMYGPILR
(2)
NRFGNFDMVNICDPDAAREVFKVEGKYPERLDIAPWRLHREDAGKELAVLLG
(2)
NDKKWHKNRTVVSRPMLRPQSVAAYVLKIDDVATDMLQHIRSVRAGPDGTEVLDLENELFKWALE(1)
SISAVLFNERMGLLQDNIPQDAQDFINGMHDAFDSLTRAMTPDARLHKLLNTKSWQKNKQAWDT
(0)
GEKVMDRQLQRAEERQARGEADDGQLDFLWFISSREKLTKEEIYANAIELMGAAIDT
(0)
TSTTLLWTLYQLCHRPDLQDKLYQEVTQVIGQDEVITYDHLKNLHLFKAVIKETLR
(2)
LHPVAFAITRVIQQDTVLMGYKIPAK
TVVMVSLYDMARDPRLYKNPEEYRPERWLRGAEDYVDTHPYAYLPFGFGTRSCI
(1)
GRRVAETELQVLLAK
(0)
ICQQFVLKQRNPRVIPAMTKGILMPAEKMDICFIERQ*
>e_gw.241.76.1|Brafl1
33% to CYP27C1, 99% to gene G above
FGNFDMVNICDPDAAREVFKVEGKYPERLDIAPWRLHREDAGKELAVLLGNDKKWHKNRTVVSRPMLRPQSVAAYVLKID
DVATDMLQHIRSVRAGPDGTEVLDLENELFKWALESISAVLFNERMGLLQDNIPQDAQDFINGMHDAFDSLTRAMTPDAR
LHKLLNTKSWQKNKQAWDTVFKIGEKVMDRQLQRAEERQARGEADDGQLDFLWFISSREKLTKEEIYANAIELMGAAIDT
TSTTLLWTLYQLCHRPDLQDKLYQEVTQVIGQDEVITYDHLKNLHLFKAVIKETLRLHPVAFAITRVIQQDTVLMGYKIP
AKTVVMVSLYDMARDPRLYKNPEEYRPERWLRGAEDYVDTHPYAYLPFGFGTRSCIGRRVAETELQVLLAKICQQFVLKQ
RNPRVIPAMTKGILMPAEKMDICFIERQ*
>fgenesh2_pg.scaffold_140000032|Brafl1
31% to CYP11A2
90% or
more to gene G above
MIRLCALTQRRSAATIVGRWLDFHRGARAASQGLLRCQTPNQPYSSGPQAASHPQLDPPVKPFSALPEPMKGMPGILKFL
VVLCTGGMSRKAQLKSHMMIGQLFQMYGPILRNRFGNFDMVNTCDPDAAREVFKVEGKYPERLDIAPWRLHREDAGKELA
VLLGNDKKWHKNRTVVSRPMLRPQSVAAYVLKIDDVATDMLQHIRSVRAGPEGTEVLDLENELFKWALESISAVLFNERM
GLLQDNIPQDAQDFINGMHDAFDSLTRAMTPDARLHKLLNTKSWQKNKQAWDTVFKIGEKVMDRQLQRAEERQAR
(gap) 37%
to CYP27A.c
GEADDGQLDFLSFISSREKLTKEEIYANAIELMGAAIDT
VNSTSMSITLSQLVTDTVHE
TSTTLLWTLYQLCHRPDLQDKLYQEV
TQVIGQDEVITFDHLKNLHLFKAVIKETLRLHPVAFAITRVIQQDTILMGYEIPAKTVVMVSLYDMARDPRLYKHPEEYR
PERWLRGAEDYVDTHPYAYLPFGFGTRSCIGRRVAETELQVLLAKICQQFVLKQRNPRVIPAMTKGILMPAEKMDICFIE
RQ*
$$$$$
>fgenesh2_pg.scaffold_283000056|Brafl1
29% to CYP24
MSHILKIAGRRTAVRHQLRLPGFWRFCGRQGVRGAATTATAAEQVAPEETVRPFQE
IPGPKGLPFIGTALDYSPFGRFPIHTQLGNSAIERY
KTHGKIYREKLGPGREMVFVCDPKDIGTVFRSDGRLPERPPVNSIATYRKMRKKPPGLGNLMGEDWHR
VRSSVNKEMMRPKSVGAYATMQDDVSREMAELIQTVVRKGDSGGQVDNFMNLMHKWGLESLSLVILGKRMGCLTLDQLAE
DSDAQRMISAVLEFFLYFGKLEMSLPFYRYFSTPAWKKFETAMDTMN
(Gap)
SLLSQKDMTLDEAVMMAIELLTGAFESTANTLA
FNLYCLAKNPAAQQKLYEEIMNVVPPGQPIDDRVLNKMSYLRAVFKETSRLYPTIFFNARTLTRDVVLSGYHVP
AKIIQKFHVGWDGEDMKQIYKIFNTPDRDTFIFRERE*
>e_gw.77.176.1|Brafl1
33% to CYP24
93% to
fgenesh2_pg.scaffold_283000056|Brafl1 (allele)
KTYGKIYREKLGPGREMVFVCDPRDIGTVFRSDGRLPQRPPVNSLATYRKMRKKPLGLGNLMGEDWHRVRSSVNKEMMRP
KSVGAYATMQDDVSREMAEQIQTVVRKGDSGGQVDNFMNLMHKWGLESLSLVILGKRLDCLTLDQLAEDSDAQRMISAVL
EFFLYFGKLEMSLPLYKYFNTPAWKRFVRALDTMN
RYAICPIQERILTELSKLEEPPQETDFLSN LLSQKDMTLDEAVMM
AIELLTGAFESTANTLAFNLYCLAKNPAAQQKLYEEILEVVPPGQPIDDRVLNKMSYLRAVFKETSRLYPTIFFNARTLT
RDVVLSGYHVPAKTQIIMANNVISTLPEYYPDPEAYIPERWLRTESSAANVQAFALLPFGYGARMCVGRFLPVKNRSVS*
$$$$$$$$$
>fgenesh2_pg.scaffold_191000017|Brafl1
27% to CYP27C1
MGITGVLGRRCDAVMRSGRVFNGQWKCGRSSLRNVGLCILRKSSSTVTNVGMETCVDPTANKTDVAVRPFHEIPGPKGLP
IIGSLWEYTFLGKLDPRRFDEVLWNRYQEYGKIYKEDLGPRGTFVRIADPGDIETVYRNEGRYPHRPSFPLVRESMEAAG
QELLKHRARSESSFNGQGLEWYRTRSAVNRTLLRRSGVALFHPTLNEISDDFLTLLKRSLDENNTVPDITWQIRRHNTEV
AGTTIFGRRPGCLEPDFSGSCQTSEMIKSIDDFFASWLKLEIGFPLTKYLLKDTWNGYMNAHRNILRIVKYHMDLDVEYE
DSRPSVLGYLLSESSLSDTDAAMSAVELFVGGMQSSSHADMFQLYELARHPHVQETIRREVTEALPKGEAVTSAHLHKLP
YLKAFVKETFRFHPVGLLHMRILDRDVVLSGYRVPAHTTIEIPMSVLGRLEELYPQADRFLPERWLRRGPNGFRSRMFSH
VTPFGHGPRACIGRRLAEDKFYIQIAKLVQNFDLHCDEEVGTVTGCFQELSPTPNIRFTPR*
$$$$$$$$
>estExt_GenewiseH_1.C_30140|Brafl1
33% to 11A2, 33% to CYP27A1
MGGWMDKFHLHMQNRWRQYGSIYKENIGPQEIVCMFDPEDVAPVLRAEGRYPRRYAFDSFYLAREIMGHKLGVFLENDEK
WQQYRTVMNKKLLRPQQAAAFTPLMDEAASNFMSYLRRKRDQGGMVTDLQAHLFRWAMESGCTAMFNQHLGLLSEDPPQL
AKDFISSTMAVLDTTNTMMTIPPKVHKALNTKAWKEHLEGWQTSFRVTKQLIEEIMERGLEKESEEDEEIPDLVSYLLSV
KLRPEEVLANIVDVLGGAVDTTSNTMAFTMHTLARHPDIQEKLHDEVMRVAPDHQAPVTQEQVHKMPYLRGVIKEVLRLY
PVAYVFSRVLNHDAVVHGYKIPAGTNLVVCPYVMGRDPNSYDDPEEFRPERWYRENSKSVKAFSWLPFGFGARGCVGRRI
AETEMHLVLIRICQNFLLEQEKDEELVGRIRLVLIPDKSVDLKLIDRN*
>e_gw.29.150.1|Brafl1
32% to CYP27C1
92% to
estExt_GenewiseH_1.C_30140|Brafl1
IAQNRWQQYGSIYKENIGPQEIVCMFDPEDVAPVLRAEGRYPRRYAFDSFYLAREIMGHKLGVFLENDEKWQQYRTVMNK
KLLRPQQAAAFTPLMDEAASNFMSYLRRKRDQGGMVTDLQAHLFRWAMESGCTAMFNQHLGLLSEDPPQLAKDFISSTMA
VLDTTNTMMTIPPKPGVKTYCTNVAPGSFLSSLELVFIMERGLKKESEEDEEIPDLVSYLLSVKLRPEEVLANIVDVLGG
AVDTTSNTMAFTMHTLARHPNIQEKLHDEVMRVAPDRQAPVTQEQVHKMPYLRGVIKEALRLYPVAYVFSRVLNHDAVVH
GYKIPAGTNLVVCPYVMGRDPNSYDDPEEFRPERWYRENSKSVKAFSWLPFGFGARGCVGRRIAETEMHLVLIRICQNFL
LEQEKDEELVGRIRLVLIPDKSVDLKLIDRN*
>e_gw.3.68.1|Brafl1
33% to CYP11A2
89% to estExt_GenewiseH_1.C_30140|Brafl1
GQEGATAKPFEAIPGPKGLPLVGTALHAAMGGWMDKFHLHMQNRWRQYGSIYKEIIGPQEIVCMFDPEDVAAVLRAEGRY
PRRHSVDSFYLAREIMGHKLGVLLENDEKWQQYRTVMNKKLLRPQQAAAFTPMMDEAASNFMSYLRRKRDQGGMVTDLQA
HLFRWAMESGCTAMFNQHLGLLSEDPPQLAKDFISCSMAILDTTNTMMTIPPKVHKALNTNAWKEHLEGWQTSFRVTKQL
IEEIMERELKKENEEDEEISDLVSYLLSVKLRPEEVLANIVDVLGGAVDTTSNTMAFTMHTLARHPDIQEKLHDEVMRVA
PDRQAPVTQEQVQKMPYLRGVIKEILRLYPVAYIFSRVLNHDAVVHGYKIPAGTNLVVCPYVMGRDPKSYDNPEEFRPER
WYRENRESVKAFSWLPFGFGARGCVGRRIAETEMHLVLIRICQNFVLEQKKDEELVGRIRLVLIPDKSVDLKLTDRN*
$$$$$$$$$
>fgenesh2_pg.scaffold_410000012|Brafl1
27% to CYP24
MQTRVKATVPTLRETGRYGVGKLHERHLDLHRQYGDICREKLLGREIVHVFSREIAQEVFMQEGRYPGRTVIEPDALYRT
TRGIPLGLLSLQDAEWHRLRRLAQDRILRPAVQSAVLPNMDRIAQEFVMRTDMLRSPGSDVMERNYKDELHLWSLEW
(gap)
KLIFSLPLYKVVPTPTWRKLAAAQDTFFRLSENYIKQVLTDSGDGDPETQDSLLLHLLRKSELSKEEVSATMTDLFQGGIDTT
TNGMMYSLFALAKNPEVQELVCQEIRTHLPEGARVTPEVLGKMKYLKAVIKETFRVCLPGCCRLWPVIFGTARQYDYDVV
LGGYDVPAKTEILVHHRVMCRQDKYFRDPLTFDPTRWLRDEKTPRVPTYLFMPFGHGVRMCIGMLNIILTIRRRFAEQQL
QLLVIRMLQRFHVECEEAELRQVFSLVLLPDRNPRFIFRRRQGETA*
>gw.501.20.1|Brafl1
30% to CYP24
LKKLHESFFERYRQFGKISKETIGNKCFVSVYDPRDIETLFRTEGPNPSWMQLMALGEVRKRLGKPLGMINETGQKWRQL
RYAAQSKLLNPKSVSSFVPVLDEISRDFVEKLRTGRSAATLEPTIDLDAELRKWSLESVVSATLGIRLGCLQKHRQIPDK
DTEDLLQSSDAFLDTWSKLELGPPLYMLYPTKTWRKFLRANELWLSAAGRMIDRSLDRSESERDPLQPEVTLLEHIVTRK
ELTPDDVVMIITELIFAGIESTAVAMTYNLYTMAKNQHVQEKVRREVNAVVGKSGKVTQDALKSLKYVKACIKETSRVLP
AFSMRNRILDKEIVLAGYRVPPNVIIRVLTHVTGQLPEYVVEPDRFAPERWLRDDTTIPKPHPFAVRPFGVGTRSCIGQR
LAEQELGILLAKV
>fgenesh2_pg.scaffold_44000117|Brafl1
87% to gw.501.20.1|Brafl1
MAFAVLMMMAAAVLPNFARSAITLIPMGSTYLPYGFDPAGAPLYGMGDRGAVEQLTYDADNYRIYTVGEARILNVIDISD
PKNAALVYQLQLPGGATDVDSCGRFVAVSIHDDFKVLPGTVLIYSMYDTTRKNMTLLHQIQVGALPDMVKFTKDCMTLVT
CNEGEPGLDESGNFVDPEGSASVIAFQSTNLGQESAPTVRTATFRKFDSLAEEYNSRGVRWTLPMIQVGSEVMEFNLSQT
LEPEYVAYNSDGSKAYIALQENNAIAVLDMATATFDDIYPLGSKYWGTASIDTSNEDGGSLVSRNLKSQRVQKAMNLTSQ
LGCAVFSSIDGLDPENPDKYSSLHLFGGRGFSVWDADDLSLVWDSGDDVERMVAKYYPTIFNSDYDEEFFNSTPAARFDH
RSCKKGPETESLAIGEVDGKTAFFVGNERSSTILVYSLADEDIITPVFQSIHFSGRTDLTWRQAYQDRVVGDIDPEDMRF
VSTRDSPTNSPLLLVAGTVSGTVSVYEVAESDDDGVSTAGKMKRAWLQHLVAKKLGADAVSIGRSGGETSTFSPPVTRY
25% to
CYP27B1
RQFGKISKETIGNKTFVSVYDPRDIETLFRTEGPNPSWMQLMALGEVRKRLGKPLGMINETGQKWRQLRYAAQSKLLNPKS
VSSFVPVLDEISRDFVEKLRTGRSAATLEPTIDLDAELRKWSLESVVSATLGIRLGCLQKHRQIPDKDTEDLLQSSDAFL
DTWSKLELGPPLYMLYPTKTWRKFLR
ANELW 38% to CYP27B1
LRVLPAFSMRNRILDKEIVLSGYRVPPNVIIRVLTHVTGQLPEYVVEPD
RFAPERWLRDDTTIPKPHPFAVRPFGVGTRSCIGQRLAEQELGILLAKMIQQFHIE
CDGEMEQIFNIANKPDLSGTFKFTEL*
>Gene
C 38% to CYP11 amphi, 34% to Gene E, 34% to Gene B
42%
to 27B1 Fugu, 38% to 27C1 fugu, 42% to 27A1 fugu (but not first exon)
37%
to 11A1 fugu, 36% to CYP24 fugu (Best match to CYP27B)
42%
to Xenopus trop. 27B1, 41% to Xenopus laevis 27A1
MAQQILRNSSVCSLVRPNSRALVSVAPAATVQQNRPLKEMPGPTNKLGQLWWGFKNRSRMHEAQ (0)
(0)
LEQERKYGRMWQSSFGFNPNVNVAHVALAEQLMRQEGKYPKRIEVNFMQQYRDLRGYSYGLLNQ (2)
(2)
NGPEWRHLRTAVSKRIMRPKEVPR (2)
(2)
YGDSMNEVVTDMIDRFKDLRDTTGGGKTVPDLTNELYKWAME (1)
(1)
SIATVLFDTRLGCLEREMPEKTQQFIDSIATMFRTAFLVSALKPWMLTYLGLGVWKRHVEAWDVIFSV(1)
(1)
AHENIDRKVLDIDARLSRGEDLVGSFLTYMLTGTDVTKKDLYATVTELLLAGVDT (0)
(0)
TSNTMVWTLYELARHPELQERLHQEVTSVVSPGQIPTVDDVKNMALLKNVIKEILR (2)
(2)
VYPVLPANGRVLDKDIVLDGYNIPKG (0)
(0)
TQFAILHYNMTRDPEVFEEPDRFNPDRWTRMGTEKVNTFSSVPFGFGPRQCA (1)
(1)
GRRLAEMEMYLVLAR (0)
(0)
LVQTFEVRQLTPGEVVRPVTRALLVPGDPVHLEFIDRP*
>CYP27 40% to 27B1 Fugu, 37%
to 27C1 fugu, 40% to 27A1 fugu (but not first exon)
35%
to 11A1 fugu, 34% to CYP24 fugu (Best match to CYP27B)
MAQQILRNSSVCSLVRPNSRALVSVAPAATVQQNRPLKEMPGPTNKLGQLWWGFKNRSRMHEAQ (0)
(0)
LEQERKYGRMWQSSFGFNPNVNVAHVALAEQLMRQEGKYPKRIEVNFMQQYRDLRGYSYGLLNQ (2)
(2)
NGPEWRHLRTAVSKRIMRPKEVPR (2)
(2)
YGDSMNEVVTDMIDRFKDLRDTTGGGKTVPDLTNELYKWAME (1)
(1)
SIATVLFDTRLGCLEREMPEKTQQFIDSIATMFRTAFLVSALKPWMLTYLGLGVWKRHVEAWDVIFSV(1)
(1)
AHENIDRKVLDIDARLSRGEDLVGSFLTYMLTGTDVTKKDLYATVTELLLAGVDT (0)
(0)
TSNTMVWTLYELARHPELQERLHQEVTSVVSPGQIPTVDDVKNMALLKNVIKEILR (2)
(2)
VYPVLPANGRVLDKDIVLDGYNIPKG (0)
(0)
TQFAILHYNMTRDPEVFEEPDRFNPDRWTRMGTEKVNTFSSVPFGFGPRQCA (1)
(1)
GRRLAEMEMYLVLAR (0)
(0)
LVQTFEVRQLTPGEVVRPVTRALLVPGDPVHLEFIDRP*
>CYP27 fgenesh2_pg.scaffold_25000096|Brafl1
MAQQILRNSSVCSLVRPNSRALVSVAPAATVQQNRPLKEMPGPTNKLGQLWWGFKNRSRMHEAQLEQERKYGRMWQSSFG
FNPNVNVAHVALAEQLMRQEGKYPKRIEVNFMQQYRDLRGYSYGLLNHNGPEWRHLRTAVSKRIMRPKEVPRYGDSMNEV
VTDMIDRFKDLRDTTGGGKTVPDLTNELYKWAMESIATVLFDTRLGCLEREMPEKTQQFIDSIATMFKTAFLVSALKPWM
LTYLGLGVWKRHVEAWDVIFSVAHENIDRKVLDIDARLSRGEDLDGSFLTYMLTGTDVTKKDLYATVTELLLAGVDTTSN
TMVWTLYELARHPELQERLHQEVTSVVSPGQIPTVDDVKNMALLKNVIKEILRVYPVLPANGRVLDKDIVLDGYNIPKGT
QFAILHYNMTRDPEVFEEPDRFNPDRWTRMGTEKVNTFSSVPFGFGPRQCAGRRLAEMEMYLVLARLVQTFEVRQLTPGE
VVRPVTRALLVPGDPVHLEFIDRP*
>CYP27 e_gw.25.105.1|Brafl1
QLEQERKYGRMWQSSFGFNPNVNVAHVSLAEQLMRQEGKYPKRIEVNFMQQYRDLRGYSYGLLNHNGPEWRHLRTAVSKR
IMRPKEVPRYGDSMNEVVTDMITRFKDLRDTTGGGKTVPDLTNELYKWAMESIATVLFDTRLGCLEREMPEKTQQFIDSI
ATMFRTAFLVSALKPWMLTYLGLGVWKRHVEAWDVIFSVGESHENIDRKVLDIDARLSRGEDLDGSFLTYMLTGTDVTKK
DLYATVTELLLAGVDTTSNTMVWTLYELARHPELQDRLHREVTSVVSPGQIPTVDDVKNMALLKNVIKEILRVYPVLPAN
GRVLDKDIVLDGYSIPKGTQFAILHYNMTRDPEAFEEPDRFNPDRWTRMGTEKVNTFSSVPFGFGPRQCAGRRLAEMEMY
LVLARLVQTFEVRQLTPGEVVRPVTRALLVPGDPVHLEFIDRP*
CYP19 clan
(2 subfamilies)
>CYP19
amphioxus 37% to CYP19 zebrafish ovarian, 38% to brain form
41%
to e_gw.484.33.1 so there are two CYP19 subfamilies
in Amphioxus
two
possible start METs
MLQFLVIESRGSFPLNRSRTRHGITSQIEADGCS
MDTGEGWDVLLVVLLVVLVWYYIRETWTSGIDGIFPP (1)
(1)
GPPYIPLLTPLWTLWVFLHDGIWAATAGYAAKYGDFVRVWLGTEQTFIISR (2)
(2)
ASAAAHVLKSSKYRARFGDPSGLAQIGMNGSGVIFNNDVQSWKFLRFFFVK (1)
(1)
VLDRAAGVSAIATRRQLANIRDIASSNPDGAVDVVTLMRRITLEIGNRLFLGVNIEN (1)
(1)
DLEVVNTINGYFAAWEFFMIRPKVLQLIYPTLYRKHQTAV (2)
(2)
RALQDVVGKLVDKKRAVMNGDEAEEEFSIPKGEHDFAAALIQAQ (0)
(0)
EFGQVSASCVRQCVTEMLLAGPDTMSVHIYFILLHIAEHGLENGILREIREVL (1)
(1)
GDRDPTRDDLSKMVFLDHVIN ESMRARPVVTFVMRHAEEEDHVDGYVIPKG
TNVIINLVAVHQDPRHFP
EPETFDPDHFKEK (0)
(0)
VPSTQFMPFGLGVRSCVGRTIAPLQMKAVLITLLRMYQLSPSRDHQSLEVSRNLSEHPTEPGSMFLYPRLETI*
>estExt_gwp.C_90165|Brafl1
45% to CYP19
96% to
assembled seq above
MFSLQECGQVSASCVRQCVTE
MLVAGPDTMSVNIYFILLHIAEHGLENGILREIREVLGDRDPTRDDLSKMVFLDHVINE
SMRARPVVTFVMRHAEEEDHVDGYVIPKGTNVIINLVAVHQDPRHFPEPETFDPDHFKEKVPSTQFMPFGLGVRSCVGRT
IAPLQMKAVLITLLRMYQLSP
>estExt_fgenesh2_pg.C_90115|Brafl1
39% to CYP19a C-term
98%
to estExt_gwp.C_90165|Brafl1
MLVAGPDTMSVNIYFILLHIAEHGLESGILREIREVLDRDPTRDDLSKMVFLDHVINESMRTRPVVTFVMRHAEEEDHVD
GYVIPKGTNVIINLVAVHQDPRHFPEPETFDPDHFKEKVPSTQFMPFGLGVRSCVGRTIAPLQMKAVLITLLRMYQLSP
SRDHQSLEVSRNLSEHPTEPGSMFLYPRLETI*
>CYP19 scaffold_484 96% to first 3 exons below on
e_gw.484.33.1|Brafl1
290078
MSGVMSVLTEQLQTWSAGLTCVTAVIVTGAALVLTWGGWASGRSVDVP (1) 289935
288972
GPPWLLGFGPLMSFARFIWMGVPVAAAHYGARYGDFVRVWIAGERTYVITR (2) 288820
288346
PSAPWPVLKSTNSCRRFGSRTGLRPIGMYQNGIIWNGDDGWRVLRGFFQK (1) 288197
>CYP19 e_gw.484.33.1|Brafl1 38% to CYP19 human and
danio (-) strand
first exon
is a guess, no frameshifts exist in e_gw.1098.5.1 so it may be correct
294278
MSGVMSVLTEQLQTWSAGLTCVTAVIVTGAALVLTWGGWASGRSVDVP 294135
293172
GPPWLLGFGPLMSFARFIWMGVPVAAAHYGARYGDFVRVWIAGERTYVITR
(2) 293020
292546
PSAAWHVLKSNNYCRRFGSRTGLSTIGMYQNGIIWNGDDGWRVLRGFFQK
(1) 292397
287888
ALNADTLNRATSAAVDATYRQMGNIAALQQKAADGKIEALDFLRRITLEVTNNLTLGVHIAD
(1) 287703
287339
PDDLVERIVRYFKAWEFFLLRPPIMYLMTPKLYWKHCQAV
(2) 287220
286970
NDLNDAIAELLTNKRQELKTAPPSDKPDFATCLLQAE
(0) 286860
286169
ERGEVSPAHVQQCVLEMLLAGTDTSSVSMYYLLVSVAENPQVELKVLEEMRDIL
(1) 286008
286565 ERGEVSPAHVQQCVLEMVL 286509 (duplicate exon 7 seq)
285823
GERDPTKADLPQLVYLEQVIKEAMRIKPVGPVIMRQAKEDDR
(2) 285695
285428
IDGIETPAGTNIILNLADMHRRQDNFPAPDDFNPQHFDNK
(0) 285309
284605
DFKGEYVPFGTGPKGCIGQFLAMIEMKAIMCTLLRKHHLRAIPGESLEGIETHWDIAQQPVNASYMYFEERN*
284387
>CYP19 e_gw.1098.5.1|Brafl1
95% to
e_gw.484.33.1|Brafl1 yellow
exon 9 is wrong, exon 9 is in a seq gap
49213
MSGVMYVLTEQLQAWSAGLTCVTAVIVTGAALVLTWGGWASGRSVDVP (1) 49070
48196
GPPWLLGFGPLMSFARFIWMGVPVAAAHYGARYGDFVRVWIAGERTYVITR 48044
47584
PSAAWHVLKSNNYCRRFGSRTGLSTIGMYQNGIIWNGDDGWRVLRGFFQK
47435
47273
ALNADTLNRATSAAVDATYRQMGNIAVLQQKTADGKIEALDFLRRITLEVTNNLTLGVHIAD
(1) 47088
46510
PDDLVERIVRYFKAWEFFLLRPPIMYLMTPKLYWKHCQAV
46397
46144
NDLNDAIAELLTNKRQELKTVPPSDKPDFATCLLQAE
46034
45737 ERGEVSPAHVQQCVLEM 45687 (duplicate exon 7 seq)
45639
ERGEVSPAHVQQCVLEMLLAGTDTSSVSMYYLLVSVAENPQVELKVLEEMRDIL
45478
45352
GERDPTKADLPQLVYLEQVIKEAMRIKPVGPVIMRQAKEDDR
(2) 45227
SVFITIPLLYGNVNISITLYYALTKLLTHPPLQ
44076
DFKGEYVPFGTGPKGCIGQFLAMIEMKAIMCTLLRKYHLRAIPGESLEGIETHWDIAQQPVNASYMYFEERN*
43858
$$$$$$$
CYP20 clan
>CYP20 e_gw.479.56.1|Brafl1 39% to CYP20
MLDYAIFAITFVVFLIAAVLYLYPGSNKITTIPGLEPSDPKDGNMDDVGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVS
LAAPELWKQHERAFDRPPLLFKGFEPLWGTMSITYANGVDGRTRRKLYDPSFGHEAMKHYFSIFQELGQEMAKNWASMEG
DQHIPLQAHMLALTTKATTRCSFGDAFKDEKECVQFSRNFNICWCDVEERVNGSHPTEGSPREKKFQEARGKLQATIGRV
VKYRRENPPPPQEQLFIDVLIEGDLPEEQVFGDAITYMVGGFHTTANLLTWALYFIATHEEVEEKLYQELSDVLGKKGEV
TPDNIPQLVYLRQVLDETLRCAVVTPWGARYMDLDAEIGGHIVPAKTPVIHAFGVVLQDERFWPEPNKFDPERFDAENSK
GRHKLAFQPFGSAGGRKCPGYRFTYVETTVFLSILCRQFKLHLVDGQVVKPRHGLVTRPVDEIWITVTKRD*
>CYP20 estExt_GenewiseH_1.C_860218|Brafl1
88% to e_gw.479.56.1|Brafl1
MLDYAIFAITFVVFLIAAVLYLYPGSNKITTIPGLEPSDPKDGNMDDVGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVS
LAAPELWKQHERAFDRPPLLFKGFEPMFGAMSITYANSVDGRTRRKLYDPSFGHEALKHYFSIFQELGQEMASKWESTKG
DQHIPLHAHMMALALKTFTRSSFGDSFKDEKECVQFGRNYGICWNDMEERIKGSHPTEGSPREKKFKEALGKLHATIARV
AKYRRENPPPPQEQLFIDVLIEGNLPEEQVLCDAMTFTVGGFHTSGNLLTWALYYIATHEEVEEKLHQELSDVLGKKGEV
TPDNISQLVYLRQVLDESLRCAVIAPWGARYMDLDAEVGGHIVPAKTPVIHAFGVVLQDERIWPEPNKFDPDRFDAENSK
GRHKLAFQPFGFAGGRKCPGYRFAYTWTSVFLSILCRQFKLHLVDGQVVKPCHGFVTRPVDEIWITVTKRD*
>CYP20 fgenesh2_pg.scaffold_86000110|Brafl1
87% to e_gw.479.56.1|Brafl1
MLDYAIFAITFVVFLIATVLYLYPGANKITTIPGLEPSDPKDGNLGDLGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVS
LGAPELWKQHERIFDRPRFEPLIGAKSIQYANSVDGRTRRKLYDPSYGHNAMKHYYSIFQELGQEMAKKWESMKGDQHIP
LHAHIIALAMKAITRSSFGDAFKDEKECVQFGRNYDICWNDMEERIKGSYPTEGSPREKKFEEAKGKLHATIARVAKYRR
ENPPPPQEQLFIDVLIEGDLPEEQVLCDAMTYMVGGFHTSGNLLTWALYFIATHEEVEEKLYQELSDVLGKKGEVTPDNI
SQLVYLRQVLDESLRCAVVAPWGARYMDLDAEVGGHIVPAKTPVIHAFGVVLQDERIWPEPNKFDPERFDAENIKGRHKL
AFQPFGFAGGRKCPGYRFTYVETTVFLSILCRQFKFHLVDGQVVTPWHGLVTRPLDEIWITVTKRD*
>CYP20 e_gw.89.28.1|Brafl1
83% to e_gw.479.56.1|Brafl1
MLDYAIFAITFVVFLIATGLYLYPGPNKITTIPGLEPSDPKDGNLGDIGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVS
LGAPELWKQHERIFDRPPLLFKGFEPLIGAKSIQYANGLDGRTRRKLYDPSFGHNAMKYYYSIFQELGQEMAQKWESMEG
DQHIPLRAHTIDLTMKAITRCSFGDTFKDEECLQFSRNYDICWDDINERTKGNYPVEGSPREKKFQEALGRLHTTIGRVA
KYRRENPPPPQEQLFIDLLIEGDLPEEQVRAKSHTYWTISSVMTLYHCLLLTWALYFIATHKEVEEKLYQELIDVLGKKE
DVTPDNISQLVYLRQVLDETLRCAVVGPWGARYMDLDIEIGGHIVPAKTPVIHAFGVVLQDERIWPEPNKFDPERFDAES
SKGRHKLAFQPFGFAGGRKCPGYKFSYAETSVFLSILCRQFKLHLVDGQVVTWHGIIMITRPVDEIWITVTKRD*
>CYP20 e_gw.86.147.1|Brafl1
83% to e_gw.479.56.1|Brafl1
MLDYAIFAITFVVFLIAAVLYLYPKSNKITTIPGLEPSDPKDGNLGDVGRAGALHEFLLKLHAEYGDIASFWWGQQLVVS
LGAPELWKQHERIFDRPPLLFKGFEPLIGAMSIQYANHVDGMTRRKLYDPSFGHEAMKHYYSIFQELGQEMAKKWETMEG
DQHIPLHAHMIALAMKAITRSSFGDSFKDEKECVQFGRNDDICWNDMEERVKGSYPTEGSPREKKFQEALGKLHTTIRRV
VKYRRENPPPPQEQLFIDVLIEGDLPEEQVLCDAMTFMVGGFHTSGNLLTWALYFIATHEEVEEKLYQELSDVLGKKGEV
TPDNISQLVYLRQVLDESLRCAVITPWGARYMDLDAEIGGHIVPAKTPVIHAFGVVLQDERIWPEPNNLEFESATGFYSL
INLAHSSPIFPPPPGYRFSYIETSVFLSILCRQFKLHLVDGQVVTPWHGCVTRPLEEIWITVTKRD*
>CYP20 estExt_GenewiseH_1.C_4790081|Brafl1
86% to e_gw.479.56.1|Brafl1
MLDYAIFAITFVVFLIATVLYLYPGANKITTIPGLEPSDPKDGNLGDVGRAGSLHEFLLKLHTEYGDIASFWWGQQLVVS
LGAPELWKQHERIFDRPPLLFKGFEPLIGAKSIQYANSVDGRTRRKLYDPSYGHNAMKHYYSIFQELGQEMAKKWESMKG
DQHIPLHAHIIALAMKAITRSSFGDAFKDEKECVQFGRNYDICWNDMEERIKGSYPTEGSPREKKFEEGLTFQQLHATIA
RVAKYRRENPPPPQEQLFIDVLIEGDLPEEQVLCDAMTYMVGGFHTSGNLLTWALYFIATHEEVEEKLYQELSDVLGKKG
EVTPDNISQLVYLRQVLDESLRCAVVAPWGARYMDLDAEVGGHIVPAKTPVIHAFGVVLQDERIWP
Amphioxus
has 8 CYP20 sequences. This complicates
the CYP20 story, since most other species have 1. Two of the contigs may represent a recent duplication with
differential gene loss.
479.a
and 479.b/c are nearly identical to 86.a and 86.b. The 479.d and 86.c sequences do not match, so that suggests
a larger, possibly 4 gene cluster, with loss of the 479.d seq in 86 and loss of
the 86.c seq in 479.
The
scaffold 89 seq is unique.
This
leaves five distinct CYP20s in Branchiostoma.
I
am still looking for a gliomedin-like neighbor.
>CYP20
amphioxus 39% to CYP20 Danio from trace archive (hybrid seq)
MLDYAIFAITFVVFLIATVLYLYP
(0)
(0)
GANKITTIPGLEPSDPK (2)
(2)
DGNLGDVGRAGSLHEFLLKLHTEYGDIASFWWGQQLVVSLGAPELWKQHERIFDRP (1)
(1)
ALLFKGFEPLIGAKSIQYANSVDGRTRRKLYDPSYGHNAMKHYYSIFQE (0)
(0)
LGQEMAKKWESMKGDQHIPLHAHIIALAMKAITRSSFGDAFKDEKECVQFGRNYDI (0)
(0)
CWNDMEERIKGSHPTEGSPREKKFKE (1)
(1)
ALGKLHATIARVAKYRRENPSPPQEQLFIDVLIEGNLPEEQ (0)
(0)
VLCDAMTFTVGGIHTSGN (1)
(1)
LLTWALYYIATHEEVEEKLHQELSDVLGKKGEVTPDNISQLV (2)
(2)
YLRQVLDESLRCAVIAPWGARYMDLDAEVGGHIVPAK (0)
(0)
TPVIHAFGVVLQDERIWPEPNK (2)
FDPDRFDAENSKGRHKLAFQPFGFAGGRKCP (1)
(1)
GYRFTYTWTSVFLSILCRQFKLHLVDGQVVKPCHGLVTRPVDEIWITVTKRD*
>amphioxus
CYP20 like scaffold 479.a seq gap in exon 12/13
99%
to 86.a , 37% to CYP20 danio, 38% to CYP20 fugu, 37% to CYP20 human
possible
allele to 86.a (only 6 aa diffs)
432510
MLDYAIFAITFVVFLIATVLYLYP (0) 432581
432736
GANKITTIPGLEPSDPK (2) 432786
433046
AGNLGDVGRAGSLHEFLLKLHAEYG 433120 (duplicate seq)
433166
DGNLGDVGRAGSLHEFLLKLHAEYG 433240 (duplicate seq)
433286
DGNLGDVGRAGSLHEFLLKLHTEYGDIASFWWGQQLVVSLGAPELWKQHERIFDRP (1) 433453
434119
ALLFKGFEPLIGAKSIQYANSVDGRTRRKLYDPSYGHNAMKHYYSIFQE (0) 434265
434881
LGQEMAKKWESMKGDQHIPLHAHIIALAMKAITRSSFGDAFKDEKECVQFGRNYDI (0) 435048
435360
CWNDMEERIKGSYPTEGSPREKKFEE (1) 435437
435701
AKGKLHATIARVAKYRRENPPPPQEQLFIDVLIEGDLPEEQ (0) 435823
436815
VLCDAMTYMVGGFHTSGN (1) 436868
437343
VLTWALYFIATHEEVEEKLYQELSDVLGKKGEVTPDNISQLV (2) 437468
437729
YLRQVLDESLRCAVVAPWGARYMDLDAEVGGHIVPAK (0) 437839
438253
TPVIHAFGVVLQDERIWPEPNK (2) 438318
VDPERFDAENIXXXXXXXXXXXXXXXXXXXX
439186
XXXXXXXXXLVFLFILCRQFKFHLVDGQVVTPWHGLVTRPLDEIWITVSKRD* 439317
>amphioxus
CYP20 like scaffold 479.b
2
aa diffs to 86.b
441870
MLDYAIFAITFVVFLIAAVLYLYP 441941
442088
GSNKITTIPGLEPSDPK 442138
444041
DGNMDDVGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVSLAAPELWKQHERAFDRP 444208
446694
ALLFKGFEPMFGAMSITYANSVDGRTRRKLYDPSFGHEALKHYFSIFQE (0) 446840
447423
LGQEMASKWESTKGDQHIPLHAHMMALALKTFTRSSFGDSFKDEKECVQFGRNYGI (0) 447590
exons
6,7,8,9,10 in a seq gap
456021
TPVIHAFGVVLQDERIWPEPNK 456086
456446
FDPDRFDAENSKGRHKLAFQPFGFAGGRKCP (1) 456538
457265
GYRFTYTWTSVFLSILCRQFKLHLVDGQVVKPCHGFVTRPVDEIWITVTKRD* 457423
EST
matches trace file
>BW799748
Amphioxus Branchiostoma floridae 1 aa dif to 479.b
NSKGRHKLAFQPFGFAGGRKCP
GYRFTYTWTSVFLSILCRQFKLHLVDGQVVKPCHGLVTRPVDEIWITVTKRD*
>amphioxus
CYP20 like scaffold 479.c
1
aa diff to 86.b
note:
the top three exons are identical to 479.b
the
missing exons 11,12,13 are in 479.b
I
suspect this is assembled incorrectly and 479.b and 479.c are one gene
The
combined fragments would have only 3 aa diffs to CYP86.b (allele)
460211
MLDYAIFAITFVVFLIAAVLYLYP 460282
460433
GSNKITTIPGLEPSDPK 460483
460715
DGNMDDVGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVSLAAPELWKQHERAFDRP 460882
exon
4,5,6,7 in a seq gap
461613
VLCDAMTFTVGGFHTSGN 461666
462121
VLTWALYYIATHEEVEEKLHQELSDVLGKKGEVTPDNISQLV 462246
462794
YLRQVLDESLRCAVIAPWGARYMDLDAEVGGHIVPAK (0) 462904
no
exon 11,12,13
>CYP20
amphioxus scaffold 479.d
87%
to 86.a, 86% to 86.b, 85% to scaf89, 88% to 86.c, 86% to 479.a
37%
to CYP20 human, 38% to CYP20 danio
464840
MLDYAIFVITFVVFLIATVLYLYP (0) 464911
465054
GLNKITTIPGLEPSDPK (2) 465104
465455
DGNLGDVGRAGSLHEFLLKLHAEYG 465529 (duplicate seq)
465568
GGNLGDVGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVSLGAPELWTQHKRIFDRP (1) 465735
467328
ALLFKGFEPLWGTMSITYANGVDGRTRRKLYDPSFGHEAMKHYFSIFQE (0) 467474
467704
LGQEMAKNWASMEGDQHIPLQAHMLALTTKATTRCSFGDAFKDEKECVQFSRNFNI (0) 467871
468141
CWCDVEERVNGSHPTEGSPREKKFQE (1) 468218
468460
ARGKLQATIGRVVKYRRENPPPPQEQLFIDVLIEGDLPEEQ (0) 468582
469033
VFGDAITYMVGGFHTTAN (1) 469086
469528
LLTWALYFIATHEEVEEKLYQELSDVLGKKGEVTPDNIPQLV (2) 469653
469951
YLRQVLDETLRCAVVTPWGARYMDLDAEIGGHIVPAK (0) 470061
470256
TPVIHAFGVVLQDERFWPEPNK (2) 470321
470700
FDPERFDAENSKGRHKLAFQPFGSAGGRKCP (1) 470792
471108
GYRFTYVETTVFLSILCRQFKLHLVDGQVVKPRHGLVTRPVDEIWITVTKRD* 471266
CYP20
sequences are also found on scaffold 86 and scaffold 89 in amphioxus
>scaffold
89 e_gw.89.28.1 [Brafl1:221840] model has error in exon 8
Brafl1/scaffold_89:1792029-1799319
84%
to 479.d, 88% to 86.a, 84% to 86.b, 88% to 86.c, 87% to 479.a
38%
to CYP20 danio, 37% to CYP20 human
MLDYAIFAITFVVFLIATGLYLYP
GPNKITTIPGLEPSDPK
DGNLGDIGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVSLGAPELWKQHERIFDRP
PLLFKGFEPLIGAKSIQYANGLDGRTRRKLYDPSFGHNAMKYYYSIFQE
LGQEMAQKWESMEGDQHIPLRAHTIDLTMKAITRCSFGDTFKDEECLQFSRNYDI
CWDDINERTKGNYPVEGSPREKKFQE
ALGRLHTTIGRVAKYRRENPPPPQEQLFIDLLIEGDLPEEQ
VLCDAMTYMVGGFHTSGN
LLTWALYFIATHKEVEEKLYQELIDVLGKKEDVTPDNISQLV
YLRQVLDETLRCAVVGPWGARYMDLDIEIGGHIVPAK
TPVIHAFGVVLQDERIWPEPNK
FDPERFDAESSKGRHKLAFQPFGFAGGRKCP
GYKFSYAETSVFLSILCRQFKLHLVDGQVVTWHGIIMITRPVDEIWITVTKRD*
>scaffold
86.a
fgenesh2_pg.scaffold_86000110
[Brafl1:79625] corrected exon 4
Brafl1/scaffold_86:2108590-2115417
MLDYAIFAITFVVFLIATVLYLYP
GANKITTIPGLEPSDPK
DGNLGDLGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVSLGAPELWKQHERIFDRP
ALLFKGFEPLIGAKSIQYANSVDGRTRRKLYDPSYGHNAMKHYYSIFQE
LGQEMAKKWESMKGDQHIPLHAHIIALAMKAITRSSFGDAFKDEKECVQFGRNYDI
CWNDMEERIKGSYPTEGSPREKKFEE
AKGKLHATIARVAKYRRENPPPPQEQLFIDVLIEGDLPEEQ
2112464
VLCDAMTYMVGGFHTSGN 2112517
LLTWALYFIATHEEVEEKLYQELSDVLGKKGEVTPDNISQLV
YLRQVLDESLRCAVVAPWGARYMDLDAEVGGHIVPAK
TPVIHAFGVVLQDERIWPEPNK
FDPERFDAENIKGRHKLAFQPFGFAGGRKCP
2115259
GYRFTYVETTVFLSILCRQFKFHLVDGQVVTPWHGLVTRPLDEIWITVTKRD* 2115417
>scaffold
86.b
estExt_GenewiseH_1.C_860218
[Brafl1:265292]
Brafl1/scaffold_86:2118885-2129038
MLDYAIFAITFVVFLIAAVLYLYP
GSNKITTIPGLEPSDPK
DGNMDDVGRAGSLHEFLLKLHAEYGDIASFWWGQQLVVSLAAPELWKQHERAFDRP
PLLFKGFEPMFGAMSITYANSVDGRTRRKLYDPSFGHEALKHYFSIFQE
LGQEMASKWESTKGDQHIPLHAHMMALALKTFTRSSFGDSFKDEKECVQFGRNYGI
CWNDMEERIKGSHPTEGSPREKKFKE
ALGKLHATIARVAKYRRENPPPPQEQLFIDVLIEGNLPEEQ
2123453
VLCDAMTFTVGGFHTSGN 2123506
LLTWALYYIATHEEVEEKLHQELSDVLGKKGEVTPDNISQLV
YLRQVLDESLRCAVIAPWGARYMDLDAEVGGHIVPAK
TPVIHAFGVVLQDERIWPEPNK
FDPDRFDAENSKGRHKLAFQPFGFAGGRKCP
2128631
GYRFAYTWTSVFLSILCRQFKLHLVDGQVVKPCHGFVTRPVDEIWITVTKRD* 2128789
>scaffold
86.c
e_gw.86.147.1
[Brafl1:220957] exon 12 in model is wrong
corrected
below
Brafl1/scaffold_86:2141287-2148370
MLDYAIFAITFVVFLIAAVLYLYP
KSNKITTIPGLEPSDPK
DGNLGDVGRAGALHEFLLKLHAEYGDIASFWWGQQLVVSLGAPELWKQHERIFDRP
PLLFKGFEPLIGAMSIQYANHVDGMTRRKLYDPSFGHEAMKHYYSIFQE
LGQEMAKKWETMEGDQHIPLHAHMIALAMKAITRSSFGDSFKDEKECVQFGRNDDI
CWNDMEERVKGSYPTEGSPREKKFQE
ALGKLHTTIRRVVKYRRENPPPPQEQLFIDVLIEGDLPEEQ
2144990
VLCDAMTFMVGGFHTSGN 2145043
LLTWALYFIATHEEVEEKLYQELSDVLGKKGEVTPDNISQLV
YLRQVLDESLRCAVITPWGARYMDLDAEIGGHIVPAK
TPVIHAFGVVLQDERIWPEPNK
(2)
Exon
12 in a seq gap
2148212
GYRFSYIETSVFLSILCRQFKLHLVDGQVVTPWHGCVTRPLEEIWITVTKRD* 2148370
$$$$$$
CYP26 clan
(5 sequences)
>e_gw.480.39.1|Brafl1
22% to CYP26B1 magenta probably right seq
scaffold
480 463546:469390 (5845 bp)
MAELPGYVGYPWVGDNSLEFYRDPVSFMEKRVQDYSSRIFQARFINRPTVFVGSAEAVKKLLNEKSQHFEMGYKALWQGL
YGDNVLFSDGWSNFFVFIT VSILHIMSHCNNCNHYPNQTCGDPQPIQDLNLLMR
PVKVYELLKQMSTEISMGLFLDIERE
TDNSLAPLVSQLMTQHWHGIISMPANLKLPSWGGNWESGYSKAQEAKDELLKIIGERIGKNKHNNVLGLMKTAGFRSEDE
IYRHLLLFVSALVPKAFSSLFTSFTLQLAGPSKVSMRQKALEDETFLEHILLEVQRLWPPFIGGRRLVRQEFTLAGYRIP
KEHGLMYVTHTAHRDPQIFPEPNSFKPERWSTCNAGHEGYLCAFGGGPRRCIGTQLVQLVLKHVTKYLLHNFHW
QVTQAE
IPPYKWLPVSRP
>estExt_GenewiseH_1.C_4470006|Brafl1
scaffold
447 85292:91028 (5737 bp)
90% to
e_gw.480.39.1|Brafl1
about 10
aa diffs to green region of e_gw.480.39.1|Brafl1
MAELPGYVGYPWVGDNSLEFYRDPVSFMEKRIQDYSSRIFQARFINRPTVFVGSAEAVKKLLNEKTQHFEMGYKALWQGL
YGDNVLFSDGWSNFFVFIT NLSSSFSGKKLLTSLQIPGHLPHMTVNLR PVKVYELLKQMSTEISMGLFLDIERETDNSFA
PLVSQLMTQHWHGIISMPANLKLPTWGGNWESGYSKALEAKDELLKIIGDRIGKNKHNNVLGLMKTAGFRSEDEVYRHLL
LFVSALVPKAFSSLFTSFTLQLAGPSKASMRQKALEDETFLEHILLEVQRLWPPFIGGRRLVRQEFTLAGYRIPKEHGLM
YVTHTAHRDPQIFPEPNSFKPERWSTSNAGHEEYLCAFGGGPRRCIGTQLVQLVLKHVTKYLLHNFHW EVTQAEIPPYKW
LPVSRPTVEDQVIFTPRDSPDQEVEVGVEVAETSL*
Hybrid seq
MAELPGYVGYPWVGDNSLEFYRDPVSFMEKRVQDYSSRIFQARFINRPTVFVGSAEAVKKLLNEKSQHFEMGYKALWQGL
YGDNVLFSDGWSNFFVFIT VSILHIMSHCNNCNHYPNQTCGDPQPIQDLNLLMR
PVKVYELLKQMSTEISMGLFLDIERE
TDNSLAPLVSQLMTQHWHGIISMPANLKLPSWGGNWESGYSKAQEAKDELLKIIGERIGKNKHNNVLGLMKTAGFRSEDE
IYRHLLLFVSALVPKAFSSLFTSFTLQLAGPSKVSMRQKALEDETFLEHILLEVQRLWPPFIGGRRLVRQEFTLAGYRIP
KEHGLMYVTHTAHRDPQIFPEPNSFKPERWSTCNAGHEGYLCAFGGGPRRCIGTQLVQLVLKHVTKYLLHNFHW
EVTQAEIPPYKWLPVSRPTVEDQVIFTPRDSPDQEVEVGVEVAETSL*
$$$$
>CYP26 fgenesh2_pg.scaffold_164000029|Brafl1 44% to
CYP26B
Brafl1/scaffold_164:756408-763969
MLEEVVGYLVFPAFLMVLSWKLWGRYATPSDPACALPLPAGTTGFPIIGETLSFILEGADFSRKRHALYGDIFKTHILGR
PTIRVRGADNVRKILRGENDIVGTMWPDNFRMVLGTENLAMCGSGPLHRQRKKIVMRAFRHDALEIYTDSMQAMIADTLR
VWCRGPQPLAVYPAAREMMFRLAIAVLVGFHQDEEEARRVGSLFRTAVKNIFSLPLNVPGSALRKALQCRQEIDEWLKRH
IHEKHAQIWSGEVPDDVLSFIISSAKEEGKAVDQQQLLDTAVELLFAGHETTSSAATSLIMHLALQPQVVQKVQEDLEKH
GLLQPDQPLSLEQVGRLTYVGQVVKEVLRISPPIGGGFRKALKTFAIGGFQVPEGWAVMYSIRDTHSASQLFSSPQQFDP
DRWAAADSTAIRYDFLPFGAGPRACAGKEFAKLQLKLLCVELVRSCRWELADGKVPEMKSVPVLHPANGLPVNFVSLDDV
TVKREDADGLAAPAHAPLMNTDLVTRSDPCLTLDKNGNLYPTSEQNSPDTVTVVGPDLSNIV*
>CYP26 fgenesh2_pg.scaffold_164000030|Brafl1
Brafl1/scaffold_164:786544-802905
62% to
fgenesh2_pg.scaffold_164000029|Brafl1 43% to CYP26A1
MLVELVTVLVLPCVALLLSWKLWTQYYTWSDPGPDTPLPPGSMGLPFIGETLSLVTQGGKFSSSRHAQYGDVFKTHILGR
PTIRVRGATNVRKILLGENHIVTSLWPQTFRTVLGTGNLAMSNGEEHRLRRKVIMKAFNYEALERYVPIMQEILREAVQR
WCGAPQPVTVWPMAREMAFRVASAVLVGFQHSDEEIQHLTSLFTNMVKNLFSLPVKLPGSGLSNGLFYRQAIDEWMMNHI
QRKKEFVLQGGDSGDVLSHIMNNAKDNGEKLSDQEIQDTVVELLFAGHETTSSAATSLIMHLALQPQVVQKVQEDLEKHG
LLQPDQPLSLEQVGRLTYVGQVVKEVLRRRPPIGGGYRRALKSFDIGGFHVPKGWAVLYSIRDTHEASQIFSSPELFDPD
RWTPETSQAPLARYDMVTFGGGPRACVGKEFAKLLLKLLCVELTRRCRWKLADDKLPDMKLIPIVYPADGLPVIFTPIGG
KSPGDENKNGVPYEERTRGKDCPILCSVSFEKDINVAT*
>CYP26 estExt_fgenesh2_pg.C_1640031|Brafl1
Brafl1/scaffold_164:838132-841029
45% to
CYP26C1, 42% to CYP26B1, 42% to CYP26A1
64% to fgenesh2_pg.scaffold_164000029|Brafl1
MLAELLINAAVPLVLVWTLWTLWKHYSTQGDPACDLPLPKGSMGLPFIGETLAFVTQGADFSRSRHELYGDVYKTHILGR
PTVRVRGADNVRKILHGENTLVTTIWPYSIRAVLGTQNLGMSFGEEHRFRKRVVMKAFNQNAMESYLRSTQTVLRETVAQ
WCVQPQPVVVYPASREMALKIAAASLIGVHTGQEDAQRVTVLFQNMIDNLFSLPVKIPFGGLSKALRYRQIIDEWLEGHI
KRKQRDIDNGDIGTDALSRLILAARDVGHDLNSQEIQDTAVELLFAGHETTSSAATSLIMHLALQPQVVQKVQEDLEKHGL
(gap)
LQPDQPLSLEQVGRLTYVGQVVKEVLRISPPIGGGFRKALKTFELDGFQVPAGWTVTYSIRDTHGSVGNVSSPDQFDPD
RWAADSDGSRRGRHHYIPFGAGPRACAGKEFAKLQLKLLCVELVRSCRWELADGKVPAMTAIPVPRPVNGLPVQFTPCEP
ITNNTLSDATEQNTNLSVCYSSNVPGPSHTSPKQQDFDAPCQIVMARKSEACGA*
$$$$$$$$
CYP46 clan
(1 sequence)
>CYP46 from ESTs and trace
archive
CF917908
BI377382 Amphioxus 5-6 hrs cDNA 52% to CYP46 fugu
ATGI35342.g1 ATUP100909.x1 ATUP374858.g2
ATUP181014.g1
AFSA27081.b2
ATUP181014.b1 AFPZ282931.x1
(possible exon 2)
walked
upstream to ATWW110117.g1 ATUP762936.y1 (N-term)
MAVVAVLMVLGVLAVVGLAVAGVVYLGYIYYMHRKYDHLPGPPRKS
(2)
MNLLTYPICTTFLTLYSDENEYLTAFLD
(0)
LLFCERYGPIVRLNFLHRVIIFVSSPEAVR
(0)
ELLVTGKYIKPPDQYERIGSIFGERQ
(0)
FLGEGLVTETNQERWHKRRRIMDPAFSRKYLQTL
MDKFNTSGDLFVEKLQTLADGVTPVSMVDMFGRVTLDVIAK
(0)
VAFSMDLNTILDDHTPFPMATYITLSALIQQFRHPFME
(0)
YNPFQRDYIRKVREACRLLRKTGHSVLQERQDQIRRGEQLPNDIMTLILKAN
(1)
DEDSGLTVEKLVDDFVTFFIA
(1)
GSETTANQLSFTLMELGRYPDVLEK
(2)
LRAEMREVCGNKEYITYEDIGKLQYMGQ
(0)
VLKESLRMYPPATGTSRLVEEEMELCGHRIPGDTVLI
(0()
TSTYIMSHMEKHYPEPYTFNPDRFTPDADRPLYTYFPFSLGSRSCIGQHFSQ
(0)
IEAKVLLCKFLQKLEFELDPNQSFAVSEASTLRPKGGCICMLSKLEK*
>CYP46 estExt_fgenesh2_pm.C_1290004|Brafl1 48% to
CYP46
MHRKYDHLPGPPRKR
(gap)
CERYGPIVRLNFLHRVIIFVSSPEAIRELLVTGKYIKPPDQYERIGSIFGER add this line to lower seq
FLGEGLVTETNQE
RWHKRRRIMDPAFSRKYLQTLMDKFNTSGDLFVEKLQTLADGVTPVSMVDMFGRVTLDVIAKVAFSMDLNTILDDHTPFP
MATYITLSALIQQFRHPFMEYNPFQRDYIRKVREACRLLRKTGHSVLQERQDQIRRGEQLPNDIMTLILKANDEDSGLTV
EKLVDDFVTFFIAGSETTANQLSFTLMELGRYPDVLEKLRAEMREVCGNKEYITYEDIGKLQYMGQVLKESLRMYPPATG
TSRLVEEEMELCGHRIPGDTVLITSTYIMSHMEKHYPEPYTFNPDRFTPDADRPLYTYFPFSLGSRSCIGQHFSQIEAKV
LLCKFLQKLEFELDPNQSFAVSEASTLRPKGGCICMLSKLEK*
>CYP46 estExt_fgenesh2_pg.C_1860050|Brafl1
same as estExt_fgenesh2_pm.C_1290004|Brafl1
MAVVAVLMVLGVLAVVGLAVAGVVYLGYIYYMHRKYDHLPGPPRKS
FISGHIDDMTKCLQDGKFTQDMILEW
(gap)
FLGEGLVT
ETNQERWHKRRRIMDPAFSRKYLQTLMDKFNTSGDLFVEKLQTLADGVTPVSMVDMFGRVTLDVIAKVAFSMDLNTILDD
HTPFPMATYITLSALIQQFRHPFMEYNPFQRDYIRKVREACRLLRKTGHSVLQERQDQIRRGEQLPNDIMTLILKANDED
SGLTVEKLVDDFVTFFIAGSETTANQLSFTLMELGRYPDVLEKLRAEMREVCGNKEYITYEDIGKLQYMGQVLKESLRMY
PPATGTSRLVEEEMELCGHRIPGDTVLITSTYIMSHMEKHYPEPYTFNPDRFTPDADRPLYTYFPFSLGSRSCIGQHFSQ
IEAKVLLCKFLQKLEFELDPNQSFAVSEASTLRPKGGCICMLSKLEK*
CYP51 clan
(1 sequence)
>CYP51 estExt_fgenesh2_pm.C_2170007|Brafl1
MLVEMGNLLLENALETVQELGSGTVALTTIVVLLGVTYFGRQFVSSVGKAEKLPPVVPHTIPILGHGYNFYKNPIGFLEE
AYKKYGPVFTITMAGSKFTYLVGSDAAATLFNSKNEDLNAEEVYSRLTTPVFGKGVAYDVPNPVFLEQKKMFKTGLNIAR
FRTHVSLIEEETKEYFKRWGDSGERDLFEALAQLTILTASRCLHGKEVRSMLHEGIAQLYADLDGGFTQMAWLLPGWLPL
PSFRKRDRANREMKKVFKKIIQQRRESGDCDDDMLQTLMESTYKDGRPLTDDEITGMMIGLLMAGQHTSSTTSTWMGFFL
AKHKDIQARAYQEQLDICGEDLPPLNYDDLKEMALLDKCLAETLRLRPPIMTMMRMCKTPQQVKGYTIPVGHQVCVSPTV
NQKLEDTWEEAGTWNPNRFLEGNASTGKFSYVPFGAGRHRCIGENFAYVQIKTIWAVLLREFEFELIDGHFPSINFETMI
HTPSQAIIRYKKR*
>CYP51 estExt_GenewiseH_1.C_4170037|Brafl1
1 aa diff
to estExt_fgenesh2_pm.C_2170007|Brafl1 (allele)
MLVEMGNLLLENALETVQELGSGTVALTTIVVLLGVTYFGRQFVSSVGKAEKLPPVVPHTIPILGHGYNFYKNPIGFLEE
AYKKYGPVFTITMAGSKFTYLVGSDAAATLFNSKNEDLNAEEVYSRLTTPVFGKGVAYDVPNPVFLEQKKMFKTGLNIAR
FRTHVSLIEEETKEYFKRWGDSGERDLFEALAQLTILTASRCLHGKEVRSMLHEGIAQLYADLDGGFTQMAWLLPGWLPL
PSFRYRDRANREMKKVFKKIIQQRRESGDCDDDMLQTLMESTYKDGRPLTDDEITGMMIGLLMAGQHTSSTTSTWMGFFL
AKHKDIQARAYQEQLDICGEDLPPLNYDDLKEMALLDKCLAETLRLRPPIMTMMRMCKTPQQVKGYTIPVGHQVCVSPTV
NQKLEDTWEEAGTWNPNRFLEGNASTGKFSYVPFGAGRHRCIGENFAYVQIKTIWAVLLREFEFELIDGHFPSINFETMI
HTPSQAIIRYKKR*
$$$$$$$
CYP74 clan
(10 sequences, 7 distinct sequences, two pairs of alleles, plus one nearly
identical duplicate on scaffold 120)
>fgenesh2_pg.scaffold_781000005
[Brafl1:110589] 23% to
CYP74B2
Scaffold
781
96%
to fgenesh2_pg.scaffold_402000022
MGVSMSNTKGLVRHVKAGPRALKPGGEHPAAVRTNVGIPVVALLNQDTIHHVFNTDLVDKEQYCLGYVGV
RSELLRGHCPSMFANGQEHRRKKAFLIDVFRGRQKTLPPVLSRQIMAHFKEWSRLEALADFEDKVFFLMS
DILTETVFGRKLDGRLALHWLQGLPSVRTWIPFPTKAKQDLAASALPVLLKSIEESPNYEELIQLSYLHD
IEEEDAIDNILFVIVFNAVAAVSAVIVTFITRLHTITEADRNVLLKTTLQALLKHESLSEESLGDMKALD
SFLLEVLRLHPPVFNFFGVAKKDFAIPTGVDKNVEVRQGEQLMGSCFWAQRDAKVFLSPNVFRCYRFMDS
KELLVDREQDGGKKRHLIFGHGS
(2)
L T
E A A
D L (frameshift)
DS H
Q C P
G Q D
I A F
Y L M
K A T
L A V
L L C YC
SWELEALPVWSDKTARLGRPDDLVSLTWFNFDSDTARHVLESYDLNCEK*
>fgenesh2_pg.scaffold_402000022
[Brafl1:102192]
Scaffold
402
57%
to fgenesh2_pg.scaffold_107000039
96% to
fgenesh2_pg.scaffold_781000005|Brafl1 (probable allele)
MGVSMSNTKGLVRHVKAGPRALKPGGEHPAAVRTNVGIPVVALLNQDTIHHVFNTDLVDKEQYCLGYVGV
RSELLRGHCPSMFANGQEHRRKKAFLIDVFRGRQKTLPPVLSRQIMAHFKEWSRLEALADFEDKVFFLVS
DSLTETVFGRKLDGRLALHWLQGLPSVRTWIPIPTKARQDLAASALPVLLKSIEESPNYEELIQLCYLHD
IEEEDGIVNILFTIVFNAVAAVSAVIVTFITRLHTIIEADRNILLKTTLQALLKHESLSEESLGDMKVLD
SFLFEVLRLHPPVFNFFGVAKKDFAIPTGVDKNVEVRQGEQLMGSCFWAQRDAKVFLSPNVFRCYRFMDS
KELLVDREQDGGKKRHLIFGHG
SLTEAADLDSHQCPGQDIAFYLMKATLAVLLCYC
SWELEALPVWSDKTARLGRPDDLVSLTWFNFDSDTASHVLESYHLNCEK*
>fgenesh2_pg.scaffold_107000039
[Brafl1:81984] 27% to
CYP7D1
Scaffold
107
61% to
fgenesh2_pg.scaffold_402000022|Brafl1
51%
to estExt_fgenesh2_pg.C_1950037
816169
MGACMSDTSGLLNTKKSGPHVLNPRGEHPTIVRTNVGIPCVGLLSQETIQYVFDPELVDKEPCCFGYSEVPGDV
RRGHCPSMFANGQEHRRKKAFLVDVFKECRDKIQTVLFKTILEDFEEWSRVKTVPDFEDRVYFLISKAVT
EAVFGTKLDGRLALTWLEGAIQLKTWLPIPNYAKRHRLAVAALGELMKTIEESPKYEELIRMCHLHDLEA
EDGMMTLMHAILFNGCGAVTTTIITSVARYQTIPAGERKDLQTSVLQEVEKFGSITEESLGEMEFLESFL
LEVLRMHPPVADFWGVAKKDFTVSAGEIKEEIRKGERLLGSCFWAQRDVSVFLRPGLFRSRRFLDEKE
KRSNLLFPHGSFLEAASLDSHQCPAMDIAFILMKATLAVLLCYCKWELQDTPEWSDKITRLGKPDGLVSLTS
FGFDLVEARRVLEL*
$$$$$$$
>estExt_fgenesh2_pg.C_1200087|Brafl1
22% to CYP74B2, 25% to CYP74F1 rice
(29% to
Nematostella XM_001636310.1 12 exons)
1244338 MGNCCSNYAGMWRALQQGNYSIKEINYGGADATVLRRNIGVTVVSLLDQHNIRYVFDMDLVEKVPFTLGNTALRPAVLGG
HCPGMLSNGVEHVRRKEFAMAVIQRSLTNSLFSTMVEQLHAHTSMWATVGHNIYDFEDRVNRFCADAVSTVILGTTLPYE
SVRAWQNGLHSHRPRVPTLGRYLAKSHALRALPVLLRNIRNAPAYEDIIHLGKTCGLTEEEATHEILYTIVGHALPQVQN
PLLACLAAYAAMPDLDRRQMWEEMNK
(0) 1245135
1247089
VLHNVGTFTETVLGSMTCVESFILEVLRLRPPMEMFFGRARKDFIVKTRDREIFQ
(0) 1247256
1247668 VHEGEVVCGSAFWAGRDPTSFRVPIMFRRNRFACPGSEALRGSLIFGRGPLTFLPTNENHQCPGLELAMGVLKPSMAWL
LMFCKWKLTEEPKWSGKKRSRCGKPDNPMGMVTFKYYPTDVANYYPLPGVTPSNEKGKPGKDNSPNVSSFVSSIL*
1248132
Two genes
nearly identical 45 kb apart (possible assembly error?)
estExt_fgenesh2_pg.C_1200094
1303480
MGNCCSNYAGMWRALQQGNYSIKEINYGGADATVLRRNIGVTVVSLLDQHNIRYVFDMDLVEKVPFTLGNTALRPAVLGG
HCPGMLSNGVEHVRRKEFAMAVIQRSLTNSLFSTMVEQLHAHTSMWATVGHNIYDFEDRVNRFCADAVSTVILGTTLPYE
SVRAWQNGLHSHRPRVPTLGRYLAKSHALRALPVLLRNIRNAPAYEEIIHLGKTCGLTEEEATHEILYTIVGHALPQVQN
PLLACLAAYAAMPDLDRRQMWEEMNK
(0) 1304277
1306472
VLHNVGTFTEAVLSSMTCVESFILEVLRLRPPMEMFFGRARKDFIVKTRDREIFQ
(0) 1306636
1307061
VHEG 1307072
end
in a sequence gap
>estExt_fgenesh2_pg.C_1950037
[Brafl1:125761] two genes fused
Scaffold
195
Neighbor
to amphioxus on right side scaffold 195 also on another scaffold
P450
like Nematostella/CYP74
MGGVWSNTYGFIKGVTDGVHMMKPEGEHPSVVRTNPGLPVVALMNQDTIHYAINPETYKKEPYSFGPVGV
SKDVLRGHCPSMFSNDEDHRRKKALLVDAYKQGEKSLPSILFNQIKAHFGEWSRLKDVPDFEERVFHIMS
ETLTEALFGRKIDGQLCFTWLNGLITEAKTWIPMPSLAWKRRQAIKAIPELLKAIETAPKYRELVQLCHT
HGVEVEEGIFTILYGTLFNGCAAQTAAIVSSVARLHTLSDAEKNEIIQTTLQVLEKHGGVSEESLGEMKT
LESFILEVLRLHPPVFNYWVLARKDLVISPEKENIKVRKGERMLGCCFFAQRDGSVFPDPDRFRWNRFLD
EQGGQKKHLFFPRGSFTEAADLNSHQCPGQDIGFFMMKTTLSVFLCYCSWELKDAPVWSDKPIRVGNPDD
PVRLVRFNFRSEQAGRALVNTSAKKI*
>estExt_fgenesh2_pg.C_3320046
[Brafl1:128846] 3 exons
Scaffold
332 also 98% identical to upper seq but gene neighbors are different
MGGVWSNTYGFIKGVTDGVHMMKPEGEHPSVVRTNPGLPVVALMNQDTIQYALNPETYKKEPYSFGPVGV
SKDVLRGHCPSMFSNDEDHRRKKALLVDAYKQGEKSLSSILFNQIKAHFGEWSRLKDVPDFEERVFHIMS
ETLTEALFGRKIDGQLCFTWLNGLITEAKTWIPMPSLAWKRRQAIKAIPELLKAIETAPKYRELVQLCHT
HGVEVEEGIFTILYGTLFNGCAAQTAAIVSSVARLHTLSDAEKNEIIQTTLQVLEKHGGVSEESLGDMKT
LESFILEVLRLHPPVFNYWVLARKDLVISPEKENIKVCKGERMLGCCFFAQRDGSVFPDPDRFRWNRFLD
EQGGQKKHLFFPRGSFMEAADLNSHQCPGQDIGFFMMKTTLSVLLCYCSWELKDAPVWSDKPIRVGNPDD
PVRLVRFNFRSEQAGRALVNTSAKKI*
>estExt_fgenesh2_pg.C_1940045
[Brafl1:125747] 81% to estExt_fgenesh2_pg.C_1950037
scaffold
194
MGGVWSDTFGFIKGLVHGPHMMKPEGEHPSVFRANPGVPAVVLLNRDTIQYAFNPETYEKEPYSFGPVCA
AKDVVGGHCPSMFSNDEDHRRKKALLIDVYKQGQKTLPSVFFSQIKAHFEEWSRLEDVPDFEERVFHITS
ETLTEALFGKKIDGRLCYTWGNGIPTDFRTWIPIPPAARKRRQAVEVLPALLKAIKETPKYQELVQLCHT
HGVEVEEGILTILYGTLFNGCGAQTATIISSVACLHTLSDAEKNEIIQTTLQVLEKRGGISEESLSEMKT
LESFILEVLRLHPPVFNYWALARKDLVISPEKENIKVCKGERMVGSCFWAQRDGSVFPDPDRFRWNRFLD
EDEQGGQKKHLFFPRGSWTEAADLDSHYCPGQDIGFFILKVLLAVLLGYCSWELKDAPVWSDNTFRLGNP
DDPVRLARFNFRSEQAGRALGIRPDNIAPNAI*
Seq
downstream similar to a rickettsia seq
>estExt_fgenesh2_pg.C_510020
[Brafl1:120723]
Scaffold
51
87%
to estExt_fgenesh2_pg.C_1940045
83%
to estExt_fgenesh2_pg.C_1950037
623366
MGGVWSDTFGFVKGLVYGPHM
MKPKGEHPSAFRMNNGVPAVVLLTRDTIQYAFNPETYEKDPYSFGPGGVSKDVVRGHCPSMFSNDEDHRR
KKALLIDVYKRGQKTLPSVFFSQIKEHLEEWSRLEDVPDFEERVFHIMSETLTEALFGRKIDGELCFTWL
NGLLTDFKTWIPIPSMSRKRRLAIEALPALLKAIKEAPKYQELVQLCHTHGVEVEEGIFTILYGTLFNGC
AAQCAAIVSSVARLHTLSDTEKNDIIQTTLQVLEKHGGVSEESLGEMKTLESFILEVLRLHPPVFNFWCL
ARKDLVISPEKENIKVCKGERMVGCCFWAQRDESVFPDPDRFRWNRFLDEDKQGGQKKHLFFPRGSWTEA
PDLDSHQCPGQDIGFFMMKALLAVLLGYCSWELTAAPMWSDKTIRVGNPDDPVRLARFNFRSEQAGRALG
IRPDNIAPNAI*
>fgenesh2_pg.scaffold_163000045
[Brafl1:87575]
Scaffold
163
73%
to estExt_fgenesh2_pg.C_1940045 pseudogene
MGGVWSDTFGLIKGLVYGPHMLKSEDEYPTAFRTNNGVPAVVLLNRDTIQYVFNPEMYEKEPFYFGYLGT
SKDVMRGHCPSMFLNGEEHRQKKALLIDAYKQGQKALPSVLFKQIKAHFGEWSRLDEVPDFEDRVFHFFS
EALTEALFGRKVDGQLCRTWLNGLLNDFKTWIPMPSMARKRRLAIEAIPVMWKAIEEAP
K Y
* E L
V Q L
C D T
H G V
E A
E E G
I F T
I L C
G T I
F N G
I A A
E R A
A
I V
S S V
A R L
H T L
S D A
E K N
E I I
Q T T
L Q
V L E
K H G
G VS (frameshift)
G E
M K T
L E S
F L L
E V
L R
L H P
P V F
N L W
G L A
R K D
F I I
S P E
K E
N I QV (1)
IA
I I R K G
E Q L
L G S
C F W
A Q R
N G S
V F P
D P
D R F
R W N
R F V
G E D
E Q G
E Q K (2?)
(0)
K H L F L
P R S (2)
NWTEAYDFDSH
HCAGQDIAFLTMKATLAVLLCYCSWELKDAPVWSDKTLRVGNPDDPVRLTRFSFRSEQAGRALGIRPDNT
YPNSI*
unnassigned
(1 sequence)
>gw.549.14.1|Brafl1
heme signature like seq possible pseudogene fragment no allele
CVYWDFKLNDDKGGWSSEGCNVYYAADTHTVCHCNHLTNFALLMDVYGSTAKLSEGNQKALSIISLIGCAVSSAGLLFAL
ITFLLFRTLRRDNPTKILINLCVALLLVNLTFVTLSHPEQFHAGFMCKTHAMVMHYALLAAIAWMGIEAVNMYLAFVKVF
DTYYTNFVMKICLAGWGRLLKYVQFSMGMKSCVARACVCVCVCVCVCVCVNLCYLSGIAFYAAFVAPVCVVLIFNTTMYG
LVLRHVVRMRGKVEKSELSEVITKLKRAAGLCVLLGVTWLFAMLAIDKAAVFFSYVFAICNSLQGFFIFVFHCVLRKSAR
KRWMALLPC