Aphid
P450s from Acyrthosiphon pisum
Oct.
13, 2009
D.
Nelson
CYP2
clan
>CYP15A1 LOC100162751
49% to CYP15A1 Tribolium,
51% to CYP15A1 Diploptera puncatata
52% to CYP15A1 Reticulitermes flavipes
MLFFVTLVISLVLLFLILDTIKPRRYPP (1) GPKWLPIGVN (tribolium N-term)
MFFVAVVISVFIIVCILDIITPHKYPI (1) GPTRVPLLGNYLE
IRKLRNKLGFYHLVWDHLAKYYGKVFSVKLGRIEAVVVSGYDAVRQVLCKDDFDGRPDGFFFRFRAFYKRLGIVFVDGPTWTEQRKFCMQHLRKMGFGGDLMERIIIEEVNDLMLDISRKCENGKPIEVYGLFDVSVLNGLWAMLAGHRFALNDSRLARLMELVHVSFRMLDMSGGILNQMPFIRFFAPKCSGYKYLKQIINEFYTFLKESVEEHKCRANDQEDDFISAFLKEIEKNKESPGSFSEEQLLVILLDLFLAGSETTSSMLSFVILLLLKHQDIQAKVHAELDAVVGDREIHLADKNRLNYLEAVLMEVQRHSNVAPLAIAHRTIRKTSLQEYTIPKDTLVLASIWSVHMDEQHWGDPKVFRPERFLDSSGKIINDSWFMPFGVGRRRCLGEILAKTNIFMFIAKLIQHFEIRIPQGAQLPDKPQDGVTISPSPFSAIFIPRRCLSQ.
>CYP15A2P LOC100169165 whose N-term 100 or so aa
are wrong
49% to CYP15A1 Tribolium
87% to LOC100162751
missing N-term exon, revised middle
Since the rest of the protein is so conserved
(87%), the N-term exon seems to be gone. Therefore this is probably a paeudogene.
GPSRIPFIGNYLEIRKLRNELGFYHLVWHQLAKCYGQVFSVKLGRIEAVVVSGYDAVRQVLCKDDFDGRPDGFFFRFRAFYKRLGIVFVDGPTWNDQKKFCMQHLRKMGFGGDLMEKIIIEEVHDLMVDITIKSENGKPIKVHGLFDISILNGLWAMLAGQRFALNDSRLARLMELVHVSFRMLDMSGGILNQMPFIRFLAP
NSSGYEHIKQILNEFYTFLK (0)
ESVEEHKCG
ENYQEDFISAFLMEIEKNKESPESFSEEQLLVILLDLFLAGSETTSSMLSFAVLLLLKHQDIQDKVHAELNAVVGDREIQLADKKKLNYLEAVLMEVQRHSNVAPLAIAHRTIRKTSLQEYIIPKDTLVLASIWSVHMDEHHWGDPEVFRPERFLDSTGNIIKDSWLMPFGIGRRRCLGEILAKANVFMFIANLIQNFEIRIPNGVQLPDRPQDGVTISPSPFSAIFIPRR.
>CYP15A3P LOC100160402p which has either 2
annotation errors, or is a pseudogene.
44% to CYP15A1 Tribolium,
79% to LOC100162751
middle region revised
missing N-term exon
Since the rest of the protein is so conserved
(79%), the N-term exon seems to be gone. Therefore this is probably a pseudogene.
GPTRVPLLGNFLEIQKLKNKLGFYHLVWDKLAKCYGQVYSVKFGPIETVVVSGYDAVREVLSKDDFDGQADGFFFRTRAFYKKLGIVFVDGPMWTEQRKFCMRHLQKLGFCGDVMEKIVIEEVNDLVLDITRKYENGKSIEVRGLFEVSVLNGLWAMLAGGRFSLNDSRLARVVELIHESLRILDMPGGILNQWPFIRYLAP
LSRNKHLKQIINELYILLK (0)
ESVEEHKCSENDQE
DFISAFLMDIEKNKKSLGSFSEEQLVVILLDLFLAGSETTSITLSSVILHLLMNQDIQTKVRAELDAVIGDREILPSDRKRLNYLEAVFMEVQRHSNVVPLAIATNRTIRKTTLQDYIIPKDTLVLASIWSVHMDEQHWGDPEVFRPERFLDSKGKIINDSWLMPFGVGKRRCLGEKLAKTYIFMFIAKLIQHFEIRIPTDIQLPDKPQNGVNISQTPVSVFFIPRRCLKAN.
>CYP18A1 LOC100163652 = 26 hydroxylase (not a
Halloween gene)
MTAETMSDGDGYSRELWLNAVAAALGLTYSAYRQLRAARTLPPGPWGVPFLGYAPFLSNHCTYLKYNELARRYGPICSFTQRGNTVILLSDHKLIKTAFDMKQITGRPNDGYMDIIGGYGAVNSTGKLWESQRKFLHLVLRHMGMTFTGHNRLNMENRIMIEVSTLTETFHKTCGKPIDLNAGSLCLAITNVISSLTMSVRFEPNDPRFERYMHMVDEGFKLFGMLRPVSLFLPRRHITDERNIQEKIKNNHQEIAKYFQSIIEEHRSTFDPNSIRDLVDAYLLEIKRSQEAGTMDQLFQGLDPNRQVQQILGDLFSAGMETIKNTILWAMVYMLHYPDVMTKVQDEIDSVVGQYKSPVLDDYPNLPYTQATLYEVLRKSSITPLGTTHATTSDVTLNGYHIPTGAQIIPLQHFVHNDPNLWDEPEAFKPERFINAEGKVKKPDCFLPFGVGRRKCLGETLAQMELYLFFSTLLHEFDVCLPDGDELPSMDGQVGITLTPQSFKVVMKARNK.
>CYP303A1 LOC100162206 = nompH (not a Halloween
gene)
MWILVLVLFSVVVALLSYLDMRKPKNYPPGPKWLPILGSALTVNSLRKQTGYLYRATICLAESYGPIVGLKVGKDRQVVCCGYNAIKEMLTKEEFDGRPQGPFYETRTWGTRRGLLLTDEEFWVEQRRFVLRHLREFGFGKRTMAELVQDEAVQLVEDFKEKIAMSKNGNGEIFEMRDAFSVGVLNTLWSMMASKRYNADDIELKNLQALLTELFANIDMVGALFSQFPVLRFIAPEASGYKSFVNIHQQVWKFLKAELDDHKETFIINQPRDLMDVYLQMLHSEDKKESYSESQLLAICMDMFMAGSETTSKSLGFGFLYLLLNPEVQKKAQEEIDRVVGRDRLPTLNDRPNMPYLEALVLESVRVFMGRTFSIPHRALKDTTLQGYHIPKDTMVIANFAALLNDDDVWDHPDRFWPERFIGCDGKLIVPDEYLPFGYGKHRCMGQTLARSNIFLFSACLLQNFDFSVPDGQAPPSTLGVDGVTPSPGEFNAYVSLRPR
>CYP305E1 LOC100168939 SCAFFOLD13 coords:75476-86851
41% to CYP305A1 Drosophila melanogaster
MAWYFVCFVTVILLLIALRTCRKPKNYPPGPKWIPFVGNTYQLSKLAATKNGQYLAFEELRQRYKSDIIGLKLGREYVVIVFGNDLLNETFHRDEFQGRPDNFFMRLRTMGKRRGITMTDGDLWKVHRSFAVRHLKLLGLGQRRVDELIHDEYQLMVDRLFDATKSVTPTLYLQSAVMNVLWELTAGTKFEDPKLLTLMRKRSSAFDMAGGLLNQIPWLRYLAPTRTGFSLITEINQQLYSLISNIIVEHKKTITHTTRDFIDAYLNQMKKEEIYNTMFTEEQLIAVCLDLFIAGSSTTSSTLDFAILAMARWPDVQAKVQSTLDEIQPPGTYITAEQILKNRYVEAVLLETKRLNHVTPIIGPRRVLRNTNLNGYNIPKNTTILMSLYSVHQDQLKWGDPEVFRPERFMDTNGKINTTEDMYFFGFGKRRCPGEALAQRFVNLAFANLIHDFTIEIDQLPDGVNCGILLTPKPYKIKMTKRK
>CYP306A1 LOC100165691 = phm
25 hydroxylase
MFWIIGVILFGVLCAGYLWRSNRNLPPGPWGVPIFGYLPWLNPTEPYKTLTALASKYGPIYSIQMGKHFAVVMSDPTLVRMALARNELADRTNFEVVNEIMQEHGLIFTHGPLWKEQRKFVCNWLKVIGVTKFGDKKNNLQLLIADAVSTTISKLRQSNNRPIDTGTFFLVHIGDFINLIVLGKAWPEDDPNWIYLRNLAEDGSKKFAIATPLSVLPILKIIPKYRNTVFEVIEGVKNTHLIYKTLMEKRGNEIHESDDLMAMFMKEMTKRKNDKDSHYFTEKQCCFLLSDLFGAGVETTVNTLRWFLLYMALNQEIQNDLQKLLDSACTDGGLIGLEQIESIPLLKACVSETMRLRPVAPSGIPRSVNTEITISGYRIPKGTMVLPLQWAMHHDEKYWTDPETFRPKRFLDDEGNMINHKAFMPFQAGKRACVGDTLSYWILYLFGANIIHNFNVSAEQGLSEKEINTIMDGEFGITLSPATHKVVFKSRI.
and three paralogs of the Halloween gene
CYP307 spo (which has a complex history, see Rewitz and SztalŐs papers), so
one might expect different expression patterns.
>CYP307A1 LOC100160204
85% to LOC100160738
57% to CYP307A1 Tribolium,
49% to Cyp307a2 Drosophila melanogaster
39% to CYP307B1 Tribolium
MDTAKGVVAAAADNVTVVLLLLLSVVLLILAVKSASGRGPWTSRRRPGKSTAAVALTAVPDGPTAYPVIGALHAMDGHRDKPFHRFTELSHKYGPVFSMTMGSMPCVIVNDFDSIKEVLITNGSKFGGRPDFSRYNVLFAGDRNNSLALCDWSWLQETRRKIARKYCSPKVCSSNYGLLDSISSDELDVFLESLAAVTIRGFECEVQLKKQLLMACANMFIRFMCSTQFEYGDPKFQNMVRTFDEIFWDINQGYAVDFLPWLKPFYAGHMRKLSKWSTQIRRFIMDTVVSKRYAADDVDEQEPIDFTDALLMSLRKEPGLKMNHVLFELEDFIGGHSAVGNMIMLALSMVATRPHVAQAIRDEAEQVTGGQRLVRLYDKPDMPYTEATLFETLRFISSPIVPHVATEDTTIKGFKISKGTCIIINNYEINTSPAYWDNPEVFDPNRFVHRESGTKPCIRKPEYFLPFSTGKRTCIGQQLVSGFGFVLLAGILQRYEVKATAQLAIPEARLALPPDTYPLILKPLDGSR
>CYP307A3 cLOC100160738
52% to CYP307A1 Tribolium,
50% to Cyp307a2 Drosophila melanogaster
37% to CYP307B1 Tribolium
MDTTNGIVAGADTVTVALSLLLPVVLLMLAVAWACGPLAAHRRPGTSTAAVLDGPKSFPIIGSLHAMDGHQDSPFRRFTELSHQYGPVFAMTMGSMPCVVVNDYDSIKEVLITNGSKFGGRPDFTRYNALFAGDRNNSLALCDWSSLQETRRKIARTYCSPKVYSSNYCLLDSISSNELDVFLDSLATVSVRGSECEVQLKQLLLMASANMFIRFMCSTQFEYGDPEFQNMVRTYDEIFWDINHGYAVDFLPWLKPFYAGHMRKLSKWSTQIRQFIMDMVVSKRSSYAKAQEPTDFTDALLMSLRKEPGLKMNHVLFELEDFIGGHSAVGNMVMLALSMVATRPHVAQAIRDEAEQVTGGQRLACLYDKPDMPYTEATLLETLRFISSPIVPHVATEDTTIKGFKISKDTCIIINNYEINTSPAYWDNPEVFDPNRFVHRKFGAKPCIRKPEYFLPFSTGKRTCIGQQLVSGFGFVLLAGVLQRYEVKATAELAIPEARMALPPDTYPLILKPLDGSR
>CYP307C1 LOC100159333
46% to CYP307B1 Tribolium,
43% to CYP307A1 Tribolium, 40% to CYP307A2 D. melanogaster
42% to LOC100160204, 40% to LOC100160738
MEFVFSSLTYLLLFVLTAVLLFLIRDELKTKQVDHRAGLVDPPAPKAWPIIGHLYLMARYKVPYRVFDEIMADLGSVFRLDLGSVPCVVVNGLNNIREVLMIKGDHFDSRPSFRRFNQLFKGDKNNSLAFCDWSQLQKTRRELLRAHTFPNTTSNMYTRLDTCLKTELADLTDTLDTMANTECVDIKNMLLHTCANVFMSYFCSTRFSRSYDKFREFIRNFDDVFYEVNQGAPCDFLPSLMPLYHWHFKKIRSWSSKIRNFMETEIFNKRKAAWVPGTKPVDFVDNLLDAVTQPDRDDGFDMDIGLFSLEDIIGGHSAITNFIVKTLGFLVDRPDVQRRIQEESDAVVRASGSVGLSDRSQMPYTEAVVYESLRLIASPIVPHLANRDTSVDGVRIRKGTTVFLNNYSLHMSPELWNNPEHYSPERFINAEGRLEKPEYFIPFSGGKRSCMGYKLVQLLSFCTISTLLNKYTLLPVEDVSYAVPKGNLALPFVTFPFRLRPRNFRKQ
CYP3
clan
>CYP6CY1 LOC100159226.pro SCAFFOLD10025:26656..31321 (+ strand)
82%
to LOC100163313.pro (adjacent), 77% to LOC100168115.pro (2 genes away)
MFTASWWINVITPCTIIVTITYYFCVSTFKKWEKLNVPYIKPIPLFGNFLNVALGKNHPLEFYNKIYHEFAGQKYAGVFQMRTPYLMVRDPEIINDVMIKNFSSFPDRGIYSDFVAEPLTNNLLLMENPQWKIIRNKLTPAFTAGKLKTMYDQIKECGDELMKNIDIDLNRTSNEIEVKDIMGKYSTDVIGTCAFGLKLNAINDDESPFRKYGKLIFKPSLRVLMRELCVMITPALLKVVRLKKFPTAATDFFHAAFNETMTYRLENNIVRNDFVHYLMQARNDLVLNTDLPKHEKFAESQIVANAFVLFAAGFETVSSAISYCLYELALNKSIQDRVRKEIQLQLSKNNGQINHELLIDLNYLDMVIAETLRKYPPLVALFRKASQTYRVPNSSLIIEKGQKIIIPIYAIHYDNKYYSDPEKFIPERFSAEEKAKRPSGVYLPFGDGPRICIDSEWLRNVLMLQ
>CYP6CY2 LOC100163313.pro SCAFFOLD10025:31376..37086
(+ strand)
MFTANWWINFITPCTIIVTIAYYFCVSTFKKWEKLNVPYIKPIPLFGNLLNVAVGKDHPLDFYNKIYHKFAAHKYAGVFQMRTPYLMVRDPEIINDMLIKDFSSFPDRGIYSDFVAEPFSNHLFFMENPQWKIIRNKLTPAFTSGKLKMMYDQIKECGDELMKTIDIELIKNDDEIEVRDIIGKYSTDVIGTCAFGLKLNAIKDDESPFRKHGKTLFEPSLRALFKELCLMIAPALLKVIKVKDFPTDATDFLHTVFKETITYRQKNKIVRNDIFQCLIQVRNDLVLNADLSKNEKFTETQIVANAFAMFAAGFETVSSAISYCLYELALNKSIQDRVREDIELKLSNNDGQINHELLIDLNYLDMVIAETLRKYPPVVALFRKASQTYRVPNDSLIIEKGQKIIIPIYALHYDSKYYTDPEKFIPERFSAEEKAKRPSGIHLPFGDGPRICIGKRFAEMEMKLAFVEILTKFEVFPCEKTEIPLKYSNKVFTLMPKHGIWLRFKRIN
>CYP6CY3 LOC100168115.pro SCAFFOLD10025:41696..46319 (+ strand)
37% to CYP6A14, 38% to CYP6Y1, 39% to CYP6AQ1 bee
42% to CYP6AX1
MAYDLNEIKIKLNYVNIINVIIVVLYHYLFKIRFREKKYAEKYPLIEFDTKRENSVDSCVFCVGTLFNIQNCVNYRKGHMYPSSSSATDWWIYIVTPCLVAVTITYYFCISTFNKWEKLNVPYIKPIPLFGNFLKVALAKDHPLEFYDKIYYKFSGLKYGGLFQMRTPYLMVRDPEIINNMLIKDFSSFPNRGIYSDLAANPLSDNLFFMENPRWKTIRSKLTPAFTSGKLKIMYDQIKECGDKLMKNIDNDLKGKNDEIEVRDIMGKYSTDVIGTCAFGLKLNSISDDESPFRKYGKSIFIPSLRTLFRELCLMVSPALLKVVRVKDFPTDATAFFNAAFKETITYRLENKIVRNDFVNCLMQARNDLTLNTNLPKHERFSESQIVANAFVMFAAGFETTSTTLSYCLYELALNIHIQDKVRQEIQLKLSKSDGQIDNEFLMGLNYLDMVIAETLRKYPPLIALFRKASQTYRLPDNLILEKGQKIVIPIYSIHFDSKYFEDPLKFNPERFSSEERAKRPNCVYLPFGDGPRTCIGKRFAELEMKLALVEMLTKFEVLPCGKTEVPLKYSNKALTLMPKHGIWLRFKKIV
>CYP6CY4 LOC100164042.pro SCAFFOLD10025:137606..141395
(+ strand)
87% to
LOC100167264.pro, 85% to LOC100163313.pro
80% to
LOC100168115.pro 90kb downstream of LOC100168115.pro
MFTANWWINVITPCTIIVTIAYYFCVSTFKRWEKLNVPYIKPIPLFGNFLNIALGKDHPLEFYNKIYYEFAGRKYGGLFQMRTPYLMVRDPEIINDVMIKDFSSFPDRGIYSDFTANPLSNNLFFMENPQWKTIRNKLSPAFTSGKLKTMYDQIKKCGDELMKNIDIDLNKNGNEIEVRDILGKYSTDVIGTCAFGLKLNAISDDESPFRKYGKSIFTPSLRMLFRELCLMITPALLKVIRVKDFPTAATDFFHAAFKETMTYRIENKIVRNDFVHCLMQARNDLVLNTDLPKHEKFTETQIVANAFVMFAAGFETVSTTVSYSLYELALDKSIQDRAREEIQLKLSKNDGQINHEFLMDLNYLDMVIAETLRKYPPLVALFRKASQTYRIPNDSLIIEKGQKIIIPIYAIHYDTKYYPEPEKFIPERFSVEEKAKRPSGIYLPFGDGPRMCIGKRFAEMEMKLAFVEILTKFEVFPCEKTEVPLKYSNKVLTLMPKHGIWLRFNRIN
>CYP6CY5 LOC100167264.pro
SCAFFOLD13514:28216..31057 (+ strand)
MFTANWWINVITPCTIIVTIAYYFCVSTFKKWEKLNVPYIKPIPLFGNFLDIALGKAHPLEFYGKIYNEFAGRKYGGLYQMRTPYLMVRDPEIINDMLIKDFSSFPDRGIYSDFVANPLSNGLFFMENPQWKIIRNKLTPAFTSGKLKTMYDQIKECGDELMKTIDMDLIKNGKEIEVRDIMGKYSTDVIGTCAFGLKLNAINDDESPFRKHGKSIFTPSLRSLFRELCLMVTPALLKVVRVKDFPTDATDFFHAVFKETITYRLENKIVRNDFVQCLIQARNDLVLNADLPNHDFVLEKFTESQIVANAFGMFAAGFETVSSTISYCLYELALNKSIQDRLRKEIQLKLSKNDGQINPEFLMDLNYLDMVIAETLRKYPPLVALFRKASQKYRLPNDSLIIEKGQKIIIPIYALHYDNKYFTDPENFIPERFSAEEKAKRPNGIYLPFGDGPRICIGKRFAEMEMKLAFVEMLTKFEVFPCDKTDIPLKYSNNVITLVPKHGIWLTFKRIN
>CYP6CY6 LOC100165240.pro
SCAFFOLD13514:36706..45131 (+ strand)
downstream of LOC100167264.pro
MFTDNWWIYVITPCTIIVTIVYYFCVSTFKKWENLNVPYIKPVPLFGNFLNVALGKEHHIDFYNKFYHKFAGHKYAGVFQMRLPILMIIDPEIINDVLIKDFSSFPNRGFSVDFKANPLSNNLFLMENPQWKIIRNKLTPAFTSGKLKVMYDQIKECGEELMKNIDIDLKKSGDEIEVRDIMGKYSTDVIGTCAFGLKLDAINDDESPFRKHGKSIFAPSLRQLFREMCMLISPVLVKVVRVKDFPKDATDFFHAAFKETMKYRHENKIVRNDLVHCLMQARNDLVLNTDLPKHEIVLEKFTESQIVANAFIMFAAGFETVSSAISYCLYELALNKSIQDRVREEIQLKLSKNDGQINHEFLMELHYLDMVLAETLRKYPPLVFLMRKALQTYRLPNDSLTIEKDQKVIIPVYAIHHDSKYYPEPENFIPERFSTEEKAKRPNGTYMPFGDGPRICIGKRFAEVEMKLAMVEMLTKFEVFPCEKTEVPLKYSHKTITLMPKHGIWLKFKKIN
>CYP6CY7 LOC100160895.pro SCAFFOLD17283:187176..190201 (+ strand)
MIDVISCSIIGLLSSVYILYATVFLSIAYYLCTSTHDKWRKLNVPYTKPLPLFGNSMNLVLAREHPMDFFTGLYNRFPDEKLCGFYQMTTPFLMIRDPKLINNIMVRDFSYFTDHGFDTDPSVNILANSLFMLNGDRWRTMRQKLSPGFTSGKLKDTHDQIKECTDQLINIVDDNLKVSDHFEIRELVGNFSTDVIGMSAFGLKLDTIRNGNLDFRKFGKKIFQSDFKQLFVQAMMLFCPKLVTILKLKQFPDDAADFYGSMFRDVLEYRDRNNVIRNDVTQTLIQAKKDLVTNNDGDDSTSKNKWTEMDIVGNAILMFVAGAETVSITICFCLYQLALNKDIQDKLREEIVTTNAKHGGQLNNDFLTNLHYMNMVLEEVSRMYSITMILFRQATKNYEVPGQSLVIEKGQKIIIPAYCIHNDPKYYPNPGTFDPERFSTEEKAKRLNGTYIPFGDGPRLCIGKRFAELEMKLVLSKILLKYEVLPCEKTEVPINIRGAGSIVNPKNGVWLSFKPIVAN
>CYP6CY8 LOC100159248 SCAFFOLD5532:14697..15575 (- strand) and SCAFFOLD5532:7975..9925 (- strand) (seq gap in
middle of gene
MLIFANFWMDFIILVTVLFSIIYYYCTSTFNVWKKLNVPYVRPIPLFGNYLKVALGIENPMETYKNIYYELAGFQYGGMFQMRTPYLMIRDPEIVNNILIKDFSYFTDRGIHVDFKAEPLSEVLFLMENPRWKKLRSKLSPAFTSGKLKQMYSQIEKCGQDMIINIFAELKKNPNEIDIRDILAKYSIDVIGSCAFGLALNVASDDTSLFRSYGKTAFSETLRKYPLLFALFRVATKTYRVPNDSLIIEKGQKIIIPTFSLHFDPRYFSDPEVFNPERFSTKEKAMRPNGVYLPFGDGPRLCIGKRFAEMEMKLALVEILSKFEVEPCEKTEIPIQFSKLSVVVIPKDEKILLKLNPLSE
>CYP6CY9 LOC100161627 SCAFFOLD1099:1136..6561 (+ strand)
MSASQLLVDLAAGWWTVAVLALLAATVYHFCTSTFGYWRDRGVPYVRPTVPLFGNIGGLALGVEHQARMFGRIYDGFRGQRYGGFFQMRTPHLMVCDPALVNRVLIGDFAHFTDHGMYTAGPDENPLANGLFNMNGAQWKIMRQKLSPVFTAGKLRHMRGQVTECSEQLMRNVAADVPAGGGQMEIRDVLGKYSTDVIGTCAFGLHLNAINDERSSFRKHGKAVFAPSFRVLLKELAWMVTPALRRALRIGDMPPDAAQFFTAAFTDTMKYREEHGIVRDDFMQSLIQARTDLVVNKTEPSVEFLETDIVANAFILFAAGFETVSTAMSFCLYELALKKPIQDKVREEMNTTKKKHNAEIDNDFLKDLHYLEMVLAETLRKYPPLLTLFREATQDYQVPDDTFVIEKGTKVLIPAYAIHHDYRYYPDPETFDPERFSPEEKAKRPNGTYMPFGDGPRLCIGKRFAEMEMKLALTELLTTYEVEPCEKTDIPMRFSKRSLIITPENGIWLKFKPIHTSK
>CYP6CY10P LOC100159387 SCAFFOLD6634 coords:14145-18091
(- strand) pseudogene
upstream part not found in region up to next gene
47%
to CYP6A13, 48% to CYP6AQ1 C-term only CYP3 clan
67%
to LOC100168007.pro
MESQILSNAFGFFAAGFDTTSTSISYCLYELALKKNIQDRVREEIKLTKSKYNGVIDNEFLNDLHYLDMVIAESLRKYPLMFALFRVATKTYRVPNDSLIIEKGQKIIIPTFSLHYDPKYFSDPEVFNPERFSPKEKAMRPNGVYLPFGDGPRLCIGKRFAEMEMKLALVEILSKFEVEPSEKTMIPVQFSKLSVVVIPRDEKILLKLNPLSE
>CYP6CY11P SCAFFOLD12002:AUG4_SCAFFOLD12002.g13.t1 pseudogene
SCAFFOLD12002:309999..313284 (- strand)
86% to LOC100163195.pro not in the collection
MISCLIDFLLGTPAIAVTVLMAFVYYYTTNTYDKWLKLNLRYDPPWPLVGNTMKMVTLIE
HQLATIDGIYKRLAGEKYCGFYQTKTPFLMIRDPELINNILIKDFLNFANRGFHKDPALN IIANGLFFMEGPKWDVMRQKLSSGFTSGKLKLAHNQIAECSDELMRFIAAKMKENDQIEVK
*TMSKYSTDVIGTCAFGLSL
>CYP6CY12 LOC100163195.pro 39% to CYP6AX1
SCAFFOLD12002:306085..309474 (- strand)
MISCLIYVLFGTPAIAAVAVLAAILYYYTTNTYDKWLKLKVPHDPPWPLVGNTAKMMTLIEHQLTTIDGIYKRFSGEKYCGFYQMKTPFLMIRDPELINNILIKDFSNFADRGFHKDPALNIIANGLFFMEGPKWKMMRQKLSPGFTSGKLKLAHNQIAECSDELMRFIAAKMKENDQIEVK
ETMSKYSTDVIGTCAFGLKLDTVKNEGSDFRLYGRKILKLSFRFLLAEMVSPKILKLLGVAEFPPDASAFYESAFKEVIRYREENGIVRHDVAQSLIEARKELVLDSTDENGFTEQHIIANAILMFLAGFETVSSTLSFCLYHLALNQDVQEKIRDEMNSKLKQHGKINNDFLVNLHYTDMVLAETERMYVVTNALFREAVKTYHVPGDTLVIEKGTKIMIPIYSIHHDPTYYPEPYIFDPQRFSPEEKAKRQSSTYLPFGDGPRFCIGKRFAELEMKMVLSQIITTFRILPCEKTEVPLKLQNGLPMMVAKNGIWLRFQSISE
>CYP6CY13 LOC100167704.pro SCAFFOLD7563:47979..58025 (- strand)
only P450 on this 60 kb contig
MISWMFNCLIDSFTLICTTVIGLLFYYYSTSTYKKWRKANVPHTKPVPFFGNFFRSTLGFETINDTYHNIYKQFPDKKFCGFYQMRTPTLMIRDPELINNVLIKDFSHFTDHGLDMDPSVNFLASSLFFTRGQKWKIMRQKMSAGFTSGKLKLMHSQIKDCSKEMIDYIDRKSKTTDQFDMHDIMNKYATDVIGTCAFGLKLGSMKDEDNEFRKFTKLLFKPSFRLIFTNILSLISPKTSNILKIKTSSPEVMEYFTTSFQNVIEYREKNNMDRNDVAQTLMRARKELKFTEMDIISNAILMYLAGAEPVSDTLGFCLHELAINKHVQDKLRKHINTKRKEHGGEFTNDYLMDLHYADMVLTETLRKCNGTIVLFRKATKAYQVPDSSLVIEKGQQIIIPTYSIHHDPKYYTNPDVFDPERFSPEEKSKRPSSTELLFGDGPRFCIGKRLAELEMKLGLSEIISKFEILPCEKTENPVQLANAGGAIKPKNGIWLI
>CYP6CY14 LOC100161480.pro SCAFFOLD8603:3827..8475 (- strand) model short
MISYLTNLLFDYIFLSLIIVCTFLYYYTTSTYDTWRKLNVPFAKPVPFFGNIFKMFTGLERQVDAFGRIYQQFPDEKFCGFYQMSTPFLMLRDPELINTVIIKDFSYFTDHGIDMNPSVNVMARSLFFATGQKWKTMRQKLSPGFTSGKLKGTHEQIRECSDQLTNCIYEKSQKTDAIEVYELVGNTATDVIGTCAFGMKLDTINNDNSSFRQNVKKVFKPSGKVIFAQILGVLFPKIVKFLKLQTSPVDVDAVNFFHSVFGEVIEYRTKNDVVRNDLTQTLMKARQDLVVSSDYKGEEKYCELDIIANAMLLFTAGSETVTATASFCFYELALNKVIQDRLRDEIISSKIKHGGQLNNEFLEDLHYADMVLDXXIEKGQKILIPIYSIHHDPKYYPNPETFDPERFTAEEKSKRPNGTFLPFGDGPRHCIGKRFAELELKLILSKILTKFEISPCEKTEIPLQMNKERGITSPKNGIWLNFRPIVE
>CYP6CY15 LOC100163900.pro SCAFFOLD5222 coords:2410-5346
MYFLTDWLLDNFTYLSLIAVFTGFYYYSTSTYGKWQKLNIPYIPPVPLFGNAFRMVTKLECPMDMYDRLYKQFPDVKLLGFYQMTEPMLLIRDPELINAILIKDFPYFTDHGFVMDPSTTVMAKSLFFSNGQRWRTMRQKLSPGFTSGKLRDTYLAINECSNQMVSSIVEKLGKTDRLAIRSIISGFSNDVIGMCAFGIQLDSMNNEDSDFRRYSERIFEKTTKQIIVQAVTTIFPFVINLFKIQMFSAEATNFFRKVFADVINYREKNNIVRNDLTQTLLQARKELVLKENSTAEGIVFADQFTDDDIIGNAIVLFADGAETISSIVSFCLYELALNKEIQDKMRAEICSMKAKHDGQFNNDFLMDLRYTNMVLEETGRKYSIASILMREATKTYTLPDESFVIEKGQKLIIPMFSIHRDPKYYPDPLIFDPERFSKEQKSQRPNGIYMPFGDGPRMCMGKRFAELEMKLVLSNVLSKFEVLPCEETEIPLEITDETGVIAPKRDLVLKFRPIIED
>CYP6CY16 LOC100162372 SCAFFOLD1019:69426..73624 (+ strand)
MISFMTDWLHDNVTCLSLIAVLASFYYYSTSTYGKWRILNIPYVPPVPLFGNTTRMMLRLEHPIDMFERFYNSFPDVKLFGFYQMRDPVLLVRDPELINAILVKDFSYFTDHGIDLDSSTSVLANSLFFANGQKWRTMRQKLSPGFTSGKLKDTHGQINECSDEMVSGIVESIKKKTDQIDVKTITGGFSTDVIGTCAFGMKLDTIKNDDSDFRRYVKIMFQSTPKQMIVQVLLMICPWVIKVLKINMFSVEATNFFHNVFTDVFKYREEHNVIRNDLTQTLMQARKELVLKENSSIEDKFTDADIIGNAILMFTAGSETISSMLSFCLYELALNIEIQDRLRSEICSMKAKHDGHLNNDYLMDLYYTNMVLEETARKYSIAFNLMRVATKTYTLPDESFVIEKGQKLIIPMFSIHRDPKYYPDPLRFDPERFSTEQKSQRPNGIYMPFGDGPRLCIGKRFAESEMKLVLSNVLSKFEVLPCEKTEIPVNIRSMSGFITPKNGIVLKFRPIVEH
>CYP6CY17 LOC100164459mod.pro (N-term
trimmed) SCAFFOLD2510:85936..93942
(+ strand)
MNDIKLLNGLVSGAVDSPDLLSRIGFRIPDKYSNRKRPLPLNKPIMISFLIDCLVNNVTCLSLIVIFTGSFYYYSTSTYNKWRKLKIPYVPPVPLFGNTFRMLARLEHPIDTFDKIYNHFPDFKLFGFYQMREPMLLVRDPELINMILVKDFLYFTDHGVDIDPSMSTLAKSLFFANGQKWRTMRQKLSPGFTSGKLKGTYCQINECSDEMVSSIVEAIGKKTDRIELKTITGRFSTDVIATCAFGLKLDSIKNGDSEFRRYVKILFQTTTKQAIILILSLICPRVVKILRLQFFSLEATNFFSKVFADVIKYREDHNVSRNDITQTLIEARKELVLKEISTTEDKFTDDDIIGNAIFLFSAGSETISSLVCFCLYELALNKEIQDKLRAEIYSMKAKHNGKLNNDYLVDLRYTNMVLEETGRKYSIAFNITRVATKTYTLPDESFVIEKGQKLIIPMFNIHRDPKYYPDPLRFDPERFSMEQKSQRPNGTYIPFGDGPRLCIGKRFAEAEMKLVLSKVLSKFEVQPCEQTEIPLDIRSGSGLLSPKNGLVLKFKPIIEH
>CYP6CY18 LOC100168007.pro
SCAFFOLD1502:67256..70800 (+ strand)
39% to
CYP6AX1
73% to
LOC100164042.pro
MHRTTLISLGGLFMHNINLLSNNIIVSVVKMSLYNLYLYCLYGCNVIYKLDKFSGITIYHNENPSSIQYRITNDIFTLYLMKIAEANAVLTLAYVKLVVFCVFLIMFSFIFIWWINIITPCLFIFTITYYFCTLTYSKWEKINVPYIQPIPLFGNFLDVALGMQHPIDFYRKIYYELAGYKYGGLFQMRTPYLMIRDPEIINNVLIKDFSNFPNRGIYSDFSANPLSNQLFFMENPQWKIIRKILSPAFTSGKLKLMYDQIKECGDELMKNIHKNLTKTDNKMEVRDILGKYSTDVIGTCIFGLKLNAVSDDNSTFRKYGKSLFLPSLRTHLRELSLMITPALLNILRFKDFPADATEFFHSAFHETITYREKNNIVRNDFVQTLIQARNDLVLNKNIPQRERFLESQIVANAFVMFAAGFETVSTAISFCLYELSFCLYELSLKKHIQDKVREEINLKLSKNNGLINNDLLIDLNYLDMVLAETLRKYPPTFALFRKASQTYHVPNDSLTIEKDQKVIIPIYSLHYDPKYFADPEVFDPERFSPEEKSKRISGTYLPFGDGPRICIGKRFAELEMKLALVEILTKFETEPCERTEVPIRFSKKALITMPENGIWLTFKKITNQ
>CYP6CZ1 LOC100165972.pro
SCAFFOLD1502:79066..88489 (+
strand)
downstream of LOC100168007.pro, 46% to LOC100168007.pro
MFEFVYELFDLKMLLVTAFLGAIYVYSTWTHSHWSKLGISSPSAPVPLFGHAMPSMLGQMHFMDVLHNLYKELGDQRFGGIYTMRTPQLLVKDPELIGHILIKDFNNFTDRGLYAGTHTNPLNNNIFFTRGERWKTMRQKLSPTFTANKLKYMNEQVKECSDGLLSTIGKNLDDDAGRIEIREMMAKYSTDVIGSCAFGLKLDAINDPDSEFRKHGKTVFQPSLRSKIRVAVIFMQPSLLSIFRVHHYSHRTIRFFHDAFQQTIEYREKHNEDRKDFVQHLMKAREDLVLNPNLKPE
EKFTEMDIVANAYILFIAGFETVSTSMSFCMYELALRKDVQDKVRKEILEVKSKYNGQMNSECLNELHYMGMVIKETLRKYPPLVTLNRVVTKPYVIPGTQIKLKIGTKIVVPVHAIHYDPKYYSDPEAFEPDRFSDENIHNIQPNTYMPFGDGPRFCIGKRFAEFEMKMALSEVLTNYEVMACDKTQIPIKYVIGSFVNIPESVWLKFRKVNT
>CYP6DA1
LOC100159636.pro SCAFFOLD17790 coords:18123-22112
41%
to CYP6AX1
MSGVWSIPFVQLCAAAVLLVTFLGYMYLTYHYGKWTGLGVPHAAPSPPFGSLRDVVMGRVPLVDAIHSLYRRFDGQRYFGIYEGRQPLLVVCDPQLVHTIMVKDFRSFVDRNAGKVSFVHDKLFDHLVNLRGEQWKAIRAKLSPTFSAAKLKSMLGDINVCTARLIDNLNGQITKNSGIVDVSEASAQFTTDTIGSCAFGLDCNALSNPDSEFRRTGRAIFTPSLRSNLLNITRLVGFGRLLDVFRIRGMSGNIYDFFDNLLDTTMEQHKSGENTRNDFIALLVKLKDEEKQKEHGQKLFTDDILAANSFVFFVAGFETTASTISYCLYELAMNPEIQVKLRENIKKTLDANDGKLAYDTLKDMKYLDMVINETFRLHPPVPVLNRVCTQKYTITDSNITLNVGDKLIIPTYSLHHDSKYYSDPEIFDPERFTEENISSRPHGTFLPFGDGPRICIGLRFAMMEAKTGLAEILSKFEVSPCKETQTPIKIKPRSILLTPNESIRLSFKSIDQ
>CYP6DA2 LOC100168454.pro
SCAFFOLD17790:22986..29493 (+ strand)
downstream of LOC100159636.pro
39% to CYP6AX1, 39% to CYP6a2
MICFSCWLEIVPIAAIASALLTYVYCTRYYGHWTALGVPHTKPAPLLGHFAGPTMGRESGTITVDTLYRRFVGHRYFGVYQLRHPMLVVRDPVLVHAVLATEFGSFHDRVMSRTSFEHDGLFNSLVNLRGDKWKAVRAKLSPTFTVAKLKAMFASLHVCTGQLTDKLLLLTSGGQGIVNVTDVSSKFTIDTIGRCAFGINCNTLFDSNTEFQRAGQAVFTPTLKSSVLNFMRLIDLGWLVDLFRLRSMPDLVYEFYLNLFQDTLELRKNEKEDRNDFVSILVKLRNDEKINNSRVELFTDDVLASNAFIFFAAGFETTASAMSYCLYELALNQDIQVELRKQIQHTLNENGGILTYDVLKDMKYLDMVLNETLRMHPPGPGLLRVCTKKFKIPDSDITLDTGMKVLIPTYSLHHDPAYYPNPELFDPLRFTEDNKALRPNGTFLPFGDGPRICIGLRFALMEAKTGLAEIISKFEIFPCKYTKIPIKLNPRSILLTPNEPISLLFKQIA
>CYP6DB1 AUG5s6612g1t2 XR_045850 40% to CYP6K1
missing about 60 aa at N-term
added
MISKEIIFVYTTCAIVVLFTAVYLYYRNIYSYWKKLGVYHLEPLFFFGNAKERVLFKKSFHEFHRD
MYFKFKGHRYAGYYLGRRASLVILDPEIIKCIMIKDFNHFTDRQTMRFRTSEYITEMLINLKGSKWKRMRGQLTPAFTSGKLRTMEHLVDVCCNNMSDFLNENIKSEQGYDLEMKDFFGKFTLDVIATCAFGVESNSLKDVNGGFASRVSKFASLSIMKRLTLYIVLLFMPGIARFVPLSFFNMEVIQFLANVIKEAKKCRKSTGQKRNDFLQLLLDSETDLDKNDKTKSKEDVLTEAQVVAQSVLFLIAGFETSSTLLTFTCYELAINQTIQDKLREEICSVLKRFNGKCTYEAMQEMPLLDMVLMETLRMHPPVAQLERVSTQDYTLPDSNLLLKKGMTVQIPVIGLHYDPEYYPDPYKFEPYRFSPEEKAKRSHYVFLPFGTGPRNCIGLRFALMSTKRGMVHLLKDFSIDLSKEMTVPYEYSKHSMLLKAKDGIRLSFNKLSA
>CYP6DC1 AUG5s9515g6t1p
41% to CYP6AM1 Hodotermopsis sjostedti
MVFFDSSLLNLVAYGIVISTTFYFYLRYRYTFWQRQGCPVPLKPHIIYGHTKEVTKMKTWVGKHYANIYYNTDGYKFVGFYQFQKPKLMLRDLNIIKDVFTKEFSTFPNRGIVFDDKLEPLTGNLLTLEGHRWKVLRNKLTPAFTIGKIKNMIDLIDGRAQEMVRVLEKSAVIGEQVEFKELLARFSTDVISIVAFGFETNSLTNPDAEFRRVGRMLFSTSLETIIRNALNALAPSLIGLLKVRSIKKEYADFFYNVVNDTVKYREENGIQRNDFLDLLMKIKRGQNLASDEDNFKFTMDVLAAQCFVWFIGGYETSSVTLTFTFFELAQNLDVQMRAQDEIDSVLSKYDGKLTYEILQEMPYLDMIVSEALRKYPPVPNLTRKAVKPYKLPNSDFTLDKGLQVVIPVYGIHNDPEYWPEPEKFIPERFTEEEKRNRPQYAYLPFGAGPRLCIGMRFGMMQVKVALFRILSTYNISLSKSMKLPMKMNPKTIPANPDGGMFLHITKRKN
>CYP6DD1 LOC100162594.pro
SCAFFOLD16233:102870..132505 (- strand)
cyan region may be too long, 46% to
CYP6AY1, 42% to CYP6M11
MMLYINFKFIQLRAIIPKHEAALYIRFI
MFPAVVIIVACCTTVILFLYKYTTYTYKYWKSKSVTFATPVPLFGNIKDHVTLKMTQGECLKNIYNDFPREKFVGMYQLQTPTLLLRDPETIRLFLVKSFAHFTDRGFSYDGHREPLTKHLVNLEGDTWKILRQKLTPTFSSGKIKSMLGLLQGCGVQLIEYMDATIESGKTEFEIRDLTAKFTTDVIGTCAFGLECNSLKDSQSEFRRMGCAVLNSSASLALAKMVRVFFPKLFKALKLRTFPAEVQQFFMGIVKQTIDFRNTNRVRRNDFIQLLLEIKNQNHNQENAIKSIELTEELIAAQVFVFFLAGFETSSTTLSFCLHEMAVNQDIQNRVYDEINETANMYGLPFSYEAISSMNYLEQCLKETMRKYPPVQALARVCTKQFRVPGTDLDLDVGTAVLIPVYAIHHDPQYYPEPDTFNPDRFAKDGDGGGGDNGRPSGVFLPFGDGPRICIGMRFAMLEMKLALAQFLHRYLVTLSDKSCTRIEFEPASFLSCPKGGIWLNVNKRKA
>CYP6-un1 AUG5s17796g1t1p SCAFFOLD17796:3506..66376 (+ strand) possible pseudogene
36%
to CYP6BX1 Dendroctonus ponderosae C-term no Cys
MFVPRKNSRFSIIALSFLLRNLVADVNILAVRISFLFIPPSADVADKQVRLDMFEFVTGTIEQTAALVTGALYELARDQNIQNHLREHLDSVLDEHQEQVTIDQWXXXXXXXXXXXXXXXXXXNQTLRKYPLAAVVRRVATKPYVVPGTGGRGTIEPDSLIVVPVYELHHDAEHFTEPEKFQPHRFPGQLSSAYMPHGSGPQSYIGKHFVELEAKLVIAMLMSRYEVHVDSATPGTPPDPLDRKSFEGVRMTVANRSSVRESTMGASLRQFHMSNGVNSGDRKFLFF
CYP4
clan
>CYP4G51 LOC100164072
CYP4G like 64% to CYP4G16
MVTNVQGVNPLFALSAFNLFFYLLTPAIVLWYIYFRMSRKQLYDLASKIPGSEGLPLLGNALDFMQDPHTIFEKIYERSFEFEKNSPIKMWIGPRLLVFLTDPRDVEVILSSNVYIDKSPEYRLFEPWLGNGLLISTGDKWRAHRKLIAPTFHLNVLKSFVTLFNVNSRDTVSKLRKMGSSTFDIHDFMSECTVEILLETAMGVSKKTQKKSGFEYAAAVMKMCDILHMRHTNLWLKPDFIFNFTKYAKEQVGLLDLIHGLTNNVLAKKKEEFLKKKSLMKEVSDIPAASEEIVETSSTLEVEEVPYGNSFGQSAGLKDDLDVEDDGIGEKKRVAFLDLLIECSENGVVLSDEEVREQVDTIMFEGHDTTAAGSSFFLCLMGAHQDVQQKVVDELYSIFGDSDRPVTFQDTLQMKYMERCIMETLRMYPPVPIISRQIKEKVKLGEDITLPVGATIVIATFKIHRNEDVFPNPEVFNPDNFLPEKSASRHYYAYVPFSAGPRSCVGRKYAMLKLKIILSTILRNFKINSNLTEKDWKLQADIILKRTDGFKLSLEPRKSLAKTAA
>CYP4CH1 LOC100167623p SCAFFOLD15279:22883..50385
(- strand)
70% to LOC100163721
small duplication near C-term
MWYLTVITPIAILVVIIMILRLEGRRKRVLANKIPGPDGSIFIGMLPLFLQGPEQLILKGLKVYQKYEKSLFKVWVLNNLYIVLTRPEDIEMVLTNPKLQKKSKEYLVLQESIMGQGIFSIDDIKKWKSNRKMVSGGFNFTIIKSFIPIFYEESNVLNDILKQKCDLKSNECDISVPVSMATMEMIGKTALGVKFNAQNGGRHRFVENLQTAMHAWEYRISHP
WYLSKTLFQLSSVKKKHDQSQKIINEFTDEIINKKLDELNQNANN
KNKVETDDEDVCRKTKTVIEILLGNYHEMSHEQIRDELVTIMI
GGQETTAMANACAIFMLAHHPDVQNKVFEELQSIFSTGDHNRPPTYEDLQQMEYLERVIKETLRIFPPLPVFGRSLEEEMKIGEHLCPAGSTLMVSPLFVHSSGQYYTDPEKFNPDNFLPDTCRGRHPYSFIPFSAGYRNCIGIKYG
ILQMKTVISTLVRK
NTFSPSERCPTPKHLRVMFLSTLKFVDGCYVKIVPRTS*
>CYP4CH2 LOC100167777p SCAFFOLD9588:3006..11080 (+
strand) added blue
30% to CYP4G7
missing N-term, might be a pseudogene
YEKSMFKAWLFNKLYIVLTRPEDIEFVLASPKFLRKAKEYMVLQQSIMGQGIFTIEDINKWKINR
(2?)
XXXXXXXXXXXXXXXXXXXXXXXXVLAEILGD NSDSTSKECDISVPVSMATMEMIGKTALGVTFNAQKGGCNRFVENLLTAMHAWEYRITHPWYLSSTLFQFSSIKQKHDHSQKIINEFTDEIIKSKIVEINNSGSENGVNADDDDIGRNTKTLTKIFLENPHENMTLEQIRDELVTVMIGGQETTAMANACVVFMLAHHQDVQDKVFKEQESIFSIGDRNRPITYNDLLQMEYLERVIKETLRLFPPLPVFGRDLNEDTTIGDHLCPAGSTLIICPLFLHSSPQHYGSTAHGPDAFDPDNFLPEACHERHAYAYIPFSTGPRNCIGIKYAMLQMKTVASTLVRHHRFLPSDRCPTPDQLRLVFLTTLKLADGCYVKVEPRRPQ
>CYP4CH3 LOC100163721 SCAFFOLD9588:25696..30352 (+ strand)
revised middle
70% to LOC100167623p, downstream of LOC100167777p
MXXXXXXXXXXXXXXXXXILVITIVILILKGRKNRILANRIPGPNGWFLVGMLPLFLQGPEKLIKNILREYRIYEKHIVKFWLFNNLYIVLTRHQDIELVLGNPKFLRKSKDYMVLQESIMGQGIFSIDDIEKWKNNR
KMVMKGFNFTPTKSFIPIFYQEAN
VLAEILQEKCVLKSNECNISGPVSMATMEMIGKTALGVTFNAQTGGCNQFVEHLQTAMHAWEYRVTHP
PWYLNNTLFRFSSVKREHDRSQKIINKLTDEIIKQKIIELSQNINNS
EKKIESDNEECCQKSKTVLEILLGSSHKMDHEQIRDEIVTVMI (1)
GGQETTAMAITCTIFMLAHHQDVQNKVFEELQSIFVNGDRNVPPTYKDFQQMKYVEMVIKETLRLFPPLPFLGRRLDEDMKIGEYMCPAGAALIICPIFVQSSPLYYTDSEKFNPDNFLPDACGSRHSYAYIPFGAGLRNCIGIKYAMLQIKTVISTLVRK
IKFSPSERCPTPEDLRLMFLMTLKLVDGCYIKMEPRT
>CYP4CJ1 LOC100166760.pro
SCAFFOLD7010:99236..111631
(+ strand)
MIFSNVIGALTSDSNTQWMALLSLVVLGVYFLFSDRFSENRGRQISLLPSITRSQWTSLILSLKLASFGPRDILPYFDNVIKKYGSLIHLKIIARHYIIINDPDDIKVLLSSVQHITKGPDYEMLEPWLNKGLLTSTDQKWHSRRKLLTNTFHFKILETYVPSLNKHSRSLVKNLINASDNGKSIADIDSHVTLCALDIVCETIMGVNLRTQEGKSMNYVKAIKNVSQILIKRIFTFWYWNEIVFNLSSIGREFRKSLKLLHDFTENVIRERRKILENVEQKKVDENGKKRIYSFLDLLVGVSKENPGAMTDKDIREEVDTFLFEGHDTSSIAITMAIIHLGLDQNIQNLVRDELYEIFGDSDRDATMEDLKAMTNLERVIKETMRLYPSVTGITRTLKQPLHLDKYTIPSKSVMVVVPHLLHRDKNIYPNPEKFDPDRFLPEQCNGRHPYAYIPFSAGPRNCIGQKFAMYQMKTVLSTILRYTNVETLGTQKSIVISTQLILRADYLPSGVRDFPVKSFSLV
>CYP4CJ2 LOC100164743.pro
SCAFFOLD7010:120446..129462 (+ strand)
adjacent to LOC100166760.pro
MLETNVYNFVLIPLAILISYAIWSRLRKPLEYRQISSHVPSVTKNLWSELLFSCSIAMKHPRDLLPFFMEIFLNNGPVVHCNITGRSYVLLNDPDDIKILLSSTQYINKGPEYKMLKPWLNDGLLLSSGLKWQNRRKLLTNTFHFKTLDMYNPAVNKHAKVFTKKLLEACEDDKEISVMEYVTLCSLDIICETIMGTEMNAQKGKSIQYVYSIKSACRSVIDRVFKFWLWNDLIYRISESGRSFFKSIRVLHDFTDSVIKRKQSLLKTSGNTIVQPESKPAEKRKTKSFLDLLLDVLKDNPDQMTIKDIREEVDTFLFEGHDTSSISMTMTLLLLGMHQDIQDRAREELHSIFGDSDRDATMEDLNAMRYLDAVIKESLRLYPSVPSFTRELETTLQLENYKIPPMTTMVIFPYILHRNENIFPKPEDFIPERFLDEDNKSKFLFGYIPFSAGARNCIGQKYAMNQMKTVVSTVLRNAKIVSSGCKEDIKISMQLLIRIESLPKVIFRPL
>CYP4CJ3 LOC100162661.pro SCAFFOLD7010:131616..139794 (+
strand)
41%
to CYP4C3 adjacent to LOC100164743.pro
MIEVNFYSVVLVPLAGLISYAIWSRLRMPVEYRQISSHVPSVTKSFWSEMVLSWKLAMLQPK
DILPFVTDLFKENGPVVHFNLSGRSYVLLNDPDDLKVLLSNTQYIKKGPEYEMLKPWLNEGLLLSSGQKWHNRRKLLTNTFHFKTLDMYNPSINKHSRILVDKLFEASANDDKEISIAEYVTLCSLDIICETIMGTEMNAQKGKSAEYVHSIKSACKSVIERIFKFWLWNDLVFRMSGSGQSFFKSIKILHEYTDNVIKSKRASLNNSGIEKIRSDSKFEKTKKKSFLDLLLNVLNDTPDQMSDRDIREEVDTFLFEGHDTSSIAMTMILVLLGMHPEIQDRARDELRSIFGYSTRDATMEDLNAMKYLEAVIKESLRMYPSVPAFTRELDKPLQLNKYIIPPMTTITVYPFILHRNEDIYPDAEEFIPERFLDEENKAKFIFGYLPFSAGARNCIGQKYAMNQMKIVVSTILRNAKFESLGRKEDIQISTQLIIRIESLPKMKFYKL
>CYP4CJ4 LOC100160629 SCAFFOLD7010:141476..155172 (+ strand)
adjacent to LOC100162661.pro
MIEVNFYSVVLMPLAGLISYAIWSRLRMPVEYRQISSHVPSATKTFWSEMVLSWKLAMMQPK
DILPFLTDLIRNNGPVVHFNLSGRSYVLLNDPDDLKILLSNTQNIKKGPEYEMLKPWLNEGLLLSSGQKWHNRRKLLTNTFHFKTLDMYNHSINKHSRILVDKLLDASANSNKEISIADYVTLCSLDIICETIMGTEMNAQEGKSVQYVHSIKCACKSVIERIFKFWLWNDLIYKISGSGQSFFKSIKALHEFTDNVIKSKRALLNNSGIEEMQSDKKTKKKSFLDLLLNVLNDTPDQMNDRDIREEVDTFLFEGHDTSSISMTMTLVLLGMYPDIQDRARDELHSIFGDSDRNATMEDLNAMKYVEAVIKESLRLYPSVPGITRELQTPLQLKNYIIPPMTTIAVYPFILHRSENIYPNAEEFIPERFLDEENKAKFQFGYLPFSAGARNCIGQKYAMNQMKIVVSTILRNAKFESLGSKEDIQISTQLVLRIESLPKMKFFNL.
>CYP4CJ5 AUG5s3079g1t1 SCAFFOLD3079:13926..21110 (+ strand)
64%
to LOC100166760.pro
43%
to CYP4CE1, 43% to 4V16, 42% to 4C1
missing the N-term
DVLPFLSSILKQHGSLVHIHLLGHSYVLLNDPNDIKVLLSSPQHINKGPEYGLLKPWLNKGLLTSGSQKWQMRRKLLTYTFHFKILETYISSFNKHAQCLTKKLENMASNNQRVSIYTHMTLCALDLVCDTIMGTELRSQEGKSLEYVEAINTVTDITIKRIFKFWLWNGSIFNLSQIGRDFNKSLKILHTFTENVIKEKRAKLESVNCLETEELSFGKKRVESFLDLLIGISKQNPEKMTDMDIREEVDTFLFEGHDTSSTAMTMAFIQLGLNQDIQNSVREELYSIFGDSDREATMADLKSMTYLDRVIKETIRLYPSVPSVTRMLRQHLHIKEYDIPPQTVVVVVPYLLHREEKHFPNPLTFDPDRFLPEHSINRHPYAFIPFSAGPRNCIGQKFAMYQMKTIISTVIRKMKIETLGSQDDIKISAQLILRPESLPDIKLTKIK
>CYP4CK1 LOC100161405.pro SCAFFOLD15733:160611..183325
(- strand)
47%
to CYP4BT1 Pediculus humanus
MNYHDHLTTRSRHCTTTVEEINDVERDDYNRSPLPLSTSGILIRTVEPILATVSLVFDKSFILFACFSKFSLQYFNELYLFLCQLLVDLHFFRLLLEWHSKFGDTYQLWIGLRPFIAMANADHIQQILKSTVHIDKNLEYNLLLPFIGTGLVTSSGSKWHTRRKLLSPTFHQNILEGFLPLIEKQMKTLVKVLRKEVNNVNGFDIKPYAKLAALDTIGNTAMGCEINSQENSQLDYVKALDELTAIMQKRFITPWLKPNLLFNLTSLSKRQKACIDVIHTFTRKVIKERKDNFKLFNNQTSDANKNEIHYEKKPNRALLDLLIEVSEDGKVLSDEDIQEEVDTFMFAGVDTTSVTLSWVMYVLGKHPHVQDKIVEELNQKIPNFGDGNLTLNILSSLDYLGRTIKEVLRLYPSVPFIGRQIYQPLTIGDHTILPGTSIFINVFALHRNEKHFENPEKFDPDRFLKEKKNDRHRFAFVPFSAGSRNCIGQKFAMIVLKIAVATVIKTYRVKSIDPEEKLGLVGEIVLNALNGIHVTLEERT
>CYP380A1 LOC100165004 SCAFFOLD17282:20108..54426 (- strand) top half only
N-term is on SCAFFOLD11661:5770-6260 (-) strand
see EST EE264487.1 Myzus persicae to confirm N-term
MYGKLSLPELIIYASVALILALWFHWRWKHRYFLDLAEKLPGPPCYPLIGTTSMYTSTYD
ETIAKLKENAEKYNYEPVGTWIGPIHYVSVVKPEDIQ
IVLNNSRALEKGQLYSF
LKSLLGEGLLTASVDRWRKHRRIISYAFNVKFLEQLYPVFNEKNKTLIKNLRKNINSTQP
FDLWDYIISTTFDTICQTAMDYRINEKKKKRPPR
MFGKLSSPELIIYTFVALILALWFHWRWKHRYFLDLAEKLPGPPSYPLIGTTSMFTHTYD
(1)
ETIAKLKENAEQYNYEPVGTWIGPIHYVSVVKPEDIQ
(0)
IVLNNSRALEKGQLYSFLKSLLGEGLLTASVDRWRKHRRIISYAFNVKFLEQLYPVFNEKNKILVKNLRKNINSTQPFDLWDYIISTTFDTICQTA
MDYRINEKHNKTEFLDLMTTIANQLVKTVNRPYLYPSLFFSIYRSMSGLGEKLELINKLPLQLIDEKKIDFRSKIVESDSYPEEFTNEKKNKFKTFIDTLLEASENDPDFTNADIRDEVITMMFAGSDTNATTECFCLLLLAIHQDIQDEVYDEIYNVVRDSDRELTPEDTANFSYLEQVIKETLRMYPTISVFTRQLVEDVKVTNYVLPRGASVTISPIVTHHCPHLYPNPEAFNPDNFSIENVAKRHKYSYIAFSGGPRGCIGMKYAMISMKLMITEILRNFSVHTDIKLSDVRIKMNDAFTRKVGGYPITIRPRDRRPSYVRRNTRVA
>CYP380A2P LOC100167018.pro
SCAFFOLD17282:20108..54426 (- strand) bottom half only, pseudogene
55% to
LOC100165004
VIKEKKAEFDQRLKATNDKVDVTNNDDEKYSKLFLDILFELNNNGGNFSDSDIRDEVVTMMT
()
GGSETSAITICFCLLMLAIDQDIQ
DKVYDEVYDIFGESDHIITIEDTTRLVYLEQVLKETLRLYPVGPVLLREIREDLKI
()
FSNDYVLPKGTTCVISPIATHHSPDLYPNPWSINPENFSPENVAKRHKYSFIPFSGGPRGCI
()
GSKYAMMSMKVTVSTFLRHFSVHTDIKLTDIKLKIDLLMRSVHGYPVTIRPRDKRPTYYNMRNQNKQG
>CYP380B1 LOC100161319 SCAFFOLD11137:17306..28967 (+ strand)
39%
to CYP4G14 C-term half
52%
to LOC100165004,
50% to LOC100167889
see model below for N-term
MGYNLNDQRTLSEFVLAMKKVSELSKCIVKPWLYIDQIFAVYTYLTGLNVYMSQLNRVSLQIIRDKKLEFKSIKLQQSTDKSHEVVPEKKRNSTKVFLDKLLKLNDEGADFTDEDLKDEVITMTVAGSDTSAISECFCILLLAMHQDIQDKVYDEIYSVLGDSDREVIPEDIFRFKYLEMVLKESLRLFPPGAIFSRKINENVKLTNFELPKGSNVFVSPYVTHRCPQLYPNPDTFNPENFSAENEANRHKFSFLAFSGGPRGCLGVKYAMISMKLMMVAVLRRYSVHTDCKLSEIEMQIDLLAKKANGYPITIRPRERTQDR
>SCAFFOLD11137:AUG4_SCAFFOLD11137.g1.t1 = LOC100161319
MFAQMRMAIHNAAHALPMTKSELYFYASIVIFVVLWCRMRWQYRQFYRLADKLKGPPSYP
LKGSIFDLSTTPEKLMYNFKESAEKYNYEPVKLWVGPFFFVGVYKPEDVQIVLNSSKALE
KGMIYHIIRHAVGEGVFTAPMGKWKKHRRVIASIFSSKFLDQLYPIFNENNKKLVENISK
HVGETQPFDIWDYIISCNLNNVSQAA
MGYNLNDQRTLSEFVLAMKKVSELSKCIVKPWLY
IDQIFAVYTYLTGLNVYMSQLNRVSLQIIRDKKLEFKSIKLQQSTDKSHEVVPEKKRNST
KVFLDKLLKLNDEGADFTDEDLKDEVITMTVAGSDTSAISECFCILLLAMHQDIQDKVYD
EIYSVLGDSDREVIPEDIFRFKYLEMVLKESLRLFPPGAIFSRKINENVKLTNFELPKGS
NVFVSPYVTHRCPQLYPNPDTFNPENFSAENEANRHKFSFLAFSGGPRGCLGVKYAMISM
KLMMVAVLRRYSVHTDCKLSEIEMQIDLLAKKANGYPITIRPRERTQDR
>CYP380C1 LOC100162836.pro
SCAFFOLD10061:3374..10265 (- strand)
small sequence gap
MIEIIVYIIVVIFIV
MWCYFKWHNRPFEKLAARMPGLPAYPFIGSLYTCIGVTSEQLRSRILDLVKDYNLGPIKCWMGPYFGVFIVRPEDIQIVLNSSNALQKGFVYNFFKVILGEGLFTAPIDKWRIHRRMISPFFNGKLLEQFFPVFIEKNRILIRNVAKQLNETQVFDLWDYIAPFAFDTICQNTLGYNIDTQTNKNECEFAKAIVKTLDLEGMRIYKPWLYPEFVFSMYLKLTGQQRVFETVRKFPLQVIKEKKAEFDQRKKLIDAKIDVTNSNEHQSKLFLDTLFELNNGGGNFSDSDIRDEVITMLAA
XXXXXXXXXXXXXXXXXXXXXXX
DKVYDEIYDILDDSDHMISIEDTTRLVYLEQVLNETLRLFPAGPMQLKEIQEDLKISSSDYVLPKGTMCVISPLVTHISPDLYSNPRDFNPENFSPENIAKRHRYSFIPFSGGPRGCIGSKYVMMIMKVTVSTFLRHFSVHTNIKLTDIKLKLDVLMRSVDGYPVTIQPRHKRPTYKRNKKPLR
>CYP380C2 LOC100160808 SCAFFOLD10061:16344..23495 (- strand)
49% to LOC100165004
MIEIIVVIFVM
MWCYIKWHNRPFEKLAARMPGFPAYPFIGTGFQFIGLTPEQIMNRILDYEKDYNLEPFKIWIGPYFGVFIVKPEDLQIVLNSSKALQKGCVYDFFKHVTGEGLFTAPVDKWRIHRRMISPLFNGKLLEQFFPVFIEKNRILIRNVANQLNETQVFDLWDYIAPFALDTICQNTLGYNLDTQTNKNGCEFAEAIVTTTDLEGMRIYKPWLYPEIVFSMYLKLTGQQRVFETVRKFPLQVIKEKKAEFDQRKKLIDAKIDVTNNNEHQSKLFLDTLFELNNDGGNFSDSDIRDEVVTMLTGGSETSAITVCFCLLMLAIHQDIQDKVYDEIYDIFDESDHMISIEDTTRLVYLEQVLKETLRLFSVGPLLLREIQEDLKIFSSDYVLPKGTTCVLAPIGTHLSPNLYSNPRDFNPENFSPENIAKRHRYSFIPFSGGPRGCIGSKYAMMSMKVTVSTFLRNFRVYTDIKLTDIKLKLGLLMRSVDGYPVTIRLRDKRPTYKRNKKPPR
>CYP380C3 LOC100158738.pro SCAFFOLD10061:35318..41755
(- strand)
MIEIIVYIIVIIFVVTWCYFKWHNRPFEKLASR
MPGPPAYPFIGTLYQFIGLTSEQIVSRILDYVKDYNLEPFKFWMGPYFGVVIVKPEDLQIVLNSSKALQKGYVYDFFKDIGGEGLFTAPVDKWRIHRRMISPLFNGKLLAQFFPAFIEKNQILIRNVAKQLNETQVFDLWDYIAPFALDTICQNTMGYNLDTQTNKNECEFAEAI
(seq gap)
VIKEKKAEFDQRKKLNDAKMDVTNSNEHQSKLFLDTLFELNNGGGNFSDSDIRDEVITMLIAGSETSAITVRFCLLMLAIHQDIQDKVYDEIYDIFDESDHMISIEDTTRLVYLEQVLKETLRLFSVGPLLLREIQEDLKIYDDYVLPKGTMCIISSIATHHSPDLYPNPWSFNPENFSPENVVKRHKYSFIPFSSGPRGCIGSKYAMMSMKVTVSTFLRHFSVHTDIKLTDIKLKLGLLMKSVNGYPVTIRPRDKRPTYKRNLKPLR
>CYP380C4 LOC100159590 SCAFFOLD17803:1..16255 (-
strand)
missing C-term in a seq gap
MIEIIVYIIVVIFVVMWCYFKWHNRPFEKLAARMPGPPAYPFIGTLYGCIGLTSGQIVSRILDYVKDYNLEPFKFWMGPYFGVFIVKPEDLQIVLNSSNAFQKGFVYDFFKVILGEGLFTAPVDKWRIHRRMISPFFNGKLLEQFFPVFIEKNRILIRNVGKQLNETQVFNLWDYVAPFALDVICENTMGYNLDTQTNKNECEFAKAIVIKEKKAEFDQRKKLNDAKMDVTNSNEHQSKLFLDTLFELNNGGGNFSDSDIRDEVITMLAAGSETNAITVCFCLLLLAIHQDIQDKVYDEIYDIFDESDHMISIEDTSRLVYLEQVLKETLRLLPAAPFLLREIQEDLKIFSSDYVLPKGTMCIISPLATHRSPDLYSNPRDFNPENFSPENIAKRHRYSFIPFSGGPRGCI
>CYP380C5v1
LOC100162710.pro SCAFFOLD12542:91226..95809 (+ strand)
runs of the end of the contig
& = frameshift
74% to CYP380C2
LOC100160808
MIEIIAYIIGIVLVMVWCYFKWQNRRFEKLAAIMPGPPAYPIIGIGYTFFGSSEHVMSKIIDLVKEYNLSPIKLWLGPYFAVSISKPEDLQIILNNSKALQKDRMYDFFKYAVGEGLFTAPVDKWKRHRRMITPAFNAKLFEQFFPVFNEKNKILIKNVTKELNKTQMFDLWHYVAPAALDTICQTTMGYNLDTQSNNKECEFGEAIVM
(2)
ASEVAAMRIYKPWIYPEMVFSMYLKLTGHQRVFETVK &
KFPLQVIKEKKDEFDQRKKAINAKVDLANNKDENQSKLFLDILFELNNTGGNFSDSDIRDEVVTMMTGGSETSAITICFCLLMLAIHQDIQDKVYDEIYDIFGGSEETITIEDTTKLVYLEQVLKETLRLYPVRPVLLRELQDDVKIFSNDYVLPKGTTCVLCPITTHHCPVIYPNPWSFNPENFTPENVAKRHRYSFIPFSGGPRGCIGSKYAMLSMKVTVSTFLRHFS
VHTDI
>CYP380C5v2 aLOC100167486 SCAFFOLD4690:1..2605
(- strand)
96% to CYP380C5 LOC100162710.pro runs off the end of the contig
MIEQIAYIIGIVLVWSYFKWQNRRFEKLAAIMPGPTAYPIIGIGYKFFGSSEDVMSKIIDLVKEYNLSPIKLWLGPYFAVSISKPEDLQIILNNSKALQKDQMYDFFKYAVGEGLFTAPVDKWKRHRRMITPAFNAKLFEQFFPVFNEKNKILIKNVTKELNKTQMFDLWHYVAPAALDTICQTTMGYNLDTQSNNKECEFGEAIVM
ASEVAALRIYKPWLYPEMVFSMYLKLTGHQRVFETVKKFPLQ
>CYP380C6 LOC100167889 SCAFFOLD12103:1756..16564
(+ strand)
49% to LOC100165004
MEVSQDFPVSSLKHSAGGPRMTSTELTAYGVISFIVVLWCHYKWNRRHFERLASKMTGPPAYPIIGAGLEFVGTPQQVIERIIKLFDIYGSEPFKVWMGTSLGVTISKPEDVQIVLNSSKALEKDQFYKFFKNTVGEGLFSAPVHKWRRHRRLITPVFNANLLDQFFPVFNEKNRILTRNLKKELGKTQPFDLWDYIADTTLDIICQTAMGYNLDTQLNNESEFAEALTKASELDSMRIYKPWLHPDIIFSIYGKLTGLHNVYKTLHKLPNQVIKEMKETYAQRKIDNKSNTIDVNDDDKKRLKVFLDTLLDLNEAGANFSDEELRDEVVTMMIGGSETSAITLCFCLLLLAIHPEIQDKVYDEIYEVLGDGDQTITIEDTTKLVYLEQCLRETLRLYPIGPLLLRQLQDDVKIFSGDHTLPKGTTCIISPICTHHIPELYPNPWSFNPDNFDAENVSKRHKFSFIAFSGGPRGCIGSKYAMLSMKVLVSTFLRNYSVHTNVKLSDIKLKLDLLMRSANGYPVTIRPRDRRPTYKKNTHCSTVNL
>CYP380C7 LOC100168315 SCAFFOLD2534:44976..48195 (-
strand)
MENIIQSVRDFRLTTSEVIVYQLIVCFVVIWCQFKWIRRNFESVAAKMKGPKGYPFIGSSFDFIGTPEQVMEKVLKIDDKYSPGPIKIWVGPYFGVIVIKPEDVQAVLNNSKALQKDRVYDFIKNIFGEGLLTAPVHKWRKHRRLITPSFNASLLNQFFPVFNEKNKILIRNLKKELGKTTPFDLWDYIAPTTLNLICQTAMGYNLDTQSEYGTEFENAMIKASELDSLRMKTPWLYLSFMFKLYLKLKGHSDVFNTLYKLPIKMIQEKKEAFAQRKILNKPSAVDVTDNEREKLKVFLDTLFELNEAGANFSDDDIKDEVVTMMIGGSETSAITICFSLLMLAIHPDIQDKVYDEIYEVFHDDNETITIEDTNKLVYLEQVLKETLRLFPVLPLVFRKLEDDIKIASDDLVLPKGTTCIISILGTHHFSESYPNPWTFNPENFNPENITNRHKYSFIAFSGGPRGCIGSKYAMMSMKVAMSTFLRNYSVHTHYTFDDIKLKIDLLLRSANGYPVTIQLRDRRPTYIRNKKL
>CYP380C8 LOC100165148p
SCAFFOLD1571:20813..37353 (-
strand)
also SCAFFOLD17147:5399..9863
(- strand) C-term
see EST ES224491.1 Myzus persicae for N-term
VLLLLNARYCRIEATMQSVSGFRLTITEVFA
YTIICTLAILWCRFKWNRRHLDRLAAGLE
GPPAYPIIGSALQFIGTPEEVLNNLVQLIEDYCPGPFKIWMGPYFGVAIVKPEDLQIVLN
SSRTLQKDRFYNFIKNIFGEGLLTAPVDKWRKHRRLITPSFNSILLNEFFPV
Y
P
T I C
F V V I
LWCRYKWNRRHLDKLAAGLKGPPAYPIIGSALQFIGTPEEPNLFQIVLNSSRALQKDRFYNFVKNIFGEGLLTAPVDKWRKHRRLITPSFNSILLNEFFPVYNEKSKMLIRNLKSELNKTQPFDLWDYIAPITLNLICQNAMGYNLDSQSKSGSEFEKAMIKASELDSIRVSKPWLYPSIMFSLYLKLKGYSNVFNSLYKLPLKMIHKKREEFAQKKIGNESNYLDVTDNERKHSKVFLDTLFELNEAGANFSYDDIRDEVVTMMIGGSETNAITLCFCVLLLAIYPSIQDKVYDEIYDVLGDGDQTITIEDTSKLLYLDQVLKETLRLFPVIPLILRQLQGDVKIISNNIVLPKGSTCYLSPLATHRDSDSYPNPTSFDPENFSPENIAKRHKYSFIGFSGGPRGCIGSKYAMLSMKVLVATFLRNYSVHTDCKFNDIKLRLDLLLRSSNGYPVTIRTRDRRP
VYKFKLEYI
>CYP380C9 LOC100162179 SCAFFOLD17731:37803..45605 (- strand)
N-term may be too long
MGLGDYVQLYNALETGAII
MQSVGEFRLAVSEVLLYSAIISVVVFWCSCKWNNRHINKLDSKMKGPPAYPIIGSALELLGTPELDKWRKHRRLITPLFNANLLSQFFPVFNEKNKILIRNLKKELGKTQPFDLWDYIAPTTLNLICQNAMGYNLDSHSQCGSEFEKAMIKASELDSIRIYKPWLFPNIFFSLFLRLQGQSNVFKTLKKLPLKMINEKKEVFAQKKIVKETIVMNNTDGEKKNLKVFLDTLFELNETGANFSDNDILDEVVTMMIGGSETSAITLCFSLLLLAIHPDIQNKVYDEIYDVLGDGDQTITTEDTIKLVYLEQVLKETLRLFPVLPLVIRKLQDDVKIISGNHLLPKGTTCYIAPLFTHRDCDSYPNPLNFNPENFSQENISKRHKYSFIAFSGGPRGCIGSKYAMLSMKVMMSMFLRNYSVHTNCKFNDIKLKLDLLLRSANGYPVFIQSRDRRPSYKLNKT
Mito
clan
>CYP302A1 LOC100165806 dib 22-hydroxylase
MPSAKCFLGCTNVRYGARIVSILDFKSTLFQILRFSSTETTAVKEFNEIPGPTSLPLVGTLYQYLPVFGKYKFDRLHHNGLAKLRQYGPVVREDIVPGVSIVWIFKPEDIETLYRKEGRYPERRSHLALQKYRLSKPDVYNTGGLLPTNGSDWWRLRKAFQKHLSKVQCIKRYVDSTNTVVGEFIDRRIKRAELRDDFGPELSRLFLELTYYVAFDERLQRFKDEEWDSDSECSKLIKAAHDINSAIMKTDNGPQLWRKFDTPMYKSIQKGHEQIEKIALRVVNEKLISIKTTDSKTSLLGEYLSSDDTDFKDVIGMTVDTLLAGIDTATYSCCFGLYHLSSNPDVREKMFDESRALLPDNHTPVTDRVLERAVYAKAVVKEMFRMNPISVGVGRILPEECVFSGYRVPAGTVVVTQNQVSCRLEEYFRRPNEFLPERWIKGSAEYEPVSPYLVLPFGHGPRTCIARRLSEQFLQVVLIKIVRNFEMTWTGPKLDSESLLINKPDGPISIIFKTRD.
>CYP315A1 LOC100159616 sad 2-hydroxylase
MANRYCSLVLVNSTKKRFMSTSNLKTVITESKKEIPIVKGLPLVGTMFSILAAGGGRKLHEYIDKRHQKYGSVFREKLGSVDAIWISNPLDMKLLFAQEGKFPKHILPEAWLLYNDTYGQKRGLYFMNGKEWWKYRQIFNKVMLKDLNVNFIKSYKVVINDLLNEWELSNGQVIPNLIADLYKISISFMVAHLVGRVYDDCKNDLSNDINCLAQCIQKVFQCTVKFTVIPAKTSKLLKLNIWNDFVIAVDNSIESANNLVSKLMSLNGDGLLNSVLNVHDIPIDMIKRLMIDFIIAAGDTTAYSTQWSLYTLGLHKSIQNNLRHSLLKTDFLECDYLNNILKEVLRMYPLAPFITRIPPSDIYLTDHKIPANSLVIMSMFTSSRNGKYFNSPNEFIPDRWNRLKNNKYNGVNEPFATLPYGFGARSCIGQKMAHVQMCLTLSE
>CYP301A1 LOC100164600
MKNIRQFQIHSIRWRSTATQHAHSPHVSAGSPEALEVTNDLITAKHYSQVPGPTPWPIIGNTWRMLPIIGPYQISDLANVSYILYKQYGKIAKLGNLVGRPDLLFVYDADEIEKVYRQEGDTPFRPSMPCLVKYKSQVRGQFFGRLPGVVGVHGEAWREFRTKVQKPVLQPQTVKKYIQPIEEVSDYFIKRMQEMKNENSEMPADFDNEIHKWALECIGRVALDARLGCLNPDLPKNSEPQKIIDAAKYALRNVALLELKYPFWRYLPSTLWKKYVSNMDYFIEICMKYIDDAMLRLKNKSQSVNESELSLVERILANEPDPKTAYILALDLILVGIDTISMAVCSMLYQIATRPEEQEKIHQEILKILPNKDDKLDASKLEKMVYLKAFIKEVLRMYSTVIGNGRTLQKDMVICGYRIPKGIQLVFPTIVTGNMEEYVTDCKQFKPERWLKQSTDYIHPFASLPYGHGPRMCLGRRFADLEMQVFLAKLIRSHKLEYLHKPLEYKVTFMYAPDGELKFKMTERPTS.
>CYP49A1 LOC100161793
MSVLARRLRNLRITVDHANKSTEVFTSVSQGDVDFVKDYSELPGPKSLPLLGNNWRFMSYIGDYKVTEIDKLSLRLWKEYGDIVKIEKLLGRPDMVFLYDADEIEKVFRNEELMPHRPSMPSLNYYKHVLRKDFFGDLAGVIAVIKKIKNKDQEVPDDFLNEIHKWSLESIAKVALDQKLGCLEDEHAVDSDTQNLIDAINTFFANVPELELKIPFWKLFSTPTWRKYINALDTITNVTSKHINRSMDRLLSQKSFCPDSQSSLLQRVLSLDPSNPKLAQILSLDMFIVGIDTTSAALASILYQLSRHPDKQKKLREEIRTVLPNADSKLTSSKLEQLQYLKACIKETLRMYPVVIGNGRCMTKETIISGYKIPKGVQVVFQHYAISNSSKYFSQPDQFLPERWLKGSGYKHHPFASLPFGYGKRMCLGRRFADLELQTVVSKIFQNFEVKYEYGDLEYTVHPIYMPDGPLKFKMIED.
Three other mitochondrial P450s are paralogs of CYP314A1 shd
20-hydroxylase
>CYP314A1a LOC100167431 SCAFFOLD4030:67656..77815 (- strand)
45% to CYP314A1 Manduca
sexta
MVQKNFWTKIGGACCIVVACITALVKLVLKYVVGTYSNVEYPSEAQQKIYKTIADIPGPRSLPVFGTRWIYWKFCLYKLNAVHLAYEDMFNRYGDIIREEALWNIPVISVKNRDFIERVLRQSGKYPIRPPNEVTANYRKSRPDRYTNTGLVNEQGEVWAMLRNKLTPELTSPRTIRRFLPEVNQLADDFNNLISLARDGNNVVKEFEAYCNRMGLESTCTLILGRRFGFLDGEISETATRLADSVTSQFRASQEAFYGLPLWKLIPTKAYKDFVASEDALYNIVSEIVDSALIDEQQSCTDVRSVFVSILQTSELDNRDKKAAIIDYIAAGIKTLGNTLVFLLYLVAKHPEVQEKIYNEISRLAPAGTSVTAEHLHKATYLRACITEAHRLKPTAPCIARVLESEIEYDNYRLPPGSVVLLHTGLACLDENNFKDATSYRPERWLDELTKKSPFLVAPFGCGKRMCPGKRFVDLELQIVLAKMVKQFEIDFEGQLKTEFEFLLTPVDSNFILRDRIC.
>CYP314A1b LOC100165833 SCAFFOLD10596 coords:34935-48215 (+)
strand
94% to CYP314A1a, 66% to CYP314A2
44% to CYP314A1 Manduca
sexta
a recent gene
duplication may include some flanking genes as well
MVQKNFWTKIGSACCIVIACITTLVKLVLKYVVGTYSNHENPSDAQQKIYKTIADIPGPRALPFFGTRWIYWKFCLYKLNAVHLAYEDMFNRYGDIICEEALWNIPVISVKNRDFIERVLRQSGKYPIRPPNEVTANYRKSRPDRYTNTGLVNEQGEVWAMLRNKLTPELTSPRTIRRFLPEVNQLADDFNNLISLARDGNNVVRGFEGYCNRMGLESTCTLILGRRIGFLDGEVSETATRLADSVTSQFRASQEAFYGLPLWKLIPTKAYKDFVASEDALYDIVSEFVESALIDEQQSFTDVRSVFVSILQASELDNRDKKAAIIDYIAAGIKTLGNTLVFILYLVAKHPEVQEKIYNEVSLLAPAGTPITSEHLHKATYLNACIIEAHRLKPTAPCIARVLESEIEYDNYRLPPGTVVLLHTGLACLDENNFKDATSYRPERWLDELAKKSPFLVAPFGCGKRMCPGKRFVDLELQIVLAKMVKQFQIDFEGQLKTEFEFLLTPVDSNFILRDRIY.
>CYP314A2 LOC100169172 SCAFFOLD543:7676..15068 (+ strand)
40% to CYP314A1 Manduca
sexta
MALQKIIRKIWTSIKVTCFIVLACVTALVKFVSKNTLGIYRKFRKPADAQRRIYKTVADIPGPRSFPIIGTRWIYWKFGSYKLNAVHLGFEAMFLCFGDIIREETLWNSPVISVINRDCIEKVLRQSGKYPIRPPNEVIANYRRSRPDRYTNTGVSNEQGVIWNSLRKRLTSKMTSPDVVQGVFPEIKSMVDDFIHLLCQARNKNNIVKGFEGLSNRMGLESSCMLILGRRNRFLDRVVNETAMRLTDAVTTQFRASQKTFYGHPFWKIIPTKLYKEFIASEETFYEIMSEIIDFALSDETQSGISENSVFGSILRAPNMDMKEKKAAIIEFIGAGIKTFGNTLVFVLYLIAKHPEVQEKLYNEISRLAPADTPITNEHLKQAKYLNACIMEAHRYSPTAPCIARVLESQIIYDGYCLPKGTTVLMQTGLACLDERNFKDATSYIPERWMNKETYDSLFLVAPFGCGKRICPGKKFVELALKIVLAKMVKQFHIGYEGQLETVFEFILTPVNANFILRDRIN.
>CYP353B1 LOC100161881
is too short at
the C-term yellow is probable C-term exon
45% to Tribolium castaneum
XM_969024.1 CYP353A1 mito
clan
MTFRQFKPFSSIPEPKRWPLLGHTHLFIPKIGPYDSQHLTEAMGDIERMLGPVFKLMLGGKTMVVTTRVEEAKTLFAHEGKHPARPIFPALNLLRKKPFGTGGLVSE
(2)
NGVEWYRLRKAIAPLMSKNIYESYIPQHKKAAVDFIDYIKLNRNKDKCLKDMFYHLTKFSVE
(1)
AISIVSPGLRIKCLNTTMSECFVEAGNKFMDGLYNTLKEPPIWKFYKTNAYRNLESSHSTCKNFIDEYLKQTHEHNALVNAINTNSNLTNTDINLLVLEIFFGGIDA
(0)
TATTLAMTLFYISQDESVQKACEEDVLQGTNAYIKACIKETLRLSPTAGANARYLPKTTVIGGYEIPANTLVMAFNSLTSTKEKYFKAPLEYQPSRWLRNSNIQKFDPYASLPFGHGPRMCPGRHVAMQEMTILLSE
(0)
LIKNFKISLPAEHAKNIGMIYRMNRIPDSRIDIIFNNK*
Compare to
>gi|225029066|gb|GO270917.1| N4(1)G11 Suppressive subtractive hybridization screening of prediapausing
Leptinotarsa decemlineata Leptinotarsa
decemlineata
cDNA, mRNA sequence.
Length=619
Score = 55.5 bits (132), Expect =
3e-07, Method: Composition-based stats.
Identities = 27/53 (50%), Positives =
35/53 (66%), Gaps = 2/53 (3%)
Frame = -2
Query 1
MCPGRHVAMQEMTILLSELIKNFKISLPAEHAKNIGMIYRMNRIPDSRIDIIF 53
MCPG+ +A
E+ ILL ++++ K
SL
IGM+YRMNRIPD IDI F
Sbjct 480
MCPGKRLAENEIVILLKQILR--KYSLEVSDTSPIGMVYRMNRIPDRVIDIRF 328