D. Nelson
Some names not previously assigned are
now assigned. Some revisions to
sequences are made, while some are known but remain confidential.
Revised Feb. 9, 2007
D. Nelson
Some sequences are given placeholder
names like CYP6AEaa until official
names can be assigned. There are 51 complete sequences, 9 more
are nearly
complete, missing only N- or C-terminals
or a small internal piece.
This is 60 genes with strong, intact or
nearly intact assemblies.
There are another 12 in closely related
gene subfamilies that have all or nearly
all exons accounted for, but the exon connections
cannot be made from the small
contigs that do not overlap. This is at least 72 genes. There are 7 more partials
that are less than half a gene at the
present time. If these can be
completed, that
would make 79 P450s in silkworm. This is comparable to Drosophila.
Parts of some genes were assembled from
confidential sequences. The
Beijing Genomics Institute has now set up a database with a BLAST server to
make this data public, so I have included it here.
SilkDB: a
knowledgebase for silkworm biology and genomics
Nucleic Acids
Research 33, 399-402 (2005)
http://silkworm.genomics.org.cn/
July 20, 2004
Note: trees have been built and naming
is being done, but some sequences are still
uncertain. The old CYP301B sequence has had its C-terminal exon changed
based
on an Apis sequence and it is now
CYP49A1, the old CYP49A1 is now CYP49A2
>CYP4G22
old CYP4Gxx 64% to 4G19
AV404689
Bombyx mori prothoracic gland EST spans two contigs
AV405174
Bombyx mori prothoracic gland EST
AV404871
Bombyx mori prothoracic gland EST
BP183989
P5PG Bombyx mori cDNA clone
BP182770
cDNA clone NRPG1026
BP182055
NRPG Bombyx mori cDNA clone
BP183011
cDNA clone NRPG1327
BP183771
P5PG Bombyx mori cDNA clone
AU004478
EST
BP183321
NRPG Bombyx mori cDNA clone
BAAB01118673.1
BAAB01085960.1 84% to 4G20
MSYTNAENVVPTSTFSAINLFYVLLVPAVILWYAYWRMSRRRLYELADKLNGPPGLPLLGNALEFVGGSA
()
DIFRNIVQKSADYDHESVVKIWIGPRLLVFLYDPRDVEVILSSHVYIDKAEEYRFFKPWLGNGLLIST (1?)
GQKWRSHRKLIAPTFHLNVLKSFIDLFNANSRAVVDKLKKEASNFDCHDYMSECTVEILL
(1)
234
ETAMGVSKSTQDQSGFEYAMAVMKMCDILHLRHTKIWLRPDLLFKFTDYAKNQTKLLDIIHGLTKK 431 (0)
988
VIKRKKEEFASGKKPSNLNETATTSEPSTGKLTSVEGLSFGQSSGLKDDLDVDDDVGQK 1164
1165 KRLAFLDLLLESSQSGVAISDEEIKEQVDTIMFE
1266 ()
1340
GHDTTAAGSSFFLSMMGIHQDIQDKVIEELDQIFGDSDRPVTFQDTLEMKYLERCLMET 1516
1517
LRLYPPVPIIARQVNQEITL 1576 ()
1680
SNGKKIPAGTTLVIATYKLHRRPDVYPNPNKFDPDNFLPERSANRHYYAFVPFSAGPRSCV 1862 (1)
1960
GRKYAMLKLKVILSTILRNFRVISDLKESDFKLQADIILKRAEGFQVRLQPRKRMAKA* 2136
>CYP4G23 old
CYP4Gyy 58% to 4G19 59% to CYP4Gxx
CK508129 rswdd0_001928.y1 swd Bombyx mori
cDNA
CK505752
rswcc0_009305.y1 swc Bombyx mori cDNA
AV401408
BP122820 BP123076 BP183420 CK509955 CK505752 CK505553
BAAB01050777.1
BAAB01096227.1 BAAB01156775.1 BAAB01003438.1
BAAB01073763.1
BAAB01110168.1 BAAB01050777.1
MTSLVDETEGYHVNSRVIFYPLLGLTTAIWILYRWQQNSHMHKLAELLPGPASIPIFGNALTLMRKNPHE
()
LVNLALGYAQTFGNVIRVWLGSKLIVFLVDADDIEIILNSHVHIDKATEYRFFKPWLGEGLLISS
(1?)
GPKWRSHRKMIAPTFHINILKSFVGIFNQNSNNVVEKLKSEVGKTFDVHDYMSGTTVDILL
()
ETAMGISRKTQDESGFDYAMAVMK
()
MCDIIHQRHYKFWMRSEIVFKLTSFFKQQTKLLGIIHGLTNK
()
VIKNKKETYLENKAKGIIPPTLEEFTHHSGEILANNAKTLSDTVFKGYRDDLDFNDENDV
()
GEKKRLAFWDLMIESSQNGTNKISDHEIKEEVDTIMFE
()
GHDTTAAGSSFVLCLLGIHQDVQARVYDELYQIFGDSDRPATFADTLEMKYL
ERVILESLRLYPPVPVIARKLNRDVTI
()
STKNYVIPAGTTVVIGTFMLHRQPKYYKDPEVFNPDNFLPENTQNRHYYSYIPFSAGPRSCV
(1)
GRKYALLKLKILLSTILRNFRTISEIPEKEFKLQGDIILKRAEGFQMKVEPRKRVPTNVAR*
>CYP4G24 old
CYP4Gzz 71% TO CYP4Gyy
FIRST
TWO EXONS ARE JOINED WITHOUT OVERLAPPING EVIDENCE
BUT
THEY ARE THE ONLY OTHER 4G N-TERM SEQUENCES WITHOUT A PARTNER
BAAB01149226.1
BAAB01135847.1 BAAB01073803.1 BAAB01199362.1
MSSLLNYNFEFDVPPIFHTLLLAIMIMWM
LHRWQQSSRLFRLGNKLPGPMALPLVGNSLLILGKKAEGW
= BAAB01031491.1
VKYALDYSEKYG
TVVRAWAGPKLVVFLTDANDVEVILNSQIHIDKSPEYRFFKPWLGEGLLISSG
= BAAB01016193.1
XKWRSHRKMIAPTFHINILKSFMGVFNENSKSVVKKLRSEVGKTFDVHDYMSCVTVDILL
ETAMGITKTTQDAASFDYAMAVMK
MCNIIHQRHYKVWLHFDAIFKLTSLFKKQRELLKTIHGLTNK (0)
VIKKKKFMYLQNKEKGIIPPTIEELTKIKDTNDSIMEDSAKTLSDTVFKGYRDDLDFNDEQDV
()
GEKKRLAFLDLMIESAQNHTCNISDHEIKEEVDTIMFE ()
GHDTTAAGSSFVLCLLGIHQEIQSKV
YDELFEIFGDSDRLVTFADTLQMKYLERVILESLRLYPPVPAIARKLTRDVQI
()
VTNNYIIPAGSTVVIGTFKIHRDPKYHKNPNVFNPDNFLPENTQNRHYYSYIPFSAGPRSCV
(1)
GRKYALLKLKVLLSTILRNYKTTSEISEDQFVLQADIILKRYDGFKIRIEPRNKNHSNTV*
CYP4L subfamily sequence
>CYP4L6 67% to 4L4
CK535040
EST 5aa diffs BP182824 EST
BAAB01121615.1
BAAB01097697.1 BAAB01093085.1 BAAB01107041.1
MYLILLICLLVLVLTVSWWSMLNRICKSNVPGPFPLPIIGNAHQFVVRST
(1)
EFLGLLKSFTDKYGDVFRVHFFSYPYVLISHPKYAE
(0)
932 ALVSSADLITKGRSYSFLKAWLGEGLLTAS 1018
(1)
1506
GPRWRLHRKFLTPAFHFNILQNFLPVFCKKSEILRDKIRRLADGQPIDLFPITALAALDNVAESIM 1688 (1)
GVSVNAQQNSESEYVRAIEX
()
LSQITTLRMQIPLLGEDFIFNLTSYKKKQNIALEVVHGQTKKVIEARRCELEKNNKTNISGTNE
()
IGIKNKHAFLDLLLLAEIDGKL
()
12 MDEQSVREEVDTFMLEGHDTTTSGI 86
85 LVYTLFGLSKHPDIQEKWYEEQLTIFGEEMDRTPAY ()
NELAQMKVLEWVIKESLRMYPSVP
264
265
LIERWITKDAE
VGGLKLSKGTSVVLNIFQMHRNPEVFEKPLEFIPERFDSLEHKNPFSWLAFSAGPRNCI
()
GQKFAMMEMKVTLSTLVRNFKLVPVDIEPILCADLILRSQNGVKVGFLPRTQSNSKT
CYP4M subfamily sequences
>CYP4M5 EST
BP125511
BAAB01099467.1
BAAB01097680.1 BAAB01039373.1
MFVYLIFIASFFLLIHLAFNY
NSKAVMMNKVPGPKLSFILGNAPEIMMLSSVELMKLARKFASRWDGIYRIWAFPLSIINIYNPDDVEVIVSTTKHN
EKSSVYKFLKPWLGDGLLISK
GEKWQQRRKILTPAFHFNILRQFSVIIEENSQRLVESLEKCIGKPIDIVPVVSEYTLNSIC
ETSMGTQLSDKTEDAWKAYKDAIYELGPYFFQRFTRVYLYFDIIFYLTSLWRKMKKPLKSLHG
FTSTVIKERKIYVEQNGVKF
GEDVNDDDLYIYKKRRKTAMLDLLIAAQKDGEIDDHGIQEEVDTFMFE
EHDTTASGLTFCFMLLANHRAVQDKIVEEINYIMGDSTRRANLEDLSKMKYLECCIKESLRLYPPVHFISRNLNEPV
VLSNYEIPAGSFCHIHIFDLHRRADIYEDPLVYDPDRFSQENSKGRHPYAYIPFSAGPRNCI
GQKFAMIEMKSAVAEVLRKYELVPVTRPSEIELIADIILRNSGPVEITFNKRTK
>CYP4M9
BAAB01103156.1 BAAB01142776.1 BAAB01178932.1 57% to 4M5
BY921981.1
EST N-term
MWMYFILILLLLLTIHFLLNYNYRARLLRRIPGPRGYFIVGNALDVILSPAELFASTRK
NAAQWPNLNRFWSFGIGALNIYGPDEIEAIISSTQHITKSPVYNFLSDWLRDGLLLST
GTKWQKRRKILTPAFHFNILKQFCVILEENSQRFTENLKDTEGKSINVVPAISEYTLHSII
()
675
ETAMRTQLGSETSEAGRSYKNAICELGNQFVHRLARLPLHNNFIYNLYTLGKQNKH
LNIVHSFTKKVIKDRRQYIRGNGGNNFDDEKDTQADEHSIYFNKKKTAMLDLLLKAERDGL
IDEIGVQEEIDTFMFE
()
1150
GHDTTATGLTYCIMLIANHKSIQVGALRKYID 1245 ()
295
GESTRAADIEDLSKMRYLERCIKESLRLYPPVPSMGRILSEEI 423 ()
1467
VLDGYTVPAGTYCHIQIFDLHRREDLFKDPLVFDPDRFLPHNTEGRHPYAYIPFSAGPRNCI 1282 ()
GQKFAILEMKSLLSAVLRRYNLYPITKPEDLKFVLDLVLRTTEPVHVRFVKRNKV*
CYP4S subfamily sequences
>CYP4S5 66% to
CYP4S4 from the moth Mamestra brassicae
BAAB01177041.1
BAAB01181662.1 BAAB01048279.1 BAAB01025317.1
AADK01013707.1
MIWAVVLLIIFIFLIWKALEDEENPLDSLPGPERKPIIGATMRFVNLNT
()
3009
GEMFIKLREYHAMYGTRYVVKIFKRRILHLSNERDVE (0)
365 VVLSHSKNIKKSKPYTFLSPWLGSDLLLST 454
GFKWHSRRKILTPTFHFNILKSFLEIIKDKSCDLVKRLEEYRGEEVDLMPVISDFTLFTIC
ETAMGTQLDSDKSAETSEYKMAILQIGSLLLDRLTKVWLHNDFIFRQFTVGRKFQKCLKQVHSFAHN
VIVERKRQRASGRDPTVVAEDVFG
RKKRLAMLDLLLEAEEKNEIDFEGIMDEVNTFMFE
(1?)
GHDTTAVALTFSLMLVAEDDQVQ
(0?)
DRIYKELQGIFGDSDRRPTISDVAEMKYLEAVVKETLRLYPSVPFIAREITEDFML
()
DDLKIKKGSEVAVHIYDLHRRKELFSDPEKFLPDRFLNGELKHPYSFVPFSAGPRNCI
(1)
GQRFATLEMKCVLSEICRSFRLEPRTKGWRPTLVAEMLLRPNEPIHVKFIKRKQS*
>CYP4S6 94% to
CYP4S5
BAAB01121123.1
BAAB01054179.1 BAAB01073078.1 BAAB01008096.1 BAAB01081659.1
AADK01014846.1
164
MIWTVVVLIIFIFLIWKALEDEENPLDSLPGPERKPIIGAALRFVNLNT 18 ()
REMFIKLREYHAMYGTRFVVKIFKRRVLHLSNEKDAE (0)
VVLSHSKNIKKSKGYTFLSPWLGSGLLLST (1)
1053
GFKWHSRRKILTPTFHFNILKSFLEIIKDKSCDLVKRLEEYRGEEVDLMPVISDFTLYTIC 1235
1966
ETAMGTQLDSDKSAETKEYKMAILQIASLLLNRLTKVWLHNDLIFDQLTVGRKFQK 2133
2134
CLKQVHSFAHNVIVERKRQRASGRDYSVVAEDVFGRKKRLAMLDLLLEAEEKNEIDFEGI 2313
2314
MDEVNTFMFE 2343 ()
2862
GHDTTAVALTFSLMLVAEDDQVQ 2930 ()
3158
DRIYKELQGIFGDSDRRPTISDVAEMKYLEAVVKETLRLYPSVPFIAREITEDFML 3325 ()
578
DDLNIKKGSEVVVHIYDLHRRKELFADPEKFQPDRFLNGELKHPYSFVPFSAGPRNCI 751 ()
GQRFAMLEMKCVLSEICRSFRLEPRTKGWRPTLVAEMLLRPNEPIHVKFIKRKQS*
CYP4AU sequences
>CYP4AU2
BAAB01192732.1 CYP325 like N-term
AADK01032226.1
1927
MIILLLLFFVASLYWLYWTNKSKRMNKMTASLPVPPTLPILGNATLFIG
>CYP4AU2
35% to 4C3 aa 112-276 37% to 4G19 BAAB01022101.1 exons 2,3,4
AADK01025934.1
(1)
AIGDPEDAQVVLENCLDKDVVYRFLRPWLGHGLFVAP (1)
VCLWKSHRKVLLPVFHNKVVEQYLQMISVQADILVERLNEKANKGEFDVLKYITACTLDIVF
(1)
ETAMGERMDVQRSPDTPYLRARHTVMTILNMRFFKVWLQPDCIFNLTSYSKQQKDNIDLTHKFTDE
(0)
>CYP4AU2
62% to CYP4AU1 I-helix to end BAAB01055413.1
AADK01029434.1
Note
N-terminal is known but confidential (10/18/2007)
Only
missing 23-24 aa in the middle between the two segments
(1)
GKVRPVLDMLFGREIEFTDEQLREHIDSITIAGNDTTALVMAYTLVRLGIHQNVQEKVYLE
()
QRTIFGDSKRGADKVDVAQMQYLERVLKESMRLYTVVPIIARNVHKDTYL ()
PRCGVTLPAGIGAVVGPFAIHRSKSVWGPDADEFDPDRFLPERSLNRHPAAFLPFSHGSRNCI
()
GRNFGMLIMKSIVSTISRSYRIEADELGPLKIEMLLFPIRGHQIRISRRETLA*
>CYP4AU-like
BAAB01077742.1 Length = 3282 68% to BAAB01022101.1 exon 2
AADK01002646.1
69% to
CYP4AU2
2223
AICDTEHAQIILDCLNKDMVYRFLRLWLEHGLFVAS 2330
CYP4AX sequences 50% identical to CYP4S4
>CYP4AX1
CYP4qq 50% to CYP4S4 53% to CYP4Sxx lower case is from CYP4rr
BAAB01092965.1
BAAB01135526.1 BAAB01183645.1, AADK01005955.1
AADK01030377.1
(frameshift after GQKF in last exon)
MWFSLVVVAVIYALWKLFYKEDDPIDSLPGPTKLPIIGNMLDMFNMTP ()
GEKFKYERQLSKTYKQRYMQKIFYRRIVYVHHPDDVE ()
VVLSHSKNITKNVNYDFLKPWLGTGLLLST
()
GSKWFKRRKILTSAFHFDILKDFASLFEEKSRRLVDQLRANNGEPISLLPVMSNYTLFTLC
(1)
ETALGTKLDTDRSVAAAEYKDAISKTAQISIYRLPRIWLYIDAIFNRTSAGREFAKNVDIIHSFADN
VIVQRKEQRLNSLDKGLVERDEFNRKKRTALLDLLLEAEAKREIDLEGIREEVNTFMFA
()
GHDTTGTALTFSLMLLSDHEEAQ
()
ERILEEYNEVMRGKETPTLSEFAEMKYLDAVIKETLRLYPNPHRVGRVLTEDITL
(1)
GGVPIKAGTEIGVQIIDLHHREDFFPEPEKFRPERFLRGEIQHPYSFVPFSAGPRNCL
()
GQKF
AMLEIKSVLTHICNNFKLVPMKRNWRVETVSDIVLKPAEPIYIKFVPR*
>CYP4AX2
CYP4rr 94% identical to CYP4qq lower case is from CYP4qq
BAAB01152933.1
BAAB01194854.1 BAAB01058563.1 BAAB01102208.1 BAAB01008375.1
MWFSLLVVAVIYALWKLFYKEDDPIDSLPGPAKLPIIGNMLDMFNMTP
()
GEKFKYERQLSKTYKQRYMQKVFYRRIVYLHHPDDVE
(0)
VVLSHSKNITKNVIYDFLKPWLGTGLLLST
()
GSKWYKRRKILTSAFHFDILKDFASLFEERSRRLVDQLRANNGEAISILPVMNNFTLLTIC
(1)
ETALGTKLDTDRSVNTAAYKDAISKIGQICIYRLSRIWLYIDAIFNRTSAGREFAKNVDIIHSFADN
VIVQRKEQRLNSLDKGLVERDEFNRKKRTVLLDLLLEAEAKREIDLEGIREEVNTFMFA
()
ghdttgtaltfslmllsdheeaq
()
ERILEEYNEVMRGKETPTLSEFAEMKYLDAVIKETLRLYPNPHRIGRVLTEDITL
(1?)
GGVPMRAGTEVCVLTIDLHYREDFFPEPEKFRPERFLRGEIQHPYSFVPFSAGPRNCL
(1)
GQKFAMLEIKSVLTHICNNFKLVPMKRNWRVETVSDIVLKPAEPIYIKFVPR*
&&&&&&&&&&&&&&&&&
>AM106362
Lutzomyia longipalpis
Jacobina = best EST to these genes
MWGKFFFICDGLFWAVFFSPPLRGGLGSFFRGGGRKFPGPFYGS
PSLGRLPAQMGFSQKGFLGLPKVILSKITQSNENFGWAEAVC
()
AYFPPEECQYVFNANECLSRDDIYDYIKPFTGDGLVTLP
()
AETWKDHRKFLNPCFNLKILQSYMPIFNTEVKTLIGRLGQRIGKGSFDMYDYMDACALDVVC
()
QTTLGTQMNIQKNENMDYLDAANSLLATMTTRIFNPLYHSDFIFNLSKWAKMEQKNSDITFGFVDNILQR
KKAAYKKFQPSDEQNNLDEGTSFKSPQLFIDQLLKLSMEGKYFTDTDVKNEANTIVA
CYP6B sequence
>CYP6B29
51% TO 6B1 CK494244 EST 57% to 6B27 aa 144-500
BAAB01105883.1
BAAB01064450.1
2461
MAIIYILSASVVLPLLLYLYFTRHFNYWKKRNVPGPKPVPLFGNLMELALRKKNIGIVFKELYENFPNEK 2670
2671
VVGIYRMTTPCLLIRDLDVIKNIMIKDFDVFVDRGVELS*SGLGANLFHADGDTWRV 2841
2842
LRNRFTPLFTSGKLKN
MLHLMIERANKYIEHVEMLCDHQPEQDIHTLVQKYTMATIAACAFGLDIDTTDPNKDQLK
TLEEIDRLSLTANFAFELDMMYPGVLKKLNSTLFPGFVSRFFKDVVKTIIEQRNGKPTDR
NDFMDLILALRQLGDIQATKRNSEDKEYSIELTDELIEAQAFVFYIAGYETSATTMTFML
YQLALNPDIQDKVIAEIDQGLKESKGEVTYEMLQNLTYFEKAFNETLRMYSIVEPLQRNA
KIDYKIPDTDIVIEKGTTVLFSPLGIHHDEKYYPNPSKFDPERFSPANISARHPCAHIPF
GTGPRNCI
()
661
GMRFAKIQSRVCMVKMFSKFRFELAKNTPRNLDIDPTRLLLGPKGGIPLKIVRR* 825
CYP6AB subfamily sequences
>CYP6AB4 64% TO
6AB1 OVER WHOLE SEQ BAAB01081811.1 BAAB01162324.1
BAAB01031011.1,
AADK01009517.1
MLTAAIFVIIVALVYLYSTRTFRYWEKRGIKHDKPVPFFGTDSEGYLLRKSMTQTAVDAY
LKYPNEKVIGFFRSSRPELIIRDPDIIKRILTTDFAYFYPRGLNPHKKVIEPLMRNLFFA
DGDLWRLLRQRMTPAFTSGKLRAMFPLVVERAERLQSRTLEIASQPLDARELMARYTTDF
IGACGFGLDADSLNDEDSAFRKLGAAIFNITVQQAIVAALKEIFPGIFKHFKYSSKYETD
FMSLVSSILKQRNYKPSGRNDFIDLLLECRMKGEIVVESIEKMKPDGTSEVVRMELTEQL
LAAQVFIFFAAGFETSSSATSFTLHQLAFHPEIQEKVQKELDQVLAKYNNKLCYDAIKEM
RYLESAFKEAMRMFPSLGFLIRECARQYTFPELNLTIDEGVGIIIPLQALHNDPEYFDSP
NEFRPERFMPSEYNHNKTKFVYLPFGDGPRGCI
()
GARLGLMQSLAGLAAVLSKFTVKPAPSTKRHPVVEPKSSVVQSIKDGLPLLFIERTKS*
>CYP6AB5
BAAB01021567.1 BAAB01206787.1 58% to 6AB3 72% to 6AB2
56% TO CYP6ABxx
MISSING
C-TERM BEYOND A REPEAT SEQ.
possible
C-term = BAAB01196051.1 66% to 6AB2
Bmb030331
from Li Bin
MFFYLLIVILITLYYYGVR
TFDYWKKKGVNHDPPLPFFGNNLRQFMQKASMAMMATETYKKYPEEKVVGFYRGTSPELV
VRDPELIKRILVTDFSSFYARGFNPHKKVIEPLLKNLFFADGDLWRLIRQRFTPAFSTAK
LKAMFHIITERAEKLQMIAENEAYENFCDVRELMARYTTDFIGACGFGLNIDSLSDENSQ
FRKLGKRIFKRDLSDAVRAALKLMFPELCKHLTFLTPELEKSMTYLVQNVIREKNYKPSG
RNDFIDLMLELKQKGKLLGESIEAKNANGTPKQVELEFDDLLMTAQVFVFFGAGFETSST
ASSYTLHQLAFNPECQEKTQKEIDEVLSKHNNKITYDAIKEMTYLEMAFNEAMRLYPSVG
YLVRMCTVPEYTFPEINLTINEDVKLMIPIQAIQKDEKYFKDPERFHPERFSSGAKANLK
PYTFLPFGEGPRACV
()
923
GERLGQMLSMAGLVAVLQKYTVEPVEISLRDPIPDPTTTVSEGFVHGLPLKLRRRERRI* 1102
>CYP6AB8 96%
identical to CYP6AB4 20 aa diffs
whole seq known
but confidential
CYP6AE sequences.
>CYP6AE2 Bombyx
mori (silkworm)
old CYP6AEcc BAAB01091139.1
contig444108,
frameshift at 1529 whole gene in one contig
No accession
number
Junwen Ai
submitted to nomenclature committee Jan. 31, 2007
nearly identical to sequence CYP6AEcc on Bombyx page except
one region from amino acids 106-256 does not match.
The EST BY914225
agrees with the Ai sequence so the old
CYP6AEcc sequence appears to be a hybrid assembled from two genes.
663
MSVSALVFAAFVLLVTYIYYWSTRKFDYWKRKNVPYAKPVPFFGNYMRYITLQSFLGDVM 842
843
QKLCQQFPDRPYFGSFYGTEPALVLQNPEIIKQVFTKDFYYFNSRENRDYNHKEVFTQNL 1022
1023
FFANGDRWKVIRQNLTPLFSSSKMKNMFYLVEKCNHSLEDMLDKETKDLQSIEIRSAMIR 1202
1203
YTLDSICSSAFGIETNTLSEGAENSPFPSMGSTIFSSSITRGLKLIGRSMWPGIFYKLGL 1382
1383
RCFPTEIDDFFERLLTEVFENRGYKPTNRNDFVDLILSLKQNDYLTGDG 1529
1529
LVPKNVDAKKVTVKVDDALLIAQCVVFFGAGFETSATTLSAALYELAKNPEAQRRAQEEV 1708
1709
DELLLKHNNKLNYDCLAELPYLEACMNEAMRLYPVLPNITREAVTDYTFPDGLRIDKGMR 1888
1889
VHVPVYAIHRNPDNFPDPEEFRPERHLGDAKNDIKQFTYFPFGEGPRICI 2038 (1)
2917
GMRFGKMQTIAGMVTCLKKYNFELADGMSKTVPFRSTTVLTQPSTGLFLKATPRDGWKQRIFAR* 3111
>CK526561 a
related EST rswfa0_003045.y1 swf Bombyx mori cDNA.
2aa diffs with BAAB01091139.1, 6aa diffs with
CYP6AEbb
GFETSASTLSAALYELAKNPEAQRRAQEEVDELLLK
414
HDNKLNYDCLAELPYLEACMNEAMRLYPVLPNITREAVTDYTFPDGLRIDKGMRVHVPVY 235
234 AIHRNPDNFPDPEEFRPERHLGDAKNDIKQFTYFPFGEGPRICIG
>CYP6AE3P
(old CYP6AEbb) BAAB01174895.1 95% to BAAB01091139.1 54% to 6AE1
there
are probably two very similar genes
BAAB01172535.1
fills in missing sequence for this contig
Bmb026776
from Li Bin
MSVSALVFSAFVLLVTYIYYWSTRKFDYWKRKNVPYAKPVPFFGNYMRYITLQSFLGDVM
QKLCQQFPDRPYFGSFYGTEPALVLQNPEIIKQVFTKDFYYFNSRENRDYNHKEVFTQNL
FFANGDRWKVIRQNLTPLFSSSKMKNMFYLVEKCNHSLEDMLDKETKDLQSIEIRSAMIR
34
YTLDSICSSAFGIESNTLREGGENSPFANMGSIVFSSSITRGLKWISRSMWPGIFYKLGL 213
214
QCFPAEIDGFFER 252
LLTEVFENRGYKPTNRNDFVDLILSLKQNDYLTGDGLVPKNVDAKKVTVKVDDALLIAQC
VVFFGAGFETSATTLSAALYELAKNPEAQRRAQEEVDELLLKHNNKLNYDCLAELPYLEA
CMNEAMRLYPXXXXXTRETVADYTFPDGLRIDKGMRVHVPVYAIHRNPDNFPDPEEFRPE
RHLGDAKNDIKQFTFFPFGEGPRICI (1)
GMRFGKMQTIAGLITCLKKFNFELADGMPRTLAFRSTTLLTQPSTGLFLKATPRDGWEQRIFAR*
>CYP6AE4
(old CYP6AEaa) BAAB01211364.1
84% to CYP6AEff 53% to 6AE1
missing
C-term
MLFLTLLFILSVCVYFIY
(frameshift)
YRVCNRRFDYWRKKNVSFVPPVPILGNYSGYILLKESISKVVHNLCKLFP 2567
2568
NDPYIGAFFGTEPTLIVKDPEFIKLVLTKDFYHFNGREGSKYTHNEVVTQNIFFTYGDRW 2747
2748
KVIRQNLTPLFSSLKMRNMFHIIEKCSGIFENLLDEESLAPEVEMKSLMSRFTMDCIGGC 2927
2928
AFGVDTKAMQEPKDNIFTTMGYLFFESTTYRGIKNVLRAIWPGIFYGLGLKVFPTDLNEF 3107
3108
FSKLLVGIFEARDYKPSSRNDFVDLLLNLKKNRHIVGDRLQKTTTGDEGADSKFELEVDD 3287
3288
GLLVGQCLAFFSAGFETTSTISNFTLYELAKNPDVQKRAQKEVDEYIKKHNNKLDYDCVK 3467
3468
ELPFVEACIDEALRLYPVLGVLTREVMEQYTFPTGLTLDKGDRVHIPVYHLQRDPEYFPE 3647
3648
PELFKPERFYGEEKKNIRPFTYLPFGDGPRICI 3746 (1)
>CYP6AE5
(old CYP6AEff) BAAB01096775.1 91% TO CYP6AEee
BAAB01211363.1
51% to CYP6AE1 Depressaria pastinacella 86% TO CYP6AEdd
AADK01005027.1
MILTIIFILSLCVYILYRISTRKFDYWQKKNVNYVQPTPFLGNYSGYILLKENLL
DVVYNLSKLFPNDPYVGAFFGTKPTLIVKDPEFIKLVLTKDFFYFTGKECFEYTHKEVIT
QGIIFTYGDRWKVIRQNVTPLFSSSKMRSMFRIIEHCSGVFENLLD
EESLAPEVEMKSLM
SRFTMDCIGGCVFGVDINAMQEPKDNIFTTMGCLFLETTTSRGIKNVVKAIWPEIFYGLG
FKVFPTDIHKFFSKLLVRIFEARDYKPSERTDFVDLLLNLKKNRHIVGDRLQKIKTGDEG
ADSKFELEVDDGLLVGQCLAFFSTGFETSSTISNFTLYELAKNPDVQKRAQKEVDEYIKK
HNNKLDYDCVKELPFVEACIDEALRLYPLFGVISRQTGERYTFPTGLTLDKGDRVHIPVY
HLQRDPEYFPEPELFKPERFYGEEKKNIRPFTYLPFGDGPRICI
(1)
11263
GMRFAKMQILAGLVTILKKYTVQLADGMPETIDIEPKAIVTQPAISLRLKFVPRNDLQKRIFA* 11069
>CYP6AE6P
(old CYP6AEee) BAAB01091974.1 93% to CYP6AEff
BAAB01178163.1
= C-term exon 50% TO BAAB01091139.1
BAAB01149335.1
= N-TERMINAL (MOST PROBABLE SEQUENCE TO COMPLETE GENE)
1055
MILTIIFILSLCVYILYRISTRKFDYWQKKNVSYVEPAPFLGNYSGYILLKENLLDVVHN 1234
1235
LSKLFPNDPYVGAFFGTKPTLIVKDPEFIKLVLIKDFYYFHGREGSKYTHNEVITQGIFF 1414
1415
TYGDRWKVIRQNLTPLFSSSKIRNMFRTIEKCSGVFENLLD 1537
EESLAPEVEMRSVMSRFT
MDCIGGCVFGVDINAMQEPKDNIFTTMGCLFLETTTSRGIKNVVKAIWPEIFYGLGFKVF
PTDIHKFFSKLLVRIFEARDYKPSERTDFVDLLLNLKKNRHIVGDRLQKIKTGDEGADSK
LELEVNDDLLVAQCVSFFIAGFETSSNSLTFTLYELAKNPDVQKRAQKEVDEYIKKHNNK
LDYDCVKELPFVEACIDEALRLYPLFGVISR
(frameshift)
DKRERYTFPTGLTLDKGDRVHIPVYHLHHDPEYFPEPELFN
PERFYGEEKKNIRPFTYLPFGAGPRVCI
GERFAKMQMLAGLVPILKRYTVRLAEGMPETINFEPKAIASQPNIGVRLNLLPRNN
>CYP6AE7
(old CYP6AEdd) 53% to 6AE1 BAAB01210600.1 BAAB01149335.1 87% to CYP6Aeaa
Bmb021626
from Li Bin
MILTIIFILSLCVYILYRISTRKFDYWQKKNVSYVEPAPFLGNYSGYILLKENLLDVVHN
LSKLFPNDPYVGAFFGTKPTLIVKDPEFIKLVLIKDFYYFHGREGSKYTHNEVITQGIFF
TYGDRWKVIRQNLTPLFSSSKIRNMFRTIEKCSGVFENLLDEESLAPEVEMRSVMSRFTM
DCIGGCAFGVDTNAMQEPKDNIFKTMGYLFFESTTHRGIKNVFKAIWPEIFYGLGFKVFP
TDLNEFFSKLLVGIFEARDYKPSSQNVFINLLLNLKKNRHIVGDRLLKIKTGNVRAESKI
KLEVDDELLVSQCVAFFIAGFETSSTISSFTLYELAKNPDVQKRAQKEVDEYIKKHNNKL
DYDCVKELPFVEACIDEALRLYPVLGVLTREVMEQYTFPTGLTLDKGDRVHIPVYHLQRD
PEYFPEPELFKPERFYGEEKRNIRPFTYLPFGAGPRTCI
(1)
GQRFAKMQMLAGLVTILKRYTVRLAEGMPETINFEQRAIVTQPNIGIRLNLLPRNN*
>CYP6AE8 (old CYP6AEgg) 53% to 6AE1
BAAB01205437.1 BAAB01169820.1 (52% to BAAB01211364.1)
BAAB01196919.1
contig74655, 50% to 6AE1 might be the last exon
MFLLINICVILFVIYYLVTKKYSYWRNRNVSHEKPVLLLGNYGDLILQKKNFGEMAQAIC
1747
QKFPGEPVVGAFFGTEPVLIPQDPEVIKTILTKDFYYFNGREISEHVHKELLSYNLFA 1574
1573
TYGDEWKILRQNLTPIFSTAKLKSMFTLIEKCSKSFQNLLEDETKISKELEVRTLMQRFT 1394
1393
IECIGSCIFGVDTDTLGNDKMNPFKAAGSQLSDFSRLVFVKGIVRAIWPTLFYALGFKTF 1214
1213
TTELDIFKKLVNAVFAQRKHKPTTRNDFVDLILTWKNNNTITGDSIGSFKNSDKTKFSI 1037
1036
DVNDDLLLAQCLVFFAAGFETSAMTSSYTLHELAKNQRALKKACDEVDAYLLRHGNKV 863
862
NYDCVTELPYLEACIEETLRLYPVLGIITREVMEDYVLLDKIHLKKGDRIHVPVFHLHH 686
685
NPEHFPNPEEYRPERFYGEEKRKVKPYTYLPFGEGPRICI 566 ()
2262
GMRFAKMQSIAGLITILKKFRLELPEGAPTKIEFKPEAFVTTPKDLIKIKFLEREGWQQRVF 2077
>CYP6AE9
(old CYP6AEhh) BAAB01207183.1 BAAB01033068.1 50% TO CYP6Aedd
Bmb007891
from Li Bin
530
MYLLTFLLHFILFLVLVVYYITKRNHDYWKNKKVPFEKPYPILGNYGDYILLKIF 366
365
FGNVTKRLCEKFPESPYVGTFYGTEPALVVQDPDLIRLILVKDFFYFHGREVCKYVDREI 186
185
TTQSIFFTYGDDWRVLRHSLTPLFTTSKMKN
MFHLIKNCCRVFENVLEEKATVKSFEINSIIKYFIMDCVGACLFGVEINAMEKHSQNPIV
KISKEIFPKSTTRGIINVLRGIWPSLFYALRLKLFPEVITMFFTDLLECAFKDRQRNLSQ
RQDFIDLFMKLRQNKYLVGDSIPDIKNGRVQKVNLEVTDELLIAQCVSFFGAAFETSSTT
LSLTFYELAKNPKYQKKAIEEVDDYFAKHNNEIEFDIVSDTPYLNACIDETLRLYPSLAN
LTREVMEDYTFPTGLKVEKGTRIHIPVYHLQRNPKYFPEPNKFDPRRFLPEAKQTIYPFT
YMPFGEGHRICIAMRFAKMQMLAGFATLLKKYEVAVDDTTPQELTIDPRIIVTTPIENIQ
LKLIARQHAL*
FOUR
NEARLY IDENTICAL C-TERMINAL EXONS FOR CYP6AE SEQUENCES.
>BAAB01132789.1 Bombyx mori DNA, contig535134,
whole genome shotgun sequence
Length =
2056
352
GMRFAKMQILAGLVTILKKYTVQLADGMPETIDIEPKAIVTQPAISLRLKFVPRND 519
>BAAB01134380.1 Bombyx mori DNA, contig538543,
whole genome shotgun sequence
Length =
2850
218
GMRFAKMQILAGLVTILKKYTVQLADGMPETIDIEPKAIVTQPAISMRLKFVPRND 385
>BAAB01199026.1 Bombyx mori DNA, contig760785,
whole genome shotgun sequence
Length = 626
460
GMRFAKMQILAGLVTILKKYTVQLADGMPETIDIEPKAIVTQPAISLRLKFVPRN 624
>BAAB01136322.1 Bombyx mori DNA, contig542966,
whole genome shotgun sequence
Length =
1880
1736
GMRFAKMQILAGLVTILKKYTVQLADGMPETIDIEPKAIVTQPAISLR 1879
Other CYP6 family like sequences
>CYP6AN2
65% to 6AN1
join
with BAAB01065359.1 C-term exon
67% to 6AN1
Bmb032387
from Li Bin 361-441 65% to 6AN1
AADK01063079.1,
AADK01036854.1
Whole
sequence known but confidential
KEIQCEIDEVLSRHDNKLCYDAILEMPLLT
MAFKEALRMFPSLGNLHRVCTRSYTIPELGITIDPGVRIIIPAQAIQNDAKYFDDPSEFR
PKRFAKDSEIKKFSFLPFGAGPRNC205 GARLGEMQSLAGLAAILHKFSVEPAPSTVRKLRVKHTQNVVQGVEGGLPLLIKERK* 375
>CYP6AU1
new subfamily 40% to 6B1 43% to 6B27 BAAB01129923.1
AADK01004964.1
MLTAYLLYVFASIVVLIYFYLNNKYKYWKNKNIAGPEPEFLFGNLKESFHRRKHIATVFK
EIYDQYPDEKVVGMFRMTSPTLIVRDLDIVKQIMIKDYAKFNERGIKFSKEGLGDNLFHA
DADAWKAVRSHLTPMFSSGKLKNMVRVLSQTGDRFVDHVTNEVRLRPEQELVSLFLKYNI
ATIMACSFSMETDINDMQVYEMLNVAVFENSYISELDMVFPGILRKTNRSIFGPTVNNFC
YEIVEFIKNERNGKPANRNDMMDMLLGVNCDKLKMKNGSEDFVEITEHVMAGQVFVFFAAG
YVNNTITLTFSLYHLAKDQSIQ
ERLMREIETVLENHNNVLTLEAINEMSYLEMIYLETLRMHPTTNTLQRSALEDYTIP
GTDIEIEKGTLVLIPPLAFHHDEKIYPEPEKFDPERFSVENHKSRHACAFLSFGIGPRTCIG
>CYP6AV1
new subfamily 44% to 6AB1 BAAB01020563.1 BAAB01140967.1
Bmb035757
from Li Bin
MLLLIIVIIILYLYTTRNHSYWAKRGVKHERPIPFFGNHLPNILLQRDNTDLGLQMYNAY
ENEKIVGYYIGNIPQLIIRDPEIIKHMLNIDFNSFSERGLNLNPKKESLLNNLFFVSG
DNWRRMRNIFSGAFTSVKLKAVFPLIVKCTDKLVKKTTMRHVSDEIKTYELMTGYTMEFIAS
CGFGVDVDTISESNMHFVDVARSFFDKTSLQTFMINLDEICSILKIIPGTFEVNSEFNSF
ILNITKKVLAERNFKPSERNDFVDHLLQMVYWNGNEKLTTEADLESDSWFVAAQVFLFLV
AGFETSAVATSVVLHQLAYNQNLQDEVRKEIDSALKKYDNKLCYEAVCEMSLLDMTLKES
LRIQPPAGFIRRRCIKEYKIPGTNIAIDPGVKILIPIKALNHDPKYFDCPSEFRPQRFSP
EAEQNIPKFVYMPFGSGPRTCP
()
GARLASLQSMAGLASLLHHFVVEPTPNTSRHYKIKRSAFLVQIIEGGIPLYFKPRTNK*
3673
>CYP6AW1
new subfamily 39% to CYP6P5 aa 125-502 BAAB01195862.1
BAAB01153386.1
CK496233 EST
AADK01000856.1
MWLSVLLVVILLFLILLLTVFHYTTKGRKYWLLRNVPYREPCPLFGNFGATFTMRRSYTK
MLQFFYDNYYNEKYVGIFQARRPTLMLIDLELVKNVLSKEFPNFSDRISVTTDTQREPL
LRNLANMSGAEWKAMRQIVTPTFSSAKMKAMFPLIAECAQTLKNSLLKESLAEVNVPDFMTRFTTD
VIGSCAFGVDPGSLKDPESPFLKMSQKMFKIDRSTVLKRYCRTFFPKLFKFLNLRTYSKD
VETFFTTIIKKVLDERRATGVQRHDFLQLMLNVQKTETSFVMTDTLIISNSFIFMLAGLE
SSSTTLSFCLYEWAKDKHIQDDLRTEMVDCLERYGGINYEAVCSMRWVN
QAVLETLRLHPPTPLTTRLGTSACTLNGTDLSVRVRDPVLIPLYCIQRDAQHFPNPDKFNLERFKETNPPG
FLAFGEGPRSCPGARFAQLTVAAALAALLSSFEIEPCSMTTSTIVYDPRSVMLKNKGGIW
LKFVPL
>CYP354A1 Bombyx mori (silkworm)
BAAB01200346.1 BAAB01118814.1
BAAB01157630.1 BAAB01008306.1
Bmb029934, Bmb010351 from Li Bin
AV399740.1 EST N-term 203 amino acids
BAAB01211873.1 exons 1,2
AADK01025884.1
exons 3,4,5,6
BAAB01157630.1 exon 7
AADK01034838.1 exons 7,8
AADK01018486.1 exons 8,9
This
sequence assembled from genomic DNA and ESTs
A tree
of 155 CYP6 and 9 sequences places it outside CYP6 and CYP9, but
It is
a CYP3 clan member.
The
stop codon in exon 4 seems to be a seq error
when
compared to the CYP354A2 cDNA sequence.
MNFSPGTVQILQFIQNDWKL
605
ILILTLLIFIYYYYTNTFDYFEKRGVPFKKPIIFLGNLGPRLKAVKSFHQYQLDIYQYFKGHPYG (1) 799
1277
GTFDGRRPVLHILDPELIKAIMIRDFDHFTDRNTLNSMEPRYLSRSLLNLK (0) 1429
4244
GLEWKGVRSTLTPAFSSSRLKNMIPLIQQCSKQMVEFLKKF (1) 4027
2588
DGKEIEMKQTMGHFTLEVIGACAFGIKCDALSNENSRFVK (0)
1717
VAEKFDYMPKYKRVILLMLLVFMPKMIR*LRLSFLNIEYTG 1598
ELVRMLQAAKAERRSSESK
(2)
KGDFLQILIDFAAKETAQNDTAKREI
(1)
LLDDDTIDAQSLLFLIAGY
ETSSTLLSFAIHVLATKPDLQETLRAHVQEMTKGKEMSYELLAQMDYLEAFLQ
(1)
ETLRIYPPVARVDRICTKPYIIPGTTVHVGV
GDAVAIPVYGLHMDEDIYPEAREFKPERFMDDQKKDRPSHLYLPFGAGPRNCI
(1)
GLRFAMISAKIAMVALMKNFKFSVCSKTMDPIDFDKRAVLLKSAKGLWVRIELIDLS*
>CYP354A2
AB265182.1 Antheraea yamamai
FTLEVIGACAFGIKCDALSHENAYFYKVAENFDYMPKIKRVLIF
LCMIFMPKLLTYLNVSFLHLKSTDELVRMLQAAKAERKRLNSRENDFLQILIDFAKKE
YTELENTNTAKYLDDDTIDAQCLLFLIAGYETSSTLLSFAIHELAINTQLQSKLRAHI
KEVTDGKEISYELLSELTYLDGFLLGDAVAVPVYGIHMDPKYYPEPHELRPERFMHSE
KKERPSHLFLAFGSGPRSCIGSRFAMISAKTAMMSLMKNYKFSTCSQTTYPIEFDKRS
VLLKSETGLWVRFEPL
CYP9A subfamily sequences.
>CYP9A19
old CYP9Aqq 66% to 9A13 BAAB01209004.1 BAAB01182505.1
BAAB01064142.1
BAAB01170681.1 BAAB01057984.1 BAAB01037494.1
Bmb022413
and Bmb022414 From Li Bin
MIIIIWTLVIGLALLLYLKQTYSYFSKHEIKSVTPLPILGNMGKIVFKFNHLVDDISQLYNNFPEER
()
FVGRYEFVNPVIYIRDIEIVKRITIKDFEHFLDHRTIVNEDSDPMFGRNLFSLK
()
GQEWKDMRSTLSPAFTSSKMKLMMPLIVEVGEQMINAIKENIKNS
()
GVGYVDIDTKDLTTRYANDVIASCAFGLKVDSLTEENNRFYAMGKAATNFSFKQILMLLGFISFPKLMK
()
MTKFRLFSEETSGFFKELIMGTMKDREMRKIIRPDMIHLLMEAKK ()
GKLVHDNKSSKDTDAGFATVEESAVGKKQIDR
()
VWTDDDIIAQAVLFFIAGFETVSSAMTFLLHELALNPEVQEKLVEEIKENKERNNGKFDYNSIQNMAYLDMVVS
()
ELLRLWPPAVSMDRICVQDYNLGKPNDKAKRDFI
(0)
LRKGTGVAIPVWAFHRNPEFFPDPQKFDPERFSEENKHNIKPFTYLPFGVGPRNCI
(1)
GSRFALCEVKVMAYQLLQHMEISPCEKTCIPSKLSKEIFNLRLEGGHWVRLKIRD*
>CYP9A20
old CYP9Arr 65% to 9A13
BAAB01013356.1
BAAB01000958.1 BAAB01096964.1
BAAB01123933.1
BAAB01125489.1 BAAB01165057.1 CK511220 EST
may be
a hybrid sequence 68% to 9A13
joined
Bmb025541, Bmb025542, Bmb025543
MILLIWAVVLIAAFVLFYKQAYSLFSKHGVKGFTPLPFFGNMGRIVIKMDHFSDHIQSLYDSFPEER
()
FVGRYEFLNPMVIIRDIELLKKITVKDFEHFLDHRTIINKDTDPFFGRSLFFLR
()
DQDWKDMRSTLSPAFTSSKMKLMMPFIVEVGEQMNKALKQRIQEAG
()
VGYVDIDSKDLTTRYANDVIASCAFGLKVDSITEENNQFYAMGKAASTFNFRQLLIFFGLASVPKLV
()
KILRITLFQKEIKTFFRELILGTMKNREAQNIIRPDMIHLLMEAKK
()
1129
GKLRHDEKSTKDSDAGFATVEESSVGKKDINR 1224 ()
1878
VWTDDDLVAQAVLFFVAGFETVSSAMTFL 1964
LHELALNPEVQEKLVEEIRENEKNNNGKFDYNSIQNMVYLDMVV
SEVLRLWPPVIALDRMCVKDYNLGKPNDKSKEDFI
()
IRKDVAVGIPVWGLHRDPEFFPNPLKFDPERFSEENKHNIKPFSYMPFGLGPRNCI
()
GSRFALCEVKVMTYQLLQHMEISPCEKTCIPSKLSKETFNLRLEGGHWIRLKIRN
>CYP9A21
old CYP9Ass BP115106 EST BAAB01063049.1 BAAB01018306.1
BAAB01114672.1
BAAB01001005.1 BAAB01079446.1
joined
without overlapping fragment, but = complementary frags
could
be a hybrid of two genes 66% to 9A13
MIIIIWTLAIGLAFLLYLKQIYCYFSKHEIKSITPLPILGNMGKIVFKINHFVDDISQLYNKFPEER
()
FVGRYEFVNPVIYIRDIEIVKRITIKDFEHFLDHRTIVNEETDPIFGRNLFSLK
()
GQEWKDMRSTLSPAFTSSKMKLMMPLIVEVGEQMINALKKNIKNSG
()
VGYVDIDTKDLTTRYANDVIASCAFGLKVDSLTEENNQFYAMGKAASNFSFKQILLLLGFISFPKMMK
()
MTKFTLFSEETSGFFKELIMGTMKDREMRKIIRPDMIHLLMEAKK
()
GKLVHDDKSSKDTDAGFATVEESAVGKKQIDR
()
VWTDDDIIAQAVLFFVAGFETVSSAMTFLLHELALNPEVQDKLVEEIKENKERNNGKFDYNSIQNMVYLDMVVS
()
EXXRLWPPGVSLDRLCVQDYYLGNPNEMAIRDFI
()
LRKGTGVAIPVWAFHRNPEFFPDPLKFDPERFSEENKHNIKPFAYLPFGVGPRNCI
()
GSRFALCEVKVMAYQLLQHMEISPCEKTCIPSKLSKETFNLRLEGGHWVRLKIRD
>CYP9A22
old CYP9Att 75% to 9A14 aa 421-529 BAAB01150296.1 AU002409 BAAB01097088.1
BAAB01211661.1
72% to 9A12 cannot walk upstream any futher
join
with N-term from BAAB01034175.1 (might be a hybrid seq)
MITLIWLGVLLVTLTLHLRKVYSRFKDYGVNHFTPIPVLGNAGPITVRLRHVAEDFDMVYKAFPEDR
(2?)
FTGRFDLLRPTVIIKDLDLIKQITIKDFEHFLDHRALVDDTADPFFGRNLFSLR
()
GQEWKDMRSTLSPAFTSSKMRGMVPFMVEVNNQMIDMIKKKIVANA
()
GYLDCEGKDLTTRYANDVIASCAFGVKVDSHTNEENQFYLMGRDMADFGFRKIMVFLGYSSFPKLMK
()
KFNAKLLSDETGHFFTDLVLRTMEDREVKEIVRPDMIHLLMEAKQ
()
GKLSYDEKSTKEADTGFATVEESDVGKKTINRI
WSNTDLIAQATLFFVAGFETISSAMSFALHELALNPEIQDRLVQEIKENYAKTGGKFDFN
CIQDLTYMDMFVSEVLRLWTPVVGMDRLCVQDYNLGRANKNATKDFI
()
LRKGEGLSIPTWSIHHNPEYYPEPYKFDPERFSEENKRNIKPFTYLPFGTGPRNCI
()
GSRFALCEVKVMLYQLLQQIEVLPSDKTKVRAKLAKDTFNVKIEGGHWIRLKLRD
CYP9G subfamily sequences
>CYP9G1
intact sequence submitted by Manabu Kamimura 3/22/99
BAAB01081406.1 BAAB01087326.1 BAAB01125517.1
first
exon (69 amino acids) known but confidential
Bmb008984 from Li Bin
RYVGFTDGVCPGIFIRDPEIIKHVTVKEFDHFVNNKDLSPEGEESILKNSLIMLK
(1)
DEKWRKMRAALSPAFTASKMREMVPLITEISHNIVEYLK
()
EHLTEDIDLDDLMSRYSNDVIASAAFGLQINSLKERDNIFFKAGKDVFNFSLFQSIRMIFSDHFPSLSK
()
KLGFTVIPKSTSEFFRTLIASTVDYRIKNKVERRDMIQLLMQLST
(1)
EWTETELAAQLFVFFVAGFETTGNTLINCIHELALNPHIQDILYEELKAFKETKGNLVYENIGELKYLDCVLN
()
ETMRKWSAAIFVDRICTKPYVLPPPREGGKPCQ
()
LKPGEVIYNAVNSIHMDPKYYEEPEKFIPERFLDENKHKIKPFTFMPFGVGPRYCI
()
GSRFALMEMKILLFRLMLNFKVLKCAKTLDPIKMSPVGFNMNIWGGSWVKFQARKA*
>CYP9G3
59% to 9G1 C-term corrected by EST CK508550 BAAB01034600.1
N-term
corrected by EST BP125176 AU000984 CK510649 (4 aa diffs) BP127468
BAAB01164136.1
BAAB01189844.1 BAAB01100167.1
GTLK
exon is extra compared to CYP9G1, but it is seen in ESTs
The
length of this sequence in other insect species ESTs is highly variable.
MLVEVIVFLITTLVAYYLYVYKKIHYFYDARGVKYQPGIPVLGNILKSSLGTGHFWEDIDKIYKAFPGER
()
YIGYIEGTTPILMIKDPEIIKNITVRDFDHFVNHKEFFPVEIDALFGGSLFMMK
()
DDKWRDMRTTLSPAFTGSKMRLMLPFMIDISKNIVEYLK
()
GHQLEDVDVDDLMRRYTNDVIASAGFGLQVNSLVDKDNEFYECGQAMFSTSWPQRFKMILAAQFPTLAK
()
KIGIKVFPQKVTRFFREIVTSTMDYRLKNNVERPDMIQLLMDAYK ()
GTLKNESNESDEKNVGFAMTEEMLKPKGNVR
()
KWTQDELTAQVFIFFLAGFESSANGLTLCIHELALNPEAQEKLYEAIIKFKEEKGPLTYDNIGELKYLDCVLN
()
ETSRKWSAAIIVDRVCSKPYELPPPREGGKPYK
()
LQPGDIVYNSVNSIQMDPEHHPDPEKFDPDRFLDENKHKIKPFTFLPFGAGPRNCI
()
GSRFALLELKVLIYYIVLNFKIIKTEKTLSPIKLQPGEFNIKVWGGTWTQFEARE*
>CYP9G3-de9b10b
BAAB01034600.1 pseudogene detritus exons downstream of CYP9Gxx
2492
DFLFNVANSIHIDPIYHTEPDTFNPARFLVENEYQMREFTFMPFSIGPRNFF
4065
GKGFALL*IKLILYYLVLNFKVMKFMKTQNPIKLIQHEFNLEV 4193
Other CYP9/CYP6 like sequences
>CYP9AJ1
N-term 34% to 9A20 N-term, 76% to EB742536.1 Antheraea mylitta
AADK01023492.1,
AADK01016060.1, BAAB01077546.1, BAAB01122451.1
Whole
seq is known but confidential
3622
MLWYVVLIVTCLFYYSSRRLRYFSSRGIATLPTTPFFGNLTAVTFGRENFVEAIAKGYDAFKD 3458 ()
1564
RYFGLYQYLVPTLVPRDPELIRQILVRDFSAFADRGVHIDEECDPLFGRNLIMLR 1400
4355
GSKWRSMRVALSPAFSGARCRNMAPLMVESAKSVTNHLQKRIIDEKVIDINV (0) 4200
3464
ITMSYVNDVIASCAFGFAVDSLKEPDNCIYKLGQKAIIQDTTQVMKFFGYENIKTVMK
3290
VNPRLEYILGILKTQKKASKXXXXXXXXXXXXXX
RPDFIQILVDAMQGMLIYE
(2) 2694
>CYP9AJ1
BAAB01063629.1 CYP9A like
I-helix region
(1)
617 FTDDDLVAQAVLFYVAGYDTTANLINYFLYEMAVNPHVQEKLNQEIAEMSDEGDVYEAIQRLKYLEMCVC 408
(1)
>CYP9AJ1
C-term BAAB01173477.1 BAAB01146424.1 BAAB01001094.1
41% TO
6G2 374-514 45% to 9E2
Bmb023483
from Li Bin
AADK01003334.1
MSPQPNNLSPLDILINVNKIYKTKYIKYIKLVEPHSVNE
EVLRLWPLVGSADRRSVIPYDFGPTYPDSKHSLI
APTGIHIWIPIYSIHRDEQYWPDPNTCIPERFSPENKGSIVPYTYLPFGTGPRHCI
GSRFAILAAKVFLVKFLKTYRTKTHREKTKLSPRAFILRPRDGFELTVERRIEND
>CYP9AJ2
Antheraea mylitta (wild silkmoth) from fat body
EB742536.1
76% to AADK01023492 seq above
RYFGLYQYLVPTLVPRDPELIRQIMVRDFHAFADRGVHISADCDPLFGRNLIMLT
GSNWRSMRVSLSPAFSGARCRCMAPLMADTARAVAQHLKHHITHEQLIDINT
ITMAYVNDVIASCAFGFAVNSLEDLQ
NGIFRLGQRAIVQDTT
CYP12 or CYP333 like mito
sequences. CYP333 seems to be the
lepidopteran
(moths and butterflies) equivalent of
CYP12 in flies and mosquitos (diptera).
These sequences will probably be in
CYP333.
There
are at least four different C-terminals here and some
N-terminal
fragments, but it is not clear which N-terminals belong to
which C-terminals.
>CYP333B1
39% to 12B2 mito BAAB01158176.1 BAAB01035519.1
BAAB01083643.1
BAAB01120305.1 39% to 49A1 39% to 12F2
47% to
CYP333A1 75% to combined (BAAB01158865.1 + BAAB01136746)
N-terminal
is upstream of a repeat seq. I
cannot identify it.
Bmb006234
from Li Bin
AADK01000081.1,
AADK01010165.1, EST BJ985225.1
MIIALHYSKLRPVLNFSCLQQCVR
()
TVTVSAATEKLQQTELKSFREIPGPSSLPIMGPFLHFMP
GGSLHNINSTELTHKLYDIYGPIVRIDSMFSKDAIVLLYDAESAGI
ILRNENNMPIRISFKSLSYYRQKYKKSENDRTDRPTGLVSD
()
HGELWKSFRSAVNPVLLQPKTIRLYSSALEEVATDMVERL
()
RSLRDENNRIRGQFDQEMNLWSLESIGVVALGNRLNCFDSNLQDDSPVKRLIECVHQ
MFVLSNELDLKPSIG
(1?)
QLNYTKNIFKTRLIYFSLTKYFIKKALDDIKMNKSKSDDEKPVLEKLLDINEEYAYIMASDMLVAGVDT (0)
TSNTMSATLYLMAINQDKQQKLREEVMSKNGKRSYLRACIKEAMRILPVV
SGNMRRTTKEYNILGYHIPEN
()
VDIAFAHQHLSMMEKYYPRPTEFIPERWLTNKSDPLYYGNAHPFANSPFGFGVRSCI
()
GRRIAELEVETFLSKIVENFQVEWSGSSPRVEQTSINYFKGPFNFIFKDL*
&&&&&&&&&&&&&&&&&&
>CYP333B2
BAAB01158865.1
AADK01000081.1 BAAB01179470 BAAB01136746.1 BAAB01210389
whole
seq 61% to 333B1
MIGLYILFITFFFLFQNLIYCQ
(1)
ATVSTSENVDVTNLKPFHEIPGPSSLPLIGPLLHFIPG (1)
GSLYTPDTKDFSAKLFKLYGPIVKLDPLFARNTLLMVYDPESAAN
(0)
VLRSENRIPYRGGFNSLAYYRKHIKKHENNHKKLTGLITE
()
GEEWWDLRSTVNPVLLQPKTIKLYSAAIDEVAQEMMNR
(2?)
MHRKLDENDRLQAKFDDEMNLWALESIGVVAFGIRLNCFDPNLAENSPEKKLIECVHQI
1459
FNLSNQLDFQPSLWHIFSTPTYKKAMKMFQLQEE 1358 ()
545
LSKYFINKAMRNINKNENKPDEQKGVLEKLLDINEEYAYIMATDMLVAGVDT 390 (0)
176
VANSIAATLYLFAKNPEKQEKLREEVMSKESKRYLKACIKETMRMMPVVSGNLRRTT
4119
KEYNILGYHVPKG 4081 (0)
3451
IDVAFAHQDLSSMEEHYPRPTEFIPERWLADKNDPLYYGKAHPFVMAPFGFGVRSCI 3281 ()
2571
GRRIAELEIETFLTKILENFRVEWYGPPPKIIQTSINYFTGPFNFVFNDIKKK* 2410
&&&&&&&&&&&&&&&&&&&
>CK494024
EST seq for N-term KYG motif CK493643 CK540020
BAAB01054952.1
BAAB01013929.1 BAAB01206514.1 28% to 49A1
MKMAKSTVVIRQSL
(2)
LQRQCRRHIAGSSSSRTSTSPQRRNAAATASATATCLKPFNEIPGPMALPMLRHSAHILPKI (1?)
GNFHHTVGLDVLENLRKKYGDLVRLSKATRTRPVLYVFHPEMMRE
(0?)
VYES
A
middle region fragment
>BAAB01019704
Length = 2835 CYP12 like WALES motif (at ETAM site)
42% to
CYP49A2
1227
RLEELRCKGNVLNEELETEIYRWALETVGMMLFGIRLGCLDARYVTQSMHTF
1072
&&&&&&&&&&&&&&&&&&&&&&
>CYP333A2
BAAB01120524.1 Length = 1355 38% to CYP49A1 C-helix region
126 VYRAEEANPLRPGFQVLDYYRTQLRKSRYGGLHGLINA
239 ()
1099
QGPEWREFRTKVNPALLLPKLVKLYAPGIDEIAQDFVQRYLFSFVIKYYNCVCNFLCSSIIPLR
1290
>CYP333A2
mitochondrial P450 CYP333 family I-helix aa 294-337
BAAB01104455.1
61% to CYP333A1
3
DNFELELTKFSLEATALVALGSRLGCLKDSLDSDHPARRLMKSTRDIFELTYKLEIRPSP 182
183WRYIATPAYKMVIEAYDTQWE
245 ()
VSKYYFRISMMYINHARKKLEQRGYDIPEEEKSVLEKLIAIDEKVAVMMASEMLLAGIDT
753
strongly
suspect these two fragments are part of the same gene with
one
exon missing between them from the AGIDT MOTIF TO THE PKG MOTOIF.
>CYP333A2
BAAB01160330.1 Length = 986 also
ovS022C05f from KAIKOBLAST
aa
262-365 75% to CYP333A1 BY922830.1
482
IDVIAPNEYLSRSEKFYPQPEEFIPERWLVEKSDPLYYGNAHPLVTLPFGFGVRSCIGRR 661
662
IAELEIELLIKRLIEEFKVTWNGPPIKIVNKLTNTFVKPYNFTFTSVK* 808
&&&&&&&&&&&&&&&&&&&
A
related Ips pini sequence (NOT BOMBYX)
>CB407630
JH III-treated male I.
pini midguts Ips pini cDNA 40% to 12F4
CYP12
like sequence from Ips pini (North American pine engraver) (N-term)
MLCSKSFSPLIVLRNNSTISAAGVNSFKKAEVVDNAKPEGWDQAKPFDQIPGIKP
LPLIGNNFRFLPGGEFHKVQGLDITRRLQQKYGRITTLSGLFGVMPIVHLFDPNDFEHVL
RNEGPWPIRKNVECVTYYRQRVRPEIFKGVDGVALTQGEEWLRERSVVNKILMQPRTIEM
YVGSMNEVANDLLELMKHFAKKDPNSEMPDNFQNELYR
>CB408781
JH III-treated male I. pini midguts Ips pini cDNA Length = 464
WALES
region to I-helix. These two Ips
pini cDANs probably are from the same
gene. They overlap by 2 aa and are the right
size to cover a CYP12F sequence exactly in this region.
2
YRWTLESVGVVAYNRRIGCLDINMHKDSEGSRFISAVQEFFDLMYALEYRPSMWRIYSTKKW 187
188 KRFVELMDFITEINQRYINECLATIDPNSDIPDHERSALERLFKVDQQIAVVMASDMLVAGVDT
379
Joined
sequences 37% to 12F1, 36% to CYP333B2
MLCSKSFSPLIVLRNNSTISAAGVNSFKKAEVVDNAKPEGWDQAKPFDQIPGIKP
LPLIGNNFRFLPGGEFHKVQGLDITRRLQQKYGRITTLSGLFGVMPIVHLFDPNDFEHVL
RNEGPWPIRKNVECVTYYRQRVRPEIFKGVDGVALTQGEEWLRERSVVNKILMQPRTIEM
YVGSMNEVANDLLELMKHFAKKDPNSEMPDNFQNEL
2
YRWTLESVGVVAYNRRIGCLDINMHKDSEGSRFISAVQEFFDLMYALEYRPSMWRIYSTKKW 187
188
KRFVELMDFITEINQRYINECLATIDPNSDIPDHERSALERLFKVDQQIAVVMASDMLVAGVDT 379
CYP15 like sequence
>CYP15C1
40% to CYP15 made from non-overlapping pieces, may be hybrid
note:
CYP15A1, B1 and C1 are all probably orthologs. Each species
seems
to only have a single CYP15. (not
found in Drosophila)
BAAB01071346.1
BAAB01157546.1 BAAB01036905.1 BAAB01022841.1
BAAB01157214.1BAAB01068330.1,
AADK01010670.1, AADK01007973.1
MLALIALCFILFFYIISRRHRGLCYPP (1)
2393
GPTPLPVVGNLISVLWESRKFKCHHLIWQSWSQKYGNLLGLRLGSINVVVVTGIELIKEV 2214
2213
SNREVFEGRPDGFFYTMRSFGKKL 2142 (1)
GLVFSDGPTWHRTRRFVLKYLKNFGYNSRFMNVYIGDECEALVQLRLADAGEPILVNQMFHITIVNILWRLVAGKR
()
YDLEDQRLKELCSLVMRLFKLVDMSGGFLNFLPFLRHFVPRLIGFTELQEIHNALHQYLR ()
YQEIIKEHQENLQLGAPKDVIDAFLIDMLESQDDKR
()
ATLDDLQVVCLDLLEAGMETVTNTAVFMLLHVVRNEDVQRKLHQEIDDIIGRDRNPLLDDRIR
()
MVYTEAVILETLRISTVASMGIPHMALNDAKLGNYIIPK
()
one
exon not identified
326
GKRRCIGEGLARSELFMFLTHILQKFHLRIPKNEPLPSTEPIDGLSLSAKQFRIIFEPRKTFKSI* 523
CYP18A1 sequence and a related sequence
>CYP18A1
60% TO CYP18A1 ORTHOLOG AU005208 EST BAAB01190855.1 BAAB01187772.1
Bmb024372
from Li Bin
MITMLTNSKILWALWQVMNYCVSRTSVMLIIVTCTALLLTQFLKLVRDIRKLPP
GPWGPPVVGYLPFLGVRHKTFLQLARNYGALFSARLGNQLTIVMSDYKIIREAFRREEFTGRPSTPLMHTLDGL
()
GIINSEGRLWKNQRRFLHEKLREFGMTYMGNGKKLMEDRIQ
()
NEIHELIVSLHRAQGAPIDVNPLLALCVSNVICGITMSVRFSNGDVRFERLNHLIEEGMRLFGEVHYGEYIPLYN ()
YLPGKALAQEKVAKNRDEMFAFYQTLIDEHRETLDINNARDLIDVYLIEIEKAKSEGRAGELFEGRDH
()
ELQLKQILGDLFSAGMETIKSSLLWMIVFMLRNPDVKRRVQEELDAVI
GRERLPSIDDISSLPYTETTILETLRLSSIVPLATTHSPTR
()
DVQINGYKIPAGSQVIPLINCVHMDPNLWDEPNKFNPSRFIDATGKIRRPEYFMPFGVGRRMCLG
DVLARKEMFMFFSCMMHQFDLEMAEGDALPSLEGIVGATIAPKAFRVKFLARSPVPLVPTTLSADSSHLRHVGSH
CYP18B1 sequence
>39%
to CYP18A1 BAAB01081335.1 BAAB01007952.1
BAAB01142048.1
BAAB01138082.1 BAAB01045663.1
note:
this sequence is made from several non-overlapping contigs.
MLIYTRIISTLGMWTKLDTYIELYNGYMQDILDRKHRTMDLLFLVLLGLLFLFLKRTICYYMYLPP
(1)
GPWGVPFLGYLPFMKSSPHAMYSRMADKYGEISSIKLGNHLVVCLNSPKLVKELFS
RSDSIARPRTPLNEIMEGR
(1?)
GIVLSEGILWQKQRQFLHEKFRALGVKVWPKQRFEKFII
(0)
1739
MEIEEFITDLIKLNGAPVDPTLLLGRHVHNIICQLMMSFRFEEDDQEFGIFNEKISRGMKL 1557
1556
YGSIHASEYVRHYL 1515 (0)
KLPGKKTILDEMKRSLADISEFHANKIRERIDYRATHPYDEPADLLDYYLDNIEARKSRKRFPDIFPGVDP
(1)
EKQVVQVMNDLFSAGMETSRTTLSWTLLMMIHEPDVAAKVRAQLTETVHPGELVTLDHRPELPY
LEAVLFETLRCVSLVPLGTTHVNTT
()
SEWKVDKYVIPKGAHIIPLIGKMNNDPKVYPEPDKFKPERFLRDGQFHIPDSYMAFGVGQ
RLCLGIQLARMQLFLFFANIMNRFEFSLPEGAEMPPLEGFMAATHTPLPYSLCFHKIDN*
CYP49A like sequences
>CYP49A1
44% to 301A1, BAAB01133495.1 BAAB01150577.1 BAAB01091102.1 BAAB01198981.1
BAAB01133297.1
BAAB01093561.1 possible N-term
note:
C-term exon on this gene has been changed 7/20/04 because it
matches
better to the CYP49A1 Apis mellifera sequence.
Bmb020819
from Li Bin
MIMINLLGAKNSMLSTCPVHIQRARSTHVVDATAFEVSSPVKPWEDVPGPKPLPLLGNTWRFTPYI (1?)
GSYSVEHIDRICVSLRAKYGKCVKMAGLLGRPDMLFVFDANEVERVFRGEDAAPHR
()
PSMPSLNYYKHTLRKDFFSAEENCAGV
()
HGDSWSAFRTKVSRVALSAGAAAQYTAPVAEVADCFVER ()
IRKIRDENMETPEDFLNEIHKWSLE
()
SLGLIALDTRLGCFEACEGSESQRLIDAV
KTFFLCVGELELRAPCGGSTPPPCSDERRCFRHYPQV
()
SVTLRHVDKALEEIKLNGSSKSLLQDLVTAAGARVAAVAALDMFLVGIDT
()
TSTAVASILYQLSSRPHVQEKIYEEVTKALQGRPMSPGDLNQMPYLKATVKEVLR
()
MYPVVIGNGRQLTKDTVICGYNIPKG
()
TQVIFQHYVMGNSEEYFKDASQFRPERWLKRTAQRHHAFASLPFGYGKRMCLGRRFAELEIHTVICK
(0)
>BAAB01052936.1
possible C-term for this gene
LLQKYKLEYHYGDLEPTRSFIARPKRALKLRFIDRI*
>CYP49A2
BAAB01014940.1 BAAB01083735.1 51% to CYP49A1 Drosoph.
BAAB01056249.1,
AADK01003288.1 first exon is a guess
MLSGACTAHYTKHIRCVIIVMKYQTTVAAALESFRNF (1)
ARCYTVMPGPRPLPILGNSWRFALGW
KPWRTKRLDLTLWCLRSLAGAGGAAKVAKLFGHPDLVFPFCAEETARIYRREDAMPHR
AAAPCLKHYKQELRKDFFGDEPGLIGI ()
HGEPWSRFRSKVSKALIAPEAARAAVPELDYVANDFVIR
()
LEHLLDLNRELPKDFLTELYKWALE
()
SVGAWALGTRLGCLNDTKTDAKEMIK
CIHGFFHSVPELELSAPLWRIYSTPAYKTYIEALDSFRL
()
LCLKRLTDKGVCAQVAKNCGQKVATILALDLMLVGVDTTAA
()
AAASSLYLLANVPRAQRALQKELDTNLPKNRILNDKDLDKLPYLKACIKEALR
(2?)
MKPVILGNGRCIQSDTTISGYKVPKG (0?)
THIVFPHYVLSNEERYFPSPHEYVPERWLRENDIAG
(gc boundary?)
VCRKQKEIGIHPFASLPFGFGRRMCVGKRFAEVELQLLLAR
(0?)
IFQKYNVLWRHPELTYSVTPTYIPNESLKFTLNKRNE*
CYP301A1 sequence
>CYP301A1
65% to Drosophila 301A1, 64% to Apis 301A1
BAAB01030444.1
BP117781 BAAB01134599.1 BAAB01155686.1 BAAB01102862.1
BAAB01093671.1
BAAB01132355.1 BAAB01006099.1
MGRALRSFAAYARPIQLQSTRNSSSCPFSKRQRSQIAPTAELNEEIFA
NARPYSEVPGPRPIPILGNTWRMVPVIG
QFDISEFAKVTKQFLDTYGRIVRLGGLIGRPDLLFVYDADEIER
MYRREGPTPFRPAMPCLVKYKSEVRKDFFGELPGVVGV
()
HGDQWRRFRSKVQRPILQPQTVKKYVAPIELVTEDFIKYMVDARDENGDLPHEFDNDIHRWSLEC
()
IGRVALDVRLGCLSPQLNSNSEPQRIIDAAKFALRNVAVLELKA 306
307
PYWRYIPTPLWSK YVNNMNFFVE ()
368
ICSRYINEALERLKTKKVTSENDLSLLERVLRSEGDPKIATIMALDLILVGIDT 529 ()
ISMAVCSILYQAATRLEQQDKMAEEIRRVLPDPSKPLSYSDLDKLHYTKAFVREVFR ()
MYSTVIGNGRTLQDDDVICGYHIPKG
()
QVVFPTIVTGNMEQFVSDPLEFKPERWLEGGGKLHPFASLPYGFGARICLGRRFADLEIQVLLAK
(0)
LLSRYRLEYHHEPLDYAVTFMYAPDGPLRLRMIER*
CYP302A1 like sequence
>CYP302A1
46% to 302A1 Anoph., BAAB01119552.1 BAAB01040509.1 BAAB01022270.1 CK534186
BAAB01132138.1 missing N-term or N-term not identified
AADK01003274.1
MFVRLTVKNNIPYRARKCVYRRASENFVGSEHASKVNEQGDNLM
8340
NFEDIPGPRSYPIIGTLHKYLPLI (1) 8411
GDYDAEALDKNAILNWRRYGSLVREKPIVNLVHVYDPDDIEAVFRQDHRYPARRSHTAMNYYRTNKPNVYNTGGL
()
LRSNGPDWWRLRSIFQKNFTSPQSVKTHVSDTDNIAKEFVEWIKRDKVSSKNDFLTFLNRLNLE
()
IIGVVAFNERFNSFALSEQDPESRSSKTIAAAFGSNSGVMKLDKGFLWKMFSTPLYKKLVNSQIYLEK
()
ISTDILIRKINLFESDDSKNDKSLLKTFLQQPQLDHKDIMGMMVDILMAAIDT
(0)
463
TAYTTSFVLYHIARNKRCQDEMFEELHTLLPKKDDEITADVLSKASYVRSSIKE 536
SLRLNPVSIGIGRWLQKDIVLKGYSIPKG
(0?)
VIVTQNMTSSRLPQFIRDPLTFKPERWMRGSPQYETIHPFLSLPFGHGPRSCIARRLAEQNICIILMR
()
LIREFEIQWAGEELGVKTLLINKPNKPVSLNFIPRSS
CYP303A1 like sequence
>CYP303A1
45% to CYP303A1 BAAB01045446.1 BAAB01180906.1
BP126135
BAAB01031702.1
Bmb002871
Bmb015732 from Li Bin
MWLAALAVLLAVCLYLFLDTLKPRKFPPGPKWTPILGCAKEVYKLREK ()
TGYLYKAVRELSLTYCKETPVLGLRIGKDRIVMVNSLEANKEMLFNEDIDGRPKGIFYQTRTWGERRGVLLT
DGELWKEQRRFLIKHLKEFGFGRSGMGETAKLEAEHIVIDVMHMIGDRGSAVIQMHN
FFYVYILNTLWTMMAGNRYNPSDPQMKILQ
SMLFDLFAAVDMVGTAFSHFPILSIVAPTLSGYR
NFIKTHKRIWKFLREELARHKDSFQPDKEDKDFMDVYIRALREHGEVNTYSEGQ
LVAMCMDMFMAGTETTSKSMSFCFSYLVREQEVQRKAQEEIDRVVGKDRVPSVNDRPN (2)
MPYNEAIVHECVRHFMGRTFGVPHRALRDTTLAGCHIPE
(0)
DTMVVSNYTNILLDENYFPEPYSFKPERFLVNGRVSLPDHYFPFGLAKHRCMGDALAKCNIF
VFTTTMLQRFSLVPVPGEGLPSLDHMDGATPSAAPFKALVIPRI*
CYP305A sequence
There
is no 305A sequence.
CYP305B1 sequence
>Bombyx
mori CYP305B1 mRNA AB044900 BAAB01125622.1 Bmb009083 Bmb026475
BY931349.1
EST., AADK01021449.1, BAAB01102353.1, BAAB01149133.1
BY922832.1,
BP117067 ESTs, BAAB01028853.1, BAAB01141906.1, AADK01007172.1
MLPVLVCIIVIVLICCNVIRSVIKPEKFPPGPIWYPFFGSSSIV
QQMTSKHGSQWKALLELSKQWSTQVLGLKLGRELVVVVYGEKNVRQVFSESEFDGRPN
SFFYKLRCLGKRLGVTFVDGPLWREHRQFTVKHLKNVGFGKTSMELEIQNELKLLREY
INDNKHKPIKVDSMFSSAVMNVLWKYVAGERIREDKLERLLELFYLRSKAFTLTGGLL
SQIPWCRFIIPGLSGYKLIVDLNQQISEIIEEAIKKHLNKEVQQNDFIYSFLDEMNEE
NKASFTYDQLKTVCLDLIIAGSQTTGNAVKFALLSVLRNKNIQEKIFNEIENTIGDSM
PCWADSSKLVYTSAFLLEVMRIHTIAPLAGPRRVLQDTVIDGYVIPKETTVLISLADI
HLDPNLWPDPHEIKPERFIDEKGLSKSNEHIYPFGSGRRRCPGDSLARSFVFIIFVGI
LQKYRIDCVNGVLPSNEADIGLLAAPKPFVANFVSRE
related
to BX561870.1 Glossina morsitans (50%)
GTRKGITGVDGPLWYEHRHFSMKQLRNVGFGRTPMEKHIERETDDLLAYIEGLNGLPVCP
SSFLAHVVINVLWTMVANKHFAYEDKRLEKLLNLLHRRSQAFDMSGGLLSQY
PWLRFIAPKKTGYNIICQLNNELHEFFMETIEKTQAHTDARKC
CYP306A1 sequence
>CYP306A1 mRNA
for cytochrome P450 monooxygenase,
complete cds. BAAB01049676.1 BAAB01150476.1
BAAB01016115.1
BAAB01136488.1 BAAB01075467.1
ACCESSION AB162964
Niwa,R.,
Matsuda,T., Yoshiyama,T., Namiki,T., Mita,K., Fujimoto,Y.
and Kataoka,H.
TITLE CYP306A1, a cytochrome P450 enzyme, is
essential for ecdysteroid
biosynthesis in the prothoracic glands of Bombyx and Drosophila
JOURNAL J. Biol. Chem. (2004) In press
41% to
Drosophila m. 306A1 AV404609 Bombyx mori prothoracic gland cDNA
MDLYFIWLVTFVAGFWIFKKIKEWQNLPPGPWGLPIVGYLPFID
RYHPHITLTNLSKTYGAIYGLKMGSIYAVVLSDHKLVGDTFSKDSFSGRAPLYLTHGL
MNGNGIICAEGGLWRDQRKLITSWLKSFGMSKHSVSREKLEKRIASGVYEILENIEKT
SDAALDLPHMLTNSLGNVVNEIIFGFKFPPEDKTWQWFRQIQEEGCHEMGVAGVVNFL
PFIRHVSPSTRKTIEVLVRGQAQTHTLYASMIDRRRKMLGLEKPKGAEYAPHENLLKL
YPNGHIKCIKYSKVSPNTEHFFDPNTLIPTEGDCILDNFLLEQKKRFESGDPTALYMR
DEQLHFLLADMFGAGLDTTSVTLAWFLLYMALFPEEQEEIRKEILSVYPYDDDVDSSR
LPLLMAAICETQRIRSIVPVGIPHGCIEDAYLGNYRIPKNAMVIPLQWAIHMDPNVWE
EPEKFKPRRFLAQDGSLLKPQEFIPFQTGKRMCPGDELSRMLSCGLVSRLFRKQRIRL
ASKIPTAEEMRGTVGVTLAPPPVKYYCEPI
CYP307A1 sequence
>CYP307A1
51% to 307A1 probable ortholog BAAB01102325.1 BAAB01020228.1
Bmb008079
from Li Bin
MSSLIIVLFVFALAVYKLLRRKTVRWVKTNKYGGVETAILRTAPGPVCWPIIGSLHLLGG
HESPFQAFTELSKKYGDIFSVKLGSADCVVVNNLSLIREVLNQNGNVVAGRPDFLRFHKL
FAGDRNN
()
SLALCDWSNLQLRRRNLARRHCGPKQHTDSHARIGTVGTFESVELIQTLKGLT
SRSDASIDLKPILMKSAMNMFSNYMCSVRFDDEDLEFQKIVDHFDEIFWEINQGYAVDFL
PWLAPFYKKHMEKLSNWSQDIRSFILSRIVEQREINLDTEAPEKDFLDGLLRVLHEDPTM
DRNTIIFMLEDFLGGHSSVGNLVMLCLTAVARDPEVGRKIRQEIDAVTRGKRPVGLTDRS
HLPYTEATILECLRYASSPIVPHVATENANISGYGIEKGTVVFINNYVLNNSEQYWSEPE
KFDPSRFLEKTRVRTRRNSQCDSGLESDSERAPVGKPDVEREMLSVKKNIPHFIPFSIGK
RTCIGQTMVTSMSFTMFANIMQSFEVGVENINDLKQKPACVALPKNTYKMHLIPRK
CYP314A1 sequence
>hypothetical CYP314A1 ortholog from many non-overlapping fragments
49% to
314A1 BP179750 BP179443 BAAB01069738.1 BAAB01063661.1
BAAB01099804.1
BAAB01099804.1 BAAB01122030.1 BAAB01134118.1
MQSTSPPLLDWSCVPTLVLAVIAVVVAVTALLTRTSDAKHSCR
LPGPQPLPFLGTRWLFWSRYKMNKLHEAY
()
ADMFKRYGPVFMETTPGGVAVVSIAERTALEAVLRSPAKKPYRPPTEIVQMYRRSRPDRYASTGLVNE
1435
QGEKWYHLRRNLTTDLTSPHTMQNFLPQLNTISDDFLELLNTSRQSDGTVYAFEQLT 1265
1264
NRMGLE
AVCGLMLGSRLGFLERWMSGRAMALAAAVKNHFRAQRDSYYGAPLWK 783
784 FAPTALYKTFVKSEETIHA 840
2337
VPDRIVSELMEEAKSKTTGMAQDEAIQEIFLKILENPALDMRDKKAAIIDFITAGIET 2510
87
LANSLVFLLYLLSGRPDWQRKINSELPPYAMLCSEDLAGAPSVRAAINEAFR 242
243
LLPTAPFLARLLDSPMTIGGHKIPPGTFVLAHTAAACRREENFWRAEEYLPERWIK 410
411 VQEPHAYSLVAPFGRGRRMCPGKRFVELELHLLLAK
(0)
IMQKWRVEFDGELDIQFDFLLSAKSPVTLRLVEW*
CYP315A1 sequence
>CYP315A1 mRNA
for homolog of shadow, complete cds.
ACCESSION AB167737 BAAB01119643.1
Niwa,R.,
Matsuda,T., Yoshiyama,T., Namiki,T., Mita,K., Fujimoto,Y.
and Kataoka,H.
TITLE CYP306A1, a cytochrome P450 enzyme, is
essential for ecdysteroid
biosynthesis in the prothoracic glands of Bombyx and Drosophila
JOURNAL J. Biol. Chem. (2004) In press
MHRFPSMSSIRSAVRSRNSNRCSMSTKPHKSLRTIDEMPHKKSL
PIIGTKFDLFSAGGGKNLHKYIDMRHKQLGPIFYERLTGKTKLVFISDPTHMKSLFLN
LEGKYPAHILPEPWVLYEKLYGSKRGLFFMDGEDWLINRRIMNKHLLREDSDVWLRAP
IRTAVFHFICNWKLRAQSGNFSPNLESEFYRFSTDVILAVLQGNSALLKPTPEYEMLL
LLFSEAVKKIFSTTTKLYALPVEFCQRWNLKVWRNFKQSVDDSISIAQKIVYEMLHTK
DAGDGLVKRLKDENMSDELITRIVADFVIAAGDTTAYTSLWILFLLSNNTEILTEMND
NDQYVKNVVKEAMRLYPVAPFLTRILPKQCVLGPYLLEEGTPVIASIYTSGRDEQNFS
KADQFLPYRWDRNDQRKKDLVNHVPSATLPFAFGARSCIGKKMAMLQMTELISQIVKN
FDLKSMNNSDVDAVTSQVLVPNKDIKVLILPRSISK
CYP324A sequence
>CYP324A1 45%
to 324A1 Trichoplusia ni BAAB01098782.1 BAAB01016062.1
BAAB01088903.1
BAAB01068119.1 BAAB01136435.1
Bmb031911,
Bmb031912 from Li Bin
MLFIIIFIFILLLLTWWLIRWQQVKSYWAARNVPHEPPHPVLGSLTFLQKENP
()
SIWLIKLYKKFPFPYIGIWLFWKPALIINSPELARQILTKDADTFRNRFSNAGKSDPVGALNLFMIN
()
DPVWSSVRRRLTPVFTKLKLQALYPILIRKSNDLKKRIKEDTEKNIKINLR ()
SLFVDYSTDILGEAAFGVSSNSITTGESAMREVTKDFMKFDWLRGLQWSCIFFFPELADFFR
(2)
CKLFPKESLEILRKIYRTMVAERSKSQSISGKSKDLLDALMAMKIEAAAENE
()
VYNEDLLFAQATLFVQAGFETTSSAITFAIYELAYNPEIQ
()
ERLYREIVEAKQKMEGNELDGVVLSNLQYLNCVIN
()
ETLRKYPSLGWLDRVSSQSYKVDDTLTVPAGTAVY
VNVAGIQSDPQLFPKPEEFIPERFNTDNNNIKPFTFIPFGEGPRQCI
()
GIRFGYQAIQFGLSAIILNFKLRPIEGSPLPNNCHIESKGFVYTADHPLHIQFVPRN*
CYP341 sequences, weakly similar to
CYP325
These
sequences are closely related and are weakly similar to CYP325
There
appear to be six genes.
5 exon
1 seqs., 6 exon 2 seqs.,
6 exon
3 seqs., 6 exon 4 sequences, 7 exon 5 seqs., 6 exon 6 seqs.,
5 exon
7 seqs., 6 exon 8 seqs., 6 exon 9 seqs.
>CYP341A1
9 exons 39% to 4C3 BAAB01068196.1 BAAB01181661.1 BAAB01053162.1
BAAB01098630.1
BAAB01068157.1 BAAB01166031.1 BAAB01162916.1
Bmb018467
from Li Bin, AADK01007855.1
MIVILLIVLVLGWFSVFRYRRRNMYKLAAAIPDVDKHIPLLGIAHKFTGNTE
()
VLSNPVDSEVVLKTCLEKDDLHRFIRAIIGYGGIFAP ()
VSIWRRRRKIMVPAFSPRIVQSFVGIISEQSEKLASNLGKRVGKGMFSSWPFLSAYTLDSVC
()
ETALGVKINAQGDKDSSFLKSMNRILNIVCMRIFHLWLQPVWLFKLFPVFNEHQSCIEMLHDFVDK
(0)
VIQNKREEIKRENNSKTEVDYEY
(1)
NLGSYKCKTFLDLLITLSGAEKGYTNIELREEVLTLTVAGTDTSAVAIGFTLELLSKYPEIQEKVYKE ()
LREVFCDSERPLIKEDLEKMKYLERVVKESLRLFPPVPFIIRKVLEDMTL
(1)
PSGSVLPAGSGIVVSIWGIHRDPKYWGPEAEHFDPDRFLPERFNVEHPCCYLPFSSGPRNCL
(1)
1097
GYQYAMISIKTSLSAILRRYKVVGEPEKGPVPRIRVKLDIMMKSVEGCQVALERRPTK* 921
>CYP341A2
Papilio xuthus
MFLCLLCLSVVLGMVLFKLKRRRLYRLASKIPGSDDELPFIGLA
HKFTGTTEDILNSLQKYSYEAMKNNGILRGWLGHILYFIVVDPVDVEVILKTSLEKDD
LHRFIRNVIGNGLIFAPVSIWRRRRKITVPAFSPKIVDTFMEVFAEQSEKLVSVLAAC
AGNGYIAMEPYLCRYTLDSVCETTMGITTNAQNNPNAPYLKALKNILNLVCERIFHLW
LQPDWLYKFFSQSKSHQKYTKEMQGFVDEVIQNKRREIKKEKDLKSEVDRNFGLSNYK
TQSFLDLLIEFSGGENGYTDLELREEILTLTIAGTDTTGISIGYTLKLLAMYPKVQDK
LYQELLDVFGTSDRRIVKEDLSKLKYLERIVKESLRLYPPGPFIIRKVLEDISLPSGR
VFPAGSGAAVSIWGLHRDPKYWGPDAEVFDPDRFLPERFNLKHACSYIPFSSGPRNCI
GYQYALMSMKTVLSAIVRRYKIMGEESGPVPHIKSKIDIMMKAVDDYKICLEKRFK
>CYP341A3
AADK01027984.1 96% to 341A1 exons 1,2
MIVILLIVLVLGWFSVFRYRRRNMYELAAAIPDVDEHIPLLGIAHKFTGNTE 2243
4155
PRDLELVLKTCLEK 4196
>CYP341A3
AADK01029147.1, BAAB01096639 exons 4,5
1917
ETAMGVKVNAQGDKDSNFLISMNRILNIVCMRIFHLWLQPAWLFKLFPVFHEHQKCKKLLHDFVDE 1720
1030 AIQKKREEIGTENNSAIEVDYKY (1) 962
>CYP341A3
BAAB01031047.1 43% to 4G15 342-572 45% to 4AU1, exons 6-9
AADK01017121.1
4094
LGSYKSKTFLDLLITLSGAEKGYTNIELREEVLTLTVAGTDTSAVAIGFTLELLSKYPEIQEKVYKE 3894
3561
LCEVFGDSERPLVKEDLEKMKYLERVVKESLRLFPPAPFIIRKVLEDITL 3412 (1)
2636
PSGVMVSIWGIHRDPKYWGPEAEHFDPDRFLPERFNVEHPCCYMPFSSGPRNCV 2451
2226
GYQYAMISIKTSLSAILRRYKVVGEPEKGPVPQIRVKLDIMMKSVNGCQVALEKRPTK* 2050
>CYP341A4
35% to 4G1 BAAB01061620.1 (C-term), BAAB01046550.1 BAAB01151967 BAAB01098630,
BAAB01016786.1 exon 3
BAAB01023249.1
BAAB01137180.1
BAAB01039204.1
BAAB01138999 42% TO 4G15 341-572 45% to 4AU1
AADK01003016.1
whole gene plus adjacent gene
88% to
341A1
8810
MIVILLIVLVLGWFSVFRYRRRNMYKLAAAIPDVDERIPLLGIAHKFTGNTE 8965 (1)
11577
ALLNPVDMEVVLKTCLEKDDLHRFMRVVIGYGGIFAP 11687 (1)
14661
VSIWRRRRKIMVPAFSPKIVQSFVGIISEQSEKLASNLGKCVGTGKFSSWPFLNAYTLDSIC 14846 (1)
15391
ETAMGVKVNAQGDNDSIFLKSLNRMLNIVCMRIFHLWLQPTWLFKLFPVFHEHQKGKKLLYDFVDE 15588(0)
16204
AIQKKREEIRTENNSGTKVDYKY (1)
17580 DLGSYKSKTFLDLLITLSGAEKGYTNIELREEVLTLTVAGTDTSAVTIGFTLELLSKYPEIQEKVYKE (2)
18316
LCAVFGDSERPLVKEDLEKMKYLERVVKESLRLFPPVPFIIRKVLEDITL (1)
19211
PSGNILPAGSGVVVSIWGIHRDPKYWGPEAEYFDPDRFLPERFNVEHPCCYMPFSSGPRNCL (1)
19622
GYQYAMISIKTSLSAILRRYKVVGEPEKGPVPQIRVKLDIMMKSVNGCQVALERRPTK* 19798
>CYP341A5
BAAB01076430.1 Length = 917 exon 4, AADK01008138.1 only
one exon in 13kb
150
ETAMGVNINAQAKADSEFLKSLNRLLNVICERIFHLWLHPDWLFKQLPVYDEHQKCIKVLHEFIDQ 347 (0)
>CYP341A5
AADK01003016.1
exon 5,6,7,8,9 BAAB01039204.1 BAAB01036273.1
BAAB01189227.1 BAAB01150672.1 BAAB01106539
note: mariner transposase is upstream
Bmb022600 from Li Bin
570 VIQNKREEFQIEKISKIEVDTQY 638
1347
DLGLYKRKTFLDLLITFSGDEKGYTNVELREEMLTLTVAGTDTSAVAIGFTLELLAKYPKIQDKVYQE 1550
2502
LYEIFDGSERALVKEDLEKMEYLDRVVKESLRLFPPVPFIIRKVLEDTRL 2651
3624 PSGNVLPAGSGIMVSIWGIHRDPKYWGPEAEHFDPDRFLPERFNLEHPCCYMPFSSGPRNCL
3809
4037
GYQYAMISIKTSLSAILRRYKVVGEPEKGPVPQIRVKLDIMMKSVNGCQVALERRPTK* 4213
>CYP341A6
AADK01011365.1, 95% to BAAB01046229.1, same as BAAB01090513
same
as BAAB01110300.1, Bmb029009 from Li Bin
AADK01018246.1
exons 2-7
AADK01015317.1
same as AADK01032246 at aa level 97% at nucl. Level
MIALLLILVGLGWVLVFRYRRRNMYKLAAAIPTPDETNLLVGVAHKMMGNTE
2420
VSSNPFDLEVILKTCLEKDDSHRFFRPGIGNGGIFAP 2530
3077
VSIWRRRRKIMVPAFSPKIVHSFVGIISEQSEKLVSSLSKCVGKGMFSSWPFLSAYTLDSVC 3262
3934 ETAMGVNINAQAKAESELLKSMNRFLNVICERMFHVWLHPDWLFKQFPVFNEHQKCIKVWHEFIDQ
4131
6718 VIQNKREEFQMEKISKDLEVDTHY 6789
7230
DLGSHKSKTFLDLLITFSGDEKGYTNVELREEMLTLTVAGTDTSAVGIGFTLELLAKYPK IQE 7418
8335 LYGIFDGSERALVKEDLEKMKYLERVVKESLRLFPPVPFIIRKVLEDTRL 8484
>CYP341A6
BAAB01059505.1 exons 8,9
BAAB01018260.1
exact overlap 15 aa
44% to
CYP325F 395-482
2796
SGNVLPAGSGIFVSIWGIHRDPKYWGPEAEHFDPDRFLPERFNVEHPCCYMPFSSGPRNCL 2978
1565
GYQYAMISMKTSLSAILRRYKVVGEPEKGPVPRIRVKLDIMMKSVDGCQVALERRPTKLC* 1383
>CYP341A6?
3 aa diffs
AADK01021477.1
= BAAB01167086.1 exons 8,9 only 2 exons in 6kb
4834
ASGNVLPAGSGIVVSIWGIHRDPKYWGPEAEHFDPDRFLPERFNVEHPCCYMPFSSGPRNCL (1) 4649
4404
GYQYAMISMKASLSAILRRYKVVGEPEKGPVPRIRVKVDIMMKSVDGCQVALERRPTKL * 4225
>CYP341A7
AADK01032246.1 = BAAB01046229.1
Note exons 1,5,6 same as AADK01011365, this
seq might be a hybrid
If 5 and 6 are removed ths could join with
AADK01003016.1
2328
MIALLLILVGLGWVLVFRYRRRNMYKLAAAIPTPDETNLLVGVAHKMMGNTE 2483
1432
VSSNPFDLEVILKTCLEKDDLHRFFRPAIGYGGIFAP
1542
1804
VSIWRRRRKIMVPAFSPKIVHSFVGIISEQSEKLVSSLSKCVGKGMFGSWPFLSAYTLDSVC 1989
2671
ETAMGVNINAQAKADSEFLKSLNRLLNVICERIFHVWLHPDWLFKQFPVFNEHQKCIRVLHEFIDQ (0) 2868
VIQNKREEFQMEKISKDLEVDTHY (1)
DLGSHKSKTFLDLLITFSGDEKGYTNVELREEMLTLTVAGTDTSAVGIGFTLELLAKYPKIQENIFQE
&&&&&&&&&&&&&&&&&&&&&&
>CYP341B1
AADK01018717.1 exons 1,2, 43% to AADK01032246
BAAB01172821.1,
BAAB01064393.1
4511
MLVQLILCIFVALWLLSQRYKKKEMMKVWEQLKNDYTALPLIGHAYMFFGSQE 4353
2836
VISEPVMAEYVLKTCLEKDDILKCSRFLVGNGSVFAP
2726
>CYP341B1
BAAB01092271.1 Length = 1739 25% to 4S4 ETAM exon 4
GC
boundary
708
ETTLGVKVNAQGNSEQPFLRAFEIICRLDSSRFCQPWLHNDTVYKMMPQYQQHKDSKDFLCNFIDQ 905 (0)
These
two sequences probably join with one more exon between
>CYP341B1
44% TO 4M8 BAAB01210990.1 47% to 4H23
41% to
4AA1 anoph
DAHRNGLKSFLELLIESSGGNKGYTDLELQEETLVLVLAGTDTSAVGVAFTSVMLSRHQDVQEKVYEE
()
LKEVFGDSDRPIVADDLPKLKYLEAVIKETMRLYPPVPLIVR
KVDKDVTLPTGLTLVKNCGIVINIWAVHRNPLYWGDDADIFRPERFIDTPIKHPAAFMAFSHGPRACI
GYQYATMSMKTATANLLRHFRLRPAEPTDPTYKHEKNKPLRVKFDVMMKDMDNFTVQLEPRYK*
>CYP341B1
joined sequences BAAB01092271.1 BAAB01210990.1
Bmb003145
from Li Bin
AADK01012599.1,
BAAB01092272
2366
(1) VSIWRPRRKILAPTFSPKNLTHFVDIFSKQSSYMVKYLGKAAKTGNFSIWKYINTYSMDSIC 2184
708 ETTLGVKVNAQGNSEQPFLRAFEIICRLDSSRFCQPWLHNDTVYKMMPQYQQHKDSKDFLCNFIDQ 905 (0)
VIKSKRNSLEEQKDSTEADQ(1)
NAHRNGLKSFLELLIESSGGNKGYTDLELQEETLVLVLAGTDTSAVGVAFTSVMLSRHQDVQEKVYEE
()
LKEVFGDSDRPIVADDLPKLKYLEAVIKETMRLYPPVPLIVR
KVDKDVTLPTGLTLVKNCGIVINIWAVHRNPLYWGDDADIFRPERFIDTPIKHPAAFMAFSHGPRACI
GYQYATMSMKTATANLLRHFRLRPAEPTDPTYKHEKNKPLRVKFDVMMKDMDNFTVQLEPRYK*
&&&&&&&&&&&&&&&&&&&&&&
>CYP341C1
BAAB01149680.1 36% TO 4H18 AA 111-182 C-HELIX exon 3
594
VNIWHRRRKLLNPHFGTKNQNNFMETFIKQSAVLVNNLRKEADNGTFSVWDYLTAYTLDSVC 409
>CYP341C1
44% to 4d 299-345 I-helix 77% to 4H8 61% to gene 1 above
BAAB01135732.1,
AADK01005588.1
EATLGVQMNSQAHSNLEFLRSFDVCSSLGAARICQPWLHSDIIYHRLQRYKTYQKNTEYVLDFVKQVSIF
tgt (1)
592
PTSDNGMKTVLELLILNKTFNDVELQEEAFVMIIAGTDTSAIGISFTLLMLARYPDVQEKVYQE 783
IQELFGDSERPPEVEDIHRLLYLDAVIKEAMRLYPPAPVIIRKVERETKL
(1) cgt
&&&&&&&&&&&&&&&&&&&&&&
>55%
to CYP341B1 BAAB01045792.1
45% to
325J 66% to BAAB01210990.1, AADK01018755.1
(1)
PSGLILPRDIGILIPIWSVNRNPKYWGDDADVFRPERFLDGTKKHPTAFMTFSQGPRACL
GFKFAMNSAKAALASILRHYRIKPPSELTSMSPGQYPPIRVKFALMTRDVDNFRIQLESRS*
>seq
not assigned AADK01004743.1 one aa diff to BAAB01133870 exon 5, only one exon
in 17kb
10917
AIQNKRQSIKSTNNSYYRVFNLY
10849
>CYP341-un1
pseudogene fragment
42% TO
313A4 aa 398-442 perf to heme BAAB01044466.1, AADK01012171.1
1181 VPRGTVCAVSAMVMGRARRLWGPDAAEYRPERWLAPPHAQPAAFLAFSYGRRACI 1017 ()
CYP332A sequence
>64%
to CYP332A1, 37% to 324A1, 32% to 6G and 6d
CK502566
Bombyx mori cDNA corrected C-term
CK518690
Bombyx mori cDNA
CK526599
Bombyx mori cDNA
CK522682
Bombyx mori cDNA
BAAB01119140.1
BAAB01018054.1 BAAB01100160.1 BAAB01098663.1
BAAB01157576.1
MDVSGPLQLFLIFIIFCLTA
IYLLFNRNYQYWEKRGVPYEKPFFLFGSLSFILRKSFWDYFYELSKRHTGDYVGIFLGFK
PTLMVQTPEIARRILVKDNAHFNNRYCYSSYGVDPLGSLNLFTVK
()
NPKWSNIRHELSPMFTSLRLKTICELMNVNAKELVLKIQRDYIDNNEDVNLK
()
ELFSMYTSDTVGYTVFGLRVSALNDPSSPLWFITNHMVKWDFWRGFEFTAIFFVPALARFLR
()
LKFFSQPATEYIMRLFRTVVDERKKTNQNTDKDLVNHLLKLKENLKLGADI
()
KLADEIMMAQAAVFILGSIETSSSTLSYCLHELAYHPEEQQ
()
KLFEEVDDAIKETGKEILDYENLQELKYLSACILETLRKYPPVPHIDRVC
NKTYKLNDELTIEEDIPVFVNVLAIHRNEKYYPEPDQWRPERMIGVTDNDNLQYTFLPFGDGPRFCI
()
GKRYGLLQMRAAIAQMIHKYKFEAAEPHSTPSDPYSVILSPKSGGRIKFVPR*
CYP337 new family sequences
>CYP337A1
60% to 337A2 29% TO 6A21 30% to 6P3 36% to CYP321A1
BAAB01181176.1
BAAB01007828.1 BAAB01049389.1
Bmb030264
from Li Bin
MLLLLLISFTLLLIFFWKQNNYWKSRNVKQVTGTLFKFTFGSRSLPEYYKEIYDKHNESQ
IGIYLGRRPAIILKDLRDIQAVLAGDFQSFHSRGIILSEKETLADSILFIDDLPRWKILR
QKLSPAFSSLRLKTMFEGIERSARDFVEFIENSGNDQDLEEMPFNAIYKYTTGSIGAAVF
GVDVDQNTLDTPALNITRKALEPTLKSIMTFFLAGTFPRLIKWLDMMNFDNYETSFIDAV
KKVLENRRAGEKQYDFIDVCLELQNHEVLRDLVTGYEIVPTDELLAAQAFFFFVAGADTT
ANVMHFTLLELSSNPSVLKKLHAEIDEVFQDGKKTLNFEDMDKFKYLEMVVNETMRKYPP
IGLLQRICTKETNLPSNNLRISKDTIAVVPVLALHRDERFYPKSDAFDPQRFAPENFNEI
NKFSFLPFGEGNRVCI
()
GAKFARLQLRAGLAWLLRKYTLVPQDYKPVKFERSPFAVRDTKAKYRLINRTN
>CYP337A2
BAAB01100580.1 BAAB01137171.1 new family 60% to 337A1
31% TO
CYP6as 35% to CYP6AB1
BP123442
EST 37% to CYP321A1
MLPVLVIVSLVAILILYWQSSNYWKKRNVREVNGTVLKFTFGNCSLPEYYKQIYDKYNEN
QIGFHLGASPALVLRDLQDVQAVLASNFQSFYRRGFAVNDADVLGGNMLFLDDLPRWKIL
RQKLSPAFSSLRLKAMYEAIEKTARDFTDYIATDERAIKEPFDAVLKYTTASIGVAVFGL
DEKGESLIDLPLSHVAGNALKPSLKANIVFFIGSTFPRLFKWLNMTFFGEYEDVFIGAVK
KVLKNRRKLDKRLDIVDVYLDMQSSGRLRDVVTGFEMEPTDEVIAAQSFFFYVAGADTTA
NAIHFILLELSANPHVLNKLHAEIDTVLPKGTEVLTFEDIDRLTYMDMVISEAMRKYPPI
GFIQRLCTKDAILPSNNLRISKDTVTVVPILAIHRDERFYPNPDVFDPERFTPEKIKERN
KFSYLAFGEGNRICIGARFSRLQVKACITWLLRKYTLKPQEYKPERFERSAFSLRDTKSK
YEFIRRTN
CYP338 new family sequence
>CYP338A1
new family LIKE CYP6
BAAB01074380.1
(one exon covers first 408 aa)
C-term
lies beyond a repeat sequence. No ESTs
cover it
AADK01008949.1
MFFESFLLNLSVLIVLIVALVFDYVTKFFSYWYVRHVPYKTPIPFFGSDYHRVLGLTNST
DEVVKLYNEHPGDKFVGRLKNRIPDLIVKDPDAIKRMLSTDFAYFHSRGLGLDKSRDVCI
RNNLFYADAEKWTLLRQGLEAVLNGLCREFDIHACLSEANGDTNVQQLLSVVLDSVFDNL
LLGDEGTSIKELRTTLQKRSMIVKLKSYLKNIFPSIYVTFGLSTLPNNVLKNTRKYMESS
KLQRLIDDSGYMHQVSLKDKHVYAFENEFASSTLAL
FVTEGYIPCLYTLTALFYELAVNP
QIQEKARNSIEKDKGVNYLDAIIKETMRLHPSHSIISRQCVKMYQYPDSNLTIDRNVTIN
VPVEAIHKDKEHYENPEVFNPERFFDDHGPTKHSYSYLPFGAGPRKCI
CYP339 new family sequence in the mito
clan
>CYP339A1
new family in mito clan,
CK534983
CK537464 CK535691 BAAB01021114.1 31% to
301A1 anoph. BAAB01141557
BAAB01170021.1
BAAB01154229
note:
C-terminal exon was changed 7/20/04.
This is still a guess.
(2
agt)NGEEWSRQRSIIYTPLHNAVTYHIQGIDDICEYFSQKIYNMRNHQDEIPKDLYKDLHKWAFDCL (1)
PGLQESNSYKDLHKWALDCL
GLILFSKKFSMLDTDLVYSQCDMSWMYHSLEKATEAVIKCESGLQWWKILSTPAWYSWVKYCDSWD
()
SLIGKYVLEAEQAISYKAKEIEEQY
PNSNIWINARLLGQEKMNPEDIATVIMDMWLMGVNT
ITSSTSFLLYYLAKYQKAQKILYKEIQENFPEQKIMDLTKIREQTPYLQACIKETLR
()
LTPPIPVLTRILPKNITLDKYN ()
IPRGTLIIMSTQDASLKESNYDDANTFCPERWLKSDSNEYHLFASIPFGYGARKC
LGQNIAETMMSLLTVK (0?)
>BAAB01062474.1
Length = 869 C-term exon for CYP49/301 like gene
217 MIQAYKIEYRREPLEYHIHPMYTPNGPIRLRLVER 321
CYP340 new family only 30-31% identical
to CYP4s or CYP325s
There
are at least 8 genes in the following set. Most of the individual exons do not
exist
on overlapping contigs so they cannot be joined in a single complete gene
5 exon
1 sequences, 5 exon 2 sequences, 7 exon 3 sequences, 8 exon 4 sequences
8 exon
5 sequences, 9 exon 6 sequences, 13 exon 7 sequences
8 exon
8 sequences, 9 exon 9 sequences, 5 exon 10 sequences
This
first gene is complete
>CYP340A1
old CYP340Aaa CYPnew1 new family BAAB01050176.1
BAAB01056719.1
revised to include exon 2 BAAB01160534.1 (exons 3,4,5)
BAAB01199634.1
exon 3 BAAB01056719.1 exon 3 BAAB01196580.1 exon 6
BAAB01158445.1
exon 6
BAAB01006148.1
exon 7 with one stop codon
BAAB01212205.1
exon 8 BAAB01008628.1 exon 8 BAAB01212925.1 exon 9
BAAB01103225.1
exon 10 BAAB01050816.1 exon 9 1 aa diff
BAAB01009268.1
exon 10 1 aa diff and a frameshift
BP124291
MFIPIIVVVVCVLLLFYSIAERHSNVPLCDNYLPVIGHTHMIIGG
SGLLQTVKYACEETNKKGGVAILKLGLSNYY
VITDPEDNLTVANGTLQKHFVYQFASNWLGDGLITSS
GETWKRHRKLLNPAFSQQMLNIYTVVFNRKSRNLISAIEIQMKSGPVLIDTVFREMALNTLL
STAFGIEEEDSDFNKKYIHAVDVILALLTRRFQNPLLHYPFFYKLSALKKKEEEVIETILTASKK
IIKNKREALNKERSNENGYAT
ERKFKSMLELLLKDSDGDALTDEEIRDEVDTLILAGSDTSSQLTLVVVMVLGSYPEIQDKVYQE
VASVCGVSDTDVEKHQHPRLVYTEAVLKETLRLYPTVAVVLRKPENEIKL
KNYTIPANSNCVLGIYGLNRHPVWGPDAHTFRPERWLEPGGVPGNPNAFAGFSVGKRNCI
GKTYALISTKIILAHLVRRYKVTADISKIEFKMDVIMTPSDNCYVDFELRK
>CYP340A1-de9b
BAAB01213048.1 Length = 2727 new exon 9 94% to CYPnew1bb
750
NYTIPSDSNCVLGIYGLNRHPVWGPDAHTFRPERWLELGGVPDDPNAFAGFSVGKRNCI 926
>AADK01017983.1
nearly identical to BAAB01213048, one frameshift and 1 aa diff
2 aa
diffs to CYP340A1-de9b
231
NYTIPSDSNCVLGIYGLNRHPVWGPDAHTFRPERWLELGGVPDDPNAFAGFT 386
388 VGKRHCI (1)
>CYP340A2
AADK01003526.1 identical to BAAB01106541.1 (2 genes)
nearly
identical to BAAB01003905.1 (3
diffs in N-term)
identical
to CYPnew1dd
78% to
340Aaa, 92% to 340Acc, 66% to 340Abb
old
CYPnew1dd 91% to CYPnew1cc
BAAB01128878.1
exon 7
BAAB01171305.1
Length = 1430
BAAB01040719.
Length = 645 exon 3
BAAB01163949.1
Length = 1132 exons 3,4,5
BAAB01116297
BAAB01160956
64%
to 340A3, 78% to 340A1
1679
MFILIVLVVVCVVLLFYSIKKSNSNVPLCDNYLPVIGHTHMLIGGGK (1) 1545
490 KLLRTVKYACEEANKKGGVVILQLGLENYY (1) 401
VITDPEDNLTIANGALQKHSFYQFASNWLGDGLVTSA
GETWKRHRKLLNPAFSQQMLNIYTVVFNQESKNLISAIEIQMKSGPVLITAALKEMALKTLL
STAFGIEVEECDFNQKYMHAVDEVMAVLTRRIQNILLHNSFVYKLSALKKKEDELIETITTMSNK
VINSKRRALKNKQESHENGCAS
120
EKKTQSMSDLLLKGLDDEAFTDKEIRNEVDTLIFTGSDTSSQIMTVVVMLLGSYPEIQDKVYQE 311
IKSVCGDSDADVDKLQHPR 8576
8575 LVYTEAVLKETLRLYPITPVVLRKTENEIKL 8483
7374
SYTIPANSNCMLGIYGLNRHPVWGPDAHTFRPERWLELGGVPDDPNAFAGFSVGKRNCI 7198
6278
GKTYALISMKTTLVHLLRRYKVTADISKIEFKLDALMVPSDNCYAKFESRK* 6123
>CYP340A3
old CYP340Abb CK503564 CK511251 CK504721 BAAB01089735.1
new
family CYP4 like
BAAB01006192.1
BAAB01077706.1 BAAB01092224.1 67% to CYPnew1
complete
3080
MLIPVFLIILCVILYYFYWRDKSINNVPLCDKYLPIIGHTHLFIGNTK (1) 3223
3906 ILRTVKSICEDTNDKGGVTVARLALQDYY 3992
VITDPVDNLKIANGTFLKHFAYRFTSHWLGDGLITSS
(1)
GETWKRHRKLLNPAFNQQILNSFIGVFNDESRKLVSEIGNEMAKGPVEVTTPFRQMAFRLLF
()
LTAFGIPVEDSDFNQKYIHSVDKLLSMLIYRFQNVLLHNSFIYKISGLKKKEEQMVETVHAMSNM
(0)
IIKRKREASKNKSPTDEHCYDT
(1)
TAHRYKSILDLLLKGLDGDALTDKEIRDEVDTIIVAGYDTSSWVLTLVMMALGSYPEIQNKVYQE
()
VSSMFGDSEADVDKSHYPGLVYVEAVLKETLRLYPIVPIALRQTESDIELK
()
NYTIPADSNCVLGIYGLNRHPVWGPDAHTFRPERWLEPGGVPDDPNAFAGFSVGKRTCI
(1?)
GKVYALMSMKTTLVHLIRRYKVTADISKVEFKMDVLMTPVNNCNVKFELRK*
(-)
strand second gene on AADK01020942.1, 2 aa diffs to CYP340A2
1237
MFILIVLVVVCVVLLFYSIKKSNSSVPLCDNYLPVIGHTHMLIGDGK (1) 1106
>CYP340A4
old CYP340Acc CYPnew1cc BAAB01206095.1 BAAB01104811.1 BAAB01080812.1
BAAB01021854.1
91% to
CYP340A2
AADK01031494.1
identical to exon 9 adds exon 10
VITDPEDNLTVANGELQKHYFYQFTSNWVGDGLVTSA (1)
GETWKRHRKLLNPAFSQQMLNIYTAVFNRESRNLISAIEIQMKSGPVLITAAFKEMALKTLL
(1)
STAFGIEEEECDFNQKYMHAVDELMAVLTRRLQNILLHNSFIYKLSALKKKEDELIETIMAMSNK
(0?)
AINSKRRALKNKQESHENGCAS
(1)
EKKTQSLSDLLLKGLDDEAFTDKEIRNEVDTFIAAGSHTSSQLMTVVVMVLGSYPEIQDKVYQE
()
IKSVCGDSDADVDKLQHPRLVYTEAVLKETLRLYPIAPVVLRKTENEIKLK
NYTIPANSNCLLGIYGLNRHPVWGPDAHTFRPERWLEPGGVPDDPNAFAGFSVGKRNCI
2581
GKTYALISMKTTLVHLVRRYKVTADISKIEFKMDVIMVPSDNCYVKFESRK* 2426
&&&&&&&&&&&&&&&&&&&
>CYP340A5P
AADK01040486.1 5 aa diffs to CYP340A1
more
seq is known, but confidential
1341 VASVCGVSDDDVEKHHHPRLVYTEAVLKETLRLYPTIPLVLRKPENEIKL 1490
&&&&&&&&&&&&&&&&&&&
>BAAB01190249.1
exon 6 85% to CYP340A1
144
IIKNKRESLNKERSNENSYTT 206
>BAAB01092011.1
Length = 2603
3 aa
diffs with CYP340A1
1939
MFIPIILVVVCVLLLFYSIAERHSNVPLCNNYLPVIGHTHLIIGG 2073
>BAAB01106113.1
Length = 2553
90% to
BAAB01003905.1, 1 aa diff to CYP340A2
2098
MFILIVLVVVCVVLLFYSIKKSNSSVPLCDNYLPVIGHTHMLIG 2229
CYP340A3
facing away from each other
60% to
BAAB01003905.1
255
MLIPVFLIILCVILYYFYWRDKSINNVPLCDKYLPIIGHTHLFIG 121
&&&&&&&&&&&&&&&&&&&&&&&&
>BAAB01199634.1
Length = 696 exon 2, 2 aa diffs to CYP340A1 same?
25
SGLLQTVKYACEESNKKGGVAILRLGLSNYY 117
&&&&&&&&&&&&&&&&&&&&&&&&
>BAAB01023413.1
Length = 2453 exon 2, 66% to BAAB01003905.1
this
seq not assigned
1235 SELLRSVKYICGETNKKGGVARCKLGLDYY
1146
&&&&&&&&&&&&&&&&&
>BAAB01038139.1
Length = 841 exon 4 new 2 aa diffs to CYP340A1
118
GETWKRHRKLLNPAFSQQMLNIYTAVFNRKSRNLISAIEIEMKSGPVLIDTVFREMA 288
&&&&&&&&&&&&&&&&&&&&
>CYP340A6
BAAB01162196.1 Length = 1626 exons
3,4 1 aa diff to Bmb020206
Bmb020206
36% to 313B1 80-182 BAAB01162196.1 exons 3,4
1 aa
diff to BAAB01162196.1
all on
AADK01002164.1 single gene
Bmb020204 33% to 312A1 205-384 90% to Bmb035473 BAAB01073561.1
BAAB01209311 4 aa diffs to Bmb037924
95% to CYPnew1dd
1401 VITDPEDNLTVANGALQKHFFYQFASNWLGDGLVTSS 1291
(1) 1 aa diff to 340A6
5391
VITDPEDNLTVANGALHKHFFYQFASNWLGDGLVTSS
5281 CYP340A6
1016
GETWKRHRKLLNPAFSQQMLNIYTVVFNRESRNLISAIEIQMKSGPVLISAAFKEMALKALL 831 CYP340A6
5006
GETWKRHRKLLNPAFSQQMLNIYTVVFNRESRNLISAIEIQMKSGPVLISAAFKEMALKALL 4821 CYP340A6
ATAFGIEVEECDFNQKYMHAVDEVMAVLTRRFQNILLHNSFVYKLSALKKKEDELIETITTMSNK
()CYP340A6
2997
ATAFGIEVEECDFNQKYMHAVDEVMAVLTRRFQNILLHNSFVYKLSALKKKEDELIETITTMSNK 2803 340A6
1436 VINSKRRALKNKQESHENGCAS 1371 CYP340A6
VINSKRRALKNKQESHENGCAS ()CYP340A6
EKKTQSMSDLLLKGLDDEAFTDKEIRNEVDTLIFTGYDTSSQIMTVVVMLLGSYREIQDKVYQE ()
2 aa diffs to 340A6
514
TEKKTQSMSDLLLKGLDDEAFTDKEIRNEVDTLIFTGYDTSSQIMTVVVMLLGSYREIQDKVYQE 320
2 aa diffs to 340A6
IKSVCGDSDADVDKLQHPRLVYTEAVLKETLRLYPITP
CYP340A6
276
IKSVCGDSDADVDKLQHPRLVYTEAVLKETLRLYPITPVVLRKTENEI 103 CYP340A6
&&&&&&&&&&&&&&&&&&&&
>CYP340B1
C-helix 45% to CYPnew2aa
BAAB01186081.1
VPLWKQQRKALNPAFKQQILNNFMDIFNNQGRRLIMQLAAHGPGSFDHHHPILINNLESSL
32% to
4C3 aa 307-370 49% to CYPnew2aa I-helix exon BAAB01091571.1
SEETISLLDFILDQDKSKQFLTDEEIREQIDIFLIASFDTSATALIYLLTVVGSYPEVQQKIYDE
>CYP340B1 AADK01006724.1
larger seq identical to BAAB01024995.1 exon 9 (-) strand
13326 NVGKLKETNIFMFRILAEVGDNKDVTKDVLPKLVYLEAVIKETLRMYSVVPAIARKTDIDLKLSK
(2?) 13132
11570
KNYTIPKGSSIGIMLGTLHQHPQWGPDAHQFRPERWFTENLNVFAPFSMGKRNCI (1) 11409
10588 GKVYAMMSMKVLLVHVFRHYKVTGDISNTEHRLGVLLKPATGHHIKLDKRNKT* 10427
&&&&&&&&&&&&&&&&&&&&&&&&&&&
>CYP340C1
BAAB01092938.1 exon 5 Length =
2807 a related ETAM exon
1613
ETALGIKMEDHSAVNQQYEQALHDIFAVLTERFQKFWLHYDFVYNRTKLKEREDQIIKVL 1792
1793
HNMSNTV 1813
>CYP340C1
BAAB01203952.1 Length = 3702 new exon 8 52% to CK533198
2542
VRRVLGDAERDVTKEDYLRLEYLEAVLKESMRMYPVAPVIARYSDAEVKL 2691
>CYP340C1 AADK01006724.1
second exon 9 on same contig (+) strand identical to exon 10 BAAB01125095.1
identical
to exon 8 BAAB01203952.1
3243 (0)
CGVSCPHRVRRVLGDAERDVTKEDYLRLEYLEAVLKESMRMYPVAPVIARYSDAEVKL(1?) 3392
4640
KNYTAPAGSGFILLLWGVHQHRIWGADADQFRPERWLEAATLPDPSFFAGFSTGRRSCI (1) 4816
7545
GKVYAMMSMKTTLSFLLRRYRVSSDVTDLEFKLEAILRPHRGHYIAIERRSKDDK* 7712
&&&&&&&&&&&&&&&&&&&&&&&&&&&
>CYP340D1
CYPnew2aa 46% to CYPnew1bb BAAB01026721.1 (1 aa diff)
BAAB01035425.1
Length = 1039 BAAB01188536 BAAB01172813
Identical
to AADK01004833.1 (adds exon 1)
Bmb018392
from Li Bin
9506
MIILTVSLVIIIAVLGSWMLFYRHYKDSPPFHKGLLPIIGHSYLFMGDTT (1) 9360
SIWRNLKNLAQDCTDKGGVLQIIMGFKRHY
IITDPKDALTVANACLQKHFVYSFGSRWLGNGLITGS (1)
GEIWKRHR
KLLSPSNSPQILNTYLGIFNENSRQLVTDLAPMVGEGLKDLSFHIRKMAMNTIF ()
RTAFGVYTNEDENFTKAYMSAVDEILTIITKRFTRVWLQIDFIYNLTSIKKREDELIRIVNEMSNK
()
IIARKRSELASGVAAANET
DNKYQGLLDLMLKLAKEDALTDQEIREEVDTAIMAGFDTSSWILVYVLVKLGSFPEVQDRAYNE
()
2658
IMDVFGDSDRDLEKEDLSKLVYTEAVLKETMRLYTMGPVSLRHIEEDVKL 2807
>CYP340D1
AADK01018708.1 same as BAAB01096550.1
runs off the end
(1) kNFTLKAGTDCSISLFGINRHTVWGEDVNEF
6993 KPERWLDPQKIPDNAFAAFSIGKRSCT (1)
6346
GRAYALTLMKINLAHLLRKYKISGDVSKLDFKFNFIIKPISGSDIGLEYRK*
6191
&&&&&&&&&&&&&&&&&&&&&&&&&&&
>CYP340E1 AADK01004833 second gene upstream of new2aa = CK533198
CK533198 rswgb0_001141.y1 BAAB01126021.1
BAAB01203780.1
BAAB01034164.1
exon 10 overlaps EST seq., 45% to CYPnew1bb
17347
VITHPETAVTVADSTTSKHYIYSFFRRWLREGLLTSS (1)17237
16897 GGVWKRHRKLVSPSFNLRVLYSFMGVFNSGSKKLINHFRDEMNKGPLNVLPLIKVVSLESICR
16709
15510
ISENTFGIEESSDLEFKKKFMIAVDKMLFIVIERIRKIWLHNDFLYNYCTSLKKEEDNVV 15331
15330
KVLTDMTNK 15304
14587
VLHKKQALLKMKSPTEQPEQSKKS 14516
160
EKKFKALLDLLLDLSAEDSLTGPEILEELNTAVFGGYDTTSCTLTYVLMNLGTYQDVQQKMYEE
683 IKNVFGDSERDVESDDLSKLVYTEAVIKESLRLYPIAPIIFRELDQDVKINNYTLK
519
520
AGRGCAINVYGINRHAMWGDDADKFRPERRLEPDGVPSVY*RVATFGVGKRACI
12885
GRTYAMMSMKTTLVHILRQFRIAADITKIEFKIEIILVPQTPCYLTLERRS* 12730
>CYP340E1
BAAB01132907.1 Length = 1852 exons 3,4 new 51% to CYPnew1bb
563
VITHPETAVTVADSTTSKHYIYSFFRRWLREGLLTSS 673 (1)
1013
GGVWKRHRKLVSPSFNLRVLYSFMGVFNSGSKKLINHFRDEMNKGPLNV 1159
>CYP340E1
CK533198 rswgb0_001141.y1
BAAB01126021.1 BAAB01203780.1
BAAB01034164.1
exon 10 overlaps EST seq., 45% to CYPnew1bb
ENTFGIEESSDLEFKKKFMIAVDKMLFIVIERIRKIWLHNDFLYNYCTSLKKEEDNVVKVLTDMTNK
VLHKKQALLKMKSPTEQPEQSKKS
160
EKKFKALLDLLLDLSAEDSLTGPEILEELNTAVFGGYDTTSCTLTYVLMNLGTYQDVQQKMYEE
683
IKNVFGDSERDVESDDLSKLVYTEAVIKESLRLYPIAPIIFRELDQDVKINNYTLK 519
520
AGRGCAINVYGINRHAMWGDDADKFRPERRLEPDGVPSVY*RVATFGVGKRACI
1006
GRTYAMMSMKTTLVHILRQFRIAADITKIEFKIEIILVPQTPCYLTLERRS* 1161
&&&&&&&&&&&&&&&&
>CYP340F1
BAAB01090784.1 Length = 3359 new exons 7,8,9,10 45% to CYPnew1bb
2886
SLIDLLLNLSGDQVFSQEDIREEIDTVIVAAFDTTSWSLTYALLVLGSMPEIQEKVXXX 2719
2611
IEEVLGPTDRNLDKDDLQKLVYTEAMIKEVLRLYNVLPAVLRDITEKVQL 2462
1515
NYTMYPGDQCMILINNLNRHKAWGLDVEQFRPERWLGEAELPEHHSAYFATFGIGRRFCI 1336
GRIFAMYSMKTILVHILRRYSVKSDLAKLRLTSDYVTKPVSGHFITITRRINAVN*
&&&&&&&&&&&&&&&&
>CYP340-un1
pseudogene AADK01025706.1 = like CYP4S3 Bmb021357
I-helix
region 277-344 BAAB01059276.1 44% to 4L4
CYP4C like
Missing
C-term and internal frag, possible pseudogene
TWRNNRRLLNQAFKHTILDGFVDVFNNQARSLVDELAAEADLEKIYILNKLSLHSLRLVF
TILGVSDQEVTPEKFNSYLETNNILHNIFTKRFQKFWLHSDIVFNRTKLKIEQDYAVAVLHG
ELISNCGSQPSTYRPILLSSVMVIYFNSAFIVAGEQ
KPFIDLLIEIAEEKGLSDMEVIHELATFIAAGHDTVPYTLLYTLMCVGSHPPVQQRIYEE
AADK01040671.1
LQQVLGSDDVTKQNLSSLVYLEATIKETMRLYPIAPVVSRVTDCDVKLSKY
missing
about 14 aa at PKG motif
LHHHPVWGSDVEHFKPERWLDPTTLPENAFAGFSTGKRNCI
GKTFAMMSMKTTLAHLLRQTR
C-term
Perf to heme region 49% to 4S3 396-458
end of
this seq is 76% identical to BAAB01034164.1 above
&&&&&&&&&&&&&&&&&&&&&&
>AADK01029578.1=
BAAB01012981.1 Length = 2183 partial exon 7, 77% to CYP340A2
412
MLGLLLKGLDDDALTDREIRNEVDTLITAGSDTSS 308
exon 8
sequences
>AADK01065909.1
= BAAB01083548.1 Length = 2923 new exon 8 1 aa diff to BAAB01023742.1, 79% to
CYP340A4
286
SVANVDKHQHPRLVYREAVVKETLRLYPIGPILLRKAENEIKL 158
>AADK01030164.1=
BAAB01023742.1 Length = 888 new exon 8 1 aa diff to BAAB01083548.1, 79% to
CYP340A4
500
SVANVDKHQHPRLVYREAVVKETLRLYPIGPILLRKPENEIKL 628
>BAAB01117890.1
Length = 1882 new exon 8 1 aa diff to BAAB01083548.1
76% to
CYP340A4
1814
SVANVDKHQHPRLVYREAVVNETLRLYPIGPILLRKAENEIKL 1686
>BAAB01108434.1
Length = 2407 frameshift, 4 aa diffs to CYP340A4 exon 8
CK536690.1
EST also has frameshift AADK01041079.1
1156
IKSVCGDSDADVDKLQHPRLVYTEAVVKET 1245 LRLYQIYSVVLRKTENEIKL 1303
exon 9
sequences
More new families
>CYP365A1
N-term 34% TO 9A1 BAAB01029726.1 BAAB01072071.1
AADK01000274.1
Whole
sequence known but confidential
THQFYQRT
LVIRDPELIKRVCVNDFQHFVDRGFFFNKEVDPLAGSVLFLRGNEWKKLRAKISPIFS
()
PNKLRGMFPLIDNTADEFVIRLRKRIKNKNKDIENVDSNSNIED
AYALVDSEELVGGYTADAIVPCAFGLKSLVMDKPDDPFAVALQAFYEKSLYNVFEK ()
TMRQFWPAFVLFFRMRIIPKKTHDFFHNVVTSVLRARESGKF
>CYP365A1
C-term 42% TO 6D4 AA 308-458 BAAB01113890.1 BAAB01206962.1
Bmb040074
From Li Bin
AADK01026272.1
Whole
sequence known but confidential
DIVISANAFIIFLGGFETTSSTLAFLFLELAADQRVQEKMRQEIREVLSRNDGKMTFEILQELIYMEMVIQ
()
ETLRLYPPFPSIQRMCTKDYLIPGTEKVVEKGTIVLFPTLGIQRDNQ
YFERADDFYPERWSGGFVPPPGVYMPFGDGPRYCI
>Simlar
to CYP6CJ1 Louse P450 EST 40%
CB887321 Lice_03_N06_T3 Body louse cDNA
library Pediculus humanus cDNA.
Length = 704
MDYYYVTLLIIIISALTWFIYALANRNCNY
WKNRNVPYVSPVPFFGNLKSLLFLEKNVGEFLGDLYNSEKLPDHGFYGIYLTNNDPA
LIIRDPELIRNICIKDFQHFSNRNATSDEKNDPLGYNNLFTLTGSKWKFIRTKLTPTFTSGKM
KQMFTLVKETSDELIRYLNENSKEKNYTIMETKRF
>and
Pea aphid P450 EST 40%
CF587676 USDA-FP_120600-031 Acyrthosiphon
pisum, Pea Aphid Acyrthosiphon
pisum
cDNA clone WHAP-003_H03 5'.
MLIFANFWIDFIILITVLFSIIYYYCTSTFNVWKK
LNVPYIRPIPLFGNYLRVALGIENPMETYRKIYCELAGFKYGGMFQMRTPY
LMIRDPEIINNILIKDFSYFTDRGIYVDFKTEPLSEVLFLMNNPRWKKFRSKLSPAFSSGKLKQMFNQI
EKCGHDMINNIFAELKKNPHDIDMRDVVAKYSMDVIGSCAFGLTLM
>CYP366A1
an--0429 new N-term http://www.ab.a.u-tokyo.ac.jp/silkbase/
BAAB01010488.1
BAAB01018170.1 BAAB01015588.1 BAAB01015589.1
AADK01020747.1,
BY914303.1 EST
28% to
4G15
MLIMSEVLWCLLMLCACVWWWCSAPRRTRKLLAALPSFPQLPLLGNIHQIPRNSI
NLFQFLEKIAVTCDTTEMPFVFWLGPRPIL
FISDPEDVKGVNNAFIEKPHYYSFAKVWLGNGLVVAP
PEIWKNSIKKLSGTFTPSIVEGYQEVFAGRAADLV 545
546
QRLKARTTEEPFDIMHDLAYTTLEAICQTAFG
641
>CYP366A1
35% to 4C37 BAAB01132146.1 BAAB01000816.1 ETAM exon
Bmb038385
from Li Bin
AADK01020849.1
best matches to
CYP4C middle region
ETAFGFPKISESIVTKEYYDAFHRCLELLIRRGLNPLLHLDFMYRLTPASKELQKCVGILHNVSNT
()
VITKLIQERECAKQDNRNEI ()
BAAB01011842.1
VPNRRRFKSFLDLMLEMQVSTPELSVEQIRSEVDTVLFGGHETVATTLFYALLMIGRDKNLQDK
LYNEVRDVVGDGGRPVTGADLPRLRYCEATVLETLRLFPPFPAVLRMADKDLQLSSGI
>CYP367A1
BAAB01019896.1 BAAB01107379.1
Bmb008200
from Li Bin
AADK01004845.1
BAAB01014226 4 fam like 27% to 4C15
MLWPVIYLLALVFAYWPYWRWKNRRLLRLSASMPGPRALPIIGNGLLIVVNTGGKFNKTI (0) 4269
5077 LFRTYGDYCKIWLGPELNICVKNPDDIR (0)
LLLTSNKVNQKGPAYEIMKAAIGPGILTG (1)
6651
GPTWRNHRKIVTPSYNKRAVKLYSAVFNREAEVLANLLLKKQSGVTFNVYYDVVEITTQCVC (1) 6836
8492
QTLIGLSKDDSRNVDGMSDLILETQN (2) 8569
9222
LYELLFTKMTVWWLQIPFVYWITGRKATENAYVKKIDRLTSDFLKKRRTALKGGNVDEESMGIVDRYI
ASGELTEQEIKWETMTLFTTSQEASAKITSAVLLFLAHLPDWQEKVYKEIVEVIGSGRND
VTAEHLKHLHYLDMVYQEALRYLSIAALIQRTVEEEITINNGKFTLPIGT
TLVIPIHDLHRDPRYWDEPLKVKPERFLPENVKKRSPNVFIPFSLGAMDCLGRVYAEPLIKTLVVWAVRE
VQLEAEGCVEDLKLHVAISVKFANGYNLKVKPRI*
>CYP367B1
BAAB01152562 N-term 4 fam like
127 aa
more seq known but confidential
MISYMILIVIVFALMWSGWKQKNKKFMEMANQFPGPQALPFIGNALRFMCEPE