Last modified Dec. 3, 1999 (138 different sequences, 128 from D. melanogaster)) These have been unified into contigs. Internal asterisks are used to indicate exon boundaries. I am starting to analyze the new Drosophila genomic sequence from Celera see the N-terminal alignment for 82 distinct genes. 59 are full length. CYP4C3 complete AL072439 AI457023 AI296945 AI106780 AL051651 AC007974 clone BACR48H21 U34323 AC008219 AC014779 MAESILLSKVGQVISGYSPITVFLLGSILIFLVVYNKRRSRLVKYIEKIPGPAAMPFLGNAIE* MNVDHDELFNRVIGMQKLWGTRIGINRVWQGTAPRVLLFEPETVE* PILNSQKFVNKSHDYDYLHPWLGEGLLTSTDRKWHSRRK* ILTPAFHFKILDDFIDVFNEQSAVLARKLAVEVGSEAFNLFPYVTLCTLDIVC* ETAMGRRIYAQSNSESEYVKAVY GIGSIVQSRQAKIWLQSD FIFSLTAEYKLHQSYINTLHGFSNMVIRERKAELAILQENNNNNNNN APDAYDDVGKKKRLAFLDLLIDASKEGTVLSN* EDIREEVDTFMFEGHDTTSAAISWTLFLLGCHPEYQERVVEELDSIFGDDKETPATMKNLMDMRYLE CCIKDSLRLFPSVPMMAKMVGEDVN* IGGKIVPAGTQAIIMTYALHRNPRVFPKPEQFNPDNFLPENCAGRHPFAYI PFSAGPRNCIGQKFAILEEKAVISTVLRKYKIEAVDRREDLTLLGELILRPKDGLRVKITPRD* AA952162 AI388993 AA538617 AA439290 AA940746 AA941522 AI106700 45% to 4c3 RLFPSSPCLAFAHADGVDLNFGGQLVACRLEAI 370 AC007928 Drosophila melanogaster chromosome 3 clone BACR04O02 47% with 4C14 probable 4 family member may be in new subfamily I-helix to heme AC007809 Drosophila melanogaster chromosome 3 clone BACR45M03 (D718) AC007809 AA698109 AL060049 AL076303 intron exon boundaries from C-helix to I helix not certain MLTINLLLAVGALFWIYFLWSRRSKLYFLMLKIPGPIGLPIL GSSLENIITYKR*YGSTILTWMGPVPFIVTRDPKVVEDIF SSPDCHNKSQHIVNAITSCMGNGLLGKIDPHWLDRRKHFNPSFKQDLLLS FFHIFDAETKVLMNLLDTYVDKGEIDVVPEMLRWSFKIAA* ETTMGSEVKHDEHFKNGSLV NRKRNFKQTETNFIYLKCDNPIHFYRLISHSTLNILMPLVQNRMIS KICGYDKLRADNFSRIQKMLDNV VNPLPKTDSDPESNIVINRAMELYRKGDITYM DVKSECCIMIAAGYDTSALTVYHALFLLANHPEHQEAVFEELNGVFPDAGHFGITYPDMQ KLDYLERVIKETLRLIPAIPITARETKNDVRLSNGVLIPKGVVIGIDMFHTHRNPEVWGP DADNFNPDNFLAENMEQKHPYAYIPFARGKRNCIG SKYAMMSSKFALCRILRNYKISTSTLYKDLVYVDNMTMKLAEYPRLKLQRRG* AC007724 Drosophila melanogaster chromosome 3 clone BACR30N15 C-helix from one small fragment of unordered sequences, this might be real 7042-7600 PMGFPFIGLNVFTQVRYACVPFYLKYME KR*YGKTVLTWIGLTPVLVTCEPKILEDIFTSPNCSNRSSVVDKAISSCLGLGLL *VSIDNHWNERRKLLLPSFKNNAVLSFVPVLN NEANFLVTLLAEFVDGGDINLLPELNKWSFKIAA* AC007724 Drosophila melanogaster chromosome 3 clone BACR30N15 (D693) RPCI-98 119054 SHTVYYALVLLAMFPEHQEMVFNEIKEHFPLAKGIEVTHTDLQQLVYLDRVLNETLRLMP 119233 119234 SVPFSSRETLEDLRLSNGVVIPKGMTISIDIFNTQRNTDYWGSKAAQFNPENFLPEKIHD 119413 119414 RHPYAFIPFSKGKRNCIG 119467 XKLALIKILRNYKLKTSFPYKNLEFVDHMVIKLAQSPQLAF* Score = 125 bits (310), Expect = 1e-28 Identities = 55/89 (61%), Positives = 68/89 (75%) Frame = -3 Query: 71 RXLNETLRLMPSVPFSSRETLEDLRXSXGVVIPKGMTFSIDIFNTQRNTXXWGSEAAQFN 130 R +NETLR +PSVPF+ RET D R S GVVIPKG+ IDIF T RN WG++ + FN Sbjct: 100724 RVVNETLR*IPSVPFTPRETKRDFRLSSGVVIPKGVGIGIDIFATHRNRDHWGTDPSSFN 100545 Query: 131 PENFLPEKIHDRHPYAFIPFSKGKXNCIG 159 P++FLP+ + DRHPYA+IPFSKG+ NCIG Sbjct: 100544 PDHFLPDNVRDRHPYAYIPFSKGRRNCIG 100458 AL076582 AL074717 AL072640 N-term to C-helix also AC007724 71656-71049 MDTFQLLLAVGVCFWIYFLWSRRRLYMMHFKIPGPMGLPILGIAFEYLITYKR* YGSTCLVWVGPTPFVITRDPKIAEEIFLSPECLNRSSIFSKPVNSCTGDGLLSLEGN KWVDRRKNLNPAFKQNVLLSFLPIFNSEAKTLVAFLDSLVGQGEKKVRDDIVRWSFRIAT* ETTVGTDVKKDASFKNDSVL (this last line is the ETAM exon) GQLTP***STPSYLLLLI*PQNHFHFRIKPLPNSLRQNSV*NHSKVFVYFVFGLESLPWIRFNCF*PWECVSGYTSYGVVAGSI*C TLRYQDQWDYQFWA*PLSI**PINVNFEYTS*KGSGLIEHFVNTLGKMSIRTKYMDIYGSTCLVWVGPTPFVITRDPKIAEEIFLS PECLNRSSIFSKPVNSCTGDGLLSLEGNEN*ALTYTLITYFIYVSI*MG*SSQELKSGIQAKCFDEFSSHLQL*GENFSCLPGLTC RSG*KKGPRRHSEMVF*NSHS*VKSALKWDVMSTAILYHFSTETTVGTDVKKDASFKNDSVLKSYET*VLPIAQKCSNNIQSFLYF *VYENNCYECFVAIHSQQDIFNTGRI*NAESFGQVQC*QNDRHGEFCLA*LH RSAYTLVIKHTVIPIIINLTTESLPLSYKTFAEFAEAKQCLEPLQGVCLLRVRIRVLTMDTFQLLLAVGVCFWIYFLWSRRRLYMM HFKIPGPMGLPILGIAFEYLITYKREF*IHFIERQWIN*TLCEHFR*NEHTNQVYGYLRLHMFGVGGTDAICNHSGSKDCRGNLFV PRVPK*ELDIFQAG**LYRRWIIIIRG**KLGFNLHINNIFHLC*HLNGLIVART*IRHSSKMF**VFFPSSIMRRKL*LPSWTHL SVRVKKRSATT**DGLLE*PLVSEVCAKMGCDVYCNIVSFFHRNHSGN*CKEGCVLQERFSFEKLRDVSVTNCAKV**QYPKLLVL LGL*K*LL*MFCCHSLTTRYFQHWADLKRRKLWPSPMLTK**AR*VLLGLIA KVSLHPSNKAHRHTYYY*FNHRITSTFV*NLCRIR*GKTVFRTTPRCLFTSCSD*SPYHGYVSTAFSRGSVFLDILPMESSPALYD AL*DTRTNGITNSGHSL*VSDNL*T*ILNTLHRKAVD*LNTL*TL*VK*AYEPSIWIFTAPHVWCGWDRRHL*SLGIQRLPRKSFC PQSA*IGARYFPSRLIAVQEMDYYH*RVMKTRL*LTH**HISFMLASKWVDRRKNLNPAFKQNVLMSFLPIFNYEAKTLVAFLDSL VGQGEKKVRDDIVRWSFRIATRK*SLR*NGM*CLLQYCIIFPQKPQWELM*RRMRPSRTIQF*KVTRRKCYQLRKSVVTISKASCT FRFMKIIVMNVLLPFTHNKIFSTLGGFETQKALAKSNVNKMIGTVSFAWLNC Cyp4d1 AF016992 see also AF016993 to AF017004, CYP4D1 HAS ALTERNATIVE SPLICING MWLLLSLVLLLAIIALEMRRFLRNMRTIPGPLPLPLLGN AHIFLGLTPAEACLKIGELAERHGDTFGLFLGPSYSVMLFNPRDVERVLG SSQLLTKSQEYSFLGRWLNEGLLVSNGRKWHRRRKIITPAFHFRILEPYV EIFDRQSLRLVEELALRISRGQERINLGEAIHLCALDAICETA*MGVSIN AQSNADSEYVQAVKTISMVLHKRMFNILYRFDLTYMLTPLARAEKKALNV LHQFTEKIIVQRREELIREGSSQESSNDDADVGAKRKMAFLDILLQSTVD ERPLSNLDIREEVDTFMFEGHDTTSSALMFFFYNIATHPEAQKKCFEEIR SVVGNDKSTPVSYELLNQLHYVDLCVKETLRMYPSVPLLGRKVLEDCEIN GKLIPAGTNIGISPLYLGRREELFSEPNSFKPERFDVVTTAEKLNPYAYI PFSAGPRNCIGQKFAMLEIKAIVANVLRHYEVDFVGDSSEPPVLIAELIL RTKEPLMFKVRERVY CLOSEST FIRST 4D1 EXON (NOT USED, SKIPPED IN 6/7 ESTS) AI062205 MFLVIGAILASALFVGLLLYHLKFKRLIDLISYMPGPPVLPLVGHGHHFI GKPPHEMVKKIFEFMETYSKDQVLKVWLGPELNVLMGNPKDVEVVLGTLR FNDKAGEYKALEPWLKEGLLVSRGRKWHKRRKIITPAFHFKILDQFVEVF EKGSRDLLRNMEQDRLKHGESGFSLYDWINLCTMDTIC MORE DISTANT FIRST CYP4D1 EXON (THIS ONE IS USED) 6 ESTS MATCH THIS ONE AI257879 AI406141 AI295319 AI064088 AI387842 AI135092 MWLLLSLVLLLAIIALEMRRFLRNMRTIPGPLPLPLLGN AHIFLGLTPAEACLKIGELAERHGDTFGLFLGPSYSVMLFNPRDVERVLG SSQLLTKSQEYSFLGRWLNEGLLVSNGRKWHRRRKIITPAFHFRILEPYV EIFDRQSLRLVEELALRISRGQERINLGEAIHLCALDAICETA* AC015418 = AI261013 in ordered fragments 46% to 4D1 91-492 2330-378 missing 56 N-terminal amino acids 2330 ELREKHGPVFRIWFGKDLMVMFTDPEDIKQLLGNNQLLTKSRNYELLEPWLGKGLLTNGGESWHRRRK 2127 2126 LLTPGFHFRILSEFKEPMEENCRILVRRLRTKANGESFDIYPYITLFALDAICETAMGI 1950 1949 KKHAQLQSDSEYVQAV 1902 1849 HSICRVMHKQSFSFWQRLNVFFKHTKPGKEREAALKVLHDETNRVIRLRREQL 1676 1675 IQERNEWKPEAEQDDVGAKRRLAFLDMLLLTQMEGGAELSDTDIREEVDTFMFEGH 1508 1507 DTTSSAIAFALSLLSKNPDVQQRAFEEASELEGREKESMPYLEAVIKETLR 1355 1354 IYPSVPFFSRKVLEDLEVGKLTVPKGASISCLIYMLHRDPKNFPDPERFDPDRFLVNE 1181 1180 KQMHPFAFAAFSAGPRNCIG 1085 533 QKFAMLELKTSLAMLLRSYRFLPDKDHQPKPLAELVTKSGNGIRLRILPRDENGTTA* 378 Cyp4d2 Al009194 14092-15781 also X75955, AL023401 Cyp4d2 STS missing N-terminal first exon in this genomic DNA AI113485 GH09810, AA697893 HL03421 MLGVVGVLLLVAFATLLLWDFLWRRRGNGILPGPRPLPFLGNLLMYRGLD PEQIMDFVKKNQRKYGRLYRVWILHQLAVFSTDPRDIEFVLSSQQHITKN NLYKLLNCWLGDGLLMSTGRKWHGRRKIITPTFHFKILEQFVEIFDQQSA VMVEQLQSRADGMTPINIFPVICLTALDIIAETAMGTKINAQKNPNLPYV QAVNDVTNILIKRFIHAWQRVDWIFRLTQPTEAKRQDKAIKVMHDFTENI IRERRETLVNNSKETTPEEEVNFLGQKRRMALLDVLLQSTIDGAPLSDED IREEVDTFMFEGHDTTTSAISFCLYEISRHPEVQQRLQQEIRDVLGEDRK SPVTLRDLGELKFMENVIKESLRLHPPVPMIGRWFAEDVEIRGKHIPAGT NFTMGIFVLLRDPEYFESPDEFRPERFDADVPQIHPYAYIPFSAGPRNCI GQKFAMLEMKSTVSKLLRHFELLPLGPEPRHSMNIVLRSANGVHLGLKPRA AI402187 GH09810 94% identical to Cyp4d2 7 diffs probably is 4d2 PPVPMIGRWFAEDVEKRGKHIPAGTNLTMGIFVLPRDPEYFESPDEFRPE RFDADVPQIHPYAYIPFSAGPRNCIGQKFAMLEMKSTVSKLLRQFELLPL APEPRQLMNIVLRSANGVHLGLKPRA Cyp4d8 AI403093 GH22459.5prime GH head pOT2 Drosophila 51% to 4d2 AC010113 chromosome 3L/66A5 clone RPCI98-10H23 AC010557 chromosome 3L/65F1 clone RPCI98-9E21 AC010112 chromosome 3L/71C3 clone RPCI98-9C21 AC010015 chromosome 3L/71C3 clone RPCI98-9C21 U34329 only three diffs with AC010113 above MLLFLLVVLLFGAGWIIHLGQADRRRKVGNLPGPICPPLIGAMQLMLRLNPK TFIKVGREYVLKFGHLQRVWIFNRLLIMSGDAELNEQLLSSQEHLVKHPVYKVLGQWLGN GLLLKDGKVWHQRRKIITPTFHFSILEQFVEVFDQQSNICVQRLAQKANGNTFDV YRSICAAALDIIAETAMGTKIYAQANESTPYAEAVNE* CTALLSWRFMSVYLQVELLFTLTHPHLKW RQTQLIRTMQEFTIKVIEKRRQALEDQQSKLMDTADEDVGSKR RMALLDVLLMSTVDGRPLTNDEIREAVDTFM FEGHDTTTSALSFCLHELSRHPEVQAKMLEEIVQVLGTTRSRP VSIRDLGELKYMECVIKESLRMYPPVPIVGRKLQTDFKYS* DGVIPAGSEIIIGIFGVHRQPETFPNPDEFIPERHE NGSRVAPFKMIPFSAGPRNCIGQKFAQLEMKMMLAKIVREYELLPMGQRVECIVNIVLR SETGFQLGMRKRKHN* AC004717 DS01529 also AC005834 comp(100737-101282) P1s DS01589 DS02501 44% to Cyp4d14, 46% to 4d2, 39% to 4d10, 38% to 4d1, probable new subfamily AA698702 AA567823 AA698531 AI405296 MWILLGIAVLIMTLVWDNSRKQWRVNTFEKSRILGPFTIPIVGNGLQALT LRPENFIQRFGDYFN*KYGKTFRLWILGECLIYTKDLKYFESILSSSTLL KKAHLYRFLRDFLGDGLLLSTGNKWTSRRKVLAPAFHFKCLENFVEIMDR NSGIMVEKLKNYADGKTCVDLFKFVSLEALDVTT*ETAMGVQVNAQNEPN FPYTKALK*SVVYIESKRLASVSMRYNWLFPLAAPLVYRRLQKDIAIMQD FTDKVIRERRAILERARADGTYKPLSTLT*DDIGGKAKMTLLDILLQATI DNKPLSDVDIREEVDVFIFAGDDTTTSGMSHALHAISRHPKVQECISEHV VSVLVPDPDASVTQTKLLEVKYLECVIKQTMRLHPPVPILGRYIPEDLII GEIAIPGNTSI*LMPYYVYRDPEYFPDPLVFKPERWMDMKTTSNTPPLAY IPFSSGPKNCIGQKFANLQMKALISKVIRHYELLPLGADLKATYTFILSS STGNNVGLKPRTRVK* AI106625 AI517120 N-term 65% to AC005834 AC010015 Drosophila melanogaster chromosome 3L/71C3 clone RPCI98-9C21 AC010014 Drosophila melanogaster chromosome 3L/62E5 clone RPCI98-19D21 AL078839.1|CNS00LWJ BAC:BACR48N09 C-term 63% to AC005834 MWLTLITGALILLLTWDFGRKRQRVLAFEKSAIPGPISIPILGCGLQALHLGAENIIGWV GEKFDKYGKTFRFWILGESLIYTKDLQYFETILSSTTLLEKGQLYEYLRPFLNDGLLVST GRKWHARRKIFTHAFHFKVLEHYVEIMDRHSNVMVDNL RKVADGKTAVDMLKYVSLAALDVIT* EAAMGVQVNAQNDPDFPYIKALKSVVYIQPDRMFRFSRRYNWLFPLAAPLLHRQLLSDIR VMHDFTDKVISERRETVRRAKADGTYRPLSGCT AEIGSKSQMALLDILLQSSINNQPLSDADIREEVDTFMFEGDDTTSSG VSHALYAIARHPEVQQRIFEELQRVLGPDASASGTQAQLQDLKYLDCVIKETMRLY PPVPAIGRHAQKELEIGDKTIPANTSIYLVLYYAHRDANYFPDPLSFRPERFLEDQEQGH NTFAYVPFSAGPKNCIGQKFAVLEMKVLISKVLRFYELLPLGEELKPMLTFILRSASGIN VGLRPRKALR AC010113 Drosophila melanogaster chromosome 3L/66A5 clone RPCI98-10H23, Query: 18 113685-114071 N-term MILTATFICFCLASAFNY (from AC010557 65609-65663) FRARRQRSLIKNLKGPFTWPLMGAMHKLLFLTAGRFLSAQT 113855 EYLTKYGTFSRCWVFHRLFIPLADLELSRQLLENDTHLETGYELMKDWLVGGVLMCQS 114029 EQWQKRHSLISGLF 114071 114541-115249 C-term KDHKSLLEILLESKDPQLTGEEICGELNTCNYLGYQLCSPALCFCLVTIARNPSVQQK 114714 CLDELNLAQIKDQGWDLEKLNYLDAVLHETMRLYPPQVIVGRQLKKDFP VGDAELPCGSEIYINLYELQRNEVRYPKANHFDAQR 115041 YSLGPRCCPARKFSMQLLKTLLAPILGNFEVLPYGDEVRLDLRLVLGSSNGFQLALKPR 115249 Cyp4d10 U91634 Drosophila mettleri MSLSLPPLIAVACLVVALARISWLPLRSWLRRRRRTHQLAAQLPGPRNLP LLGNFHMFFGLEPWQVPHLINQLAKKYDGTFKLKMGSNFSLMMFQPRDIE VVLGSSQLLDKAVEYSFLRGWLNDGLLLSGGRKWHRRRKIITPAFHFRIL ESYDEIFDRQTRLLIHKWQQTLGHSFDLGHDVHLFTLDVICETAMGVSTN AQTNADSDYVRAVKTISTVLHKRMFNIFYRFDLTYMLTPLAWAERRALNV LHKFTEKIIVQRREELLRGGVTQTTDGADVGAKSKMVFLDILLQSNIDDK PLTNLDIREEVDTFMFEGHDTTSSGITFFFYNIALYPECQRKCVEEIVSV LGKDTETPVTYDLLNNLNYMDLCIKETLRMYPSVPLLGRKVLQECEINGK IIPAGTNIGISPLFLGRSEDISSEPNTFKPERFDVVTSAEKLNPHAYIPF SAGPRNCIGQKFAMLEIKAIAANVLRHYEIEFVGNAEESPVLIAELILRT KDPLMFKLKKRVI AC010003 Drosophila melanogaster chromosome 3L/75B13 clone RPCI98-6L11, 244-509 66531-67394 = AC0015424 AC007573 AC009369
MFWLGFGLLLLALSLYLLYVFERQSRIDRLTHKWPAPPALPFIGHLHILAKLVGPHPLRRATEMINEHLHDHRAKLWMGTKLYLVDCNPKDIQALCSAQQLLQKTNDYRVFENWLCEGLFTSGFEKWSHRRKIVMPAFNYTMIKQFVAVFEKQSRILLTNVAKFAESGDQIDFLQLISCFTLDTICETALGVSVGSQSSAKSEYLDAVKS*ILVIIDKRLKNIFYRNSFIFKRTSHYKR
EQELIKTLSGFTEGIIQKRIDEINQDAENRNYQSSDAELDGVKRTLCFLDTLLLSKGPDG
MFLTVKDIREEVDS
FLFGGFDLTATTLNSLMYNMTLHPEHQQRCREEVWSVCGKDKSEPISIEQVRQLEFLEAC
IKETLRMYPSGPLTARKATANCTISKYIIPKGSDVIISPI
YMGRCKDFFPDPMVFKPDRWAIGAEPKLRHHF
FIPFMAGARSCMGQRYAMVMLKMVLAHLLRNFLFEPLXERQVELKLNFVITLHTVDPIYVELK
Cyp4d14 AL009194 152A3.2 comp(10682..12421) 55% identical to Cyp4d2 MYLELFAILLATALAWDYMRKRRHNKMYAEAGIRGPKSYPLVGNAPLLIN ESPKTIFDMQFRLIAEFGKNIKTQMLGESGFMTADSKMIEAIMSSQQTIQ KNNLYSLLVNWLGDGLLISQGKKWFRRRKIITPAFHFKILEDFVEVFDQQ SATMVQKLYDRADGKTVINMFPVACLCAMDIIAETAMGVKINAQLQPQFT YVQSVTTASAMLAERFMNPLQRLDFTMKLFYPKLLDKLNDAVKNMHDFTN SVITERRELLQKAIADGGDADAALLNDVGQKRRMALLDVLLKSTIDGAPL SNDDIREEVDTFMFEGHDTTTSSIAFTCYLLARHPEVQARVFQEVRDVIG DDKSAPVTMKLLGELKYLECVIKESLRLFPSVPIIGRYISQDTVLDGKLI PADSNVIILIYHAQRDPDYFPDPEKFIPDRFSMERKGEISPFAYTPFSAG PRNCIGQKFAMLEMKSTISKMVRHFELLPLGEEVQPVLNVILRSTTGINC GLKPRVY Cyp4e1 K00045, AC005451 22741-24838 AI135768 AI293313 AI108382 MWIVLCAFLALPLFLVTYFELGLLRRKRMLNKFQGPSMLPLVGNAHQMGN TP*TEILNRFFGWWHEYGKDNFRYWIGYYSNIMVTNPKYMEFILSSQTLI SKSDVYDLTHPWLGLGLLTSTGSKWHKHRKMITPAFHFNILQDFHEVMNE NSTKFIDQLKKVADGGNIFDFQEEAHYLTLDVICDTAMGVSINAMENRSS SVVQAFKD*ITYTFNMRAFSPWKRNKYLFHFAPEYPEYSKTLKTLQDFTN EIIAKRIEVRKSGLEVGIKADEFSRKKMAFLDTLLSSKVDGRPLTSQELY EEVSTFMFEGHDTTTSGVGFAVYLLSRHPDEQEKLFNEQCDVMGASGLGR DATFQEISTMKHLDLFIKEAQRLYPSVPFIGRFTEKDYVIDGDIVPKGTT LNLGLLMLGYNDRVFKDPHKFQPERFDREKPGPFEYVPFSAGPRNCIGQK FALLEIKTVVSKIIRNFEVLPALDELYDPILSASMTLKSENGLHLRMKQR LVCDST* Cyp4e2 U56957, X86076, U34332 AC005451 AA949109 AI404961 AI405011 AI546418 AI062772 AI403566 AI062848 AA141362 AA142039 AA940590 AI511759 AA940642 AA141985 AI388306 MWFVLYIFLALPLLLVAYLELSTFRRRRVLNKFNGPRGLPLMGN AHQMGKNPSEILDTVFSWWHQYGKDNFVFWIGTYSNVLVTSSKYLEF ILSSQTLITKSDIYQLTHPWLGLGLLTSTGSKWHKHRKMITPAFHFNILQ DFHEVMNENSTKFIKHLKTVAAGDNIFDFQEQAHYLTLDVICDTAMGVSI NAMENRSSSIVQAFKDMCYNINMRAFHPLKRNELLYRLAPDYPAYSRTLK TLQDFTNEIIAKRIEAHKSGAVSTNAGDEFTRKKMAFLDTLLSSTIDGRP LNSKELYEEVSTFMFEGHDTTTSGVSFAVYLLSRHQDEQRKLFKEQREVM GNSELGRDATFQEISQMKYLDLFIKEAQRVYPSVPFIGRFTEKDYVIDGD LVPKGTTLNLGLVMLGYNEKVFKDPHKFRPERFELEKPGPFEYVPFSAGP RNCIGQKFALLEIKTVVSKIIRNFEVLPALDELVSKDGYISTTIGLPDAE RKKRDPYRHKYDPILSAVLTLKSENGLYIRLKERH AC007291 Cyp4e3 BACR02I05 comp(115634-117642) also U34330 MWLAVLALLVLPLITLVYFERKASQRRQLLKEFNGPTPVPILGNANRIGK NP*TEILSTFFDWWYDYGKDNFLFWIGYSSHIVMTNPKQLE*YILNSQQL IQKSTIYDLLHPWLGHGLLTSFGSKWHKHRKMITPSFHFNILQDFHEVMN ENSAKFMTQLKKASAGDTIIDFQEHANYLTLDVICDTAMGVPINAMEQRD SSIVQAFREMCYNINMRAFHPFKRSNRVFSLTPEFSAYQKTLKTLQDFTY DIIEKRVYALQNGGSKEDHDPSLPRKKMAFLDTLLSSTIDGRPLTRQEIY EEVSTFMFEGHDTTTSGVSFSVYLLSRHPDVQRKLYREQCEVMGHDMNRS VSFQEIAKMKYLDLFIKEAQRVYPSVPFIGRYCDKDYDLDGSIVPKGTTL NLALILLGYNDRIFKDPHHFRPERFEEEKPAPFEYLPFSAGPRNCIGQKF ALLELKTVISKVVRSFEVLPAVDELVSTDGRLNTYLGLAPDEKLKREAGR HKYDPILSAVLTLKSDNGLHLRLRERRS* Cyp4e4 AL009194 152A3.6 16210-17867 also U34331 AI405607 AI404522 AL051240 MLVVLLVALLVTRLVASLFRLALKELRHPLQGVVPSVSRVPLLGAAWQMR SFQPDNLHDKFAEYVKRFGRSFMGTVLGHVVMVTAEPRHIDALLQGQHQL KKGTMYFALRGWLGDGLLLSRGKEWHTMRKIITPTFHFSILEQFVEVFDR QSSILVERLRTLSYGNEVVNIYPLVGLAALDIITETAMGVNVDAQGADSE VVHAVKDLTNILATRFMRPHLLFPHLFRLCWPSGFRKQQAGVICLHEFTN GIIEQRRRLLAREANQDKPTKPHALLDTLLRATVDGQPLTDKQIRDEVNT FIFEGHDTTTSAVSFCLYLLSRHEAVQQKLFEELRMHYGQDLFRGVILSD FATLPYLSCVVKESLRLYPPIPAVARCLEKDLVIDEGYIPVGTNVVVLLW QLLRDEAIFTDPLVFQPERHLGEEAPRLSPYSYIPFSAGPRNCIGQKFAL LEMKTMVTKVIRHYQLLPMGADVEPSIKIVLRSKSGVNVGLRPRLY AC005451 comp(14729-16905) AC005415 ESTs AA202364 (LD02646) AI297753 36% to 4e1, 37% to 4e2, 37% to 4e5, 35% to 4e3, 33% to 4e4 probable new family MFLIAIAIILATILVFKGVRIFNYIDHMAGIMEMIPGPTPYPFVGNLFQF GLKPAEYPKKVLQYCRKYDFQGFRSLVFLQYHMMLSDPAEIQNILSSSSL LYKEHLYSFLRPWLGDGLLTSSGARWLKHQKLYAPAFERSAIEGYLRVVH RTGGQFVQKLDVLSDTQEVFDAQELVAKCTLDIVCENATGQDSSSLNGET SDLHGAIKDLCDIVCENAVVQERTFSIVKRFDALFKFSQYRARINKTPKL ITSQIISQRRHQLAAENTCQQGQPINKPFLDVLLTAKLDGKVLKEREIIE EVSTFIFTGHDPIAAAISFTLYTLSRHSEIQQKAAEEQRRIFGENFAGEA DLARLDQMHYLELIIRETLRLYPSVPLIARTNRNPIDLDGTKVAKCTTVI MCLIAMGYNEKYFDDPCTFRPERFENPTGNVGIEAFKSVPFSAGPRRCIG EKFAMYQMKALLSQLLRRFEILPAVDGLPPGINDHSREDCVPQSEYDPVL NIRVTLKSENGIQIRLRKR* Cyp4e5 U78486 Drosophila mettleri MWFIVYILLALPIMLFVFLSCEWPKRNDAEQIEWSSGVPFLGNAHQMGKT PAEILNTFFEFWHKYNKDNFRIWIGYYANILVSNPKHLEVIMNSTTLIEK LDIYDMLHPWLGEGLLTSKGSKWHKHRKMITPTFHFNILQDFHQVMNENS AKFIKRLKEVSAGDNIIDFQDETHYLTLDAICDTAMGVTINAIEKRDTVD VVKAFKDMCHIINMRAFRPLQRSDFLYRFSPEYATYAKTLKTLKDFTNDI IAKRIKVHRTAAAKTNQEGSEFSRKKMLPDTLLSATIDGRPLNQQEIYEE VSTFMFEGHDTTTSGVAFAGYILSRFPEEQRKLYEEQQAVMGNELNRDAT FQEISAMKYLDLFIKEAQRVYPSVPFIGRYTDKDYNIHGTIMPKGTTLNL GIIVLGYDDRVFEEPHRFYPERFEKQKPGPFEYVPFSAGPRNCIGQKFAL LELKTVISKLVRTFEVLPAVDELVSKDGNLNTYVGLPKEEKERKERMGYK YDPILSAVLTLKSENGLHLRLR Cyp4g1 AL009188 also U34328, AI134468 AI238257 AI107737 AI134464 AI516961 AI106721 AI402621 AI063157 AI389355 AI517399 AI108390 AI063322 AI109353 AI517684 AI107940 AI402779 AI402886 AI402810 AI238329 AI135715 AI133930 AI293299 AI108856 AI064278 AI292972 AI292892 AI135496 AI062067 AI064555 AI107765 AI109996 AI107207 AI135779 AI110101 AI387969 AI403655 AI404082 AI293025 AI107892 AI107173 AI292654 AI109079 AI063239 AI292944 AI293033 AI386573 AI514270 AI238755 AI109680 AI106986 AI135407 AI389454 AI113770 AI109410 AI108845 AI517484 AA694758 A1404254 AI063697 MAVEVVQETLQQAASSSSTTVLGFSPMLTTLVGTLVAMALYEYWRRNSRE YRMVANIPSPPELPILGQAHVAAGLSNAEILAVGLGYLNKYGETMKAWLG NVLLVFLTNPSDIELILSGHQHLTKAEEYRYFKPWFGDGLLISNGHHWRH HRKMIAPTFHQSILKSFVPTFVDHSKAVVARMGLEAGKSFDVHDYMSQTT VDILLSTAMGVKKLPEGNKSFEYAQAVVDMCDIIHKRQVKLLYRLDSIYK FTKLREKGDRMMNIILGMTSKVVKDRKENFQEESRAIVEEISTPVASTPA SKKEGLRDDLDDIDENDVGAKRRLALLDAMVEMAKNPDIEWNEKDIMDEV NTIMFEGHDTTSAGSSFALCMMGIHKDIQAKVFAEQKAIFGDNMLRDCTF ADTMEMKYLERVILETLRLYPPVPLIARRLDYDLKLASGPYTVPKGTTVI VLQYCVHRRPDIYPNPTKFDPDNFLPERMANRHYYSFIPFSAGPRSCVGR KYAMLKLKVLLSTIVRNYIVHSTDTEADFKLQADIILKLENGFNVSLEKR QYATVA AL067059 (BACR015N20), AL057969 (BACR024M08) Drosophila melanogaster genome survey of 301-454 comp(143-604) probably new family MSLMFLGAECSSMVLAAFETSAHSVFFALVLLAMFPEHQXLVFXXFKEHFLLAKG IEVTHTVLQX 425 LXFXVRXLNETLRLMPSVPFSSRETLEDLRXSXGVVIPKGMTFSIDIFNTQRNTXX WG 251 SEAAQFNPENFLPEKIHDRHPYAFIPFSKGKXNCIG 143 Cyp4g15 AC013897 AI403684 AI388987 N-term AI134510 Cyp4g15 C-term MEVLKKDAALGSPSSVFYFLLLPTLVLWYIYWRLSRAHLYRLAGRLPGPRGLPIVGHLFD VIGPASSVFRTVIRKSAPFEHIAKMWIGPKLVVFIYDPRDVELLLSSHVYIDKASEYKFF KPWLGDGLLISTGEKWRSHRKLIAPTFHLNVLKSFIELFNENSRNVVRKLRAEDGRTFDC HDYMSEATVEILLGE*TAMGVSKKTQDKSGFEYAMAVMRMCDILHARHRSIFLRNEFGFT LTRYYKEQGRLLNIIHGLTTKVIRSKKAAFEQGTRGSLAQCELKAAALEREREQNGGVDQ TPSTAGSDEKDREKDKEKASPVAGLSYGQSAGLKDDLDVEDNDIGEKKRLAFLDLMLESA QNGALITDTEIKEQVDTIMFEGHDTTAAGSSFFLSLMGIHQDIQDRVLAELDSIFGDSQR PATFQDTLEMKYLERCLMETLRMYPPVPLIARELQEDLKLNSGNYVIPRGATVTVATVLL HRNPKVYANPNVFDPDNFLPERQANRHYYAFVPFSAGPRSCVG*RKYAMLKLKILLSTIL RNYRVYSDLTESDFKLQADIILKREEGFRVRLQPRTS* Cyp4p1 U34327, AC008186 comp(2-88, 148-465) EST AI293255 TSIGLIFGLMNMSLNPDKQELCYQEIQEHIDDDLSNLDVGQLNKLKYLEY FMKETTRLFPSVPIMGREAVQETELANGLILPKGAQITIHVFDIHRNAKY WDSPEEFRPERFLPENVQDRHTYAYVPFSAGQRNCIGKKYAMQEMKTLMV VLLKQFKVLKAIDPQKIVFHTGITLRTQDKIRVKLVRRT* Cyp4p2 AL055774 BACR021H19 80% identical to Cyp4p1 ETAMGIKLDEMAEKGDRYRANFHIIDEGLTRRIVNPLYWDDCVYNMFTG HKYNAALKVVHEFSREIIAKRRVLLEEELENRRATQTADDDM*KETFAML DTLICAEKDGLIDDIGISEEVDTLMAEGYDTTSIGLVFGLMNMSLYAAEQ ELCYQEIQEHIXDDLSNLNLSQLSKLNYLGYFIKETMRLYPSIPIMGRQT LQETELENGLILPKRSQINIHVFDI Cyp4p3 AC008186 48361-48564, 48626-49174 Drosophila chromosome 2 clone BACR10M14 (D583) RPCI-98 No ESTs GSS AL063060 FAMLDTLILAEKDGLIDHIGICEEVDTLMFEGYDTTSIGVVLGKRNMSLYAAEQNLC SQEIQEHIXDDLSNLTLSQLSKLNYLGYFIKETRRRYLSIPIMGRQTLQEPELENGLFLPKRSQINI HVFDIHRNPKYWESPEEFRPERFLPQNCLKRHPYAYIPFSAGQRNCIGK KYAMQEMKTLMVVILKHFKILPVIDPKSIVFQVGITLRFKNKIKVKLVRRNCV* N-terminal of Cyp4p sequences, cannot match up these N-terms with C-terms Except that N-term B follows directly after Cyp4p3 so it is not the 4p3 N-term N-term A AI389883 56% to 4p Nterm about 15 aa short to overlap with AL055774 MMICLLWISVAILVVIHWIYKVNKDYNILAFFARRVQTKDGKP LDSLVPMIKGRTVFANCFDLLGKDTDQVFTHLRQLAKNSGDSYLQYSMGF SNFNVIDAHNAANILNHPNLITKGVIYNFLHPFLRTGVLTATEKKWHTRR SMLTRTFHLDILNQFQEIFIAESLKFVSQFQGQNEV N-term B follows Cyp4p3 sequence on AC008186 MLILWLVGAFIVLIQWIYRLNRDYCILGFFAKRIRTKNGQNPESIAPLVKGSTIFANSFD LYGKDHAGVFEHSRDCAKKLGKSYAEYAMGTAIYNVIDADSAERVLNDPNLINKGTIYDFLHPFLRTGLLTSTG N-term C AI405947 GH26104, AI109737 GH09012 2 mismatches with AL055774 Cyp4p related AL063060 BACR006L24 (N-term part) N-term of 4p sequence Note: there may be alternative splicing in this gene since exon three Starting with KKWHARR does not match genomic sequence AC008186 However, EST AI389883 does match this genomic sequence exon 3 but not exons 1 and 2 MIILWLILALSALLYWLHRANKDYHILSFFTKRIRLKDG TPVEIIAPIAKGKTIFGNTLDLYGRDHAGVFNYSRERAKEMGTSYIEYVF GKAIYNIIDADSAENVLNHPNLITKGLVYNFLHPFLRTGLLTSTGKKWHA RRKMLTPTFHFNILNQFQEIFKTESQKFLLQFEGQDEVTITLHDVIPRFT LNSICETAMGVKLDEMAEKGDRYR AC007149 Drosophila melanogaster chromosome 2 clone BACR49A03 (D586) RPCI-98 49.A.3 map 33A1-33A1 strain y; cn bw sp, WORKING DRAFT SEQUENCE, 83 unordered pieces Length = 123109 This may be part of the missing end of 4p2 since N-term B follows 4p3 There is only one difference with 4p3 Query: 446 KYAMQEMKTLMVVLLKQFKVLKAIDPQKIVFHTGITLRTQDKIRVKLVRR 511 KYAMQE+KTLMVV+LK FK+L IDP+ IVF GITLR ++KI+VKLVRR Sbjct: 104464 KYAMQELKTLMVVILKHFKILPVIDPKSIVFQVGITLRFKNKIKVKLVRR 104661 This is the same as N-term C Query: 1 MIILWLILALSALLYWLHRANKDYHILSFFTKRIRLKDGTPVEIIAPIAKGK MIILWLILALSALLYWLHRANKDYHILSFFTKRIRL DGT VEIIAPIAKGK Sbjct: 105098 MIILWLILALSALLYWLHRANKDYHILSFFTKRIRLNDGTSVEIIAPIAKGK Query: 49 TIFGNTLDLYGRDH 66 TI+GNTLDLYGRDH Sbjct: 105241 TIYGNTLDLYGRDH 105294 hybrid full length Cyp4p sequence made of 4p1, 4p2, 4p3 parts MIILWLILALSALLYWLHRANKDYHILSFFTKRIRLKDG TPVEIIAPIAKGKTIFGNTLDLYGRDHAGVFNYSRERAKEMGTSYIEYVF GKAIYNIIDADSAENVLNHPNLITKGLVYNFLHPFLRTGLLTSTGKKWHA RRKMLTPTFHFNILNQFQEIFKTESQKFLLQFEGQDEVTITLHDVIPRFT LNSICETAMGIKLDEMAEKGDRYRANFHIIDEGLTRRIVNPLYWDDCVYN MFTGHKYNAALKVVHEFSREIIAKRRVLLEEELENRRATQTADDDM*KET FAMLDTLICAEKDGLIDDIGISEEVDTLMAEGYDTTSIGLVFGLMNMSLY AAEQELCYQEIQEHIXDDLSNLNLSQLSKLNYLGYFIKETMRLYPSIPIM GRQTLQETELENGLILPKRSQINIHVFDIHRNAKYWDSPEEFRPERFLPE NVQDRHTYAYVPFSAGQRNCIGKKYAMQEMKTLMVVLLKQFKVLKAIDPQ KIVFHTGITLRTQDKIRVKLVRRT* AL063060 this fragment spans the C-term of one gene and the N-term of another Query: 165 PFSAGQRNCIGK------------------------KYAMQEMKTLMVVILKHFKILPVI 200 P SA QRNCIGK KYAMQEMKTLMVVILKHFKILPVI Sbjct: 28 PXSAXQRNCIGKSTSLQNIRFSYICVLSISFNHKGQKYAMQEMKTLMVVILKHFKILPVI 207 Query: 201 DPKSIVFQVGITLRFKNKIKVKLVVR 226 DPKSIVFQVGITLRFKNKIKVKLV R Sbjct: 208 DPKSIVFQVGITLRFKNKIKVKLVRR 285 Score = 139 bits (346), Expect(2) = 3e-78 Identities = 66/66 (100%), Positives = 66/66 (100%) Frame = +3 Query: 1 MIILWLILALSALLYWLHRANKDYHILSFFTKRIRLKDGTPVEIIAPIAKGKTIFGNTLD 60 MIILWLILALSALLYWLHRANKDYHILSFFTKRIRLKDGTPVEIIAPIAKGKTIFGNTLD Sbjct: 49611 MIILWLILALSALLYWLHRANKDYHILSFFTKRIRLKDGTPVEIIAPIAKGKTIFGNTLD 49790 Query: 61 LYGRDH 66 LYGRDH Sbjct: 49791 LYGRDH 49808 Score = 174 bits (437), Expect(2) = 3e-78 Identities = 89/94 (94%), Positives = 90/94 (95%), Gaps = 21/94 (22%) Frame = +2 Query: 67 AGVFNYSRERAKEMGTSYIEYVFGKAIYNIIDADSAENVLNHPNLITKGLVYNFLHPFLR 126 AGVFNYSRERAKEMGTSYIEYVFGKAIYNIIDADSAENVLNHPNLITKGLVYNFLHPFLR Sbjct: 49868 AGVFNYSRERAKEMGTSYIEYVFGKAIYNIIDADSAENVLNHPNLITKGLVYNFLHPFLR 50047 Query: 127 TGLLTS---------------------TGKKWHARRKMLTPTFHFNILNQFQEIF 160 TGLLTS TGKKWH RR MLT TFH +ILNQFQEIF Sbjct: 50048 TGLLTSTGELDCGFITRRYSSYYRTPFTGKKWHTRRSMLTRTFHLDILNQFQEIF 50212 1 N-terminal = MIILWLILALSALLYWLHRANKDYHILSFFTKRIRLKDGTPVEIIAPIAKGKTIFGNTLDLYGRDH* AGVFNYSRERAKEMGTSYIEYVFGKAIYNIIDADSAENVLNHPNLITKGLVYNFLHPFLRTGLLTST* GKKWHTRRSMLTRTFHLDILNQFQEIF Score = 199 bits (500), Expect = 2e-50 Identities = 93/132 (70%), Positives = 109/132 (82%), Gaps = 2/132 (1%) Frame = -3 Query: 350 IQEHIXDDLSNLNLSQLSKLNYLGYFIKETMRLYPSIPIMGRQTLQETELENGLILPKRS 409 + DD+SNL++ QL+KL YL YF+KET RL+PS+PIMGR+ +QETEL NGLILPK + Sbjct: 483 LMRSFSDDMSNLDVGQLNKLKYLEYFMKETTRLFPSVPIMGREAVQETELANGLILPKGA 304 Query: 410 QINIHVFDIHRNAKYWDSPEEFRPERFLPENVQDRHTYAYVPFSAGQRNCIGKKYAMQEM 469 QI IHVFDIHRNAKYWDSPEEFRPERFLPENVQDRHTYAYVPFSAGQRNCIGK+ + + Sbjct: 303 QITIHVFDIHRNAKYWDSPEEFRPERFLPENVQDRHTYAYVPFSAGQRNCIGKRRSRFSL 124 Query: 470 KT--LMVVLLKQFK 481 LM+ + Q K Sbjct: 123 SNDFLMIAFILQVK 82 DDMSNLDVGQLNKLKYLEYFMKETTRLFPSVPIMGREAVQETELANGLILPKGA QITIHVFDIHRNAKYWDSPEEFRPERFLPENVQDRHTYAYVPFSAGQRNCIG KKYAMQEMKTLMVVLLKQFKVLKAIDPQ Score = 59.7 bits (142), Expect = 2e-08 Identities = 29/29 (100%), Positives = 29/29 (100%) Frame = -2 Query: 461 GKKYAMQEMKTLMVVLLKQFKVLKAIDPQ 489 GKKYAMQEMKTLMVVLLKQFKVLKAIDPQ Sbjct: 88 GKKYAMQEMKTLMVVLLKQFKVLKAIDPQ 2 Score = 90.9 bits (222), Expect = 8e-18 Identities = 43/84 (51%), Positives = 55/84 (65%) Frame = +3 Query: 1 MIILWLILALSALLYWLHRANKDYHILSFFTKRIRLKDGTPVEIIAPIAKGKTIFGNTLD 60 M+ILWL+ A L+ W++R N+DY IL FF KRIR K+G E IAP+ KG TIF N+ D Sbjct: 124743 MLILWLVGAFIVLIQWIYRLNRDYCILGFFAKRIRTKNGQNPESIAPLVKGSTIFANSFD 124922 Query: 61 LYGRDHAGVFNYSRERAKEMGTSY 84 LYG+DH +F S + TSY Sbjct: 124923 LYGKDHCKLFGGSFAGLSMLMTSY 124994 MLILWLVGAFIVLIQWIYRLNRDYCILGFFAKRIRTKNGQNPESIAPLVKGSTIFANSFD LYGKDHAGVFEHSRDCAKKLGKSYAEYAMGTAIYNVIDADSAERVLNDPNLINKGTIYDFLHPFLRTGLLTSTG Score = 107 bits (265), Expect = 7e-23 Identities = 50/68 (73%), Positives = 57/68 (83%) Frame = +1 Query: 67 AGVFNYSRERAKEMGTSYIEYVFGKAIYNIIDADSAENVLNHPNLITKGLVYNFLHPFLR 126 AGVF +SR+ AK++G SY EY G AIYN+IDADSAE VLN PNLI KG +Y+FLHPFLR Sbjct: 125023 AGVFEHSRDCAKKLGKSYAEYAMGTAIYNVIDADSAERVLNDPNLINKGTIYDFLHPFLR 125202 Query: 127 TGLLTSTG 134 TGLLTSTG Sbjct: 125203 TGLLTSTG 125226 Score = 63.2 bits (151), Expect = 2e-09 Identities = 21/51 (41%), Positives = 41/51 (80%) Frame = -1 Query: 3 ILWLILALSALLYWLHRANKDYHILSFFTKRIRLKDGTPVEIIAPIAKGKT 53 +LW+ +A+ +++W+++ NKDY+IL+FF +R++ KDG P++ + P+ KG+T Sbjct: 98294 LLWISVAILVVIHWIYKVNKDYNILAFFARRVQTKDGKPLDSLVPMIKGRT 98142 AL063060 frame 1 contains C-term of one gene and frame 2 has N-term of another frame 1 possible C-term of Cyp4p2 XXGSITPTFPXSXXQRNCIGKSTSLQNIRFSYICVLSISFNHKGQKYAMQ EMKTLMVVILKHFKILPXIDPKSIVFQVGITLRFKNKIKVKLVRRNCV* frame 2 possible N-term of Cyp4p1 MIXLWLILALSALLYWLHRANKDYHILSFFTKXIXLXDGTPVEIIAPIAK GKTIXXNTLDLYGR AC008324 Drosophila melanogaster chromosome 2 clone BACR25K01 (D854) Length = 122061 this sequence is identical to AC008003 from the EECNK exon on But the C-helix exon before this does not match Gene 1 76245-78135 38% to 4d14 N-term is also on AC008325 MFLEVLFAAPLVIFIFRKLWAHLNRTYFILSLCKRIRTEDGSLLESKIYVAPSKTRFGNNFDLVNFT TESIFNFMRDASAKAKGRNYLWYFFHAP MYNIVRAEEAEEILQSSKLITKNMIYELLKPFLGEGLLISTDQKWHSRRKALTPAFHFKVLQSFLIIFK EECNKLVKVLHQSVNMELELNQVIPQFTLNNVC ETALGVKLDDLSEGIRYRQSIHAIEEVMQQRLCNPFFYNIVYFFLFGDYRKQV NNLKIAHEFSSNIIEKRRSLFKSNQLGQEDEFGKKQRYAMLDTLLAAEADGQI DHQGICDEVNTFMFEGYDTTSTCLIFTLLMLALHEDVQKKCYEEIKYLPDDSDDISVFQ FNELVYMECVIKESLRLFPSVPFIGRQCVEETVVNGMVMPKDTQISIHLYEIMRDARHFS NPDLFQPDRFFPENTVNRHPFAFVPFSAGQRNCIGQKFAILEIKVLLAAVIRNFKILPVT LLDDLTFENGIVLRTKQNIKVKLVHRENK* Gene 2 78359-79584 gene runs off end of fragment EST AA567719 matches until FLFAP then diverges this may indicate alternate splicing with a third gene Downstream of this one. MWIALLGSSLLIGALWLLLRQLNKTYFILSLCKRVRTADGSPLESKVFVVPGKTRFGNNLDLLNLT PANIFSYIRESTAKANGQNYIWNFLFAP MYNVVRPEEAEEVFQSTKLITKNMSYELIRPFLGDGLLISTDHKWHSRRKALTPAFHFNVLQSFLGIFK EESKKFIKILDKNVGFELELNQIIPQFTLNNIC ETALGVKLDDMSEGNEYRKAIHDFEIVFNQRMCNPLMFFNWYFFLFGDYKKYSRILRTIHGFSSGIIQRK RQQFKQKQLGQVDEFGKKQRYAMLDTLLAAEAEGKIDHQGICDEVNTFMFGGYDTT STSLIFTLLL AA567719 AA697618 AI517752, AL108342 ETAL exon matches may extend (many Frame shifts) MWIALLGSSLLIGALWLLLRQLNKTYFILSLCKRVRTADGS PLESKVFVVPGKTRFGNNLDLLNLT PANIFSYIRESTAKANGQNYIWNFLFAP EYNIVRAEDAEEIFQSTKITTKNMSYELIRPFLGDGLLISIDQKWHTRRKTLTPAFHFNILQSFLSIF KEESQKFIKILDKNVGFELELNQIIPQFTLNNICETALGVKLDDMSEGNEYRKAIHDFEIVFNQR AC008003 Drosophila melanogaster chromosome 2 clone BACR48D02 (D851) Length = 161114 same sequence as AC008324 gene 2 except for C-helix exon AI517752 GH28810 AA567719 HL01677 ETAL exon matches AI404516 GH24257 but not before QKWHTRRKTLTPAFHFNILQSFLSIFKEESKKFIKILDKNVGFELELNQIIPQFTLNNIC ETALGVKLDDMSEGNEYRKAIHDFEIVFNQRMCNPLMFFNWYFFLFGDYKKYSRILRTIH GFSSGIIQRKRQQFKQKQLGQVDEFGKKQRYAMLDTLLAAEAEGKIDHQGICDEVNTFMF GGYDTTSTSLI AC008325 Drosophila melanogaster chromosome 2 clone BACR05M06 (D855) RPCI-98 Length = 129759 AI404516 AA698190 76% to AI517752 same as AC008325 MWIALLGIPILLAVLTLLLKHINKTYFILSLTKRVRTEDGSPLESKVAIMPGKTRFGNNL DILNFTPASVFNFVRESTAKAKGQNYLWYFLYAPMYNVVRPEEAEEVFQSTKLITKNVVY ELIRPFLGDGLLISTDHKWHSRRKALTPAFHFNVLQSFLGIFK EECKKFLNVLEKNLDAELELNQVIPPFTLNNIC ETALGVKLDDMSEGNEYRKAIH Cyp6a2 M88009, S51248, U78088, AC007549 AI238008 GH13965, AI114198 GH10927, AA696656 GM08076 MFVLIYLLIAISSLLAYLYHRNFNYWNRRGLPHDAPHPLYGNMVGFRKNR VMHDFFYDYYNKYRKSGFPFVGFYFLHKPAAFIVDTQLAKNILIKDFSNF ADRGQFHNGRDDPLTQHLFNLDGKKWKDMRQRLTPTFTSGKMKFMFPTVI KVSEEFVKVITEQVPAAQNGAVLEIKELMARFTTDVIGTCRFGIECNTLR TPVSDFRTMGQKVFTDMRHGKLLTMFVFSFPKLASRLRMRMMPEDVHQFF MRLVNDTIALRERENFKRNDFMNLLIELKQKGRVTLDNGEVIEGMDIGE LAAQVFVFYVAGFETSSSTMSYCLYELAQNQDIQDRLRNEIQTVLEEQEG QLTYESIKAMTYLNQVISETLRLYTLVPHLERKALNDYVVPGHEKLVIEK GTQVIIPACAYHRDEDLYPNPETFDPERFSPEKVAARESVEWLPFGDGPR NCIGMRFGQMQARIGLAQIISRFRVSVCDTTEIPLKYSPMSIVLGTVGGI YLRVERI AL062684.1|CNS002GP Drosophila melanogaster genome survey sequence TET3 end of 43% to 6a2 LAGXXXSTTMGFTLYELACNQDVQDKLRAEIXSVLERYNGKLEYDSMQDLFYME KVINESLRKHPVXAHLARIATKPYQHSNPKYFIEAGTGVLVSTLGIHHDPEXY PEPEKFIPERFDEEQVKRASHLRFPTFGAGPRNCIGLRFGRMQVIIGLALLIHNXRF EXHPKTPVPMKYTINNLLLGSEGGIHLNITKVVRD* AC010578 21500-20681 AA696094, AA803578, AA202305, AI546241 63% identical to AI063421 MAILLGLVVGVLTLVAWWVLQNYTYWKRRGIPHDPPNIPLGNTGELWRTMP LAGILKRTYLKFRKQTDGPFAGFYLYAMKYIVITDVDFVKTVLIRDFDKF HDRGVYHNDKDDPLTNNLATIEGQKWKNLRQKLTHTFTSAKMKSMFSTVL NVGDEMIRVVDEKISSSSQTLEVTDIVSRFTSDVIGICAFGLKCNSLRDP KAEFVQMGYSALRERRHGWLVDLLIFGMPKLAGELGFQFLLPSVQKFYMK IVQDTIDYRMKRKVTRNDFMDTLIDMKQQYDKGDKENGLAFNEVAAQAFV FFLAG 20681 AI063421 GH03219 AI532649 SD04231, 63% identical to AA696094 LIVLLIGVITFVAWYVHQHFNYWKRRGIPHDEPKIPYGNTSELMKTVHFA DIFKRTYNKLRNKTDGPFVGFYMYFKRMVVVTDIDFAKTVLIREFDKFHD RGVFHNERDDPLSANLVNIDGQKWKTLRQKLTPTFTSGKMKTMFPTILTV GDELIRVFGETASADSDSMEITNVVARFTADVIGSCAFGLDCHSLSDPKAKF AA951440 LD31895, AA816508 LD01943, AA201305 LD04267 AI134994 AL076863, AL076873 (BACR039A20) AC010578 MLDVVALLLIALAVGFWFVRTRYSYWTRRGIGSEPARFPVGNMEGFRKNKHFI DIVTPIYEKFKGNGAPFAGFFMMLRPVVLVTDLELAKQILIQDFANFEDR GMYHNERDDPLTGHLFRIDGPKWRPLRQKMSPTFTSAKMKYMFPTVCEVG EELTQVCGELADNAMCGILEIGDLMARYTSDVIGRCAFGVECNGLRNPEA EFAIMGRRAFSERRHCKLVDGFIESFPEVARFLRMRQIHQDITDFYVGIV RETVKQREEQGIVRSDFMNLLIEMKQRGELTIEEMAAQAFIFFAAGFDTS ASTLGFAXYELAKQP AC010578 Drosophila melanogaster chromosome 2 clone BACR03K23 (D1086) RPCI-98 C-terminal region that could match with either AC010578 N-terminal AI402220 exact match to AC010578 33500-34100 region AI258456 = Cyp6a17 EFLAQAIIFLGAGFETSSTTMGFGIYELGRNQDVQDKLREEIGNVFGKHNKEFTYEGIKEMKYLEQVVMETLRKYPVLAHLTRMTD TDFSPEDPKYFIAKGTIVVIPALGIHYDPDIYPEPEIFKPERFTDEEIAARPSCTWLPFGEGPRNCIGLRFGMMQTCVGLAYLIRG YKFSVSPETQIPMKIVVKNILISAENGIHLKVEKLAK* L46858, AL061295 (BACR004K24) 52% identical to cyp6a5 EFTYDSMQELRYMELVIAETLRKYPILPQLTRISRHLYAAKGDRHFYIEP GQMLLIPVYGIHHDPALYPEPHKFIPERFLADQLAQRPTAAWLPFGDGPR NCIGMRFGKMQTTIGLVSLLRNFHFSVCPRTDPKIEFLKSNILLCPAHGI YLKVQQLSQMSS* AL061650.1|CNS00613 Drosophila melanogaster genome survey sequence TET3 end of 370-507 comp(305-718) 60% to L46858 TLRGXPLLPRLTRFSGLLYAARGVRLFXFGPGLLLLXPVYGIXXVPALXPXPHRFI PERX 539 LAGRLAPRPAAAWLPFGVGPRXCVGMGFGRVPAAVGLVGLLRIFRFGVCPRPGP GVAFLR 359 SPFLLCPAXGFCLGVPRL 305 AA801503 HL02667 I-helix and before, 53% identical to Cyp12a2 TVQEYRSPNGFLLRLGRETSLYRYIPTPTYKKFSRAMDEIFDT CSMYVNQAIERIDRKSSQGDSNDHKSVLEQLLQIDRKLAVVMAMDMLMGG VDTTSTAISGILLNLAKNPEKQQRLREEVLSKLTSLHSEFTVEDMK Dm0590 STS from DS07966-Sp6 in vector ad10sacBII, AL058810 (BACR003F22) 350-383 comp(931-1032) exact to Dm0590 DFLXAQAXSFEVAGIETCSASMSFALYELAKQPLMQSRLRREXREAFA SNPNGRLTYEAVARMEFLDMVVEETLRKYPIVPLLERECT 419-505 comp(214-474) PNPXRFNPXXFRCGEQALQXPXXYXPFGAGPXNCIGMQIGLLQIKLGLVYFLHQ HRVEIC DRTVERIQFDAKFALLASEQRIYLKVDCL* AC007137 Drosophila melanogaster chromosome 2 clone BACR25C02 (D544) AI512580 306-457 K-helix to heme comp(85862-86538) exact match to W91838 attempting to assemble P450 sequence most like 6d1 missing some internal seq. MWLLLPILLYSAVFLSVRHIYSHWRRRGFPSEKAGITWSFLQKAYRREFR HVEAICEAYQSGKDRLLGIYCFFRPVLLVRNVELAQTILQQSNGHFSELK WDYISGYRRFNLLEKLAPMFGTKRLSEMFGQVQKVGDHLIHHLLDRQGQG CPQEVDIQQKLRV*YSVNIIANLIYGLDINNFEHEDHILTSYLSHSQASIQS* FTLGRLPQKSSYTYRLRDLIKQSVELREDHGLIRKDILQLLVRFRN*NRLM VSFEVCIVKSCTNSLFLDADKLLSIKRLAKVAEDLLKVSLDAVASTVTFT LLEILQEPLIVEKLRAEIKELSNENGQLKFEELNGLRYMDMCLK*ETLRK YPPLPIIERVCRKSYSLPNSKFTIDEGKTLMVPLLAMHRDEKYFSEPMKY KPLRFLQTANDVGQCEDKTKSNVFIGFGIGGSQCVGTYRIQIILAMFRQY FSHRNYRTELCKVGN* Cyp6a8 L46859, AL054065, AI258590 LP01819, AI107730 GH05558 MALAYILFQVAVALLAILTYYIHRKLTYFKRRGIPFVAPHLIRGNMEELQ KTKNIHEIFQDHYNKFRESKAPFVGFFFFQSPAAFVIDLELAKQILIKDF SNFSNKGIFYNEKDDPISAHLFNLDGAQWRLLRNKLSSTFTSGKMKLMYP TVVSVANEFMTVMHEKVPKNSVLEIRDLVARFTVDVIGTCAFAIQCNSLR DEKAEFLYFGKRSLVDKRHGTLLNGFMRSYPKLARKLGMVRTAPHIQEFY SRIVTETVAVREKEHIKRNDFMDMLIELKNQKEMTLENGDVVRGLTMEEV LAQAFVFFIAGFETSSSTMGFALYELAKNPDIEQHDQNFTYECTKDLKYL NQVLDETLRLYTIVPNLDRMAAKRYVVPGHPNFVIEAGQSVIIPSSAIHH DPSIYPEPFEFRPERFSPEESAGRPSVAWLPFGDGPRNCIGLRFGQMQAR IGPALLIRNFKFSTCSKTPNPLVYDPKSFVLGVKDGIYLKVETV AC008285 Drosophila melanogaster chromosome 3 clone BACR31P16 (D1002) 85740-87319 77% identical to 6a8 (this is a new 6a subfamily member) AI403838 GH23364 80% to Cyp6a8 N terminal MQLTYFLFQVAVALLAIVTYILHRKLTYFKRRGIPYDKPHPLRGNMEGYKKTRTVHEIHQ EYYNKYRNSKAPFVGFYLFQKPAAFVIDLELAKQILIKNFSNFTDKGIYYNEKDDPMSAH LFNLDGPQWRLLRSKLSSTFTSGKMKFMYPTVVSVAEEFMAVMHEKVSENSILDVRDLVA RFTVDVIGTCAFGIKCNSLRDEKAEFLHFGRRALLDSRHGNLVSGLMRSYPNLARRLGLC RNTAQIQEFYQRIVKETVTLREKENIKRNDFMDMLIGLKNQKNMTLENGEVVKGLTMDEI VAQAFVFFIAGFDTSSSTMGFALYELAKNPSIEQHDQKFTYECIKDLKYLDQVLSETLRH YTIVPNVDRVAAKRFVVPGNPKFVIEAGQSVIIPSSAIHHDPSIYPEPNEFRPERFSPEE SAKRPSVAWLPFGEGPRNCIGLRFGQMQARIGLAMLIKNFTFSPCSATPDPLTFDPHSAI LLGIKGGIQLKVEAI* AC004721 Cyp6a16 DS03308 complete 2-455 comp(2956-5070) 47% to 6a8 MDFTLLLLTSLLSFLLGYLRYRFTYWELRGIPQLRPHFLFGHFFRLQSVHYSELLQETYDAF RGSAKVAGTYVFLRPMAVVLDLDLVKAVLIRDFNNFVDRRSFHGDPLTANLFNLQGEEWRNL RTKLSPTFTSGKMKYMFGTVSTVAQQLGGTFDELVAVLELHDLMARYTTDVI GSCAFGTECSSLREPQAEFRQVGRRIFRNSNRSIRWRIFKMTYLSSLAKLGLPVRI LHPDITKFFNRIVRETVELRERENIRRNDFMDLLLDLRR KGLTMEQMAAQAFVFFVAGFETSSSNMSYALFELAKNQDVQQKLRMEINDSIGKHG KLTYEAMMEMPYLDQTI PETLRKYPALSSLTRLASEDYEIPSPDGGDPVVLEKGTSVHIPVLAIHYDPEVYPEP HEFRPERFAPDACRERHPTAFLGFGDGPRNCIGLRFGRMQVKVGLITLLR RFRFSLPPGSPTQLKVTKRNLILLPSDGVRLQVDPVESRLM* Cyp6a9 L46860, AL054861, AL053264, AL072094, AL055555 AL070586, AL054261, AI257903 AL078165 AL069773 MGVYSVLLAIVVVLVGYLLLKWRRALHYWQNLDIPCE EPHILMGSLTGVQTSRSFSAIWMDYYNKFRGTGPFAGFYWFQRPGILVLD ISLAKLILIKEFNKFTDRGFYHNTEDDPLSGQLFLLDGQKWKSMRSKLSS TFTSGKMKYMFPTVVKVGHEFIEVFGQAMEKSPIVEVRDILARFTTDVIG TCAFGIECSSLKDPEAEFRVMGRRAIFEQRHGPIGIAFINSFQNLARRLH MKITLEEAEHFFLRIVRETVAFREKNNIRRNDFMDQLIDLKNSPLTKSES GESVNLTIEEMAAQAFVFFGAGFETSSTTMGFALYELAQHQDIQDRVRKE CQEVIGKYNGEITYESMKDMVYLDQVISETLRLYTVLPDLNRECLEDYEV PGHPKYVIKKGMPVLIPCGAMHRDEKLYANPNTFNPIFFARTSEGSDSVE WLPFGDGPRLCIGMRFGQMQARSGLALLINRFKFSVCEQTTIPIVYSKKT FLISSETGIFLKVERV AL072844.1|CNS00H2C Drosophila melanogaster genome survey sequence TET3 end 404-507 19-330 42% with 6a5 LLIPTAAIHMDPGIYENPQRFYPERFXEQAXRSRPAAAFLPXGDGLRGCIAARFAEQ QLLVGLVALLRQHRYAPSAETSIPVEYDNRRLLLMPKSDIKLSVERVDXL* AC008288 = AL072844 AC009342 WGNIKGVVSGKRHAQDALQDIY TAYKGRAPFVGFYACLKPFILALDLKLVHQIIFTDAGHFTSRGLYSNPSGEPLSHNLLQLNGHKWRSLHAKSAEVFTPANMQKLLV RLSQISSRIQRDLGEKSLQTINISELVGAYNTDVMASMAFGLVGQDNVEFAKWTRNYWADFRMWQAYLALEFPLIARLLQYKSYAE PATAYFQKVALSQLQLHRRRDRQPLQTFLQLYSNAEKPLTDIEIAGQAFGFVLAGLGPLNATLAFCLYELARQPEVQDRTRLEINK ALEEHGGQVTPECLRELRYTKQVLNETLRLHTPHPFLLRRATKEFEVPGSVFVIAKGNNVLIPTAAI HMDPGIYENPQRFYPERFEEQARRSRPAAAFLPFGDGLRGCIAARFAEQQLLVGLVALLRQHRYAPSAETSIPVEYDNRRLLLMPK SDIKLSVERVDKL* AL069964.1|CNS00DFU Drosophila melanogaster genome survey sequence T7 end of 72% to 6a9 N-terminal MSVGTVLLTALLALVGYLLMKWRSXMRHWQDLGIPCEEPHILMG SMKGVRTARSFNEIWTSYYNKFRGSGPFAGFYWFRRPAVFVLEK SLXKQILIKEFNKFTDRGXFHNPEDDPLSGQLFLLDGQKWRTMR NSTSSTFTSGKMKY Cyp6a9 HOMOLOG Drosophila grimshawi U87164 DPEAEFRIMGRKSLTDQRHGNLGNALLNGFPNFSRRIHMKLTPEHIEKFF MRIVKETVDYREKNNVRRNDFMDQLIDLKNKPLMKSETGESMNLTIEEIS AQALVFFAAGFETSSTTMGFALYELARAEDVQNRLRKECNEVLARHNGDL TYESIKDMKYLDQVISETLRLYTVLPILNRQCLEDYVVP Cyp6a13 Drosophila melanogaster AC005457, DS08616, AA941155 LD25139 4-474 comp(7564-8986) 52% identical to CYP6A5 probable 6A protein AC007085 (BACR21H10) comp(108394-108855) MLTLLVLVFTVGLLLYVKLRWHYSYWSRRGV AGERPVYFRGNMSGLGRDLHWTDINLRIYRKFRGVERYCGYFTFMTKSLFIMDLELIRDI MIRDFSSFADRGLFHNVRDDPLT GNLLFLDGPEWRWLRQNLTQVFTSGKMKFMFPNMVEVGEKLTQACRLQVGEIEAKD LCARFTTDVIGSCAFGLECNSLQDPESQFRRMGRSVTQEPLHSVLVQAFMFAQPELARKL RFRLFRPEVSEFFLDTVRQTLDYRRRENIHRNDLIQLLMELGEEGVKDALSFEQIAAQALV FFLAGFDTSSTTMSFCLYELALNPDVQERLRVEVLAVLKRNNQKLTYDSVQEMPYLDQ VVAETLRKYPILPHLLRRSTKEYQIPNSNLILEPGSKIIIPVHSIHHDPELYPDPEKFDPSRFE PEEIKARHPFAYLPFGEGPRNCIGERFGKLQVKVGLVYLLRDFKFSRSEKTQIPLKFS SRNFLISTQEGVHLRME AC009844 45% to 6a2 no exact match in nr same as AL097801, AL057750 65863-64775 3 IN FRAME STOPS COULD BE A PSEUDOGENE OR THERE ARE FRAMESHIFTS missing about 30 AA at C-terminal MAVMIVLLIGVITFLAWYVHQHFNYWKRRGI FPR*APKLPTVIPAYLMKTRPFCGYFSRDPPTKLRTKPAGPFVGFLYVFQEDW*L*PNIDSAKPE LIREFDKFPVGGVFHNERED PLSATLVNIDGQKWKPLRQKLTPTFTSGKMKTMFPTILTVGDELIRVFGETASADSDSME ITNVVARFTADVIGSCAFGLDCHSLSDPKAKFVQMGTTAITERRHGKSMDLLLFGAPELA AKLRMKATVQEVEDFYMNIIRDTVDYRVKNNVKRHDFVDMLIEMKLKFDNGDKENGLTFN EIAAQAFIFFLAGFETSSTTMGFALYELACHQDIQDKLRTEINTVLKQHNGKLDYDSMRE MTYLEKVITETMRKRPVVGHLIRVATQHYQHTNPKYNIEKGTGVIVPTLAIHHDPEFYPE PEKFIPERFDEDQVQQRPXCTFLPFGDGPRNCIGLRFGRMQVIVGXALLIHNFKF AC009844 Drosophila melanogaster chromosome 2 clone BACR01L16 (D1052) Query: 1 MALAYILFQVAVALLAILTYYIHRKLTYFKRRGIPFVAPHLIRGNMEELQKTKNIHEIFQ 60 MAL YILFQVAVALLAILTYYIHRKLTYFKRRGIPFVAPHLIRGNMEELQKTKNIHEIFQ Sbjct: 77597 MALTYILFQVAVALLAILTYYIHRKLTYFKRRGIPFVAPHLIRGNMEELQKTKNIHEIFQ 77418 Query: 61 DHYNKFRESKAPFVGFFFFQSPAAFVIDLELAKQILIKDFSNFSNKGIFYNEKDDPISAH 120 DHYNKFRESKAPFVGFFFFQSPAAFVIDLELAKQILIKDFSNFSNKGIFYNEKDDPISAH Sbjct: 77417 DHYNKFRESKAPFVGFFFFQSPAAFVIDLELAKQILIKDFSNFSNKGIFYNEKDDPISAH 77238 Query: 121 LFNLDGAQWRLLRNKLSSTFTSGKMKLMYPTVVSVANEFMTVMHEKVPKNSVLEIRDLVA 180 LFNLDGAQWRLLRNKLSSTFTSGKMKLMYPTVVSVANEFMTVMHEKVPKNSVLEIRDLVA Sbjct: 77237 LFNLDGAQWRLLRNKLSSTFTSGKMKLMYPTVVSVANEFMTVMHEKVPKNSVLEIRDLVA 77058 Query: 181 RFTVDVIGTCAFAIQCNSLRDEKA 204 RFTVDVIGTCAF IQCNSLRD KA Sbjct: 77057 RFTVDVIGTCAFGIQCNSLRDVKA 76986 Score = 284 bits (718), Expect(2) = 2e-87 Identities = 155/350 (44%), Positives = 230/350 (65%), Gaps = 17/350 (4%) Frame = -1 Query: 6 ILFQVAVALLAILTYYIHRKLTYFKRRGI-PFVAPHLIRGNMEELQKTKNIHEIF-QDHY 63 ++ + + ++ L +Y+H+ Y+KRRGI P AP L L KT+ F +D Sbjct: 65863 VMIVLLIGVITFLAWYVHQHFNYWKRRGIFPR*APKLPTVIPAYLMKTRPFCGYFSRDPP 65684 Query: 64 NKFRESKA-PFVGFFF-FQSPAAFVIDLELAKQILIKDFSNFSNKGIFYNEKDDPISAHL 121 K R A PFVGF + FQ +++ AK LI++F F G+F+NE++DP+SA L Sbjct: 65683 TKLRTKPAGPFVGFLYVFQEDW*L*PNIDSAKPELIREFDKFPVGGVFHNEREDPLSATL 65504 Query: 122 FNLDGAQWRLLRNKLSSTFTSGKMKLMYPTVVSVANEFMTVMHEKVPKNS-VLEIRDLVA 180 N+DG +W+ LR KL+ TFTSGKMK M+PT+++V +E + V E +S +EI ++VA Sbjct: 65503 VNIDGQKWKPLRQKLTPTFTSGKMKTMFPTILTVGDELIRVFGETASADSDSMEITNVVA 65324 Query: 181 RFTVDVIGTCAFAIQCNSLRDEKAEFLYFGKRSLVDKRHGTLLNGFMRSYPKLARKLGMV 240 RFT DVIG+CAF + C+SL D KA+F+ G ++ ++RHG ++ + P+LA KL M Sbjct: 65323 RFTADVIGSCAFGLDCHSLSDPKAKFVQMGTTAITERRHGKSMDLLLFGAPELAAKLRMK 65144 Query: 241 RTAPHIQEFYSRIVTETVAVREKEHIKRNDFMDMLIELKNQKEMTLENGDVVRGLTMEEV 300 T +++FY I+ +TV R K ++KR+DF+DMLIE+K + +NGD GLT E+ Sbjct: 65143 ATVQEVEDFYMNIIRDTVDYRVKNNVKRHDFVDMLIEMK----LKFDNGDKENGLTFNEI 64976 Query: 301 LAQAFVFFIAGFETSSSTMGFALYELAKNPDIE------------QHDQNFTYECTKDLK 348 AQAF+FF+AGFETSS+TMGFALYELA + DI+ QH+ Y+ +++ Sbjct: 64975 AAQAFIFFLAGFETSSTTMGFALYELACHQDIQDKLRTEINTVLKQHNGKLDYDSMREMT 64796 Query: 349 YLNQVLD 355 YL +V+D Sbjct: 64795 YLEKVID 64775 Score = 280 bits (708), Expect = 1e-74 Identities = 162/219 (73%), Positives = 172/219 (77%), Gaps = 33/219 (15%) Frame = +3 Query: 238 GMVRTAPHIQEFYSRIVTETVAVREKEHIKRNDFMDMLIELKNQKE---------MTLEN 288 G VRTAPHIQEFYSRIVTETVAVREKEHIKRNDFMDMLIELKNQKE + + Sbjct: 27510 GKVRTAPHIQEFYSRIVTETVAVREKEHIKRNDFMDMLIELKNQKESDSREWRCGQRINH 27689 Query: 289 GD-------VVRGLTMEEVLAQAFVFFIAGFETSSSTMGFA------------LYELAKN 329 G V+ + +L G E+ S G LY L + Sbjct: 27690 GGGFGTSLCVLHCWL*DILLHHGICPIRTGKESGHSR*GSG*GGGGHRAT*PKLY-LRVH 27866 Query: 330 PDIEQHDQNFTYECTKDLKYLN-----QVLDETLRLYTIVPNLDRMAAKRYVVPGHPNFV 384 E F + + LK LN +L ETLRLYTIVPNLDRMAAKRYVVPGHPNFV Sbjct: 27867 QGSEVPQSGFRWYVNEYLK*LNPTDFNNILAETLRLYTIVPNLDRMAAKRYVVPGHPNFV 28046 Query: 385 IEAGQSVIIPSSAIHHDPSIYPEPFEFRPERFSPEESAGRPSVAWLPFGDGPRNCIGLRF 444 IEAGQSVIIPSSAIHHDPSIYPEPFEFRPERFSPEESAGRPSVAWLPFGDGPRNCIGLRF Sbjct: 28047 IEAGQSVIIPSSAIHHDPSIYPEPFEFRPERFSPEESAGRPSVAWLPFGDGPRNCIGLRF 28226 Query: 445 GQMQARIGPALL 456 GQMQARIG ALL Sbjct: 28227 GQMQARIGLALL 28262 Score = 196 bits (494), Expect(2) = 2e-94 Identities = 102/186 (54%), Positives = 140/186 (74%), Gaps = 12/186 (6%) Frame = +2 Query: 169 KNSVLEIRDLVARFTVDVIGTCAFAIQCNSLRDEKAEFLYFGKRSLVDKRHGTLLNGFMR 228 K+ ++E+RD++ARFT DVIGTCAF I+C+SL+D +AEF G+R++ ++RHG + F+ Sbjct: 69299 KSPIVEVRDILARFTTDVIGTCAFGIECSSLKDPEAEFRVMGRRAIFEQRHGPIGIAFIN 69478 Query: 229 SYPKLARKLGMVRTAPHIQEFYSRIVTETVAVREKEHIKRNDFMDMLIELKNQKEMTLEN 288 S+ LAR+L M T + F+ RIV ETVA REK +I+RNDFMD LI+LKN E+ Sbjct: 69479 SFQNLARRLHMKITLEEAEHFFLRIVRETVAFREKNNIRRNDFMDQLIDLKNSPLTKSES 69658 Query: 289 GDVVRGLTMEEVLAQAFVFFIAGFETSSSTMGFALYELAKNPDIE------------QHD 336 G+ V LT+EE+ AQAFVFF AGFETSS+TMGFALYELA++ DI+ +++ Sbjct: 69659 GESV-NLTIEEMAAQAFVFFGAGFETSSTTMGFALYELAQHQDIQDRVRKECQEVIGKYN 69835 Query: 337 QNFTYECTKDLKYLNQVL 354 TYE KD+ YL+QV+ Sbjct: 69836 GEITYESMKDMVYLDQVI 69889 Score = 181 bits (455), Expect = 5e-45 Identities = 93/188 (49%), Positives = 136/188 (71%), Gaps = 12/188 (6%) Frame = -2 Query: 172 VLEIRDLVARFTVDVIGTCAFAIQCNSLRDEKAEFLYFGKRSLVDKRHGTLLNGFMRSYP 231 VLEI DLVAR+T DVIG CAF + CNSL++ AEF+ GKR+++++R+G LL+ + +P Sbjct: 28950 VLEIVDLVARYTPDVIGNCAFGLNCNSLQNPNAEFVTIGKRAIIERRYGGLLDFLIFGFP 28771 Query: 232 KLARKLGMVRTAPHIQEFYSRIVTETVAVREKEHIKRNDFMDMLIELKNQKEMTLENGDV 291 KL+R+L + +++FY+ IV T+ R + + KR+DFMD LIE+ +++ G+ Sbjct: 28770 KLSRRLRLKLNVQDVEDFYTSIVRNTIDYRLRTNEKRHDFMDSLIEMYEKEQA----GNT 28603 Query: 292 VRGLTMEEVLAQAFVFFIAGFETSSSTMGFALYELAKNPDIEQHDQNF 339 GL+ E+LAQAF+FF+AGFETSS+TMGFALYELA + DI++H+ F Sbjct: 28602 EDGLSFNEILAQAFIFFVAGFETSSTTMGFALYELALDQDIQKHNNEF 28423 Query: 340 TYECTKDLKYLNQVLDETLR 359 TYE K++KYL QV+ T + Sbjct: 28422 TYEGIKEMKYLEQVVMGTFQ 28363 VLEIVDLVARYTPDVIGNCAFGLNCNSLQNPNAEFVTIGKRAIIERRYGGLLDFLIFGFP 28771 KLSRRLRLKLNVQDVEDFYTSIVRNTIDYRLRTNEKRHDFMDSLIEMYEKEQAGNT 28603 EDGLSFNEILAQAFIFFVAGFETSSTTMGFALYELALDQDIQKHNNEF 28423 TYEGIKEMKYLEQVVMETLRKYPVLAHLTRMTQTDFSPEDPKYFIPKGTTGVIPALGIHYDPEIYPEP GEVKPERLTDEAIAARPSCTWL Sbjct: 42575 AEFVTIGKRAIIERRYGGLLDFLIFGFPKLSRRLRLKLNVQDVEDFYTSIVRNTIDYRLR 42754 Query: 264 EHIKRNDFMDMLIELKNQKEMTLENGDVVRGLTMEEVLAQAFVFFIAGFETSSSTMGFAL 323 + KR+DFMD LIE+ +++ G+ GL+ E+LAQAF+FF+AGFETSS+TMGFAL Sbjct: 42755 TNEKRHDFMDSLIEMYEKEQA----GNTEDGLSFNEILAQAFIFFVAGFETSSTTMGFAL 42922 Query: 324 YELAKNPDIE------------QHDQNFTYECTKDLKYLNQVLD---ETLRLYTIVPNLD 368 YELA + DI+ +H+ FTYE K++KYL QV+ RL+ Sbjct: 42923 YELALDQDIQDQLRAEINNVLSKHNNEFTYEGIKEMKYLEQVVMGK*NPKRLHNFT*R-- 43096 Query: 369 RMAAKRYVVPGHP-----------------------NFVIEAGQSVIIPSSAIHHDPSIY 405 A + P P N ++ G + +IP+ IH+DP IY Sbjct: 43097 ---AFSHFTPQKPFASIQFWRI*RE*PRQIFRLKILNTLLPKGTTGVIPALGIHYDPEIY 43267 Query: 406 PEP 408 PEP Sbjct: 43268 PEP 43276 Query: 356 ETLRLYTIVPNLDRMAAKRYVVPGHPNFVIEAGQSVIIPSSAIHHDPSIYPEPFEFRPER 415 ETLR Y ++ +L RM + P P + I G PSS E +PER Sbjct: 43120 ETLRKYPVLAHLTRMTQTDFS-PEDPKYFIAQGNHWCDPSSWYSL*SGNLSRTGEVKPER 43296 Query: 416 FSPEESAGRPSVAWL 430 + E A RPS WL Sbjct: 43297 LTDEAIAARPSCTWL 43341 Score = 174 bits (436), Expect = 8e-43 Identities = 83/152 (54%), Positives = 104/152 (67%), Gaps = 1/152 (0%) Frame = -2 Query: 343 CTKDLKYLN-QVLDETLRLYTIVPNLDRMAAKRYVVPGHPNFVIEAGQSVIIPSSAIHHD 401 C+ L +L + ETLRLYT++P L+R + Y VPGHP +VI+ G V+IP A+H D Sbjct: 35619 CSFSLSFLTFGFILETLRLYTVLPVLNRECLEDYEVPGHPKYVIKKGMPVLIPCGAMHRD 35440 Query: 402 PSIYPEPFEFRPERFSPEESAGRPSVAWLPFGDGPRNCIGLRFGQMQARIGPALLIRNFK 461 +Y P F P+ FSPE R SV WLPFGDGPRNCIG+RFGQMQARIG ALLI++FK Sbjct: 35439 EKLYANPNTFNPDNFSPERVKERDSVEWLPFGDGPRNCIGMRFGQMQARIGLALLIKDFK 35260 Query: 462 FSTCSKTPNPLVYDPKSFVLGVKDGIYLKVETV 494 FS C KT P+ Y+ + F++ GIYLK E V Sbjct: 35259 FSVCEKTTIPMTYNKEMFLIASNSGIYLKAERV 35161 IERFFMRIVRETVAFREQNNIRRNDFMDQLIDLKNKPLMVSQSGESVNLTIEEIAAQAF 35817 VFFAAGFETSSTTMGFALYELAQNQDIEKCNGELNYESMKDLVYLDQ 35637 ETLRLYTVLPVLNRECLEDYEVPGHPKYVIKKGMPVLIPCGAMHRD 35440 EKLYANPNTFNPDNFSPERVKERDSVEWLPFGDGPRNCIGMRFGQMQARIGLALLIKDFK 35260 FSVCEKTTIPMTYNKEMFLIASNSGIYLKAERV 35161 Score = 171 bits (429), Expect(2) = 2e-94 Identities = 80/139 (57%), Positives = 97/139 (69%) Frame = +1 Query: 356 ETLRLYTIVPNLDRMAAKRYVVPGHPNFVIEAGQSVIIPSSAIHHDPSIYPEPFEFRPER 415 ETLRLYT++P L+R + Y VPGHP +VI+ G V+IP A+H D +Y P F P+ Sbjct: 69946 ETLRLYTVLPVLNRECLEDYEVPGHPKYVIKKGMPVLIPCGAMHRDEKLYANPNTFNPDN 70125 Query: 416 FSPEESAGRPSVAWLPFGDGPRNCIGLRFGQMQARIGPALLIRNFKFSTCSKTPNPLVYD 475 FSPE R SV WLPFGDGPRNCIG+RFGQMQAR G ALLI FKFS C +T P+VY Sbjct: 70126 FSPERVKERDSVEWLPFGDGPRNCIGMRFGQMQARSGLALLINRFKFSVCEQTTIPIVYS 70305 Query: 476 PKSFVLGVKDGIYLKVETV 494 K F++ + GI+LKVE V Sbjct: 70306 KKRFLISSETGIFLKVERV 70362 ETLRLYTVLPVLNRECLEDYEVPGHPKYVIKKGMPVLIPCGAMHRDEKLYANPNTFNPDN 70125 FSPERVKERDSVEWLPFGDGPRNCIGMRFGQMQARSGLALLINRFKFSVCEQTTIPIVYS 70305 KKRFLISSETGIFLKVERV 70362 Score = 149 bits (373), Expect = 2e-35 Identities = 89/205 (43%), Positives = 135/205 (65%), Gaps = 38/205 (18%) Frame = +2 Query: 204 AEFLYFGKRSLVDKRHGTLLNGFMRSYPKLARKLGMVRTAPHIQEFYSRIVTETVAVREK 263 AEF+ GKR+++++R+G LL+ + +PKL+R+L + +++FY+ IV T+ R + Sbjct: 42575 AEFVTIGKRAIIERRYGGLLDFLIFGFPKLSRRLRLKLNVQDVEDFYTSIVRNTIDYRLR 42754 Query: 264 EHIKRNDFMDMLIELKNQKEMTLENGDVVRGLTMEEVLAQAFVFFIAGFETSSSTMGFAL 323 + KR+DFMD LIE+ +++ G+ GL+ E+LAQAF+FF+AGFETSS+TMGFAL Sbjct: 42755 TNEKRHDFMDSLIEMYEKEQA----GNTEDGLSFNEILAQAFIFFVAGFETSSTTMGFAL 42922 Query: 324 YELAKNPDIE------------QHDQNFTYECTKDLKYLNQVLD---ETLRLYTIVPNLD 368 YELA + DI+ +H+ FTYE K++KYL QV+ RL+ Sbjct: 42923 YELALDQDIQDQLRAEINNVLSKHNNEFTYEGIKEMKYLEQVVMGK*NPKRLHNFT*R-- 43096 Query: 369 RMAAKRYVVPGHP-----------------------NFVIEAGQSVIIPSSAIHHDPSIY 405 A + P P N ++ G + +IP+ IH+DP IY Sbjct: 43097 ---AFSHFTPQKPFASIQFWRI*RE*PRQIFRLKILNTLLPKGTTGVIPALGIHYDPEIY 43267 Query: 406 PEP 408 PEP Sbjct: 43268 PEP 43276 Score = 141 bits (352), Expect = 6e-33 Identities = 75/154 (48%), Positives = 103/154 (66%), Gaps = 18/154 (11%) Frame = -2 Query: 341 YECTKDLKYLNQVLDETLRLYTIVPNLDRMAAKRYVVPGHPN 382 Y+ +DL Y+ +V++E+LR + +V +L R+A K Y +P Sbjct: 71283 YDSMQDLFYMEKVINESLRKHPVVAHLARIATKPYQ-HSNPK 71107 Query: 383 FVIEAGQSVIIPSSAIHHDPSIYPEPFEFRPERFSPEESAGRPSVAWLPFGDGPRNCIGL 442 + IEAG V++ + IHHDP YPEP +F PERF E+ RP+ A+LPFG GPRNCIGL Sbjct: 71106 YFIEAGTGVLVSTLGIHHDPEFYPEPEKFIPERFDEEQVKKRPTCAFLPFGAGPRNCIGL 70927 Query: 443 RFGQMQARIGPALLIRNFKFSTCSKTPNPLVYDPKSFVLGVKDGIYLKVETV 494 RFG+MQ IG ALLI NF+F KTP P+ Y + +LG + GI+L + V Sbjct: 70926 RFGRMQVIIGLALLIHNFRFELHPKTPVPMKYTINNLLLGSEGGIHLNITKV 70771 YDSMQDLFYMEKVINESLRKHPVVAHLARIATKPYQHSNPK 71107 YFIEAGTGVLVSTLGIHHDPEFYPEPEKFIPERFDEEQVKKRPTCAFLPFGAGPRNCIGL 70927 RFGRMQVIIGLALLIHNFRFELHPKTPVPMKYTINNLLLGSEGGIHLNITKV 70771 Score = 122 bits (302), Expect = 4e-27 Identities = 67/109 (61%), Positives = 85/109 (77%), Gaps = 12/109 (11%) Frame = -3 Query: 246 IQEFYSRIVTETVAVREKEHIKRNDFMDMLIELKNQKEMTLENGDVVRGLTMEEVLAQAF 305 I+ F+ RIV ETVA RE+ +I+RNDFMD LI+LKN+ M ++G+ V LT+EE+ AQAF Sbjct: 35993 IERFFMRIVRETVAFREQNNIRRNDFMDQLIDLKNKPLMVSQSGESVN-LTIEEIAAQAF 35817 Query: 306 VFFIAGFETSSSTMGFALYELAKNPD------------IEQHDQNFTYECTKDLKYLNQV 353 VFF AGFETSS+TMGFALYELA+N D IE+ + YE KDL YL+QV Sbjct: 35816 VFFAAGFETSSTTMGFALYELAQNQDIQNRVRKECQEVIEKCNGELNYESMKDLVYLDQV 35637 Query: 354 L 354 + Sbjct: 35636 V 35634 IERFFMRIVRETVAFREQNNIRRNDFMDQLIDLKNKPLMVSQSGESVNLTIEEIAAQAF 35817 VFFAAGFETSSTTMGFALYELAQNQDIEKCNGELNYESMKDLVYLDQV 35637 Score = 83.1 bits (202), Expect = 2e-15 Identities = 44/170 (25%), Positives = 84/170 (48%), Gaps = 3/170 (1%) Frame = +3 Query: 106 KGIFYNEKDDPISAHLFNLDGAQWRLLRNKLSSTFTSGKMKLMYPTVVSVANEFMTVMHE 165 KG++ N+K DP+S L+ L G W+ +R KL + +M L+Y + A + + ++ Sbjct: 82008 KGLYCNQKSDPLSGDLYALRGESWKEMRQKLDPSLEGDRMSLLYDCLYEEAEQLLLTVNS 82187 Query: 166 KV--PKNSVLEIRDLVARFTVDVIGTCAFAIQCNSLRDEKAE-FLYFGKRSLVDKRHGTL 222 + +S + I+ ++ R+ + + C F + + E F + +L +HG L Sbjct: 82188 TLMSQPHSTVHIQKIMRRYVLSSLAKCVFGLNAEQRKTYPLEDFEQMTELALNSHKHGYL 82367 Query: 223 LNGFMRSYPKLARKLGMVRTAPHIQEFYSRIVTETVAVREKEHIKRNDFMDML 275 +N M P R L M RT +E++ +++T V RE + D++ +L Sbjct: 82368 MNLMMIRVPNFCRMLRMRRTPKQAEEYFIKLLTSIVEQRETSGKPQKDYLQLL 82526 KGLYCNQKSDPLSGDLYALRGESWKEMRQKLDPSLEGDRMSLLYDCLYEEAEQLLLTVNS 82187 TLMSQPHSTVHIQKIMRRYVLSSLAKCVFGLNAEQRKTYPLEDFEQMTELALNSHKHGYL 82367 MNLMMIRVPNFCRMLRMRRTPKQAEEYFIKLLTSIVEQRETSGKPQKDYLQLL 82526 Score = 73.7 bits (178), Expect(2) = 2e-21 Identities = 31/55 (56%), Positives = 44/55 (79%) Frame = +2 Query: 64 NKFRESKAPFVGFFFFQSPAAFVIDLELAKQILIKDFSNFSNKGIFYNEKDDPIS 118 +K++ S PF GF+FF + +A + DLEL K++LIKDF++F N+GIFYNE DDP+S Sbjct: 44717 SKYKRSVYPFAGFYFFFTRSAVITDLELVKRVLIKDFNHFENRGIFYNEIDDPLS 44881 Score = 60.9 bits (145), Expect(2) = 2e-87 Identities = 31/70 (44%), Positives = 44/70 (62%) Frame = -3 Query: 347 LKYLNQVLDETLRLYTIVPNLDRMAAKRYVVPGHPNFVIEAGQSVIIPSSAIHHDPSIYP 406 L+ + + ET+R +V +L R+A + Y +P + IE G VI+P+ AIHHDP YP Sbjct: 64673 LQITSILFTETMRKRPVVGHLIRVATQHYQHT-NPKYNIEKGTGVIVPTLAIHHDPEFYP 64497 Query: 407 EPFEFRPERF 416 EP +F PERF Sbjct: 64496 EPEKFIPERF 64467 Score = 56.2 bits (133), Expect = 3e-07 Identities = 26/82 (31%), Positives = 49/82 (59%) Frame = -2 Query: 1 MALAYILFQVAVALLAILTYYIHRKLTYFKRRGIPFVAPHLIRGNMEELQKTKNIHEIFQ 60 M++ +L +AL+ L + +++ GIP PH++ G+M+ ++ ++ +EI+ Sbjct: 90483 MSVGTVLLTALLALVGYLLMKWRSTMRHWQDLGIPCEEPHILMGSMKGVRTARSFNEIWT 90304 Query: 61 DHYNKFRESKAPFVGFFFFQSP 82 +YNKFR S PF GF++F+ P Sbjct: 90303 SYYNKFRGS-GPFAGFYWFRRP 90241 MSVGTVLLTALLALVGYLLMKWRSTMRHWQDLGIPCEEPHILMGSMKGVRTARSFNEIWTSYYNKFRGSGPFAGFYWFRRP AVFVLEKSLGKQILIKEFNKFTDRGFFHNPEDDPLSGQLFLLDGQKWRTMRNSTSSTFTSGK 6a9 MGVYSVLLAIVVVLVGYLLLKWRRALHYWQNLDIPCE EPHILMGSLTGVQTSRSFSAIWMDYYNKFRGTGPFAGFYWFQRPGILVLD ISLAKLILIKEFNKFTDRGFYHNTEDDPLSGQLFLLDGQKWKSMRSKLSS TFTSGKMKYMFPTVVKVGHEFIEVFGQAMEKSPIVEVRDILARFTTDVIG TCAFGIECSSLKDPEAEFRVMGRRAIFEQRHGPIGIAFINSFQNLARRLH MKITLEEAEHFFLRIVRETVAFREKNNIRRNDFMDQLIDLKNSPLTKSES GESVNLTIEEMAAQAFVFFGAGFETSSTTMGFALYELAQHQDIQDRVRKE CQEVIGKYNGEITYESMKDMVYLDQVISETLRLYTVLPDLNRECLEDYEV PGHPKYVIKKGMPVLIPCGAMHRDEKLYANPNTFNPIFFARTSEGSDSVE WLPFGDGPRLCIGMRFGQMQARSGLALLINRFKFSVCEQTTIPIVYSKKT FLISSETGIFLKVERV Score = 50.4 bits (118), Expect(2) = 2e-21 Identities = 23/60 (38%), Positives = 39/60 (64%) Frame = +1 Query: 6 ILFQVAVALLAILTYYIHRKLTYFKRRGIPFVAPHLIRGNMEELQKTKNIHEIFQDHYNK 65 +L + V +L++L + R+ Y++RRGIP PH I GN+++ K ++I IF+D+Y K Sbjct: 44542 LLLALIVVILSLLVFAARRRHGYWQRRGIPHDVPHPIYGNIKDWPKKRHIAMIFRDYYFK 44721 MLLLALIVVILSLLVFAARRRHGYWQRRGIPHDVPHPIYGNIKDWPKKRHIAMIFRDYYFK YKRSVYPFAGFYFFFTRSAVITDLELVKRVLIKDFNHFENRGIFYNEIDDPLS Score = 45.7 bits (106), Expect = 4e-04 Identities = 22/61 (36%), Positives = 36/61 (58%) Frame = +1 Query: 429 WLPFGDGPRNCIGLRFGQMQARIGPALLIRNFKFSTCSKTPNPLVYDPKSFVLGVKDGIY 488 W FG G R+CIG++F Q+Q R+ ALL+ ++FS ++ P ++ ++DGI Sbjct: 694 WFGFGVGARSCIGIQFAQLQLRLALALLLSEYEFSLNTRKP----------LINLEDGIA 843 Query: 489 L 489 L Sbjct: 844 L 846 WFGFGVGARSCIGIQFAQLQLRLALALLLSEYEFSLNTRKPLINLEDGIAL Score = 39.5 bits (90), Expect = 0.030 Identities = 26/75 (34%), Positives = 33/75 (43%) Frame = +1 Query: 356 ETLRLYTIVPNLDRMAAKRYVVPGHPNFVIEAGQSVIIPSSAIHHDPSIYPEPFEFRPER 415 ETLR Y ++ +L RM + P P + I G PSS E +PER Sbjct: 43120 ETLRKYPVLAHLTRMTQTDFS-PEDPKYFIAQGNHWCDPSSWYSL*SGNLSRTGEVKPER 43296 Query: 416 FSPEESAGRPSVAWL 430 + E A RPS WL Sbjct: 43297 LTDEAIAARPSCTWL 43341 AL057750.1|CNS00162 Drosophila melanogaster genome survey sequence TET3 end of 47% to 6a2 65% TO AL062684 YVIGKFSXGLDCHSXSDPKAKFVQMGTTAITERRHGKSMXLLLFGAPELAAXX RMKATVQEVEDFYMNIIRDTVDYRVKNNVKRHDFVDMLIEMKLKFDN GDKENGLTFNEIAAQAFIFFLAGFETSSTTMGFALYELACHQDIQDKLRTEINTVLKQHN GKLDYDSMREMTYLEKVITETMRKRPVVGHLIRVATQHYQHTNPKYNIEKGTGVIVPTLA IHHDPEFYPEPEKFIPERFDEDQVQQRPXCTFLPFGDGPRNCIGLRFGRMQVIVGXALLI HNFKF Cyp6a14 AC005457 P1 clone DS08616 AC007085 (BACR21H10) 16539-17204 13565-15095 57% identical to CYP6A5 MLFTIALVGVVLGLAYSLHIKIFSYWKRKGVPHETPLPIVGNMX 13699 RSTTSAISTKEFIRSLRGRVPSPECTFFKRTALITDLDFIKQVMIKDFSYFQ 13868 DRGAFTNPRDDPLTGHLFALEGEEWRAMRHKLTPVFTSGKIKQMSKVIVDVGLRL GDAM 14045 DKAVKEAKVEEGNVEIKDLCARFTTDVIGSCAFGLECNSLQDPSAEFRQKGREIFT RR 14219 RHSTLVQSFIFTNARLARKLRIKVLPDDLTQFFMSTVKNTVDYRLKNGIKRNDFIEQ MIE 14399 LRAEDQEAAKKGQGIDLSHGLTLEQMAAQAFVFFVAGFETSSSTMSLCLYELALQP DIQQ 14579 RLREEIESVLANVDGGELNYDVLAQMTYLDQVLSETLRKHP 14759 LLPHLIRETTKDYQIPNSDIVLDKGILALIPVHNIHHDPEIYPEPEKFDPSRFDPEEVK N 14939 RHPMAYLPFGDGPRNCIGLRFGKIQAKIGLVSLLRRFKFSVSNRTDVPLIFS 15095 KKSFLLTTNDGIYLKVE Cyp6a15p Drosophila melanogaster GenEMBL AC005457 P1 clone DS08616 comp(10139-9361) POSSIBLE PSEUDOGENE (MISSING N-TERMINAL) AFTVTNSKLAKKLKMKILRDDLTDFFLSVVKPALSGMTLWTSPPSGGRSSRQGGS KFDLS 9960 HNWTLE QMAAQAIVFFLAGFETSSSTMSSCKYELALQPEI*NQIRDEIERVLEGNAITYDALAK 9768 INYPEQVLSETLRKHPIQLIKFLLETQES 9681 FRVRNTELIVEKGTSLLIPVHSVHYDPHLYPHPKLFDSSRLKAYKSNSRHPFAYLP FGTF 9467 GPRSCIGLRFGKMQAKIGIVSLCQRFKFGDSDLTDIPL 9361 Cyp6a17 AL052842 (BACR001L16), AL074108 (BACR035C16) 49% with 6a2 AA699131 HL07765 = AI135898, GH13814, AI260230 LP03950, AI258456 LP01662 AI534145 SD06663, AI114010 GH10635, AI533952 SD06306 AI402220 AC010578 33000 region MLLLALIVVILSLLVFAARRRHGYWQRRGIPHDEVHPLFGNIKDWPNKRHIAEIFR DYYFKYKNSDYPFAGFYFFFTRTAVVTDMELLKRVLIKDFNHFENRGVFYNEIDDPLSATL FSIEGQKWRHLRHKLTPTFTSGKMKNMFPIVVKVGEEMDKVFRSKTAADRGQVLEVV DLVARYTADVIGNCAFGLNCNSLYDPKAEFVSIGKRAITEHRYGNMLDIFLF GFPKLSRRLRLKLNIQEAEDFYTKIVRETIDYRLRTKEKRND 338 FMDSLIEMYKNEQSGNSEDGLTFNELLAQAFIFFVAGFETSSTTMGFALYELARNQD 509 VQDKLREEIGNVFGKHNKEFTYEGIKEMKYLEQVVMETLRKVPVLAHLTR MTDTDFSPEDPKYFIAKGTIVVIPALGIHYDPDIYPEPEIFKPE 263 RFTDEEIAARPSCTWLPFGEGPRNCIGLRFGMMQTCVGLAYLIRGYKFSVSPETQIPMK 86 IVVKNILISAENGIHLKVEKLA AA567377 N-terminal to C helix same gene as AA698035 AI513739 AI134544 AI389195 AI295735 AI134091 MLWEFFALFAIAAALFYRWASANNDFFKDRGIAYEKPVLYFGNMAGMFLRKRA MFDIVCDLYTKGGSKKFFGIFEQRQPLLMVRDPDLIKQITIKDFDHFINHRN VFATSSDDDPHDMSNLFGSSLFSMRDARWKDMRSTLSPAFTGSKMRQMF QLMNQVAKEAVDCLKQDDSRVQENELDMKDYCTRFTNDVIASTAFGLQV NSFKDRENTFYQ This sequence is 71% identical to Drosophila mettleri CYP9 AF083947 So this is a blast with that seq. AC007594 Drosophila melanogaster chromosome 3 clone BACR28I14 (D680) RPCI-98 28.I.14 map 87B-87C strain y; cn bw sp, WORKING DRAFT SEQUENCE, 83 unordered pieces Length = 100315 AC007594 assembled sequences two different genes Sequence A exact match to AI259899 37628 DRDIVAQCFVFFFAGFETSAVLMCFTAHELMENQDVQQRLYEEVQQVDQDLEGKELTYEA 37807 37808 IMGMKYLDQVVNEVLRKWPAAIAVDRECNKDITFDVDGQKVEVKKGDVIWLPTCGFHRDP 37987 37988 KYFENPMKFDPERFSDENKESIQPFTYFPFGLGQRNCIGSR 38167 38168 FALLEAKAVIYYLLKDYRFAPANKSCIPLKLITSGFQLSPKGGFWIKLVQR 38320 the following sequence is in the same contig as the above seq. And so it is a different gene sequence B. 39127 MLWEFFALFAIAAALFYRWASANNDFFKDRGIAYEKPVLYFGNMAGMFLRKRAMFDIVCD 39306 39307 LYTKGGSK 39330 the following sequence is different from Sequence A The EST AI513739 is identical to these two sequences from the N-term up to NTFYQ So these are probably from the same gene 32607 KFFGIFEQRQPLLMVRDPDLIKQITIKDFDHFINHRNVFATSSDDDPHDMSNLFGSSLFS 32428 32427 MRDARWKDMRSTLSPAFTGSKMRQMFQLMNQVAKEAVDCLKQDDSRVQENELDMKDYCTR 32248 32247 FTNDVIASTAFGLQVNSFKDRENTFYQMGKKLTTFTFLQSMKFMLFFALKGLNKI 32021 LKVELFDRKSTQYFVRLVLDAMKYRQEHNIVRPDMINMLMEARGIIQTEKTKASAVREWS 31842 31841 DRDIVAQCFVFFFAGFETSAVLMCFTAHELMENQDVQQRLYEEVQQVDQDLEGKEL 31674 31673 KYLDQVVSEVLRKWPPAIAFDRECNKE 31593 this sequence is missing an internal exon, probable pseudogene 31595 EAKAVIYYLLKDYRFAPAKKSCIPLELISSGFQLSPKGGFWIKLVQR 31455 reconstruction of sequence A If we assume only two genes then the EST AI257340 must be the N-terminal of sequence A, since it is different from sequence B AI257340 MLWEFFALFAIADALLYRWASANNDFFKDRGIAYEKPELYFGNMAGMFLRKRAMFDIVCD 205 LYTKGGSKKFFGIFEQRQPLLMVRDPDLIKQITIKDFDHFINHRNEFATSSDDDPHDMSN 385 LFGSSLVSMRDARWKDMRSTLSPAFTGSKMRHMVQLMNHEA 508 ***** warning this sequence is from sequence B to make a complete seq. There may be some minor variation in the real Seq. A KEAVDCLKQDDSRVQENELDMKDYCTRFTNDVIASTAFGLQVNSFKDRENTFY ***** AI259899 extends sequence A back toward the N-terminal QMGKKLTTFTFLQSMKFMLFFALKGLNKILKVELFDRKSTQYFVRLVLDAMKYRQEHNIV 180 RPDMINMLMEARGIIQTEKTKASAVREWS 37628 DRDIVAQCFVFFFAGFETSAVLMCFTAHELMENQDVQQRLYEEVQQVDQDLEGKELTYEA 37807 37808 IMGMKYLDQVVNEVLRKWPAAIAVDRECNKDITFDVDGQKVEVKKGDVIWLPTCGFHRDP 37987 37988 KYFENPMKFDPERFSDENKESIQPFTYFPFGLGQRNCIGSR 38167 38168 FALLEAKAVIYYLLKDYRFAPANKSCIPLKLITSGFQLSPKGGFWIKLVQR 38320 reconstruction of sequence B (pseudogene) 39127 MLWEFFALFAIAAALFYRWASANNDFFKDRGIAYEKPVLYFGNMAGMFLRKRAMFDIVCD 39306 39307 LYTKGGSK 39330 the following sequence is different from Sequence A The EST AI513739 is identical to these two sequences from the N-term up to NTFYQ So these are probably from the same gene 32607 KFFGIFEQRQPLLMVRDPDLIKQITIKDFDHFINHRNVFATSSDDDPHDMSNLFGSSLFS 32428 32427 MRDARWKDMRSTLSPAFTGSKMRQMFQLMNQVAKEAVDCLKQDDSRVQENELDMKDYCTR 32248 32247 FTNDVIASTAFGLQVNSFKDRENTFYQMGKKLTTFTFLQSMKFMLFFALKGLNKI 32021 LKVELFDRKSTQYFVRLVLDAMKYRQEHNIVRPDMINMLMEARGIIQTEKTKASAVREWS 31842 31841 DRDIVAQCFVFFFAGFETSAVLMCFTAHELMENQDVQQRLYEEVQQVDQDLEGKEL 31674 This sequence has a small (8aa) deletion here 31673 KYLDQVVSEVLRKWPPAIAFDRECNKE 31593 this sequence is missing an internal exon, probable pseudogene 31595 EAKAVIYYLLKDYRFAPAKKSCIPLELISSGFQLSPKGGFWIKLVQR 31455 gb|AC009741.4|AC009741 Drosophila melanogaster chromosome 3 clone BACR44K17 (D976) RPCI-98 44.K.17 map 87B-87B strain y; cn bw sp, WORKING DRAFT SEQUENCE, 174 unordered pieces Length = 194613 146199 QRQPLLMVRDPDLIKQITIKDFDHFINHRNEFDTSSDDDPHDMSNLFSSSLFSMRDARWK 146378 146379 DMRSTLSPAFTGSKMRQMFQLMNQVAKEAVDCLKQDDSRVQENELDMKDYCTRFTNDVIA 146558 146559 STAFGLQVNSFKDRENTFYQMGKKLTTFTFLQNMKFILLFALKSLNK 146699 146768 ILKVEIFDRKSTQYFVRLVLDAMKYRQEHNIGRPDMINML 146887 Score = 250 bits (633), Expect(2) = 2e-80 Identities = 124/166 (74%), Positives = 138/166 (82%), Gaps = 1/166 (0%) Frame = +3 Query: 75 QRKPLLMIRDPELVKQITIKDFDHFINHRNIFGVDNND-PHDMDNLFGSSLFSMRDARWK 133 QR+PLLM+RDP+L+KQITIKDFDHFINHRN F ++D PHDM NLF SSLFSMRDARWK Sbjct: 146199 QRQPLLMVRDPDLIKQITIKDFDHFINHRNEFDTSSDDDPHDMSNLFSSSLFSMRDARWK 146378 Query: 134 DMRRPLSPAFTGSKMRQMFQLMDIVANEAVECLKRDDIPENGIELDMKDYCTRFTNDVIA 193 DMR LSPAFTGSKMRQMFQLM+ VA EAV+CLK+DD ELDMKDYCTRFTNDVIA Sbjct: 146379 DMRSTLSPAFTGSKMRQMFQLMNQVAKEAVDCLKQDDSRVQENELDMKDYCTRFTNDVIA 146558 Query: 194 STAFGLQVNSFKDRENQFYMMGKKLTPLQPLTNLKFLLFTSAQKIFK 240 STAFGLQVNSFKDREN FY MGKKLT L N+KF+L + + + K Sbjct: 146559 STAFGLQVNSFKDRENTFYQMGKKLTTFTFLQNMKFILLFALKSLNK 146699 Score = 70.6 bits (170), Expect(2) = 2e-80 Identities = 33/41 (80%), Positives = 37/41 (89%) Frame = +2 Query: 240 KALKISLFDRQSTNYFVRLVLDAMKYRQENNIIRPDMINML 280 K LK+ +FDR+ST YFVRLVLDAMKYRQE+NI RPDMINML Sbjct: 146765 KILKVEIFDRKSTQYFVRLVLDAMKYRQEHNIGRPDMINML 146887 Cyp6d2 AC004377 DS00837 MWTILLTILIAGLLYRYVKRHYTHWQRLGVDEEPAKIPFGVMDTVMKQER SLGMALADIYARHEGKIVGIYMLNKRSILIRDAQLARQIMTSDFASFHDR GVYVDEDKDPLSANLFNLRGASWGSVIFGLEIDSFRNPKNEFREISSSTS RDESLLLKIHNMSMFICPPIAKLMNRLGYESRILTSLRDMMKRTIEFREE HNVVRKDMLQLLIRLRNTGKIGEDDDQVWDMETAQEQLKSMSIEKIAAQA FLFYVAGSESTAAASAFTLYELSMYPELLKEAQEEVDAVLMKHNLKPKDR FTYEAVQDLKFLDICIMETIRKYPGLPFLNRECTEDYPVPGTNHIIAKGT PILISLFGMQRDPVYFPNPNGYDPHRFDSNNMNYDQAAYMPFGEGPRHCI GKALRMGKVNSKVAVAKILANFDLVQSPRKEVEFRFDAAPVLVTKEPLKL RLTKRK* AC007440 comp(89081-89665) region also on same fragment Cyp6g1 and one other 59% to 6D1 intron location at DFI*VMII not certain AC007440 comp(104661-104918) region C-term AC015208 18689-17472 + 17407-17141 MLLIWLLLLTIVTLNFWLRHKYDYFRSRGIPHLPPSSWSPMGNLGQLLFL RISFGDLFRQLYADPRNGQAKIVGFFIFQTPALMVRDPELIRQVLIKNFN NFLNRFESADAGDPMGALTLPLAKYHHWKESRQCMSQLFTSGRMRDVMYS QMLDVASDLEQYLNRKLGDRLERVLPLGRMCQLYTTDVTGNLFYSLNVGG LRRGRSELITKTK ELFNTNPRKVLDFMSVFFLPKWTGVLKPKVFTEDYARYMRHLVDDHHEPTKGDLINQLQHFQLSRSSNHYSQHPDFVASQAGI ILLAGFETSSALMGFTLYELAKAPDIQERLRSELREAFISTATLSYDTLMTLPYLKMVCLEALRLYPAAAFVNRECTSSASEG FSLQPHVDFI*VMIIPTYLCQSKF QFWPEPGVFDPKRFGPERSRHIHPMTYIPFGAGPHGCIGSRLGVLQLKLGIVHILKQYW VETCERTVSEIRFNPKSFMLESENEIYLRFCRSSL* AC015208 Drosophila melanogaster, *** SEQUENCING IN PROGRESS ***, in ordered 2-491 19014-20624 45% to 6g1 MELVLLILVASLIGIAFLALQQHYSYWRRMGVREIRPKWIVGNLMGLLNMRMSPAEFISQLY NHPDAENEPFVGIHVFHKPALLLRDPEMVRNILVKDFAGFSNRYSSSDPKGDPLGSQNIF FLKNPAWKEVRLKLSPFFTGNRLKQMFPLIEEVGASLDAHLRQQPLHNERMRCFDLEAK ELCALYTTDVIATVAYGVSANSFTDPKCEFRRHGRSVFEFNLLRAAEFTLVFFLPHLVPF VRFKVVPAEATRFLRKTINYVMSEREKSGQKRNDLIDILIEFRRSTQLAKASGIKDQF VFEGDILVAQAVLFFTAGFESSSSTMAFAMYELAKDTDVQQRLREEIKDALVESGGQVTL KMIESLEFMQMILLEVLRMYPPLPFLDRECTSGRDYSLAPFHKKFVVPKGMPVYIPC YALHMDPQYFPQPRKFLPERFSPENRKLHTPYTYMPFGLGPHG CIGERFGYLQAKVGLVNLLRNHMITTSERTPHRMQLDPKAIITQAKGGIHLRLVRDALGV* AC007441 Drosophila melanogaster chromosome 3 clone BACR10E03 (D690) RPCI-98 AI108995 AI517349 AI114209 AI403829 AI134370 AI238761 AI517348 AI107803 AI405169 AI109667 AA141600 AL065891 50% to Cyp6d2 MIGIYLLIAAVTLLYVYLKWTFSYWDRKGFPSTGVSIPFGALESVTKGK RSFGMAIYDMYKSTKEPVIGLYLTLRPALLVRDAQLAHDVLVKDFASFHD RGVYVDEKNDPMSASLFQMEGASWRALRNKLTPSFTSGKLKAMFETSDSV GDKLVDSIRKQLPANGAKELELKKLMATYAIDIIATTIFGLDVDSFADPN NEFQIISKKVNRNNIEDIIRGTSSFL probable end of gene on AI402085 AI107062 AA141601 AA141667 LYELTQNPEVMEKAKEDVRSAIEKHGGKLTYDAISDMKYLEACILETARKYPALPLLNRICTKDYPVPDSKLVIQKGTPIIISLIG MHRDEEYFPDPLAYKPERYLENGKDYTQAAYLPFGEGPRMCIGARMGKVNVKIAIAKVLSNFDLEIRKEKCEIEFGVYGIPLMPKS GVPVRLSLKK* AF083946 Cyp6g1 = AC007440, AL065705, AA698945, AI402390 GH21606, AI403823 GH23342, AI063794 GH03774 AI517356 GH28075 AI113449 GH09719 AI517823 GH28887 AI386932 GH17321 AI133818 GH10520 AI062549 GH01777 AL070300 (BACR31O06) AI402215 AC015208 3-150 22920-22477 154-491 22405-21326 note: lower case matches ESTs not AF083946 MVLTEVLFVVVAALVALYTWFQRNHSYWQRkgipYIPPtpiigNTKVVFK MENSFGMHlSEIYNDPRLKDEAVVGIYSMNKPGLIIRDIELIKSILIKDF NRFHNRYarcdphgdplgynNLFFVRDAhwkgiRTKLTPVFTSGKVKQMY TLmQEIGKDLELALQRRGEKNSGSFITEIKEICAQFSTDSIATIAFGIRA NSLENPNAEFRNYGRKMFTFTVARAKDFFVAFFLPKLVSLMRIQFFTADF SHFMRSTIGHVMEERERSGLLRNSLIDVLVSLRKEAAAEPSKPHYAKNQD FLVVSAGVFFTAGFETSSSTMSFALYEMAKHPEMQKRLRDEINEALVEGG GSLSYEKIQSLEYLAMVVDEVLRMYPVLPFLDREYESVEGQPDLSLKPFY DYTLENGTPVFIPIYALHHDPKYWTNPSQFDPERFSPANRKNIVAMAYQP FGSGPHNCIGSRIGLLQSKLGLVSLLKNHSVRNCEATMKDMKFDPKGFVH QADGGIHLEIVNDRLYDQSAPSLQ AC008257 43% identical to 6g1 possible frameshifts after EKDRRKA and GEDV identical to AI403674, AI237973, AI517561, AI107861, AI404012, AI109083, AI389275, AI113367, AI064268, AI064259, 1 diff with AI108091, 2 diffs with AI402430 MLLLLLLGSLTIVFYIWQRRTLSFWERHGVKYIRPFPVVGCTREFLTAKVPFFEQIQKFH EAPGFENEPFVGVYMTHRPALVIRDLELIKTVMIKKFQYFNNRVLQTDPHNDALGYKNLF FARSPGWRELRTKISPVFTSGKIKQMYPLMVK*IGKNLQDSAERLGSGTEVQVKDLCSRF TTDLIATIAFGVEANALQDAKSEFFYHNRAIFSLTLSRGIDFAIIFMIPALASLARVKLF SRETTKFIRSSVNYVLKEREKDRRKA*RNDLIDILLALKREAAANPGEDV*KEVDLDYLV AQAAVFQTAGFETSASTMTMTLYELAKNEALQDRLRQEIVDFFGDEDHISYERIQEMPYL SQVVNETLRKYPIVGYIERECSQPAEGERFTLEPFHNMELPHGMSIYMSTVAVHRDPQYW PDPEKYDPERFNSSNRDNLNMDAYMPFGVGPRNCIGMRLGLLQSKLGLVHILRNHRFHTC DKTIKKIEWAPTSPETFSTRRIISRFEAITGPAN* AC015002 Drosophila melanogaster, *** SEQUENCING IN PROGRESS ***, in 43% identical to 6g1 = AC011761 4-510 11535-8695 MVYSTNILLAIVTILTGVFIWSRRTYVYWQRRRVKFVQPTHLLGNLSRVLRLEESFALQLRRFY 11344 FDERFRNEPVVGIYLFHQPALLIRDLQLVRTVLVEDFVSFSNRFAKCDGRSDKMGALSLF 11164 LAKQPEWREIRTRLAPAFAGAKLKQMFSLMEE IGCDLEWYLKRLTRDLRRGDAERGAIVSIKDVCDLYNTDMIASIAFGLRSYSLRNTQSE 9947 IGSHCQDLFRPNVRRIIDLFVIFYLPKLVPLLRPKLFTEPHAEFLRRVIQLVIEERERGG 9767 DLRNDLIEMLLTLKKEADLQQDKSHFTHHRDFLAAQAASFEVAGIETCSASMSFALYELA 9587 KQPLMQSRLRREIREAFASNPNGRLTYEAVARMEFLDMVVEETLRKYPIVPLLERECTPI 9407 NKKRFYSLRPHAECYTRRGMPVFISNLAIHHDPK 9305 YWPDPDRFDPERFSAANKALQAPMSYMPFGAGPRNCIGMQIGLLQIK 8830 LGLVYFLHQHRVEICDRTVERIQFDAKFALLASEQRIYLKVDCL* 8695 AC006469 Cyp9b1 chromosome 2R DS02730 and DS07472 also U34324 1-505 comp(54916-56640) 74% identical to 9b2 MSFVEICLVLATIGLLLFKWSTGTFKAFEGRNLYFEKPYPFLGNMAASAL QKASFQKQISEFYNRTRHH*KLVGLFNLRTPMIQINDPQLIKKICVKDFD HFPNHQTLNIPNERLVNDMLNVMRDQHWRNMRSVLTPVFTSAKMRNMFTL MNESFAQCLEHLKSSQPIAAGENAFELDMKVLCNKLSNDVIATTAFGLKV NSFDDPENEFHTIGKTLAFSRGLPFLKFMMCLLAPKVFNFFKLTIFDSTN VEYFVRLVVDAMQYREKHNITRPDMIQLLMEAKKESKDNWTDDEIVAQCF IFFFAAFENNSNLICTTAYELLRNLDIQERLYEEVKETQEALKGAPLTYD AAQEMTYMDMVISESLRKWTLSAAADRLCAKDYTLTDDEGTKLFEFKAGD NINIPICGLHWDERFFPQPQRFDPERFSERRKKDLIPYTYLPFGVGPRSC IGNRYAVMQAKGMLYNLMLNYKIEASPRTTRDMWESARGFNIIPTTGFWM QLVSRK* AC006469 Cyp9b2 DS02730 and DS07472 also U34325, AI402118 COMP(57251-58955) AI063849 GH03949 AI108995 AI107803 AI114209 AI517349 AI403829 AI238761 AI134370 AI517348 AI403297 AI108823 AI386498 MALIEICLALVVIGYLIYKWSTATFKTFEERKLYFEKPYPFVGNMAAAAL QKSSFQRQLTEFYERTRQH*KLVGFFNMRTPMITLNDPELIKKVCVKDFD HFPNHQPFITSNDRLFNDMLSVMRDQRWKHMRNTLTPVFTAAKMRNMFTL MNESFAECLQHLDSSSKTLPGRKGFEVDMKVMCNKLSNDIIATTAFGLKV NSYDNPKNEFYEIGQSLVFSRGLQFFKFMLSTLVPKLFSLLKLTIFDSAK VDYFARLVVEAMQYREKHNITRPDMIQLLMEAKNESEDKWTDDEIVAQCF IFFFAAFENNSNLICTTTYELLYNPDVQERLYEEIVETKKALNGAPLTYD AVQKMTYMDMVISESLRKWTLAAATDRLCSKDYTLTDDDGTKLFDFKVGD RINIPISGLHLDDRYFPEPRKFDPDRFSEERKGDMVPYTYLPFGVGPRNC IGNRYALMQVKGMLFNLLLHYKIEASPRTIKDLWGSASGFNFTPRSGFWM HLVPRK* Cyp9b3 Drosophila mettleri AF083945 MDLILLLSIVGLIYFVYKWATARHNEFELRGLPFEKPLPIFGNN AAVVTGRASFQKSSPSSIARTRQHKMVGFFNFRTPMIQLNDPEIIKKITVKDFEYFP NHQLFFTTEERLINDMLSVMKDQRWKHMRNTLTPVFTSAKMRSMFSLMNESFAEC MDHLDQMSKTAVKPGGSFELELKEVCNRLSNDLIATTAFGLKVSSYKKPDNDFYEIGKSI VFFRGKALYKLFACPTTLPAVFKLLGFKIFDAQKTDFFIRLVVDAMKYREENNIVRPD MIQLLMEAKKESTEHWSDDELVAQCFIFFFAAFENNASLICTTAYELLNNPDVQQRLY EEVQETYDALKGEMLTYDAVTKMKYMDLVASESLRKWTLAASTDRECAKDYTLYDD DASKLFEFKAGDRINIPIVGLHLDDKFFPEPHKFIPERFSDGNKDQIVPYTYLPFGAGPRN CIGNRYALMQAKAMLYNLVLKYKIERSPKTVKDLLSDSRGFQLTPQSGYWVHLVPRK Cyp9c1 U34326, AL063862 (BACR007C22), AL058497 (BACR024A05) AL076220 (BACR038N13) AL055637 (BACR021M20) N-term AC007581 (BACR03I24) 1-43 8937-9143 may be two genes This sequence is from 19 unordered fragments AI296124 AI297882 AI543300 AI294370 AC007574 chromosome 2 clone BACR11C07 (D638) RPCI-98 MVFVELSIFVAFIGLLLYKWSVYTFGYFSKRGVAHEKPIPLLGNIPWSV LMGKESYIKHSIDLHLRLKQHKVYGVFN*LRDPLYYLSDPELIRQVGIK NFDTFTNHRKGITEGFNDTSVISKSLLSLRDRRWKQMRSTLTPTFTSLK IRQMFELIHFCNVEAVDFVQRQLDAGTSELELKDFFTRYTNDV IATAAFGIQVNSFKDPNNEFFSIGQRISEFTFWGGLKVMLYILMPKLMK VKTSPLHSFF NVDYFKKLVFGAMKYRKEQSIVRPDMIHLLMEAQRQFKAEQEGSAESAAQ QDKAEFNDDDLLAQCLLFFSAGFETVATCLSFTSYELMMNPEVQEKLLAE ILAVKEQLGEKPLDYDTLMGMKYLNCVVSESLRKWPPAFIVDRMCGSDFQ LKDEEGEVVVNLREDDLVHINVGALHHDPDNFPEPEQFRPERFDEEHKHE IRQFTYLPFGVGQRSCIGNRLALMEVKSLIFQLVLRYHLKPTDRTPA DMMSSISGFRLLPRELFWCKLESRGPA* Cyp9f1 Drosophila mettleri AF083947 MLVEFLALSVVVLLLAYRWATANYNFFKERGIPYHKPYPFVGNMGKMLLR QKSMFDLIVELYNRGDSKVFGIFEQRKPLLMIRDPELVKQITIKDFDHFI NHRNIFGVDNNDPHDMDNLFGSSLFSMRDARWKDMRRPLSPAFTGSKMRQ MFQLMDIVANEAVECLKRDDIPENGIELDMKDYCTRFTNDVIASTAFGLQ VNSFKDRENQFYMMGKKLTPLQPLTNLKFLLFTSAQKIFKALKISLFDRQ STNYFVRLVLDAMKYRQENNIIRPDMINMLLEARGLINSDKLKSSVVRDW SDRDIVAQCFVFFFAGFETSAVLMCFTAQELLENEDVQEKLYEEVAQVDS DLQGGQLTYEAIMGMKYLDQVVSEVLRKWPAAIAVDRECNKDITYEVDGK SVQIKKGEAVWLPTCGFHRDPKYFENPNKFDPDRFSEENKDKIQPFTYYP FGVGPRNCIGSRFALLEAKAVIYYLLREFRLVPAKKTCIPLVLSSSGFQL APKTGFWVKLIPRK Cyp9f2 AA735946 78% IDENTICAL TO Cyp9f1 AI113499 AI259899 DRENTFYQMGKKLTTFTFLQNMKFILLFALKSLN KILKVEIFDRKSTQYFVRLVLDAMKYRQEHNIVRPDMINMLMEARGIIQT EKTKASAVREWSDRSIVAQCFAFFFAGFETSAVLMCFTAHELMENQDVQQ RLYEEVQQVDQDLEGKELTYEAIMGMKYLDQVVSEVLRKWPPAIAFDREC NKDITFDVDGQKVEVKKGDVIWLPTCGFHRDPK AC007594 Drosophila melanogaster chromosome 3 clone BACR28I14 (D680) RPCI-98 28.I.14 map 87B-87C strain y; cn bw sp, WORKING DRAFT SEQUENCE, 83 unordered pieces Length = 100315 Score = 280 bits (708), Expect(2) = 3e-89 Identities = 140/152 (92%), Positives = 142/152 (93%) Frame = -3 Query: 36 ILKVEIFDRKSTQYFVRLVLDAMKYRQEHNIVRPDMINMLMEARGIIQTEKTKASAVREW 95 ILKVE+FDRKSTQYFVRLVLDAMKYRQEHNIVRPDMINMLMEARGIIQTEKTKASAVREW Sbjct: 32024 ILKVELFDRKSTQYFVRLVLDAMKYRQEHNIVRPDMINMLMEARGIIQTEKTKASAVREW 31845 Query: 96 SDRSIVAQCFAFFFAGFETSAVLMCFTAHELMENQDVQQRLYEEVQQVDQDLEGKELTYE 155 SDR IVAQCF FFFAGFETSAVLMCFTAHELMENQDVQQRLYEEVQQVDQDLEGKEL Sbjct: 31844 SDRDIVAQCFVFFFAGFETSAVLMCFTAHELMENQDVQQRLYEEVQQVDQDLEGKEL--- 31674 Query: 156 AIMGMKYLDQVVSEVLRKWPPAIAFDRECNKD 187 KYLDQVVSEVLRKWPPAIAFDRECNK+ Sbjct: 31673 -----KYLDQVVSEVLRKWPPAIAFDRECNKE 31593 Score = 245 bits (619), Expect = 8e-65 Identities = 116/121 (95%), Positives = 117/121 (95%) Frame = +2 Query: 97 DRSIVAQCFAFFFAGFETSAVLMCFTAHELMENQDVQQRLYEEVQQVDQDLEGKELTYEA 156 DR IVAQCF FFFAGFETSAVLMCFTAHELMENQDVQQRLYEEVQQVDQDLEGKELTYEA Sbjct: 37628 DRDIVAQCFVFFFAGFETSAVLMCFTAHELMENQDVQQRLYEEVQQVDQDLEGKELTYEA 37807 Query: 157 IMGMKYLDQVVSEVLRKWPPAIAFDRECNKDITFDVDGQKVEVKKGDVIWLPTCGFHRDP 216 IMGMKYLDQVV+EVLRKWP AIA DRECNKDITFDVDGQKVEVKKGDVIWLPTCGFHRDP Sbjct: 37808 IMGMKYLDQVVNEVLRKWPAAIAVDRECNKDITFDVDGQKVEVKKGDVIWLPTCGFHRDP 37987 Query: 217 K 217 K Sbjct: 37988 K 37990 Score = 68.7 bits (165), Expect(2) = 3e-89 Identities = 31/36 (86%), Positives = 34/36 (94%) Frame = -2 Query: 1 DRENTFYQMGKKLTTFTFLQNMKFILLFALKSLNKI 36 DRENTFYQMGKKLTTFTFLQ+MKF+L FALK LNK+ Sbjct: 32190 DRENTFYQMGKKLTTFTFLQSMKFMLFFALKGLNKV 32083 AC005450 Cyp9h1 comp(17940-19601) = AC007453 comp(140861-142543) = AC005472 comp(8473- 9426) 42% identical to 9b3 66% identical to AA567377 47% identical to 9c1 MDQSMIALALFIILLVLLYKWSVAKYDVFSERGVSHEKPWPLIGNIPLKAMIGGMP VLKKMIELHTK HTGSPVYGIYALRDAVFFVRDPELIKLIGIKEFDHFVNHNSMHNNIQESILSKS LISLRDGRWKEMRNILTPAFTGSKMRIMYDLIQSCSEEGVIHIQEQLELSQDASIELE MK DYFTRFANDVIATVAFGISINSFRRKDNEFFRIGQAMSRISAWSVVKAMLYALFPRL MK VLRIQVLDTKNIDYFSSL VTAAMRYRQEHKVVRPDMIHLLMEAKQQRLADLSDKSKDELYYSEFTADDLLAQC LLFFF AGFEIISSSLCFLTHELCLNPTVQDRLYEEIISVHEELKGQPLTYDKLTKMKYLDMV VLE ALRKWPPSISTDRECRQDIDLFDENGQKLFSARKGDVLQIPIFSLHHDPENFEDPEF FNP ERFADGHALESRVYMPFGVGPRNCIGNRMALMELKSIVYQLLLNFKLLPAKRTSR DLL NDIRGHGLKPKNGFWLKFEARQ* cyp12a4 AC006091 AC015190 334-470 612-178 Drosophila melanogaster chromosome 3 clone BACR48G05 (D475) 1-533 comp(90442-92074) MLKVRSALSLIQSQKATLSLATQK* RWQTNVATAEAREDSEWLQAKPFEQIPRLNMWALSMKMSMPGGKYKNMELME MFEAMRQDY GDIFFMPGIMGNPPFLSTHNPQDFEVVFRNEGVWPN RPGNYTLLYHREEYRKDFYQGVMGVIPTQ GKPWGDFRTVVNPVLMQPKNVRLYYKKMSQVNQEF ILELRDPDTLEAPDDFIDTINRWTLESVSVVALDKQLGLLKNSNKESEALKLFHYL DEFFIVSIDLEMKPSPWRYIKTPKLKRLMRALDGIQEVTLAYVDEAIERLDKEAKEG VVR PENEQSVLEKLLKVDRKVATVMAMDMLMAGVDT TSSTFTALLLCLAKNPEKQARLREEVMKVLPNKNSEFTEASMKNVPYLRACIKESQ RLHP LIVGNARVLARDAVLSGYRVPAGTYVNIVPLNALTRDEYFPQASEFLPERWLRSPK DSE SKCPANELKSTNPFVFLPFGFGPRMCVGKRIVEMELELGTARLIRNFNVEFNYPTE NAFR SALINLPNIPLKFKFIDLPN* cyp12a5 AC006091 AC015190 315-470 3273-2746 Drosophila melanogaster chromosome 3 clone BACR48G05 (D475) 1-535 comp(93052-94626) 76% identical to other AC006091 SEQ. 58% TO 12A1, 12A2 MLKGRIALNILQSQKPIVFSASQQ*RWQTNVPTAEIRNDPEWLQAKPFEE IPKANILSLFAKSALPGGKYKNLEMMEMIDALRQDYGNIIFLPGMMGRDG LVMTHNPKDFEVVFRNEGVWPFRPGSDILRYHRTVYRKDFFDGVQGIIPS QGKSWGDFRSIVNPVLMQPKNVRLYFKKMSQVNQEFIKEIRDASTQEVPG NFLETINRWTLESVSVVALDKQLGLLRESGKNSEATKLFKYLDEFFLHSA DLEMKPSLWRYFKTPLLKKMLRTMDSVQEVTLKYVDEAIERLEKEAKEGV VRPEHEQSVLEKLLKVDKKVATVMAMDMLMAGVDTTSSTFTALLLCLAKN PEKQARLREEVMKVLPNKDSEFTEASMKNVPYLRACIKESQRVYPLVIGN ARGLTRDSVISGYRVPAGTIVSMIPINSLYSEEYFPKPTEFLPERWLRNA SDSAGKCPANDLKTKNPFVFLPFGFGPRMCVGKRIVEMELELGTARLIRN FNVEFNHSTKNAFRSALINLPNIPLKFKFTDVPN* AC012807 Drosophila melanogaster, WORKING DRAFT SEQUENCE, in ordered pieces 53569-51819 = AI403604 AI389173 AA801503 48% identical to 12a2 MLRLTVKHGLRANSQLAATRNPDASSYVQQL ESEWEGAKPFTELPGPTRWQLFRGFQKGGEYHQLGMDDVMRLYKKQFGDICLIPGLFGM 53300 PSTVFTFNVETFEKVYRTEGQWPV 53228 RGGAEPVIHYRNKRKDEFFKNCMGLFGN GAEWGKNRSAVNPVLMQHRNVAIYLKPMQRVNRQFVNRIREIRDKESQEVPGDFMNTINH 52788 LTFESVATVALDRELGLLREANPPPEASKLFKNIEVLMDSFFDLGVRPSLYRYIPTPTYK 52608 KFSRAMDEIFDTCSMYVNQAIERIDRKSSQGDSNDHKSVLEQLLQIDRKLAVVMAMDM 52434 LMGGVDTTSTAISGILLNLAKNPEKQQRLREEVLSKLTSLHSEFTVEDMKSLPYLRAVIK 52254 ESLRLYPVTFGNARSAGADVVLDGYRIPKGTKLLMTNSFLLKDDRLYPRAKEFIPERWLR 52074 RKDDDKSDVLMNKDLNAFIYLPFGFGPRMCVGKRIVDLEMELTVANLVRNFHIEYNYS 51900 TEKPYKCRFLYKPNIPLKFKFTDLKY* 51819 A cyp12 family member 45% to 12a5 AC008187 AA802777 AA567770 AA567809 AA698375 AA698380 AA698511 AA942250 AA990733 AL074984, AI402682 AI402683 AI389804 LQLITKRNRMNTLSSARSVAIYVGPVRSSRSASVLAHEQAKSS*ITEEHKTYDEI PRPNKFKFMRAFMPGGEFQNASITEYTSAMRKRYGDIYVMPGMFGRKDWV TTFNTKDIEMVFRNEGIWPRRDGLDSIVYFREHVRPDVYGEVQGLVASQN EAWGKLRSAINPIFMQPRGLRMYYEPLSNINNEFIERIKEIRDPKTLEVP EDFTDEISRLVFESLGLVAFDRQMGLIRKNRDNSDALT
>AL074984 note this sequence has the end of an early exon at its beginning
AC008187
PWLXXXDRPMXLISRNRDDPDALT
LNIQPSMWRXISTPNFRMMMRLLDDILMFSQKMIKDTEDSVEKRRQ*
LQLIPKRNRMNTLSSARSVAIYVGPVRSXXXASVLAHEQAKSS
AC008187 N-term region SAVNSIFMQPKGLKTYYEFNN*FIER*ELIHTCIFRCPPEFYFFSNIKEIRDARTLEMPPGFRIPWLGGFRSPNGFDQSKSRRSRC IDCVPKHEARCSDMPFN*IYSHQCGDLSPLQTFG**CVFSMTS*CSHRK**RTPRTLWKSGAS**IK*KINKISVCKENKDQL*EL FLFFFNLVIYLEIVYR*TRIPYKFFKKYLNILEFFSSDPTK*VPGNVIVVIVVSNFSEGVFMLSSNALNLFF*PVTFICLFFQLSE VGNSRRLYRSADKKMKLMLCLLAFVFV*DFCILKHFKLFALLSFLRFRDKPLV*ERSRKWNCF*LQLITKRNRMNTL ISRQFNFYATQRIENLL*VQQLVH*ALRINSYLYISMST*VLLFLQH*RNSRC*NSGDAAWFSNPLAWWLSIAQWV*SVEIETIQM H*LCSKTRSTMFRYAFQLNIQPSMWRFISTPNFRMMMRLLDDILMFSQKMIKDTEDSVEKRRQLINKIKNK*NISLQREQGSTLRI VFIFL*FGYLS*NCLPLN*NPL*IF*KIFKYFRIFF*RSH*MSTG*CNSRNSSIKFL*GCIYVKFQRSQFVFLASHVHLFIFSVE* GRQFTAIIPFS**ENETHALFAGFCFCLRFLHFEAFQVIRSPLFSTF*R*AFGIRTFSKVELFLVAVDHQKE*NEYIE DQPSIQFLCNPKD*KLIMSSTTSSLSAKN*FILVYFDVHLSFTFSPTLKKFEMLELWRCRLVFESLGLVAFDRPMGLISRNRDDPD ALTVFQNTKHDVQICLSIEYTAINVEIYLHSKLSDDDASSR*HLNVLTENDKGHRGLCGKAAPVNK*NKK*IKYQFAKRTRINSKN CFYFSLIWLFILKLSTVELESLINFLKNI*IF*NFFLAIPLNEYRVM**S***YQISLRVYLC*VPTLSICFSSQSRSFVYFFS*V R*AIHGDYTVQLIRK*NSCFVCWLLFLFEIFAF*SISSYSLSSLFYVLEISLWYKNVLESGTVFSCS*SPKGIE*IH* AC008187 Drosophila melanogaster chromosome 2 clone BACR13D20 (D604) RPCI-98 = AC013857 86142 LQLITKRNRMNTLSSARSVAIYVGPVRSSRSASVLAHEQAKSS*ITEEH 85937 KTYDEIPRPNKFKFMRAFMPGGEFQNASITEYTSAMRKRYGDIYVMPGMFGRKDWVTTF 85761 85760 NTKDIEMVFRNEGIWPRRDGLDSIVYFREHVRPDVYGEVQGLVAS QNEAWGKLRSAINPIFMQPRGLRMYYEPLSNINNEF 85452 85390 IKEIRDPKTLEVPEDFTDEISRLVFESLGLVAFDRQMGLIRKNRDNSDALTLFQTSRDIF 85211 85210 RLTFKLDIQPSMWKIISTPTYRKMKRTLNDSLNVAQKMLKENQDALEKRRQAGEKINS 85037 85036 NSMLERLMEIDPKVAVIMSLDILFAGVDATATLLSAVLLCLSKHPDKQAKLREELLSIMP 84857 84856 TKDSLLNEENMKDMPYLRAVIKETLRYYPNGLGTMRTCQNDVILSGYRVPKGTTVLLGS 84680 84679 NVLMKEATYYPRPDEFLPERWLRDPETGKKMQVSPFTFLPFGFGPRMCIGKR 84524 84523 VVDLEMETTVAKLIRNFHVEFNRDASRPFKTMFVMEPAITFPFKFTDIEQ* 84371 duplicate C-terminal fragment Query: 405 CQNDVILSGYRVPKGTTVLLGSNVLMKEATYYPRPDEFLPERWLRDPETGKKMQVSPFTF 464 CQNDVILSGYRVPKGTTVLLGSNVLMKEATYYPRPDEFLPERWLRDPETGKKMQVSPFTF Sbjct: 14817 CQNDVILSGYRVPKGTTVLLGSNVLMKEATYYPRPDEFLPERWLRDPETGKKMQVSPFTF 14638 Query: 465 LPFGFGPRMCIGKRVVDLEMETTVAKLIRNFHVEFNRDASRPFKTMFVMEPAITFPFKFT 524 LPFGFGPRMCIGKRVVDLEMETTVAKLIRNFHVEFNRDASRPFKTMF+MEPAITFPFKFT Sbjct: 14637 LPFGFGPRMCIGKRVVDLEMETTVAKLIRNFHVEFNRDASRPFKTMFLMEPAITFPFKFT 14458 Query: 525 DIEQ 528 DIEQ Sbjct: 14457 DIEQ 14446 Query: 45 ITEEHKTYDEIPRPNKFKFMRAFMPGGEFQNASITEYTSA 84 + E ++Y IP P K +F RAFMPGGEF +ASI +Y +A Sbjct: 13807 VVTEARSYKNIPLPRKQEFARAFMPGGEFHDASILDYAAA 13688 Query: 163 SAINPIFMQPRGLRMYYEPLSNINNEF-----------------------IKEIRDPKTL 199 SA+N IFMQP+GL+ YYE NN F IKEIRD +TL Sbjct: 87107 SAVNSIFMQPKGLKTYYE----FNN*FIER*ELIHTCIFRCPPEFYFFSNIKEIRDARTL 86940 Query: 200 EVPEDF 205 E+P F Sbjct: 86939 EMPPGF 86922 Query: 211 RLVFESLGLVAFDRQMGLIRKNRDNSDALTLFQ 243 RLVFESLGLVAFDR MGLI +NRD+ DALT+FQ Sbjct: 86932 RLVFESLGLVAFDRPMGLISRNRDDPDALTVFQ Query: 244 TSRDIFRLTFKLDIQPSMWKIISTPTYRKMKRTLNDSLNVAQKMLKENQDALEKRRQAG 302 T +FR F+L+IQPSMW+ ISTP +R M R L+D L +QKM+K+ +D++EKRRQ Sbjct: 86835 TRSTMFRYAFQLNIQPSMWRFISTPNFRMMMRLLDDILMFSQKMIKDTEDSVEKRRQLI 86656 Query: 303 EKINS--NSMLER 313 KI + N L+R Sbjct: 86655 NKIKNK*NISLQR 86617 AL063519.1|CNS002VH T7 end of BAC BACR07K20 STS G01307 is exact match 29% to 12A1 MXILXLNRQYGDIVLEVMPSNVPIVHLYNRDDLEKVLKYPSKYPFRPPTEIIVMYRQSRPDR YASVGIVNEQGPMWQRLRSSLTSSITSPRVLQNFLPALNAVCDDFIELLRARRDPDTLV VPNFEELANLMGLEAVCTLMLGRRMGFLAIDTKQPQKISQLAAAVKQLFISQRDSYYGLG LWKYFPTKTYRDFARAEDLIYE*VHR AL074984 note this sequence has the end of an early exon at its beginning DRPMXLISRNRDDPDALT and the start of a related gene at its end. KRNRMNTLSSARSVAIYVGPVRSSRSASVLAHEQAKSS(intron)ITEEXK There is no P450 like sequence in the region between these, so this might represent an alternative splicing option, with two similar exons adjacent to each other in the genome. Alternatively, the upstream fragment could be a pseudogene. XVXXPWLXXXDRPMXLISRNRDDPDALTVSQNTKHDVQICXSIEYTAINV EXYLHSKLSDDDASSR*HLNVLTENDKGHRGLCGKAAPVNK*NKK*IKYQ YAKRTRINSKNCFYFSLIWXFILKLSTVELESL*IFKKXXXXLDXXLAXP LNE*RVCNSRNSSIKFSEVYLL*VPTLSICFSSQSRSFVYFFS*VR*AIH GDYTVQLIRK*NSCFVCWLLFLFEIFAF*SISSYSLSSLFYVLEISLWYK NVLESGTVFSCS*SPKGIE*IH*AVRDLWRFTWVPFDLXXQPPFWHMNKL NLVLVCTKSSRNRFIMLLVSNYR*RRSXXPX*XPRPQSXX XFXXLGLXXXIAQXV*SVEIETIQMH*LCHKTRSTMFRYAXQLNIQPSMW RXISTPNFRMMMRLLDDILMFSQKMIKDTEDSVEKRRQ*INKIKNK*NIS MQREQGSTLRIVFIFL*FGXLS*NCLPLN*NPYKFLKKXQXX*XXX*RXH *MSNGYVIVVIVVSNSLRCIYCKFQRSQFVFLASHVHLFIFSVE*GRQFT AIIPFS**ENETHALFAGFCFCLRFLHFEAFQVIRSPLFSTF*R*AFGIR TFSKVELFLVAVDPQKE*NEYIEQCAICGDLRGSRSIFXXSLRFGT*TS* I*C*YAQNPVEIDLSCY*FPIIDNGGAXXLXEXRAHNXXX XXXPLAWXFXSPNXFDQSKSRRSRCIDCVTKHEARCSDMPXN*IYSHQCG DXSPLQTFG**CVFSMTS*CSHRK**RTPRTLWKSGASK*IK*KINKISV CKENKDQL*ELFLFFFNLXIYLEIVYR*TRILINF*KKXKXXRXFXSXPT K*VTGM**S***YQIL*GVFIVSSNALNLFF*PVTFICLFFQLSEVGNSR RLYRSADKKMKLMLCLLAFVFV*DFCILKHFKLFALLSFLRFRDKPLV*E RSRKWNCF*LQLIPKRNRMNTLSSARSVAIYVGPVRSXXXASVLAHEQAK SSVSMHKIQSK*IYHVISFQL*ITEEXKXSXXSAPTIXX AC007398 AC007418 mito clan best matches = 37-38% to 12a sequences note this P1 clone is incomplete and this sequence is made up from 2 unconnected fragments and an EST AA941795 The EST spans the gap between the fragments but it is not without stop codons at the end. It may have a retained intron. Sequence has long C-terminal extension 44% to AC007356 ESTs AA941795 AI258819 AA697999 MQRLRTGESSNPKKLNV* SQQPVTSVATTRTTASSLPA ETTSSPAAAVRPYSEVPGPYPLPLIGNSWRFAPLIG* TYKISDLDKVMNELHVNYGKMAKVGGLIGHPDLLFVFDG DEIRNI* THFLFAMELRPSMPSLRHYKGDLRRDFFGDVAGLIGV* HGPKWEAFRQEVQHILLQPQTAKKYIPPLNDIASEFMGR* IELMRDEKDELPANFLHELYKWALE* SVGRVSLDTRLGCLSPEGSEEAQQIIEAINTFFWAVPELELRMPLWRIY PTKAYRSFVKALDQFT*EITLHPRICMKNIGKTM DKADADEARGLSKSEADISIVERIVRKTGNRKLAAILALDLFLVGVDT TSVAASSTIYQLAKNPDKQKKLFDELQKVFPHREADINQNVLEQM PYLRACVKETLRMRPVVIANGRSLQSDAVINGYHVPKGVSFREMVTIW* DPAYFPEPKRFLPERWLKQSTDXXX SAGCPHANQKIHPFVSLPFGFGRRMCVGRRFAEIELHTLLAKV GFNHVLHLALPELPPFNLPYLIPGTDLPQIQGLVQFRRVC VPCELHVHSAVASEFQTNAEGRVSCLHQREPEAHHGAQSCSYTRRMCDIY SQSSAAVAMLHFN* AC007356 Drosophila melanogaster chromosome 2 clone BACR24H09 (D595) RPCI-98 Comp(74301-74834) most like mitochondrial sequences CYP12A and CYP12B AI297564 is an exact match MNNLSLKAWRSTVSCGPNLRQCVPRISGA GSRRAQCRESSTGVATCPHLADSEEASAPRIHSTSEWQNALPYNQIPGPK PIPILGNTWR LMPIIGQYTISDVAKISSLLHDRYGRIVRFGGLIGRPDLLFIYDADEIEK CYRSEGPTPFRPSMPSLVKYKSVVRKDFFGDLGGVVG RHGEPWREFRSRVQKPVLQLSTIRRYLQPLE VITEDFLVRCENLLDENQELPEDFDNEIHKWSLE GIGRV ALDTRLGCLESNLKPDSEPQQIIDAAKYALRNVATLELKAPYWRYFPTPLWTRYVK NMNFFV SVCMKYIQSATERLKTQDPSLRAGEPSLVEKVILSQK DEKIATIMALDLILVGIDTVSMAVCSM LYQLATRPVDQQKVHEELKRLLPDPNTPLTIPLLDQMHHLKGFIKEVFRMYSTVIG NG RTLMEDSVICGYQVPKGVQAVFPTIVTGNMEEYVTDAATFRPERWLKPQHGGTPG KLHPF ASLPYGYGARMCLGRRFADLEMQILLAKLLRNYKLEYNHKPLDYAVTFMYAPDG PLRFKMTRV* Cyp12b1 U78485 Drosophila acanthoptera MWKFAIHSQQPFCWQQLCNRRHLYVGNVQQQTHLELLDAAPTRSDDEWLQAKPY EKVPGPGTWQVLSYFLPG GKQYNTNLIQMNRRMREWYGDIYRFPGLMGKQDVIFTYNPNDFELTYRNEGVWP IRIGLESFTYYRKVHRPE VFGSIGGLVSEQGKDWAHIRNKVNPVQMRVQNVRQNLPQIDQISREFVDKLDTLR DPVTHILNDNFHEQLKM WAFESISFVALNTRMGLLSDRPDPNAARLAEHMTDFFNYSFKYDVQPSIWPYYKT PGFKKFLQTYDKITEIT TAYIDEAIKRFEIEKDSGNECVLQQLLSLNKKVAVVMAMYMLMAGIDTTSSAFVTIL YHLARNPHKQRQLHR ERRRILPDSDEPLTPENTKNMPYLRACIKECMRITSITPGNFRIATKDLVLSGYRVPR GEGVLMGVLELSNS EKYFGQSGQFMPERWLKADTDPDVKACPAARSRNPFVYLAFGFGPRTCIGKRIAE LEMETLLTRLLRRYQVS WLAEMPLQYESNIILSPHGIYVQVRAAC cyp12b2 AC004657 DS07069_3_b12 P1 DS07069 Also AC004345 P1 DS07069 MWKYSNKIIYRNVSGNQLWFNRNSSVGGTLSQQVRSWQKEQELLKSRNLFTNN GYICSQTQ LELADSRIDEKWQQARSFGEIPGPSLLRMLSFFMPG*GALRNTNLIQMNRLMREMY GDIYC IPGMMGKPNAVFTYNPDDFEMTYRNEGVWPIRIGLESLNYYRKIHRPDVFKGVGG LASE*Q GQEWADIRNKVNPVLMKVQNVRQNLPQLDQISKEFIDK*LETQRNPETHTLTTDFH NQLKM WAFESISFVALNTRMGLLSDNPDPNADRLAKHMRDFFNYSFQFDVQPSIWTFYKT AGFKKF LKTYDNITDITSNYIETAMRGFGKNDDGKTKCVLEQLLEHNKKVAVTMVMDMLM AGIDT*T SSACLTILYHLARNPSKQEKLRRELLRILPTTKDSLTDQNTKNMPYLRACIKEGLRI TSIT PGNFRITPKDLVLSGYQVPRGTGVLMGVLELSNDDKYFAQSSEFIPERWLKSDLAP DIQAC PAARTRNPFVYLPFGFGPRTCIGKRIAELEIETLLVRLLRSYKVSWLPETPIEYESTII LS PCGDIRFKLEPVGDLM
>AL063519 T7 end of BAC BACR07K20 STS G01307 is exact match
AC006496 AC012699
MAVILLLALALVLGCYCALHRHKLADIYLRPLLKNTLLEDFYHAELIQPEAPKRRRRGI
WDIPGPKRIPFLGTKWIFLLFFRRYKMTKLHE
YGDIVLEVMPSNVPIVHLYNRDDLEKVLKYPSKYPFRPPTEIIVMYRQSRPDR
YASVGIVNEQGPMWQRLRSSLTSSITSPRVLQNFLPALNAVCDDFIELLRARRDPDTLV
VPNFEELANLMGLEAVCTLMLGRRMGFLAIDTKQPQKISQLAAAVKQLFISQRDSYYGLG
LWKYFPTKTYRDFARAEDLIYE
Orf with I helix
SQSSVISEIIDHELEELKKSAACEDDEAAGLRSIFLNI
LELKDLDIRDKKSAIIDFIAAGIET
Orf with EXXR motif
LANTLLFVLSSVTGDPGAMPRILSEFCEYRDTNILQDALTNA
TYTKACIQESYRLRPTAFCLARILEEDMELSGYSLNAG
Orf with PERW and heme motifs
TVVLCQNMIACHKDSNFQGAKQFTPERWIDPATENFTVNVDNASIVV
PFGVGRRSCPGKRFVEMEVVLLLAK
Cyp18 U44753 AL062343 MLADSYLIKFVLRQLQVQEDGDAQHLLMVFLGLLALVTLLQWLVRNYREL RKLPPGPWGLPVIGYLLFMGSEKHTRFMELAKQYGSLFSTRLGSQLTVVM SDYKMIRECFRREEFTGRPDTPFMQTLNGYGIINSTGKLWKDQRRFLHDK LRQFGMTYMGNGKQQMQKRIMTEVHEFIGHLHASDGQPVDMSPVISVAVS NVICSLMMSTRFSIDDPKFRRFNFLIEEGMRLFGEIHTVDYIPTMQCFPS ISTAKNKIAQNRAEMQRFYQDVIDDHKRSFDPNNIRDLVDFYLCEIEKAK AEGTDAELFDGKNHEEQLVQVIIDLFSAGMETIKTTLLWINVFMLRNPKE MRRVQDELDQVVGRHRLPTIEDLQYLPITESTILESMRRSSIVPLATTHS PTRDVELNGYTIPAGSHVIPLINSVHMDPNLWEKPEEFRPSRFIDTEGKV RKPEYFIPFGVGRRMCLGDVLARMELFLFFASFMHCFDIALPEGQPLPSL KGNVGATITPESFKVCLKRSPLGPTAADPHHMRNVGAN AI294030 3 DIFFS WITH CYP18 MLADSYLIKFVLRQLQVQQDGDAQHLLMGFLGLLALDT AL067521 similar to CYP18 I-helix to K-helix region LLWXSVFHCXIRGCCARVXXALARVVGRLRLPXFXXXXXLPLPASPFLASXRRSXXVPLAPTPSPXR 515 AL077732 similar to CYP18 C-terminal Note: these three similar fragments may indicate a second CYP18 gene IDTEGKXATXSTSYPSXXAXXXSXAXFWRGVELFLFFASFIHCFXIXLPXGQPLPSLKGN 183 VGXTITPESFNVCLXXRPLWPTSADPHHMRNVGAN 288 AC012164 Drosophila melanogaster chromosome X clone BACR14D22 (D1120) RPCI-98 308-468 107436-107984 = AC012373 37% to CYP18 36% to CYP2L1 This is a CYP2 like sequence AC012373 308-468 153999-154499 = AC012164 107442-107942 see AC015216 below AC012164 293-399 27006-27380 this fragment identical to AC015216 73548-73922 RGLGKEELAGHATTLLLEGYETSAMLLAFALYELALNEDAQRHAGNLI 27185 DPGALGELRYSEAALLEALRLHPAMQALQKRCTKTFTLPDQKSGASSELKVHLGTELVLP 27365 VHAIH 27380 this seq frameshifts to the next seq AC012164 401-416 27445-27492 this fragment = AC015216 73987-74034 DSALYPAPNQFRPERF 27492 AC012164 477-537 109248-109066 not on AC015216 or AC012373 FLFFAPLKAWFGIGPPEGLPLARFKGKVGAPLPPGLVKVCPKGPPLGAPSPHSPPMGKVGA 109066 AC012164 Drosophila melanogaster chromosome X clone BACR14D22 (D1120) RPCI-98 CYP18 region 1-131 114600-114992 131-171 115830-115952 170-314 116024-116458 315-402 116562-116825 403-537 117262-117666 Note AC015216 is ordered pieces so there are two main genes here the first is an unknown gene from 48688-50877 the next gene is CYP18 from 51875-54944 on the minus strand a third gene is found in part at 73548-74034 AC015216 Drosophila melanogaster, in ordered fragments CYP18 54944-51875 minus strand new gene 48688-50877 this gene is identical to AC012373 152577-152804, 152894-153046, 153468-154541 this gene is identical to AC012164 84571-84344, 84254-84102, 107115-107984 31% to CYP18 an internal intron is present but cannot be precisely identified 48688 MSADIVDIGHTGWMPSVQSLSILLVPGALVLVILYLCERQCNDLMGAP 48832 PPGPWGLPFLGYLPFLDARAPHKSLQKLAKRYGGIFELKMGRVPTVVLSDAALVRDFFRR 49011 49012 DVMTGRAPLYLTH 49155 GIICAQEDIWRHARRETIDWLKALGMTRRPGELRARLERRIARGVDECV 49301 49723 VNPLPALHHSLGNIINDLVFGITYKRDDPDWLYLQRLQEEGVKLIGVSGVVNFLPWLRHL 49902 49903 PANVRNIRFLLEGKAKTHAIYDRIVEACGQRLKEKQKVFKELQEQKRLQRQLEKEQLRQS 50082 50083 KEADPSQEQSEADEDDEESDEEDTYEPECILEHFLAVRDTDSQLYCDDQLRHLLAD 50250 50251 LFGAGVDTSLATLRWFLLYLAREQRCQRRLHELLLPLGPSPTLEELEPLAYLRACIS 50421 50422 ETMRIRSVVPLGIPHGCKENFVVGDYFIKGGSMIVCSEWAIHMDPVAFPEPEEFRPERFL 50601 50602 TADGAYQAPPQFIPFSSGYRMCPGEEMARMILTLFTGRILRRFHLELPSGTEVDMAGES 50778 50779 GITLTPTPHMLRFTKLPAVEMRHAPDGAVVQD 50877 AC014292 AC009393 AC008359 AC014072 AI257867 MITETLLTICAAVFLCLSYRYAVGRPSGFPPGPPKIPLFGSYLFMLIINF KYLHKAALTLSRWYKSDIIGLHVGPFPVAVVHSADGVREILNNQVFDGRP QLFVAAMRDPGQDVRGIFFQDGPLWKEQRRFILRYLRDFGFGRRFDQLEL VIQEQLNDMLDLIRNGPKYPHEHEMVKSGGYRVLLPLLFNPFSANAHFYI VYNECLSREEMGKLVKLCQMGIQFQRNADD YGKMLSIIPWIRHIWPEWSGYNKLNESNLFVRQFFADFVDKYLDSYEEGVE RNFMDVYIAEMRRGPGYGFNRDQLIMGLVDFSFPAFTAIGVQLSLLVQYLMLYPAVLRRVQNEIDEVVGCGRLPNLEDRKNLPFTE ATIREGLRIETLVPSDVPHKALEDTELLGYRIPKDTIVVPSLYAFHSDARIWSDPEQFRPERFLDADGKLCLKLDVSLPFGAGKRL CAGETFARNMLFLVTATMCQHFDFVLGPNDRLPDLSQNLNGLIISPPDFWLQLQDRH* Cyp28a1 U89746 Drosophila mettleri MLEITLILVLLLLGLFYVFMTWNFGYWRKRKVPGPKPHCFTGNY PHMYNMKRHSVYDLNDIYSEYEHKFDAVGIYGARSPQLLVISPQVARRVFVSDFR HFH DNEISLMVDEKSDFIMANNPFSQIGDEWKQRRADITPGLTMGRIKTVYPVTQEVCQ KM TDWLRKQIRLPPSGGIDAKDMSLRFTSEMVTDCVLGLKAESFSDKPTPIMGYIKDL FA QSWTFIIYFVLVSTLPALRHVFKLRFVPLRIENFFVNLMQTAIDARRQQLAAGKQFE R VDFLDYILHLGKKKNLDTRHLTAHTMTFLLDGFETTALYDSCALVLSRDQEAQQKL RE ELEAHLDDKGIIDFEKLNELPFLDACVQESLRIFPPAFMSNKLCTEPIELPNKTGENF TVERGTTVVVPHYCFMMDEEFFPDPQAFKPERFMQPDAAKMYREQGVFMAFGDG PRVC IGMRFALTQIKGALVELLTKFIIRVNPKTRSDNEYDPTTFIGTCKGGIWLDFELRQ Cyp28a2 U89747 Drosophila mettleri MLVTLILLGLVVFLGYKFLIWNYDYWRKRKVPGPKPALFTGNYP HLFTGKQHPVYAVNEIYRKYKNDYDAVGIYISRMPQLLIVNPDLAHRVFVSNFKN FHD NEISALVVEKSDYIFANNIFSMTGDAWKERRSDITPGLTISRIKSVYPVTNQVCKKM T EYIKKQIRIAPKDGLNGKDLSLCFTTEMVTDCVLGLGAQSFTDNPTPVMAKMRNL FRQ DLPFLINTIAMALFPPLRRIIRLRFLSKTIVEFFVRFMETPLEERQKHISAGANINRV DMLGYIIQLSPKRNMDSLKITACTMSFLLDGDDTPPSLLSNTLLLLGRNPQGHQRL RE ELSEHLCDQGFIDFDKLVDLPYLNACVHESIRIFLTAVSSKLCTESIELSNRNGPNFT VEKGTVVLVPITCFMYDDDHFPNANEYNPERFLKPDSIKKYRDQGLFLGFGDGPRI CIGMRFGLAQAKAALVEILVNFDVSVNARTRKDNLYDPKNLLSTLEGGIWLDFAARS AC008324 Drosophila melanogaster chromosome 2 clone BACR25K01 (D854) RPCI-98 25.K.1 map 25C-25D strain y; cn bw sp, WORKING DRAFT SEQUENCE, 81 unordered pieces Length = 122061 Score = 172 bits (432), Expect = 9e-43 Identities = 93/93 (100%), Positives = 93/93 (100%), Gaps = 43/93 (46%) Frame = +2 Query: 122 NNPFVLTGEAWKERRAEVTPGLSANR---------------------------------- 147 NNPFVLTGEAWKERRAEVTPGLSANR Sbjct: 28130 NNPFVLTGEAWKERRAEVTPGLSANRVSLGCPRIVYSKLISC*CLFSLGALIPLSIEVLQ 28309 Query: 148 ---------VKAAYPVSLRVCKKFVEYIRRQSLMAPAQGLNAKDLCLCYTTEVISDCVLG 198 VKAAYPVSLRVCKKFVEYIRRQSLMAPAQGLNAKDLCLCYTTEVISDCVLG Sbjct: 28310 SINALPPQKVKAAYPVSLRVCKKFVEYIRRQSLMAPAQGLNAKDLCLCYTTEVISDCVLG 28489 Query: 199 ISAQSFTDNPTPMVGM 214 ISAQSFTDNPTPMVGM Sbjct: 28490 ISAQSFTDNPTPMVGM 28537 TKRVFEQSFGFIFYTVVANLWPPITKFYSVSLFAKDVAAFFYDL 28669 MQKCIQVRRESPAAQQRDDFLNYMLQLQEKKGLNAAELTSHTMTFLTDGFETTAQVL 28840 THTLLFLARNPKEQMKLREEIGTAELTFEQISELPFTEACIH 28966 aa362 Score = 144 bits (360), Expect(2) = 1e-35 Identities = 73/102 (71%), Positives = 84/102 (81%), Gaps = 19/102 (18%) Frame = +2 Query: 98 SDFRSFHDNEMAKFTDSK 115 +DFRSFH+NE F ++ Sbjct: 29560 TDFRSFHNNEWRNFVSTR 29613 Query: 113 DSKTDPILANNPFVLTGEAWKERRAEVTPGLSANRV-------------------KAAYP 153 + KTD IL NNPFVLTG+ WKERR+E+ P LS NRV KA YP Sbjct: 29666 NKKTDMILGNNPFVLTGDEWKERRSEIMPALSPNRVRNHINHLSNQNSIFEFRKVKAVYP 29845 Query: 154 VSLRVCKKFVEYIRRQSLMAPAQGLNAKDLCLCYTTEVISDCVLGISAQSFTDNPTPMVG 213 VS VCKKFVEYIRRQ MA ++GL+A DL LCYTTEV+SDC LG+SAQSFTD PTP++ Sbjct: 29846 VSQSVCKKFVEYIRRQQQMATSEGLDAMDLSLCYTTEVVSDCGLGVSAQSFTDTPTPLLK 30025 Query: 214 M 214 M Sbjct: 30026 M 30028 Score = 139 bits (348), Expect(2) = 1e-51 Identities = 65/65 (100%), Positives = 65/65 (100%) Frame = -3 Query: 1 MCPISTALFVIAAILALIYVFLTWNFSYWKKRGIPTAKSWPFVGSFPSVFTQKRNVVYDI 60 MCPISTALFVIAAILALIYVFLTWNFSYWKKRGIPTAKSWPFVGSFPSVFTQKRNVVYDI Sbjct: 119395 MCPISTALFVIAAILALIYVFLTWNFSYWKKRGIPTAKSWPFVGSFPSVFTQKRNVVYDI 119216 Query: 61 DEIYE 65 DEIYE Sbjct: 119215 DEIYE 119201 Score = 112 bits (277), Expect = 1e-24 Identities = 45/69 (65%), Positives = 58/69 (83%) Frame = -3 Query: 1 MCPISTALFVIAAILALIYVFLTWNFSYWKKRGIPTAKSWPFVGSFPSVFTQKRNVVYDI 60 MCP++T L ++ +L L+YVFLTWNF+YW+KRGI TA +WPFVGSFPS+FT+KRN+ YDI Sbjct: 114532 MCPVTTFLVLVLTLLVLVYVFLTWNFNYWRKRGIKTAPTWPFVGSFPSIFTRKRNIAYDI 114353 Query: 61 DEIYEQYKN 69 D+IYE N Sbjct: 114352 DDIYEYVYN 114326 Score = 83.5 bits (203), Expect(2) = 1e-51 Identities = 39/46 (84%), Positives = 41/46 (88%) Frame = -1 Query: 58 YDIDEIYEQYKNTDSIVGVFQTRIPQLMVTTPEYAHKIYVSDFRSF 103 Y + + QYKNTDSIVGVFQTRIPQLMVTTPEYAHKIYVSDFRSF Sbjct: 119169 Y*LHHLIRQYKNTDSIVGVFQTRIPQLMVTTPEYAHKIYVSDFRSF 119032 Score = 25.8 bits (55), Expect(2) = 1e-35 Identities = 9/18 (50%), Positives = 13/18 (72%) Frame = +1 gb|AC008324.1|AC008324 Drosophila melanogaster chromosome 2 clone BACR25K01 (D854) RPCI-98 25.K.1 map 25C-25D strain y; cn bw sp, WORKING DRAFT SEQUENCE, 81 unordered pieces Length = 122061 Score = 201 bits (505), Expect(3) = 3e-79 Identities = 102/244 (41%), Positives = 163/244 (66%), Gaps = 43/244 (17%) Frame = +2 Query: 119 NNIFSMTGDAWKERRSDITPGLTISRI--------------------------------- 145 NN F +TG+AWKERR+++TPGL+ +R+ Sbjct: 28130 NNPFVLTGEAWKERRAEVTPGLSANRVSLGCPRIVYSKLISC*CLFSLGALIPLSIEVLQ 28309 Query: 146 ----------KSVYPVTNQVCKKMTEYIKKQIRIAPKDGLNGKDLSLCFTTEMVTDCVLG 195 K+ YPV+ +VCKK EYI++Q +AP GLN KDL LC+TTE+++DCVLG Sbjct: 28310 SINALPPQKVKAAYPVSLRVCKKFVEYIRRQSLMAPAQGLNAKDLCLCYTTEVISDCVLG 28489 Query: 196 LGAQSFTDNPTPVMAKMRNLFRQDLPFLINTIAMALFPPLRRIIRLRFLSKTIVEFFVRF 255 + AQSFTDNPTP++ + +F Q F+ T+ L+PP+ + + +K + FF Sbjct: 28490 ISAQSFTDNPTPMVGMTKRVFEQSFGFIFYTVVANLWPPITKFYSVSLFAKDVAAFFYDL 28669 Query: 256 METPLEERQKHISAGANINRVDMLGYIIQLSPKRNMDSLKITACTMSFLLDGDDTPPSLL 315 M+ ++ R++ +A R D L Y++QL K+ +++ ++T+ TM+FL DG +T +L Sbjct: 28670 MQKCIQVRRESPAAQ---QRDDFLNYMLQLQEKKGLNAAELTSHTMTFLTDGFETTAQVL 28840 Query: 316 SNTLLLLGRNPQGHQRLREELSEHLCDQGFIDFDKLVDLPYLNACVH 362 ++TLL L RNP+ +LREE+ + F+++ +LP+ AC+H Sbjct: 28841 THTLLFLARNPKEQMKLREEIG-----TAELTFEQISELPFTEACIH 28966 Score = 187 bits (470), Expect(3) = 2e-76 Identities = 98/254 (38%), Positives = 160/254 (62%), Gaps = 19/254 (7%) Frame = +2 Query: 109 VVEKSDYIFANNIFSMTGDAWKERRSDITPGLTISRI-------------------KSVY 149 V +K+D I NN F +TGD WKERRS+I P L+ +R+ K+VY Sbjct: 29663 VNKKTDMILGNNPFVLTGDEWKERRSEIMPALSPNRVRNHINHLSNQNSIFEFRKVKAVY 29842 Query: 150 PVTNQVCKKMTEYIKKQIRIAPKDGLNGKDLSLCFTTEMVTDCVLGLGAQSFTDNPTPVM 209 PV+ VCKK EYI++Q ++A +GL+ DLSLC+TTE+V+DC LG+ AQSFTD PTP++ Sbjct: 29843 PVSQSVCKKFVEYIRRQQQMATSEGLDAMDLSLCYTTEVVSDCGLGVSAQSFTDTPTPLL 30022 Query: 210 AKMRNLFRQDLPFLINTIAMALFPPLRRIIRLRFLSKTIVEFFVRFMETPLEERQKHISA 269 ++ +F F+ ++ L+ +R+ + F +K FF+ + + R + Sbjct: 30023 KMIKRVFNTSFEFIFYSVVTNLWQKVRKFYSVPFFNKETEVFFLDIIRRCITLRLEK--- 30193 Query: 270 GANINRVDMLGYIIQLSPKRNMDSLKITACTMSFLLDGDDTPPSLLSNTLLLLGRNPQGH 329 R D L Y++QL K+ + + I TM+F+LDG +T +L++ +L+LGRNP+ Sbjct: 30194 -PEQQRDDFLNYMLQLQEKKGLHTDNILINTMTFILDGFETTALVLAHIMLMLGRNPEEQ 30370 Query: 330 QRLREELSEHLCDQGFIDFDKLVDLPYLNACVH 362 ++R+E+ + FD++ +LP+L+AC++ Sbjct: 30371 DKVRKEIG-----SADLTFDQMSELPHLDACIY 30454 Score = 85.8 bits (209), Expect(3) = 3e-79 Identities = 41/89 (46%), Positives = 59/89 (66%), Gaps = 2/89 (2%) Frame = +1 Query: 363 ESIRIFLTAVSS-KLCTESIELSNRNGPNFTVEKGTVVLVPITCFMYDDDHFPNANEYNP 421 E++RIF +++ K+ TE EL+N+NG + + G VV++P+ +D ++ + P Sbjct: 29026 ETLRIFSPVLAARKVVTEPCELTNKNGVSVKLRPGDVVIIPVNALHHDPQYYEEPQSFKP 29205 Query: 422 ERFLKPDS-IKKYRDQGLFLGFGDGPRICIG 451 ERFL + KKYRDQGLF GFGDGPRIC G Sbjct: 29206 ERFLNINGGAKKYRDQGLFFGFGDGPRICPG 29298 Score = 76.1 bits (184), Expect(3) = 2e-76 Identities = 37/94 (39%), Positives = 60/94 (63%), Gaps = 3/94 (3%) Frame = +3 Query: 358 NACVHESIRIFLTAVSS-KLCTESIELSNRNGPNFTVEKGTVVLVPITCFMYDDDHFPNA 416 N + E++R+F V++ KL TE E +N+NG ++ G VV +P+ +D ++ + Sbjct: 30510 NFIILETLRLFSPQVAARKLVTEPFEFANKNGRTVHLKPGDVVTIPVKALHHDPQYYEDP 30689 Query: 417 NEYNPERFLKPD--SIKKYRDQGLFLGFGDGPRICIG 451 + PERFL+ + +K YRD+G++L FGDGPR C G Sbjct: 30690 LTFKPERFLESNGGGMKSYRDRGVYLAFGDGPRHCPG 30800 Score = 68.3 bits (164), Expect = 6e-11 Identities = 50/198 (25%), Positives = 88/198 (44%), Gaps = 1/198 (0%) Frame = +1 Query: 275 RVDMLGYIIQLSPKRNMDSLKITACTMSFLLDGDDTPPSLLSNTLLLLGRNPQGHQRLRE 334 R ML ++ +D I +F+ +G DT + L TLL+L + ++ E Sbjct: 77458 RYAMLDTLLAAEADGQIDHQGICDEVNTFMFEGYDTTSTCLIFTLLMLALHEDVQKKCYE 77637 Query: 335 ELSEHLCDQGFIDFDKLVDLPYLNACVHESIRIFLTA-VSSKLCTESIELSNRNGPNFTV 393 E+ D I + +L Y+ + ES+R+F + + C E ++ P Sbjct: 77638 EIKYLPDDSDDISVFQFNELVYMECVIKESLRLFPSVPFIGRQCVEETVVNGMVMP---- 77805 Query: 394 EKGTVVLVPITCFMYDDDHFPNANEYNPERFLKPDSIKKYRDQGLFLGFGDGPRICIGMR 453 K T + + + M D HF N + + P+RF +++ R F+ F G R CIG + Sbjct: 77806 -KDTQISIHLYEIMRDARHFSNPDLFQPDRFFPENTVN--RHPFAFVPFSAGQRNCIGQK 77976 Query: 454 FGLAQAKAALVEILVNFDV 472 F + + K L ++ NF + Sbjct: 77977 FAILEIKVLLAAVIRNFKI 78033 Score = 66.0 bits (158), Expect(3) = 2e-76 Identities = 31/52 (59%), Positives = 39/52 (74%) Frame = +2 Query: 450 IGMRFGLAQAKAALVEILVNFDVSVNARTRKDNLYDPKNLLSTLEGGIWLDF 501 +GMRF L Q KAALVEIL NF++ VN +TR DN D ++TL+GGI+LDF Sbjct: 30857 LGMRFALTQLKAALVEILRNFEIKVNPKTRSDNQIDDTFFMATLKGGIYLDFKDL* 31012 Score = 59.3 bits (141), Expect(2) = 3e-17 Identities = 24/57 (42%), Positives = 38/57 (66%) Frame = -3 Query: 5 LILLGLVVFLGYKFLIWNYDYWRKRKVPGPKPALFTGNYPHLFTGKQHPVYAVNEIY 61 L ++ ++ L Y FL WN+ YW+KR +P K F G++P +FT K++ VY ++EIY Sbjct: 119374 LFVIAAILALIYVFLTWNFSYWKKRGIPTAKSWPFVGSFPSVFTQKRNVVYDIDEIY 119204 Score = 53.5 bits (126), Expect = 2e-06 Identities = 23/62 (37%), Positives = 38/62 (61%) Frame = -3 Query: 5 LILLGLVVFLGYKFLIWNYDYWRKRKVPGPKPALFTGNYPHLFTGKQHPVYAVNEIYRKY 64 L+L+ ++ L Y FL WN++YWRKR + F G++P +FT K++ Y +++IY Sbjct: 114511 LVLVLTLLVLVYVFLTWNFNYWRKRGIKTAPTWPFVGSFPSIFTRKRNIAYDIDDIYEYV 114332 Query: 65 KN 66 N Sbjct: 114331 YN 114326 Score = 51.9 bits (122), Expect(3) = 3e-79 Identities = 24/36 (66%), Positives = 28/36 (77%) Frame = +3 Query: 450 IGMRFGLAQAKAALVEILVNFDVSVNARTRKDNLYD 485 +GMRF L Q KAALVEI+ NFD+ VN +TRKDN D Sbjct: 29355 VGMRFSLTQIKAALVEIVRNFDIKVNPKTRKDNEIDDTYFM 29462 29401 aaatcgtgcg aaacttcgac atcaaggtta atcccaaaac tcgcaaggat aatgaaattg 29461 atgataccta ctttatgnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn Score = 50.4 bits (118), Expect(2) = 3e-17 Identities = 18/46 (39%), Positives = 35/46 (75%) Frame = -1 Query: 55 YAVNEIYRKYKNDYDAVGIYISRMPQLLIVNPDLAHRVFVSNFKNF 100 Y ++ + R+YKN VG++ +R+PQL++ P+ AH+++VS+F++F Sbjct: 119169 Y*LHHLIRQYKNTDSIVGVFQTRIPQLMVTTPEYAHKIYVSDFRSF 119032 AI062825 75% to Cyp28a2 QGLFFGFGDGPRICPGMRFSLTQIKAALVEIVRNFDIKVNPKTRKDNEID DTYFMPALKGGVWLDFVERN* Cyp28a3 U91565 Drosophila nigrospiracula MRFALIQIKAAVVEVITKFNVRVNPKTRKDNEYEPTAFITSLKGGIWLDFESRP Cyp28a4 U91565 Drosophila hydei MRFAMTQIKGALVEVLTKFNVRVNPKTRTDNEYEPTRFITTLKGGIWLDFEPRQ Cyp28a5 DS00180_1 : contig 1 (85065 bases) of 1 for P1 d29 (AC001660) AI387905 AI387915 AI514076 MVLITLTLVSLVVGLLYAVLVWNYDYWRKRGVPGPKPKLLCGNYPNMFTMKRH AIYDLDDI YRQYKNKYDAVGIFGSRSPQLLVINPALARRVFVSNFKNFHDNEIAKNIDEKTDFI FANNP FSLTGEKWKTRRADVTPGLTMGRQIKTVYPVTNKVCQKLTEWVEKQLRLGSKDG IDAKQMS LCFTTEMVTDCVLGLGAESFSDKPTPIMSKINDLFNQPWTFVLFFILTSSFPSLSHLI KLR FVPVDVERFFVDLMGSAVETRRAQLAAGKQFERSDFLDYILQLGEKRNLDNRQLLA YSMTF LLDGFETTATVLAHILLNLGRNKEAQNLLREEIRSHLQDGTIAFEKLSDLPYLDACV QETI RLFPPGFMSNKLCTESIEIPNKEGPNFVVEKGTTVVVPHYCFMLDEEFFPNPQSFQ PERFL EPDAAKTFRERGVFMGFGDGPRVCIGMRFATVQIKAAIVELISKFNVKINDKTRKD NDYEP GQIITGLRGGIWLDLEKL* AC014191 Drosophila melanogaster, *** SEQUENCING IN PROGRESS ***, in ordered 1-519 10002-8254 36% to 28a5 50% to part of AC009355 38% to whole AC008327 34% to 28a5 36% to 28A1 and 28A2 10002 MFGSLLLGIATLLGAIYAFLVSNFGHWRRRGVTEPRALPLFGSFPNMIWPRQHFTMDMRDIY 9811 RHYRNTHSYVGCYLLRAPKLLVLEPRLVYEIYVSAFSHFENN 9631 9630 DASKMVDIAKDRLVALNPFVLEGEEWRHQRAVFSTLLTNGRIRTTHAIMQRVCLDLCQFI 9451 9450 AIKSAGGKDLDCID 9350 LGLRFTGESLFDCVLGIQARTFTDNPLPVVRQNHEMSAENRGLAIAGAVHGLFPNLPRWL 9171 9170 RPKVFPRSHDRFYGQMISEALRLRRSKHQERNDFINHLLEMQRELDLSEEDMASHAMT 8997 8996 FMFDGLDTTSNSIAHCLLLLGRNPDCQRRLYEELQLVNPGGY 8817 8816 LPDLDALIDLPYLSACFN 8763 *ESLRIYPAGGWASKTCTKEYELRGS 8629 8628 HHSEPLKLRPGDHVMVPIYALHNDPDLYPEPDVFRPERFLDGGLKNCKQQG 8476 8475 IFLGFGNGPRQCVGMRLGLAMAKAALAAIVQRFEVVVSPRTLNGTELDPLIFVGVHKGGI 8296 8295 WLQFVPRKNVTTK* 8254 AI403094 AI403665 AA698711 AA697564 AC008324 AI062825 AC008327 50% to 28A2 MCPISTALFVIAAILALIYVFLTWNFSYWKKRGIPTAKSWPFVGSFPSVF TQKRNVVYDIDEIYEQYKNTDSIVGVFQTRIPQLMVTTPEYAHKIYVSDF RSFHDNEMAKFTDSKTDPILANNPFVLTGEAWKERRAEVTPGLSANRVKA AYPVSLRVCKKFVEYIRRQSLMAPAQGLNAKDLCLCYTTEVISDCVLGIS AQSFTDNPTPMVGMTKRVFEQSFGFIFYTVVANLWPPITKFYSVSLFAKDVAAFFYDL MQKCIQVRRESPAAQQRDDFLNYMLQLQEKKGLNAAELTSHTMTFLTDGFETTAQVL THTLLFLARNPKEQMKLREEIGTAELTFEQISELPFTEACIH ETLRIFSPVLAARKVVTEPCELTNKNGVSVKLRPGDVVIIPVNALHHDPQYYEEPQSFKP ERFLNINGGAKKYRDQGLFFGFGDGPRICPGMRFSLTQIKAALVEIVRNFDIKVNPKTRK DNEIDDTYFMPALKGGVWLDFVERN* 1-69 114532 MCPVTTFLVLVLTLLVLVYVFLTWNFNYWRKRGIKTAPTWPFVGSFPSIFTRKRNIAYDI 114353 114352 DDIYEYVYN 114326 missing 28 amino acids 70-97 98-114 29560 TDFRSFHNNEWRNFVST 29613 115-355 29666 KTDMILGNNPFVLTGDEWKERRSEIMPALSPNRVKAVYP 29845 29846 VSQSVCKKFVEYIRRQQQMATSEGLDAMDLSLCYTTEVVSDCGLGVSAQSFTDTPTPLLK 30025 30026 MIKRVFNTSFEFIFYSVVTNLWQKVRKFYSVPFFNKETEVFFLDIIRRCITLRLEKP-EQ 30202 30203 QRDDFLNYMLQLQEKKGLHTDNILINTMTFILDGFETTALVLAHIMLMLGRNPEEQDKVR 30382 30383 KEIGSADLTFDQMSELPHLDAC 30454 356-448 30519 ILETLRLFSPQVAARKLVTEPFEFANKNGRTVHLKPGDVVTIPVKALHHDPQYYEDPLTF 30698 30699 KPERFLESNGGGMKSYRDRGVYLAFGDGPRHCPG 30800 30860 MRFALTQLKAALVEILRNFEIKVNPKTRSDNQIDDTFFMATLKGGIYLDFKDL* 31021 CYP4AA1 AC004516 68725-68267 AC004426 AC005556 comp(849-1502) P1s DS08658 (D196) and DS00960 MHLRLLSPPQLERTTNLELCSILILLVISLSIYTFYATLNTYLRSVLLSLRLTGPPSLPF L GNCMLVTDKDCK*YGSLVRIWVLLFPFFAVLEPEDLQVILSSKKHTNKVFFYRLM HNFLGD GLITSSGSKWSNHRRLIQPAFHHNLLEKFIDTFVDASQSLYENLDAEAVGTEINIAK YVNN CVLDILN*EAVLGVPIKKRGQDVAMMEDSPFRQGKIMMPARFTQPWLLLDGIYHW TKMAND ELNQKKRLNDFTRKMIQRRRQIQNNNNGNSERKCLLDHMIEISESNRDFTEEDIVN EACTF MLAGQDSVGAAVAFTLFLLTQNPECQDRCVLELATIFEDSNRAPTMTDLHEMRYM EMCIKE ALRLYPSVPLIARKLGEEVRLAKHTLPAGSNVFICPYATHRLAHIYPDPEKFQPERF SPEN SENRHPYAFLPFSAGPRYCIGNRFAIMEIKTIVSRLLRSYQLLPVTGKTTIAATFRITL RA SGGLWVRLKERDHPLIAH* L49408 DS02740, AI405120 MFYTVIWIFCATLLAILFGGVRKPKRFPPGPAWYPIVGSALQVSQLRCRLGMFCKV IDVFA RQYVNPYGFYGLKIGKDKVVIAYTNDAISEMMTNEDIDGRPDGIFYRLRTFNSRLG VLLTD GEMWVEQRRFILRHLKNFGFARSGMMDIVHNEATCLLQDLKDKVLKSGGKQTRI EMHDLTS VYVLNTLWCMLSGRRYEPGSPEITQLLETFFELFKNIDMVGALFSHFPLLRFIAPNF SGYN GFVESHRSLYTFMSKEIELHRLTYKNYDEPRDLMDSYLRAQDEGNDEKGMFSDQS LLAICL DMFLAGSETTNKSLGFCFMHLVLQPEIQERAFQEIKEVVGLERIPEWSRDRTKLPY CEAIT LEAVRMFMLHTFGIPHRAVCDTRLSGYEIPKDTMVIACFRGMLINPVDFPDPESFN PDRYL FDGHLKLPEAFNPFGFGRHRCMGDLLGRQNLFMFTTTVLQNFKMVAIPGQVPEEV PLEGAT AAVKPYDIMLVAREQ* AC003055|AC003055 Drosophila melanogaster (P1 DS06332) This is the end of a partial gene found on AC009909 ESLRLSSLIPQYTKVCTLPTVIRLSESKSLDVEVGMTIMIPNYQFHHDKQYFPEPEAF KPE RFDNGAYQELMRKGIFLPFSDGPRICMGVPLAMLTLKSALVHILSNFQVVRGRDRL IPKGD SGFGVVLQGDVNLEYRRFFR* second gene complete 5801-7437 MFTLVGLCLTIVHVAFAVVYFYLTWYHKYWDKRGVVTAEPLTILGSYPGILINKSR SLILDVQDVYSKYKDKYRTVGTFITRQPQLLVLDPALAHEILVDKFSHFRDTITSSFVGHNP DDKYVAGSPFFSAGDKWKRLRSENVGGLTPSRLKMAYSIWEQSGRKLVEYIERARREQG DIIETRDLAYRFTANAMADFIWGIDAGSLSGKVGEIGDFQKTSTDWSAHAFSSMIRFNKTL VAIFVRKLFSMRFFTKATDEFFLRLTQDAVNLRQGGSGEGRTDYLSHLIQLQQRGNSIHDSV GHALTVHLDGFETSGAVLYNAVSYQLSEHHEEQEKLRSEILEALASEGQISYDQINNLPYLD QCFNESLRLTTPIGFFMRICTKPTQINLGDDKTLDLEPGVTVMVPAYQYHHDNDIYPEASE FRPDRFENGAASVLTKRGCFLPFGDGPRICLGMRVGQLSVKTAIVHILSNYQVEQMKKV PLGADSGMGIFLNGDVELKYTKLQK AC002444 (P1 DS02782 gene runs off end of cosmid) AC009909 MYILASLALILLHLLVLPIYLYLTWHHKYWRKRGLVTARPLTLLGTYPGLLTRKSN LVFDVQKIY 77703 SKYKGKHRAVGVFVTRQPQILVLDPELAHEVLVSNFRCYKDSLQSSYLRHAKWD KYARLNPFWASGQSWRRLRTDAQAGISGSRLRQAYNIWEQGGQMLTEYMTQQVA EKNNILETRDV 78113 CFRYTAHVMADFIWGIDAGTLTRPMEQPNKVQEMASKWTSYAFYMLTLFMATIVA PCSR 78351 LLLRFRFYPKETDEFFSNLTKESIELRLKAGDSTRTDYLSHLLQLRDQKQATHDDLV GHA 78531 LTVMLDGYDTSGTALLHALYYVLAENPAVQQKLRVEILSCMA 78711 SEKSLDFEKLSSLQYLEQ 78765 AC009909 Drosophila melanogaster chromosome 2 clone BACR30P05 (D1011) RPCI-98 30.P.5 map 22F-23A strain y; cn bw sp, WORKING DRAFT SEQUENCE, 68 unordered pieces Length = 106422 1-65 45210 MYILASLALILLHLLVLPIYLYLTWHHKYWRKRGLVTARPLTLLGTYPGLLTRKSNLVFD 45031 45030 VQKIY 45016 66-184 44962 SKYKGKHRAVGVFVTRQPQILVLDPELAHEVLVSNFRCYKDSLQSSYLRHAKWDKYARLN 44783 44782 PFWASGQSWRRLRTDAQAGISGSRLRQAYNIWEQGGQMLTEYMTQQVAEKNNILETRDV 44606 185-363 44544 CFRYTAHVMADFIWGIDAGTLTRPMEQPNKVQEMASKWTSYAFYMLTLFMATIVAPCSR 44368 44367 LLLRFRFYPKETDEFFSNLTKESIELRLKAGDSTRTDYLSHLLQLRDQKQATHDDLVGHA 44188 44187 LTVMLDGYDTSGTALLHALYYVLAENPAVQQKLRVEILSCMA 44008 44007 SEKSLDFEKLSSLQYLEQCFN 43945 last part is from AC003055 Drosophila melanogaster (P1 DS06332) ESLRLSSLIPQYTKVCTLPTVIRLSESKSLDVEVGMTIMIPNYQFHHDKQYFPEPEAF KPE RFDNGAYQELMRKGIFLPFSDGPRICMGVPLAMLTLKSALVHILSNFQVVRGRDRL IPKGD SGFGVVLQGDVNLEYRRFFR* This gene from AC009909, AC002444, AC003055 MYILASLALILLHLLVLPIYLYLTWHHKYWRKRGLVTARPLTLLGTYPGLLTRKSNLVFD VQKIYSKYKGKHRAVGVFVTRQPQILVLDPELAHEVLVSNFRCYKDSLQSSYLRHAKWDK YARLNPFWASGQSWRRLRTDAQAGISGSRLRQAYNIWEQGGQMLTEYMTQQVAEKNNILE TRDVCFRYTAHVMADFIWGIDAGTLTRPMEQPNKVQEMASKWTSYAFYMLTLFMATIVAP CSRLLLRFRFYPKETDEFFSNLTKESIELRLKAGDSTRTDYLSHLLQLRDQKQATHDDLV GHALTVMLDGYDTSGTALLHALYYVLAENPAVQQKLRVEILSCMASEKSLDFEKLSSLQY LEQCFNESLRLSSLIPQYTKVCTLPTVIRLSESKSLDVEVGMTIMIPNYQFHHDKQYFPE PEAFKPERFDNGAYQELMRKGIFLPFSDGPRICMGVPLAMLTLKSALVHILSNFQVVRGR DRLIPKGDSGFGVVLQGDVNLEYRRFFR* 2nd = AC003055 42101- This fragment is before the main segment of another P450 below it so it is a different gene. 41631 ERLAMLTLKSALVHILSNFQVVRGRDRLIPKGDSGFGVVLQGDVNLEYRRFFR* 41792 Query: 1 MYILASLALILLHLLVLPIYLYLTWHHKYWRKRGLVTARPLTLLGTYPGLLTRKS-NLVF 59 M+ L L L ++H+ +Y YLTW+HKYW KRG+VTA PLT+LG+YPG+L KS +L+ Sbjct: 42101 MFTLVGLCLTIVHVAFAVVYFYLTWYHKYWDKRGVVTAEPLTILGSYPGILINKSRSLIL 42280 Query: 60 DVQKIYS---KYKGKHRAVGVFVTRQPQILVLD----------------PELAHEVLVSN 100 DVQ +Y G ++ T+ Q V D P LAHE+LV+ Sbjct: 42281 DVQDVYK*ALPLIGDCLLQKIWSTQ*IQGQVQDSRNLHYATARSSWFWVPALAHEILVNK 42460 Query: 101 FRCYKDSLQSSYLRHAKWDKYARLNPFWASGQSWRRL---------RTDAQAGISG 147 F ++D++ SS++ H KY +P Q WR++ RTDAQ+ +G Sbjct: 42461 FSPFRDTITSSFVGHNPDAKYVAGSPLL---QRWRQMEATALRKCWRTDAQSTENG 42619 Query: 66 SKYKGKHRAVGVFVTRQPQILVLD-PELAHEVLVSNFRCYKDSLQSSYLRHAKWDKYARL 124 SKYK K+R VG F+TRQP P L ++ + ++ + R Sbjct: 42354 SKYKDKYRTVGTFITRQPAAPGFGFPPLPMRFW*TSSVPFGTPSPAALWATIRMQSTWRG 42533 Query: 125 NPFWASGQSWRRLRTDAQAGISGSRLRQAYNIWEQGGQMLTEYMTQQVAEKNNILETRDV 184 +PF+++G W+RLR++ G++ SRL+ AY+IWEQ G+ L EY+ + E+ +I+ETRDV Sbjct: 42534 HPFFSAGDKWKRLRSENVGGLTPSRLKMAYSIWEQSGRKLVEYIERARREQGDIIETRDV 42713 Query: 185 -----------------CFRYTAHVMADFIWGIDAGTLTRPMEQPNKVQEMASKWTSYAF 227 +R+TA+ MADFIWGIDAG+L+ + + Q+ ++ W+++AF Sbjct: 42714 RT*TETHFHQANITLQLAYRFTANAMADFIWGIDAGSLSGKVGEIGDFQKTSTDWSAHAF 42893 Query: 228 YMLTLFMATIVAPCSRLLLRFRFYPKETDEFFSNLTKESIELRLKAGDSTRTDYLSHLLQ 287 + F T+VA R L RF+ K TDEFF LT++++ LR RTDYLSHL+Q Sbjct: 42894 SSMIRFNKTLVAIFVRKLFSMRFFTKATDEFFLRLTQDAVNLRQGGSGEGRTDYLSHLIQ 43073 Query: 288 LRDQKQATHDDLVGHALTVMLDGYDTSGTALLHALY-----------------------Y 324 L+ + + HD VGHALTV LDG++TSG L H LY Y Sbjct: 43074 LQQRGNSIHDS-VGHALTVHLDGFETSGAVLYHMLYSVSKFHNFHILLHI*SIENNAVSY 43250 Query: 325 VLAENPAVQQKLRVEILSCMASEKSLDFEKLSSLQYLEQ 363 L+E+ Q+KLR EIL +ASE + ++++++L YL+Q Sbjct: 43251 QLSEHHEEQEKLRSEILEALASEGQISYDQINNLPYLDQCFN 43484 ESLRLTTPIGFFMRICTKPTQINLGDDKTLDLEPGVTVMVPAYQYHHDNDIYPEASEFRPDRFENGAASV LTKRGCFLPFGDGPRICLG 43750 AC007571 Drosophila melanogaster chromosome 3 clone BACR48A14 (D551) comp(64150-64449) missing some internal sequence, exons not exact MLASIILSGWLLLAWLYFLWS RRRYYKVAWQLRGPIGWPLIGMGLQMMNPESKSWTGF* YMDGLSRQFKAPFISWMGTSCFLYINDPHSVEQILNSTHCTNKGDFYR FMSSAIGDGLFTSSSPRWHKHRRLINPAFGRQILSNFLPIFNAEAEVLLQ KLELEGVQHGKRLEIYQILKKIVLEAAC* QTTMGKKMNFQHDGSLCIFKAYNGLVQMLRHLIMVNNIC missing about 95 amino acids in the middle. Sequence similarity is too low to identify exons from genomic sequence WLITFAFSLTEVCVKRMLSPWLYPDLIYRRSGLFRLQQKVVGILFGFIEQVSYWSDQSCTH The sequence above probably contains an exon since it shares SPWL with AC007752 in the same region of a related gene. Exon boundaries are not clear. One other exon is probably not identified here. *LISQTFETTSTALYFTILCLAMHPCYQ EKLHKELVTELPPSGDINLEQLQRLEYTEMVINEAMRLFAPVPMVLRSAD QDIQLKRGDGEFLIPRGTQIGIDIYNMQRDERVWGPLSRTYNPDAHFGLD SPQRHAFAFVPFTKGLRMCIGYRYAQMLMKLLLARIFRSYRISTEARLEE LLVKGNISLKLKDYPLCRVERR* Three reading frames of the undefined region note there are 58-60 amino acids from the W of the C-heliX to the ETAM exon ETTMGKKMNFQHDGSLCIFKAYNGLVQMLRHLIMVNNICL*LDRSVCKTNAIAVALS*LNISSFRTFPPAAKGRRHTVWVYRTGKL LVRSKLHPLKMVSSIFRQLLEPIVSVVAANSNPDQQRSEMEMRGKSKAIFIEQVREHVERGQLSWQDVRDEANVTIAAVGFKHPVK ILYHDSFHRLSRPHRRHCTLPSSAWQCIPATRRSCTRNWS RDNDGQENELPARWIFMHI*SL*WVSSNAQAFNNG**HLPLA*PKCV*NECYRRGSILT*YIVVPDFSACSKRSSAYCLGL*NR*A TGPIKVAPTKNGELHLPTASRTHSLGSGRQFQSGSAAIRNGDEGQVQSHFH*AGEGACGARSAELAGCEG*GECDHCSGRFQTSSE DIVP*LISQTFETTSTALYFTILCLAMHPCYQEKLHKELVT RQRWARK*TSSTMDLYAYLKLIMG*FKCSGI**WLITFAFSLTEVCVKRMLSPWLYPDLIYRRSGLFRLQQKVVGILFGFIEQVSY WSDQSCTH*KW*APSSDSFSNP*SR*WPPIPIRISSDPKWR*GASPKPFSLSR*GSMWSAVS*VGRM*GMRRM*PLQR*VSNIQ*R YCTMTHFTDFRDHIDGIVLYHPLPGNASLLPGEAAQGTGH AC007752 best match to AC007571 can be used against one another to find exons 121000 region 3 frames ETTMGSDVKDEESFRSNSLLGRYQW*XXVLTWNRNASIH**HPGNHDRYVLLSVAQ*PILPAAGWKGIALLPGQN*DSSVYSEGEH LQLLPPLIHLRIS*PL*IIERKLAEDEMGALPSIQSNDKNLFLNLVTDLMRRGVFTLKNVEDESNIIVFGAFETTANAVYYTLMLL AMFPEYQERAFEEIK KPPWAAM*RMKRASGAIPCWGDTNGEF*XXLGIGMHLSINSILETMTDMCFSPWLNSRFCRQLAGKESHYYQAKTEIRQFIRKVST SNYSLRLFTYESLNPFR*LRESWPRTKWEHYLPSSPMIKTYF*IWSRI**DEECLP*RMLRMSRISSFLELLRPRLMQCTTP*CCW RCFRNTRRGPLRK*R RNHHGQRCKG*RELQEQFLAGAIPMVSFDLESECIYPLIASWKP*XXQICASLRGSIADFAGSWLERNRTTTRPKLRFVSLFGR*A PPTTPSAYSLTNLLTPLDN*EKVGRGRNGSTTFHPVQ**KPISKSGHGFDETRSVYPEEC*G*VEYHRFWSF*DHG*CSVLHPDVA GDVSGIPGEGL*GNK AL068269 AA141181 might be same as AC005130 302-502 comp(5-630) 38% with 6a2 very similar to AL062712 AAQAXFLXLAGFDTSSSTYHFLRCTSWPRTPXFRTPXNGXSGCLAV *PRSATELRHFTGLXYLRQVVDEVLRXYPPTXFLDRCCNSRTGYDL SPWNGGSPFKLRAGTXVYISVLGIHRDAQYWPNPESFDPERFSAEQ RQQHHPMTYLPFGAGPRXCIGXXLGQLEIKVGLLHILNXFRVEXCE RTLPEMRFDPKAFVLTAHNGTYLRFVKNXL* AC005130 DS01560, complete sequence AL062712 (BACR006M08), AL075733 (BACR037C16), AL059237 (BACR025F06) AC005811 DS05527 heme region 51% with 6a2 MAY BE SAME GENE AS AL068269 439-506 21-224 Probably new family 3 pieces 2520 FGDQFRELYERKEAAGRAIVGINVLHSHALLLRDPALIRRILVEDFPEF 2374 1859 CFLLLAGFDTSSSFALYELAKNPTIEHRLQAELRVDLQSSHNHQLSYDTLTGLVYLRQVLED 1674 1450 LPFGAGPRGCIGTLLGQLGIKVGLLHTLKHFRVELCERTLPEMRFDP* 1322 *CIHKVTLENRCICGQTIFTITWLS LFSFFCAALAVGSVVLLPLIALLAVWLWQRRHFRIWRRLGVPYLPAAPVL GNVLNVETAASVISFESYTSVRKLLGAPL* *FGDQFRELYERKEAAGRAIVGINVLHSHALLLRDPALIRR ILVEDFPEFSSSFKSTDAIRDTMGSGNLLFTKYKTWWETHKIFAVRTSGK* *YVFADAPYRNCGRHNPSARLLRSPGCFLLLAGFDTSSSFALYELA KNPTIEHRLQAELRVDLQSSHNHQLSYDTLTGLVYLRQVLEDDPQFHDEI SYPPRELQRIASV* *SDTCLAVYFIIFKIRNPKPRIL HQIDLTRATLTASLHALPFGAGPRGCIGTLLGQLGIKVGLLHTLKHFRVE LCERTLPEMRFDP* AC012831 comp(42975-44564) similar to AC014742 MIAVFSLIAAALAVGSLVLLPVVLRGGCLLVVTIVWLWQILHFWHWRRLGVPFVPAAPFVGNVWNLLRGACCFGDQFRELYESKEA AGRAFVGIDVLHNHALLLRDPALIKRIMVEDFAQFSSRFETTDPTCDTMGSQNLFFSKYETWRETHKIFAPFFAAGKVRNMYGLLE NIGQKLEEHMEQKLSGRDSMELEVKQLCALFTTDIIASLAFGIEAHSLQNPEAEFRRMCIEVNDPRPKRLLHLFTMFFFPRLSHRV GTHLYSEEYERFMRKSMDYVLSQRAESGENRHDLIDIFLQLKRTEPAESIIHRPDFFAAQAAFLLLAGFDTSSSTITFALYELAKN TTIQDRLRTELRAALQSSQDRQLSCDTVTGLVYLRQVVDEVLRLYPPTAFLDRCCNSRTGYDLSPWNGGSPFKLRAGTPVYISVLG IHRDAQYWPNPEVFDPERFSAEQRQQHHPMTYLPFGAGPRGCIGTLLGQLEIKVGLLHILNHFRVEVCERTLPEMRFDPKAFVLTA HNGTYLRFVKNSL* AC014742 Drosophila melanogaster, *** SEQUENCING IN PROGRESS ***, in ordered 1040-2504 suspect frame shift at about amino acid 9 since N-terminal not found in other frame This gene is missing K-helix to PERF region and has two in frame stop codons This is probably a pseudogene. MVIAFFIFL*CAALAVGSVVLLPLIALLAVWLWQRRHFRIWRRLGVPYLPAAPVLG 1204 NVLNVETAACCFGDQFRELYERKEAAGRAIVGINVLHSHALLLRDPALIRRILVEDFPEFSSSFKST 1398 DAIRDTMGSGNLLFTKYKTWWETHKIFAIRLGGRRIRSLLYGLLERI*QNLEAHMAQK 1557 LNGAESVELEVKQLCALFTTDIFAKFALQSLQNPEAEFRPMCIEVNDPKPKRLSHHLFTGF 1710 TPPI-YRVRTHLYSEEYERFMRKSMNYVLAQRAEN*EKRYDLIDMFLQMHRT 1841 ETAEGIIHRPDFYVAQAAFLLLAGFDTSSSFALYELAKNPTIEHRLQAELRVDLQSSHNHQLSYDTLTGLVYLRQVLED 2077 LPFGAGPRGCIGTLLGQLGIKVGLLHTLKHFRVELCERTLPEMRFDP*ASVLTAHNGTFLRFVRNSL* 2504 AL070820.1|CNS00FNQ Drosophila melanogaster genome survey sequence TET3 end 181-429 70-848 40% with 28A1 LSLCYTTEVVSDCGLGXSAXSFTDTPTPLLKXIKRVFNTSFEFIFYSVVTNLWQKV RKFY SVPFFNKETEVFFLDIIRRCITLRLEKPKQQRDDFLNYILQLQEKKGLHTDNILIN TMTFILDGFETTALVLAHIMLMLGRNPESK LTFDQMSELPHLDACI*LETLRLFSPQVAARKLVTEPFEFANKNGRTVHLKPGDVV TIPVKAL 794 HHDPQYYEDPLTFKPERF 848 AL062352 AL054245 AI108091 AI113367 AI064259 AI064268 36% with 6a2 FRPPGXITXXPXMTXTXYXLAKNEXLQDRLRQEIVDFFGDEDHISY ERIQEMPYLSQXVNETLRKYPIVGYIEREAIHPRTIPQHGVPHGMS IYMSTVAVHRDPQYWPDPEKYDPERFNSSNRDNLNMDAYMPFGVGP RNCIGMRLGLLQSKLGLVHILRNHRFHTCDKTIKKIEWAPTSPETF CTRRIISRFEAITGPAN* AC014186 AA803931 AA821188 AA803220 AI404794 MCCLQSTTLDRKKNASWNRSAIVGSDTSDAPEHCPL RCGALLAVLLAWQQRKCWRLIWQLNGWRGVIQQPVLWLLLCINLHPNS ILEKVSQYRVHFQRPLRVLVGTRVLLYIDDPAGMECVLNAPECLDKTFLQDGFFVRRGLLHAR GQKWKLRRKQLNPAFSHNIVASFFDVFNSVGNQMVEQFQTQTNLHGQAVKFTAAEDLLSRAVLE* DTSMGAQLDTQSVDHSPIIQAFHLSSKLLFKRMINPLLSSDWIFQRTQLWRDLDEQLQ VIHSQMESVIEKRAKELLDMGEPAGRAHNLLDTLLLAKFEGQSLSRREI RDEINTFVFXGVDTTTAAMSFVLYALAKFPETQTRLRKELQDVALDETTDLDALNGLPYLEALIKE VLRLYTIVPTTGRQTTQSTEIGGRTYCAGVTLWINMYGLAHDKEYYPDPYAFKPERWLPE DGAVAPPAFSYIPFSGGPHVCIGRRYSLLLMKLLTARLVREFQMELSPEQAPLRLEAQM VLKAQQGINVSFLKQ* AC014186 Drosophila melanogaster, *** SEQUENCING IN PROGRESS ***, in ordered pieces Length = 13825 Score = 269 bits (681), Expect = 5e-72 Identities = 155/195 (79%), Positives = 162/195 (82%), Gaps = 53/195 (27%) Frame = -3 Query: 1 MCCLQSTTLDRKKNASWNRSAIVGSDTSDAPEHCPL------------RCGALLAVLLAW 48 MCCLQSTTLDRKKNASWNRSAIVGSDTSDAPEHCPL +LA LA Sbjct: 968 MCCLQSTTLDRKKNASWNRSAIVGSDTSDAPEHCPLGMRRPSGGAPRLAAAKVLATHLAA 789 Query: 49 Q-----------------QRKCWRLIWQLNGWRGVIQQPVLWLLLCINLHPNSILEKVSQ 91 + Q +L + RG+ P ++ ILEKVSQ Sbjct: 788 ERMARSDPAAGAMVTTLHQSASEQLSVESKAIRGISSYPDAFIA--------GILEKVSQ 633 Query: 92 YRVHFQRPLRVLVGTRVLLYIDDPAGMECVLNAPECLDKTFLQDGFFVRRGLLHAR---- 147 YRVHFQRPL VLVGTRVLLYIDDPAGMECVLNAPECLDKTFLQDGFFVRRGLLHAR Sbjct: 632 YRVHFQRPLAVLVGTRVLLYIDDPAGMECVLNAPECLDKTFLQDGFFVRRGLLHARGKYL 453 Query: 148 ------------------GQKWKLRR--LNPAFSHNIVASFFDVFNSVGNQMVEQFQTQT 187 GQKWKLRR LNPAFSHNIVASFFDVFNSVGNQMVEQFQTQT Sbjct: 452 NS*LSKLCN*LYLPANF*GQKWKLRRKQLNPAFSHNIVASFFDVFNSVGNQMVEQFQTQT 273 Query: 188 NLHGQAVK 195 NLHGQAVK Sbjct: 272 NLHGQAVK 249 PSNHQTEPLKTTLIASVCAAYKAPH*IEKKTRRGIARPLLVATLVMHLNIALWACGALLAVLLAWQQRKCWRLIWQLNGWRGVIQQ PVLWLLLCINLHPNS*V*SRRPLEE*VLIRMHSLQAYWRRYPNIGCTSSDRWPFSLAPESCSISMIPPAWSAFSMRPNAWTRPFCR MASLFDAVCSMPEVNT*IADYPNYAISFIYLQIFEGKNGNCAASN*IPPSVTISWPVSSMSSIRWAIRWWNSSKRRRIFMDRRLSL RLPRICLVELYSRYPVVSSTTQLVNNKHLKSKTS*LTCLNLLKCQQSIN*LQSTFKWNFINKLLIAFTYLRGTDTFPPFPC AIKSSN*ATKNHVDCKCMCCLQSTTLDRKKNASWNRSAIVGSDTSDAPEHCPLGMRRPSGGAPRLAAAKVLATHLAAERMARSDPA AGAMVTTLHQSASEQLSVESKAIRGISSYPDAFIAGILEKVSQYRVHFQRPLAVLVGTRVLLYIDDPAGMECVLNAPECLDKTFLQ DGFFVRRGLLHARGKYLNS*LSKLCN*LYLPANF*GQKWKLRRKQLNPAFSHNIVASFFDVFNSVGNQMVEQFQTQTNLHGQAVKF TAAEDLLSRAVLEVSCCK*HNSTS***TFKI*N*LINLLKFIEMPAID*LIAINIQMEFYKQITNCIYLPARDGHVPSVSM SHQIIKLSH*KPR*LQVYVLPTKHHIRSKKKRVVESLGHCW*RH**CT*TLPFGHAAPFWRCSSPGSSESAGDSSGS*TDGAE*SS SRCYGYYSASICIRTVECRVEGH*RNKFLSGCIHCRHTGEGIPISGALPATAGRSRWHPSPALYR*SRRHGVRSQCARMPGQDLSA GWLLCSTRSAPCQR*IPK*LIIQIMQLALFTCKFLRAKMEIAPQATESRLQSQYRGQFLRCLQFGGQSDGGTVPNADESSWTGG*V YGCRGFA*SSCTRGILL*VAQLN*LIINI*NLKLVN*LA*IY*NASNRLINCNQHSNGIL*TNY*LHLLTCEGRTRSLRFHA blast with 4d2 etam onward gb|AC014186.1|AC014186 Drosophila melanogaster, *** SEQUENCING IN PROGRESS ***, in ordered pieces Length = 13825 Score = 123 bits (306), Expect = 8e-28 Identities = 70/185 (37%), Positives = 114/185 (60%), Gaps = 5/185 (2%) Frame = +2 Query: 130 EGHDTTTSAISFCLYEISRHPEVQQRLQQEIRDVLGEDRKSPVTLRDLGELKFMENVIKE 189 +G DTTT+A+SF LY +++ PE Q RL++E++DV ++ L L L ++E +IKE Sbjct: 3335 QGVDTTTAAMSFVLYALAKFPETQTRLRKELQDVALDETTD---LDALNGLPYLEALIKE 3505 Query: 190 SLRLHPPVPMIGRWFAEDVEIRGKHIPAGTNFTMGIFVLLRDPEYFESPDEFRPERFDAD 249 LRL+ VP GR + EI G+ AG + ++ L D EY+ P F+PER+ + Sbjct: 3506 VLRLYTIVPTTGRQTTQSTEIGGRTYCAGVTLWINMYGLAHDKEYYPDPYAFKPERWLPE 3685 Query: 250 VPQIHP--YAYIPFSAGPRNCIGQKFAMLEMKSTVSKLLRHFELLPLGPEP---RHSMNI 304 + P ++YIPFS GP CIG+++++L MK ++L+R F+ + L PE R + Sbjct: 3686 DGAVAPPAFSYIPFSGGPHVCIGRRYSLLLMKLLTARLVREFQ-MELSPEQAPLRLEAQM 3862 Query: 305 VLRSANGVHL 314 VL++ G+++ Sbjct: 3863 VLKAQQGINV 3892 Score = 65.2 bits (156), Expect = 3e-10 Identities = 42/141 (29%), Positives = 74/141 (51%) Frame = +3 Query: 1 ETAMGTKINAQKNPNLPYVQAVNDVTNILIKRFIHAWQRVDWIFRLTQPTEAKRQDKAIK 60 +T+MG +++ Q + P +QA + + +L KR I+ DWIF+ TQ + D+ ++ Sbjct: 2934 DTSMGAQLDTQSVDHSPIIQAFHLSSKLLFKRMINPLLSSDWIFQRTQLW--RDLDEQLQ 3107 Query: 61 VMHDFTENIIRERRETLVNNSKETTPEEEVNFLGQKRRMALLDVLLQSTIDGAPLSDEDI 120 V+H E++I +R + L++ + R LLD LL + +G LS +I Sbjct: 3108 VIHSQMESVIEKRAKELLDMGEPAG-----------RAHNLLDTLLLAKFEGQSLSRREI 3254 Query: 121 REEVDTFMFEGHDTTTSAISF 141 R+E++TF+F T+ SF Sbjct: 3255 RDEINTFVFAVGFPVTTNYSF 3317 4d2 for comparison VVGVLLLVAFATLLLWDFLWRRRGNGILPGPRPLPFLGNLLMYR GLDPEQIMDFVKKNQRKYGRLYRVWILHQLAVFSTDPRDIEFVLSSQQHITKNNLYKL LNCWLGDGLLMSTGRKWHGRRKIITPTFHFKILEQFVEIFDQQSAVMVEQLQSRADGM TPINIFPVICLTALDIIAETAMGTKINAQKNPNLPYVQAVNDVTNILIKRFIHAWQRV DWIFRLTQPTEAKRQDKAIKVMHDFTENIIRERRETLVNNSKETTPEEEVNFLGQKRR MALLDVLLQSTIDGAPLSDEDIREEVDTFMFEGHDTTTSAISFCLYEISRHPEVQQRL QQEIRDVLGEDRKSPVTLRDLGELKFMENVIKESLRLHPPVPMIGRWFAEDVEIRGKH IPAGTNFTMGIFVLLRDPEYFESPDEFRPERFDADVPQIHPYAYIPFSAGPRNCIGQK FAMLEMKSTVSKLLRHFELLPLGPEPRHSMNIVLRSANGVHLGLKPRA AI404831 GH24669 52% to 6d2 N-term AC008197, AC009846, AA141002 C-term MFSLILLAVTLLTLAWFYLKRHYEYWERRGFPFEKHSGIPFGCLDSVWRQEKSMGLAIYDVYV KSKERVLGIYLLFRPAVLIRDADLARRVLAQDFASFHDRGVYVDEERDPLSANIF SLRGQSWRSMRHMLSPCFTSGKLKSMFSTSEDIGDKMVAHLQKELPEEGFKEVDIKKVMQ NYAIDIIASTIFGLDVNSFENPDNKFRKLVSLARANNRFNAMFG VSNGCSCRIAQFLFRIGFKNPVGLAMLQIVKETVEYREKHGIVRKDLLQLLIQLRNTGKIDENDEKSFSIQKTPDD IKTISLEAITAQAFIFYIAGQETTGSTAAFTIYELAQYPELLKRLQDEVDETLAKNDGKITYDSLNKMEFLDLCVQETIRKYPGLP ILNRECTQDYTVPDTNHVIPKGTPVVISLYGIHHDAEYFPDPETYDPERFSEESRNYNPTAFMPFGEGPRICIAQRMGRINSKLAI IKILQNFNVEVMSRSEIEFENSGIALIPKHGVRVRLSKRVPKLS* 6d1 for comparison MLLLLLLIVVTTLYIFAKLHYTKWERLGFESDKATIPLGSMAKV FHKERPFGLVLSDIYDKCHEKVVGIYLFFKPALLVRDAELARQILTTDFNSFHDRGLY VDEKNDPMSANLFVMEGQSWRTLRMKLAPSFSSGKLKGMFETVDDVAAKLLNHLNERL KDGQSHVLEIKSILTTYAVDINGSVIFGLEIDSFTHPDNEFRVLSDRLFNPKKSTMLE RFRNLSTFMCPPLAKLLSRLGAKDPITYRLRDIVKRTIEFREETGVVRKDLLQLFIQL RNTGKISDDNDKLWHDVESTAENLKAMSIDMIASNSFLFYIAGSETTAATTSFTIYEL AMYPEILKKAQSEVDECLQRHGLKPQGRLTYEAIQDMKYLDLCVMETTRKYPGLPFLN RKCTQDFQVPDTKLTIPKETGIIISLLGIHRDPQYFPQPEDYRPERFADESKDYNPAA YMPFGEGPRHCIAQRMGVINSKVALAKILANFNIQPMPRQEVEFKFHSAPVLVPVNGL NVGLSKRW gb|AC009846.2|AC009846 Drosophila melanogaster chromosome 3 clone BACR23F10 (D1097) RPCI-98 23.F.10 map 94C-94D strain y; cn bw sp, WORKING DRAFT SEQUENCE, 79 unordered pieces Length = 110926 Score = 425 bits (1080), Expect = e-119 Identities = 207/207 (100%), Positives = 207/207 (100%) Frame = +2 Query: 1 MFSLILLAVTLLTLAWFYLKRHYEYWERRGFPFEKHSGIPFGCLDSVWRQEKSMGLAIYD 60 MFSLILLAVTLLTLAWFYLKRHYEYWERRGFPFEKHSGIPFGCLDSVWRQEKSMGLAIYD Sbjct: 28877 MFSLILLAVTLLTLAWFYLKRHYEYWERRGFPFEKHSGIPFGCLDSVWRQEKSMGLAIYD 29056 Query: 61 VYVKSKERVLGIYLLFRPAVLIRDADLARRVLAQDFASFHDRGVYVDEERDPLSANIFSL 120 VYVKSKERVLGIYLLFRPAVLIRDADLARRVLAQDFASFHDRGVYVDEERDPLSANIFSL Sbjct: 29057 VYVKSKERVLGIYLLFRPAVLIRDADLARRVLAQDFASFHDRGVYVDEERDPLSANIFSL 29236 Query: 121 RGQSWRSMRHMLSPCFTSGKLKSMFSTSEDIGDKMVAHLQKELPEEGFKEVDIKKVMQNY 180 RGQSWRSMRHMLSPCFTSGKLKSMFSTSEDIGDKMVAHLQKELPEEGFKEVDIKKVMQNY Sbjct: 29237 RGQSWRSMRHMLSPCFTSGKLKSMFSTSEDIGDKMVAHLQKELPEEGFKEVDIKKVMQNY 29416 Query: 181 AIDIIASTIFGLDVNSFENPDNKFRKL 207 AIDIIASTIFGLDVNSFENPDNKFRKL Sbjct: 29417 AIDIIASTIFGLDVNSFENPDNKFRKL 29497 INVAVII*NSSL*RIKRTLLPKRTNNRRISAHIIGGSPV**SAIY*LSQIKGGFKKKKRCTTSTMFSLILLAVTLLTLAWFYLKRH YEYWERRGFPFEKHSGIPFGCLDSVWRQEKSMGLAIYDVYVKSKERVLGIYLLFRPAVLIRDADLARRVLAQDFASFHDRGVYVDE ERDPLSANIFSLRGQSWRSMRHMLSPCFTSGKLKSMFSTSEDIGDKMVAHLQKELPEEGFKEVDIKKVMQNYAIDIIASTIFGLDV NSFENPDNKFRKLVSLARANNRFNAMFGMMIFLVPS*VISESEFCYGVSNGCSCRIAQFLFRIGFKNPVGLAMLQIVKETVEYREK HGIVRKDLLQLLIQLRNTGKIDENDEKSFSIQKTPDGKS*YKLNSNQDRLSGLTFSGHIKTISLEAITAQAFIFYIAGQETTGSTA AFTIYELAQYPELLKRLQDEVDETLAKND YQCSCDHLEFFAVKN*ENVTAKENK*STNFSAYNWRLSSLIVSYILTFTDKGRI*KKEKVHYFDYVFADFIGRNAIDFGVVLSEAP L*VLGATRISI*KTLRDSIRLLGQCVAAGEEHGLGHLRCVCEE*RARLGHLFALPSGCFDQRRRSGSPCSGPGFRQFPRSRRLR** GTGSPVGQYLLASRSELAIDEAHVVAMFHIRKVEEHVQHIRGYW*QDGGPSAKGAAGGGLQGGGHKESDAKLCH*HYSLDYLWLGC K*LRKS**QVP*TGVVGQSQ**IQCHVRHDDLPGALVSNK*E*VLLWCV*WMFL*DSTVPIQNRF*KSSWPGHVADCQGNR*VSRE TWNSAQGSVAVAHSAEEYRQDRRE*REVL*HSEDTRW*VIIQVKF*SR*IEWFNVFRSY*NHILGGHHRSGFYILHRWSGDHRIDC SLYHLRAGSVSRATEAPAG*SGRDTGQKR LSM*L*SFRILRCKELRERYCQREQIIDEFQRI*LEALQFDSQLYIDFHR*RADLKKRKGALLRLCFR*FYWP*RY*LWRGSI*SA TMSTGSDADFHLKNTPGFHSVAWTVCGGRRRAWAWPSTMCM*RVKSASWAFICSSVRLF*SETPIWLAVFWPRISPVSTIAAFTLM RNGIPCRPISSRFAVRAGDR*GTCCRHVSHPES*RACSAHPRILVTRWWPICKRSCRRRASRRWT*RK*CKTMPLTL*PRLSLAWM *IASKILITSSVNWCRWPEPIIDSMPCSA**SSWCPRK**VRVSFAMVCLMDVLVG*HSSYSE*VLKIQLAWPCCRLSRKPLSIAR NME*CARICCSCSFS*GIPAR*TRMTRSPLAFRRHPMVSHNTS*ILIKID*VV*RFQVILKPYPWRPSPLRLLYSTSLVRRPPDRL QPLPSTSWLSIQSY*SACRMKWTRHWPKT AC008197 Drosophila melanogaster chromosome 3 clone BACR02L12 (D753) RPCI-98 02.L.12 map 94B-94C strain y; cn bw sp, WORKING DRAFT SEQUENCE, 113 unordered pieces Length = 125235 Score = 425 bits (1080), Expect = e-119 Identities = 207/207 (100%), Positives = 207/207 (100%) Frame = +2 67979 Sbjct: 67979 MFSLILLAVTLLTLAWFYLKRHYEYWERRGFPFEKHSGIPFGCLDSVWRQEKSMGLAIYD 68158 Sbjct: 68159 VYVKSKERVLGIYLLFRPAVLIRDADLARRVLAQDFASFHDRGVYVDEERDPLSANIFSL 68338 Sbjct: 68339 RGQSWRSMRHMLSPCFTSGKLKSMFSTSEDIGDKMVAHLQKELPEEGFKEVDIKKVMQNY 68518 Sbjct: 68519 AIDIIASTIFGLDVNSFENPDNKFRKL 68599 VISESEFCYGVSNGCSCRIAQFLFRIGFKNPVGLAMLQIVKETVEYREKHGIVRKDLLQLLIQLRNTGKIDENDEKSFSIQKTPDD IKTISLEAITAQAFIFYIAGQETTGSTAAFTIYELAQYPELLKRLQDEVDETLAKNDGKITYDSLNKMEFLDLCVQETIRKYPGLP ILNRECTQDYTVPDTNHVIPKGTPVVISLYGIHHDAEYFPDPETYDPERFSEESRNYNPTAFMPFGEGPRI SIRQIYE*VKFSIIVLP*R*TLLVFRSPQDVDVYTGALSEPPLDGAIFGPLLSCMVSDQFLRLKLGDSHWYERKMGPQKFTKGESP WKSYFLIKNRVYPDS*AQLAEIYKTSLAAIICRNSDGITRVREHVMQRLRDGGNPHVDCQDLEGFHFNFEPWSEKQQPQDLHSAGI SRGSTSVRVMSKANHQAHNVTLHIDKGI*LETRFNFD*EGFNIKYYKKHILLNRFLLF*NRVAMLKLVN*TAAHHKSTISQRILRL *INF*VFCI**LLAYNIRALFCISPLI*SCHQDYQCSFLGGFFFPQIFFF*KGKKFFLFGCKELRERYCQREQIIDEFQRI*VEAL QFDSQLYIDFRRERVDLKKRKGALLRLCFR*FYWP*RY*LWRGSI*SATMSTGSDADFHLKNTPGFHSVAWTVCGGRRRAWAWPST MCM*RVKSASWAFICSSVRLF*SETPIWLAVFWPRISPVSTIAAFTLMRNGIPCRPISSRFAVRAGDR*GTCCRHVSHPES*RACS AHPRILVTRWWPICKRSCRRRASRRWT*RK*CKTMPLTL*PRLSLAWM*IASKILITSSVNWCRWPEPIIDSMPCSA**SSWCPRK **VRVSFAMVCLMDVLVG*HSSYSE*VLKIQLAWPCCRLSRKPLSIARNME*CARICCSCSFS*GIPAR*TRMTRSPLAFRRHPMV SHNTS*ILIKID*VV*RFQVILKPYPWRPSPLRLLYSTSLVRRPPDRLQPLPSTSWLSIQSY*SACRMKWTRHWPKTTERSPTIP* TRWNFWTCVCRKPFANIRVFPF*IVNAPRTTQYPTPTTSFRRERQL*SPCTAFTMMQSTSRIPRPMIPNASRRRVAIIIPLHSCRL ARVPGF LHPADL*VSKVLDYSIALKVNTFGF*ISAGCGRLYGCT**AATGWSYFWSVAQLHGFRPVPTPQTR*LPLVRTENGPTEVHQR*ES LEVLFSH*ESCLS*FISSIGGDLQDQSGRHHLSQLRWNHPSEGACYAETARWR*SPCGLPGPGGIPFQFRTLVREATAPGSS*CRH KQGIHFGASNVQGKSSSPQRNPTY**RNLVRNKIQF*LGRI*Y*IL*KTYTSKPVPSILK*SCYAKVS*LNSSPS*INNKSTNSKI INQLLSILYIIIVSL*HKSFVLHFTINIKLSSRLSM*LFGGVFFPPNFFFLKREKIFSFRL*RIKRTLLPKRTNNRRISAHISGGS PV**SVIY*LSQRKGGFKKKKRCTTSTMFSLILLAVTLLTLAWFYLKRHYEYWERRGFPFEKHSGIPFGCLDSVWRQEKSMGLAIY DVYVKSKERVLGIYLLFRPAVLIRDADLARRVLAQDFASFHDRGVYVDEERDPLSANIFSLRGQSWRSMRHMLSPCFTSGKLKSMF STSEDIGDKMVAHLQKELPEEGFKEVDIKKVMQNYAIDIIASTIFGLDVNSFENPDNKFRKLVSLARANNRFNAMFGMMIFLVPS* VISESEFCYGVSNGCSCRIAQFLFRIGFKNPVGLAMLQIVKETVEYREKHGIVRKDLLQLLIQLRNTGKIDENDEKSFSIQKTPDG KS*YKLNSNQDRLSGLTFSGHIKTISLEAITAQAFIFYIAGQETTGSTAAFTIYELAQYPELLKRLQDEVDETLAKNDGKITYDSL NKMEFLDLCVQETIRKYPGLPILNRECTQDYTVPDTNHVIPKGTPVVISLYGIHHDAEYFPDPETYDPERFSEESRNYNPTAFMPF GEGPRI TPSGRFMSK*SSRL*YCLKGKHFWFLDLRRMWTSIRVHLVSRHWMELFLVRCSAAWFPTSSYASN*VTPIGTNGKWAHRSSPKVRV LGSPIFSLRIVFILIHKLNWRRSTRPVWPPSFVATQMESPE*GSMLCRDCAMEVIPMWIARTWRDSISISNPGPRSNSPRIFIVPA *AGDPLRCE*CPRQIIKPTT*PYILIKESS*KQDSILIRKDLILNIIKNIYF*TGSFYFKIELLC*S*LTKQQPIINQQ*VNEF*D YKSTFKYFVYNNC*LIT*ELCFAFHH*YKAVIKTINVAFWGGFFSPKFFFFKKGKNFFFSAVKN*ENVTAKENK*STNFSAYKWRL SSLIVSYILTFAEKGWI*KKEKVHYFDYVFADFIGRNAIDFGVVLSEAPL*VLGATRISI*KTLRDSIRLLGQCVAAGEEHGLGHL RCVCEE*RARLGHLFALPSGCFDQRRRSGSPCSGPGFRQFPRSRRLR**GTGSPVGQYLLASRSELAIDEAHVVAMFHIRKVEEHV QHIRGYW*QDGGPSAKGAAGGGLQGGGHKESDAKLCH*HYSLDYLWLGCK*LRKS**QVP*TGVVGQSQ**IQCHVRHDDLPGALV SNK*E*VLLWCV*WMFL*DSTVPIQNRF*KSSWPGHVADCQGNR*VSRETWNSAQGSVAVAHSAEEYRQDRRE*REVL*HSEDTRW *VIIQVKF*SR*IEWFNVFRSY*NHILGGHHRSGFYILHRWSGDHRIDCSLYHLRAGSVSRATEAPAG*SGRDTGQKRRKDHLRFP EQDGIFGPVCAGNHSQISGSSHSES*MHPGLHSTRHQPRHSEGNASCDLPVRHSP*CRVLPGSRDL*SRTLLGGESQL*SHCIHAV WRGSQDL gb|AA141002|AA141002 CK01076.5prime CK Drosophila melanogaster embryo BlueScript Drosophila melanogaster cDNA clone CK01076 5prime Length = 704 Score = 119 bits (295), Expect = 2e-27 Identities = 63/103 (61%), Positives = 67/103 (64%), Gaps = 1/103 (0%) Frame = -1 Query: 73 KNDGKITYDSLNKMEFLDLCV-QETIRKYPGLPILNRECTQDYTVPDTNHVIPKGTPVVI 131 K GK T + CV ++ K PG PI VIPKGTPVVI Sbjct: 704 KXTGKXT*XFPEQDGIFGTCVCRKPFAKXPGXPIXEWGMHPG*HSTGHQPVIPKGTPVVI 525 Query: 132 SLYGIHHDAEYFPDPETYDPERFSEESRNYNPTAFMPFGEGPRI 175 SLYGIHHDAEYFPDPETYDPER SEE+RNYNPTAFMPFGEGPRI Sbjct: 524 SLYGIHHDAEYFPDPETYDPERXSEENRNYNPTAFMPFGEGPRI 393 AI064680 N-TERM TO C-HELIX AL061315 AI404867 AI406053 AI1516839 MSALIFLCAILIGFVIYSLISSARRPKNFPPGPRFVPWLGNTLQFRKEASAVGGQHILFERWA KDFRSDLVGLKLGREYVVVALGHEMVKEVQLQEVFEGRPDNFFLRLRTMGTRKGITCTDG QLWYEHRHFAMKQMRNVGYGRSQMEHHIELEAEELLGQLERTEEQPIEPVTWLAQ SVLNVLWCLIAGKR AC009382 Drosophila melanogaster chromosome 3L/76B4 clone RPCI98-1B21, WORKING Query: 1 MSALIFLCAILIGFVIYSLISSARRPKNFPPGPRF 35 MSALIFLCAILIGFVIYSLISSARRPKNFPPG F Sbjct: 27001 MSALIFLCAILIGFVIYSLISSARRPKNFPPGEFF 26897 Score = 37.9 bits (86), Expect = 0.029 Identities = 18/25 (72%), Positives = 20/25 (80%), Gaps = 2/25 (8%) Frame = +2 Query: 28 NFPP--GPRFVPWLGNTLQFRKEASAV 52 NF P GPRFVPWLGNTLQ R S++ Sbjct: 26396 NFIPKTGPRFVPWLGNTLQVRLHKSSL 26476 AL078186 MLTSVFYVLFAIAITIILISYVFLLLKCKQKAFVVIGLLYQEKKY QCFDQAPGPHPWPIIGNINLLGRFQYNPFYGFGTLTKKYGDIYSLSLGHT RCIVVNNVDLIKEVLNKNGKYFGGRPDFFRYHKLFGGDRNNCKFIXXLRF AC014810 in ordered pieces 27% to Cyp18 new family = AL065712 MLAALIYTILAILLSVLATSYICIIYGVKRRVLQPVKTKNSTEINHNAYQKYTQAPGPRP WPIIGNLHLLDRYRDSPFAGFTALAQQYGDIYSLTFGHTRCLVVNNLELIREVLNQNGKV MSGRPDFIRYHKLFGGERSNSLALCDWSQLQQKRRNLARRHCSPREFSCFYM KMSQIVARKWSTGIRELGNQLVP GEPINIKPLILKACANMFSQYMCSLRFDYDDVDFQQIVQYFDEIFWEINQGHPLDFLPWL YPFYQRHLNKIINWSSTIRGFIMERIIRHRELSVDLDEPDRDFTD ALLKSLLEDKDVSRNTIIFMLEDFIGGHSAVGNLVMLVLAYIAKNVDIGRRIQEEIDAII EEENRSINLLDMNAMPYTMATIFEVLRYSSSPIVPHVATEDTVISGYGVTKGTIVF INNYVLNTSEKFWVNPKEFNPLRFLEPSKEQSPKHF LPFSIGKRTCIGQNLVRGFGFLVVVNVMQRYNISSHNPSTIKISPESLALPADCFPLVLTPREKIGPL* AC008307 AC015141 AC007725 chromosome 3 clone BACR03D22 (D709) RPCI-98 324-511 45427-44854 45427 TAFSSQWALFALSKEPRLQQRLAKERATNDSRLMHGLIKESLRLY 45293 45292 PVAPFIGRYLPQDAQLGGHFIXXX 45230 45165 TMVLLSLNTAGRDPSHFEQPERVLPERWCIGETEQVHKSHGSLPFAIGQRSCIGRRVALK 44986 44985 QLHSLLGRCAAQFEMSCLNEMPVDSVLRMVTVPDRTLRLALRPR 44854 AC015396 Drosophila melanogaster, Cyp302a1 dib1 gene 1-183 32005-32502 184-493 32643-33702 36% to 12A3 mito? = AL099182 GSS MLTKLLKISCTSRQCTF AKPYQAIPGPRGPFGMGNLYNYLPGIGSYSWLRLHQAGQDKYEKYGAIVRETIVPGQDI 32181 VWLYDPKDIALLLNERDCPQRRSHLALAQYRKSRPDVYKTTGLLPTNGPEWWRIRAQ 32352 VQKELSAPKSVRNFVRQVDGVTKEFIRFLQESRNGGAIDMLPKLTRLNLE 32502 32643 VTCLLTFGARLQSFTA QEQDPRSRSTRLMDAAETTNSCILPTDQGLQLWRFLETPSFRKLSQAQSYMESVALELVE 32870 ENVRNGSVGSSLISAYVKNPELDRSDVVGTAADLLLAGIDTTSYASAFLLYHIA 33032 RNPEVQQKLHEEARRVLPSAKDELSMDALRTDITYTRAVLKESLRLNPIAVGVGR 33197 GRGQDLNQDAIFSGYFVPKG TTVVTQNMVACRLEQHFQDPLRFQPDRWLQHRSALNPYLVLPFGHG 33441 MRACIARRLAEQNMHI 33489 LLLRLLREYELIWSGSDDEMGVKTLLINKPDAPVLIDLRLRRE* 33705