P.sojae
P450s
P450s
found in the genome blast server at
JGI
D. Nelson
July 27, 2004
A tree of 65 Stramenopile
or Chromista sequences
>CYP5014A1
sca20 47% to scaf_20e 361222 to 363018 minus strand, C-term long
MLAFLPTLAASDSRQNVVTS
ALIALLLGASTYATLSHIERSRAKHQRHKEGLPVPRPSTTLPIMGNTLDFVKNNDVFHDC
VSSLVQEFNGEPFLLSAPGRPDILVVSTPEAFEGVTKRQFDTFVKGD*LHEMFYDLLGNA
LTNSDGDVWQFQRKIFAKLFSARALRESMTSTIQKHGRTMHTLFENGAASGASFDLFRLL
SRFAMESFAEIGFGIQMGSLAIGEDHPFEKAFDIAEEATAKRFSVPAWFWKLQRLLSVGS
EGQLQRAIQVIDSTVLKFIYESIAGRARDEKRTGGAQNIVSLALDSCDLEGEADPQLLRS
IAIAAIIAGRDTTSETLSWFFYTLSQHPEVERNIRTEMLERIPRLVLETGYFPAMDEVQS
LTYLEAAIKETLRLYPPASFNIKHCSADIFLSDGTFIPEGTTIGLPSYAMGRMTSTWGPD
CNEYKPERFLDPDTGKLLSVSPFQFPAFFAGLRICVGMNLAMLEMKIVLTGLLSRFLSRL
QTGFDI
VASTFQLVHDLGLRCPPVLQVRRVGGVHVNLHAPQFFHNRSVKHPSANAFVVKE
TLSGFSRSPSLITHVNTFNSAAPSELVFVFLQTLALPV*
>CYP5014B1
sca11 56039 to 56596 plus strand C-term frag., 73% to scaf_20d (ortholog)
56039
SKIRHELASKLPEL
VNGSISSPSMAQVNELVYLEAVVKEAMRLNPAVPSNIREALEDVVLCDGTVVKAGEAVSW
SSYSMGRMPHVWGPDAKQFKPERWIDATTGKLMAVSPLKFPLFNAGPRVCLGTKLAMMEI
KITTASVLSKYNLTAVPGHQVTYRLSLSLAMKDGFKVNVRKATAASFDGVA*
56596
>CYP5014C1
sca45a 79% to scaf_20c 111379 to 113001 minus strand, no introns
MLSVSALKLETPLHHALAVTSFLLLPLVIQLSRRIGSSSAETPEAFKERADSEPERREAGR
PPWTLPVLHNTLGFLLAGNNLHEWITRTCERFEGNPFTVKVLGLPRMLVVSTPEAFEDVL
KYQFMNFPKGPQYSENMKDLLGDGLFAADGVKWAHQRDIAHGLFRTKELRECMVKAITRH
TMALHDVLKQICARNRSVDLYKLLSCFSTEAFADISFGLKMDCLRANKELPFQAAFDRAQ
RLTALRFVRPRWFWKMQRRLGLGAEDQLQLDIKEIDATVLSIVQRVLAQRAMAPEDKDSN
MLSLYLDAIARSSGTDEQLYDPVHLRDVVVNFLVAGRDTTAQALSWFFFCVSQNPRVESK
LRREIYKKLPELMTAESCVPTLEQVNKLVYLEAVIKETLRLYPSMPIAPKYAVRDTVLSD
GTFVAAGSMVCLPLYAMGRMPHAWGPDAAEFKPERWVDPVTKKITSVSAFKFVAFNGGPR
MCLGSSLAGLELKLVAAALLSRFHIYVENPEDVGFGFSLTLPVKGPMNARLARVSASFG*
>CYP5014D1
sca45c 80% to scaf_20e, 139685 to 141217 minus strand
MTDKLSSSVAVAALSGLVVLPLA*RLLHVDKDKSQLST
RKVVRPATTLPVLGNTLDVIKNLPIRCDWLTSLCQDAQGEPVLLQSLGTPDTTLLSTPQAF
EDVFKNQFDNFPKGPKKSEYLCELLREGIFAVDNEKWYRQRKTASNLFTMRALRDSMTST
IQRHLVVLDRIFNRAAETDDTLDLFRLLNRFTMEAFTEIGFGVHMNWLDSDKEHPFQTAF
DQSQQLLVLRFVRPSWFWKAQRMMGVGAEGQLQRELHVIHSTIFDIVAQNLQNRAKGEND
KAGMDIVSLFLDDLNRSGDADESCFDPTYLRDIVVNFIIAGRDTTAQALSWFFYCLSHNP
QVETKIRKELRAKLPRLFSGDCSPSMDEVSELTYVEAALRETLRLYPSVPIVNKEAVHDT
VLSDGTFIAAGTVAALPMYALGRMTHFWGPDAAEFKPERWIDAQAGKLISALAFKFVAFN
AGPRLCLGKNLAMLEMKLIVASLLSKYRVELERPEDVTYAISKDLLVGESP*
>CYP5014D2
sca45b 78% to scaf_20b 115947 to 117587 minus strand transc. 136349
117587
MLSVSSLRNKLPFNPVKLGIGTLVFASVVVLLAKPPYEPTKKEDPSKKSH
117437
RKIHRPEATLPVLENTLTVIEAARAGDIHDRTLLSCRESNAEPVLVRSIGVPDQLIVCTPEAFEDVLKLEFSN
FPKGSYQCENLRDLLGDGIFAVDGEQWVHQRKTASNLFTMRALRDSMAFVIQRHAVVLYD
ILRQTSESNETLDLFKLLNRFTIEAFTEIGFGVHMGCLDSEEEHPFQKAFDHAQRALLLR
FVRPGWFWELQKWLGVGAEGQLKNDIEVINKTVLDIVEKALAKRSSIGSGIEIDGSASQG
KDIVSLFLGDADSDTQQLDPMFLRNIVVNFLIAGRDTTAQTLSWFFLNLAKNPDVETAIR
NEIAKKLPNIEGSEVNVSHATMQDVSQLVYLEAALKETLRLHPPVPMIPKYVVEDTTLSD
GTFVKAGSLIVLATYVMARLPQVWGPDAEEFKPERWIDPSTGKLIVVSAYKFASFNAGPR
MCLGMNLAMLEMKLVVAGLLSKFHVEVLNPEDVTYDLSLTLPLKGALNVKVSQAALPSNP
DFA*
115947
>CYP5014D3
sca45d 136352 74% to scaf_20a 145254 to 146840 minus strand no introns
MLPILQLVEKSPVAGLALTGLLVLPLVITLHSRHKKSEEGIGKIHRPASTLPFLGNTWDL
VIHGVRGDMHDFMVQIGKQFNAEPVLLQALGIPLNLILYTPEGFEDVLKTQFSNFGKGPF
MRENLRDLMGDGIFAVDGEQWVHQRKTASNLFTMRALRDSMTVVIQRHAVVLYDILRRAS
ESKETLDLFKLLNRFTIEAFTEIGFGVHMGCLDSEEEHPFATAFDRAQRALRFRFTRPGW
FWKTQRWLGLGVEGQLQRDIQVIDKTVLEIVEKALARRSSRVENPEKKAGGDIVSLFLDS
AGSSNEKQFDPKYLRDIVVNFLIAGRDTTAQALSWFFFNISKNPRVEAAIRNELAQRLPK
VKAEAATPSMQDVSQLVYLEAALKETLRLHPSVPVEPKQTLKDTTLSDGTFVPAGSAIAL
ANYAMGRMPQVWGPDAEEFKPERWIDPSTWKLIAVSAYKFASFNAGPRMCLGMNLAMLEM
KLVVAGLLSKFHIEVLNPENVTYDVSLTLPVKGALTVKVSQIAEPAGA*
>CYP5014E2
sca118 plus strand 58% to scaf_36p missing N-term to C-helix
174064
SNLITTRALREYMAPVIQEKTLLLQSILADKSETKEPFDMYKLMRQFTLDTF
AEIGFGCHLEILTSGKEHPFEVAFDEANRISSERFTKPTWLWKFQRFLNIGNERRLREAI
SVMNEFSVDLIMEAMEQMKNSKPDEADVESPAHKNIMAILLSKKEAVTPTQVRDIVLTSL
EARRNTTSDTLAWFFHSLSHHPQVERKLRAEIRSKLPKFGEIHIYVPSYEAVQDLPYLEA
TLREALRLHPTGPSIPYHYQRDTVLQDETFISAGTDVFLHLYSAGRLTSAWGSDAASFNS
QRFNDLTTGEVLPSKYSPFSSGPRVCIGRNLALLEMKIAIAAVVGRFRLCEEPSLTQRRP
AFQNFQISELFIPVCIDLTRRRSTS*
175197
>CYP5014E3P
Sca116a minus strand fragment N-term (probably all part of same pseudogene)
whole
combined sequence 60% to scaf_36p
114619
MLSPLQLSGGSTVALGLLACGLAVAGVVSYTCTWSSKSESSKAGRVPYL 114473
114472
PSWIPLLGNTVELARNVDRHHEWVAEHSLQRDGKPFALRLPGKNDTLFLS 114323
114322
RPEHFEEVVKTQSSNFSKGDI 114260 frameshift
114261
LREIFDDFLSEDILIIHGERWRFHRKILASLFTPRALREYMTRIVREDVR 114112
114111
RLQSVLQ 114091
Sca116a
pseudogene plus strand middle fragment
113385
SETQESFDLSKLLLQFTIDTXXXX 113444
113449
GFGHKLETLTSDGVHPFEAAFDDANRISSQRNTVPPCVWKLQRCLNVGSE 113598
113599
RRLREAIDEMNGLLLVLISSAYG 113667 frameshift and deletion
XXXXXXXXXXXXXXXXXXXXXXX
113667
SRPRITATEVRDISLAGL
EVGRNTTADAMMWFFHALSQNPQVQKKLHAEILAKLPKLGESESYIPSHEDIQKMPYLEA
TILELLRLHPAVPGIPYHC
113957 frameshift
113959
VETVFADNTFIPASTDIILSLYSAGRLTSVWAIVGRFRLIEEPL 114090
note:
N-term is inverted right after this point in the sequence
>CYP5014F1
sca116d 170709 to 172295 plus strand, no introns 89% to scaf_36a
MLQSMFSKSPVVPGLVTAALLVTLYWTKSAKGSAKLKGDKVKNAVILPGTLPVVGNAVELAANA
ARMHDWLADQFAATNGEAFIVRLPGKDDMMFIAKPEHLEAVLKTQFDVFPKSEYIHDVFY
DMLGDGIVVTNGETWKRQRNVVVGLFSARALREHMTPLVQKYTVQLGDILADAAATNTPV
DVFDLLHRYTFDVFGEIGFGAKMGSMDGAFQPFAEAMDEAQFLAGKRFKQPMWYWKLRRW
LNVGDEKKLKENVRVIDEHLMGIIADAIERRRHRVEEMKAGRPAALADKDIVSIVLDSME
ASGQPVNPVEVRNIAVASIIAGRDTTADCMGWLFHLLSENPRVEAKLRDEVLAKIPQLAT
DKSYVPSVEDINKVPYLEACIRELLRLYPPGPLITTHCIKDTVFPDGTFVPANTDIGIAL
FSAGRLTSVWGEDALEYKPERFIDSESGEIIPMTATKFCAFSAGPRICVGQNLAFVETKI
VIASIVGRFHMIPEPGQNVAYTQGISLGMMDPLMMRLEAVNASA*
>CYP5014F2
sca116c 167571 to 169189 plus strand 86% to scaf_36b
MLQSFFEDKLSYLHPVVPGLVAAAVAVAIYCTTDTAEPPTLEGEDGKAVAKRVRYLPSKIPVLGNAID
167774 frameshift
167774
LLSNSERMHDWIADQIVPFDGEPFTLRLPGKSDMMFIAKPEH
IEEVLKTQFENFPKSQHIHDVFFDLLGDGIVTTNGETWKRQRRVLVNLFSARALREHMTP
ISQKYVVQLRKIFEDAVASKEPMDAFGLMHRYTLDVFAVIGFGTEMKLLEGRYQPFAEAI
EESQYIVSARFKQPDAQWKLMRWLNIGSEKKLRHAIQVIDEHVMGIISGAIQRRQERDQA
IKAGEAAKPADRDIVSIILDSMESNNQVVDPVEVRNIATAALIAGRDTTADALGWLFHVL
SQNPSVEAKLRSELLTHMPRLTTDPEYVPTAEELNQVPYLEATIRELLRLLPAGPVIATH
CVRDTVFPDGTFVPKNTDIGLAFYTTGRLTSVWGEDALEFKPERFLDADTGEVVKVSSSK
FCAFSAGPRICVGRNLAFLEMKIVIANILSRFHLVPEPGQQPTYTQGITLGMQTPLMMRV
EAVTAHAAA*
>CYP5014G1
sca39 84% to scaf_86 455071 to 456241 plus strand missing N-term
SSRALREHMAPVIQKHVRVLQRVLTDVAAAKMPIDMFNYSGRFTLDAFGEIAFGFNMSTLTLQ
RD*HPFERAFVDAQHIAASRLVVPTWYWKLKRSLNVGSERRLREALTTVDQFVMDVISKT
VDKRNAPISDAEDKVHTRGRDIVSLILANETVDGTPVDPILVRNVVLMALIAGRDTAADA
LAWLFHLLTLNPRVEEKLRAYLLASLPKLGSDFDYVPDMQEVQSLPYLEATINEALRLYS
PVGLAQKLCVRDTVFPDGTFVPKGSNIALVYHAMARMPGVWGPDAAAFNPERFIDPQTGE
LIKVSSGKFSAFNTGPRVCVGRKLAMMEMKMVVACVVSRFRFDEVPGQDVAC
456135 frameshift
456137
GGGLTIGMKNPLMMRVQQLAPKEDGDEVVVGVAA* 456241
>CYP5014H1
sca79b 355924 to 357594 plus
strand 87% to scaf_6m
MLRKWLTKHRALSPLGPAGLALLAGA
AVVAAYVATRSSGDAVSVLDCKEEKTKEKWENTPQDKPKVVPYLPSKVPWIGNMLQLAGN
AHRFHSWMAEQCIAHNGVFKLHLPGQSDMLVTAVPEHYEHVVKTQFEHFSKGHQQYDMFV
DLMGHSVLIIEGERWKYHRRLLVRLFSARALRDHMTPVIQRHTLLLQNVFLKAAVAKKPV
DVYMFMHRFTFKAFAEMVFNNSLDSIDSEHEHPFEQAFDEAQSIVAGRLQQPVWFWKLMR
WLNVGLERKLREDVALIDEFIMEIISTAIEARRQRQEDLKAGRPVKDADKDIVSIVLECM
EQDGDMVSPTDVRNIAVAALGAGRDTSADAMSWLLHTLTQNPHVEDKLRAELLENLPKLA
TSPSYVPSMDEVHGLVYLEATIRELLRLQTPVPFTLRECIHDTVFSDGTFVPKGTNVGMC
HFGAARRPEVWGPDAAEFNPERFIDQETGKLVQTPMAKFNAFSGGQRMCVGKALAMLEMK
LVIATLVGRFHFREVPGQNVQYAMGITIGMRNSLMMHIEPVRTGASAAAA*
>CYP5014J1
sca116b 163736 to 165337 plus strand, one stop codon, 48% to scaf_6p
MPPPSLLKGGGLSSPALLGLLVATAAAMLFAAKPSRGKPLPANVTPVPFLPSTPL
LGNTLELAANAARLQDWVADRSRECDGQPFVVQLLGKRNLVYLSRPEHFEQVLKLQSSNF
NKGLAIHDIYSDFMGESILLVNGDRWKYHRRVLVNLFSARALRDFMTPIIQKNILVLMDI
LARARERNEALDIHKLMNKFTFETFAKIGFGQKLGNLVSPEDHPFERAFDEAHHITGHRM
TTPTWLWKLKRWLNVGSERKLRECVEVMDSLVMGIISDAIAKRQQRGQEEEAGEHDHEKD
IVSIILE*MHADGRPVEPSEVRSIALLSLIAGRDTTANAVSWILHMLHEHPRVEEKLRAE
LYEKLPKLATSRDYMPSLEELQDLPYLEAVINENLRLLPIFPYTSRQCIRDTVFPDGTFI
QAGEVLGLPHYVMARLTSVWGENAAEFVPERFLDAKSGEVLDLPVATSSAFGAGPRICVG
RRLASMEMKLLLACIVGRYHLVELPGQTVRYKLALSLTMKDPLMVNVQHVNQALAKSA*
>CYP5014K1
sca79a 346171 to 347793 minus strand 86% to scaf_6p no introns
MLSLSELRTHPLVVGFVAVAAAVTLYSVVANAGSDALDDEDDEGKTDRKAK
PIPYLPGGHPVLGHTLLMARNLDRFQDWLVETSVARGGAPFVLRQPGKNDWLFSARPEDF
EQILKVHFDTFIKGPQVRELLDDFMGENIVIINGHRWKFQRKALVNLFTARALKEHMTPV
VQKCALALQRVFAKAAESGDVLDVHHIMGRFTLETFAEIEFGSQLGLLEKGEENAFETAI
DDANHISLERFAVPMWVWKLKRWLNVGSERRLKEDMAVISSFVMSCISDAIERRKQRLEA
AARGEPVGPVAKDIVSILLDSEDATGEPVLPKDVFNISLAGVLAGKDTTGDATSWLMHLL
HENPRVENKLRAELLAKVPKLAEDESYVPPMEELDAITYLEATIRESLRLKPPAPCVTQH
CTQDTVFPDGTFVPKGMDTTLLYHASALLPSVWGPDAAEFNPERFLDDNGKLLVLPPLKF
IAFSAGPRKCVGRKLAMIEMKVVTACLVSRFHLVEVTGQDIRGTMGISLGMKNGMKVSVQ
ATPGVAKRA*
>CYP5014_un1
sca38 minus strand pseudogene fragment, I-helix to EXXR region 49% to scaf_36a
IVLAILDCVEVTS*HVNPGEVCRLNANYADRDITVDCMDWLFHLLSKNP
RIDAELFAEALVKLT*LMAYKRYVSSMKRLNKVPYLESCFREPPQLPGPLITTSCVIGTG
IPGGAFVPANKEIGTDLFSLTRGSG
>CYP5014_un2
sca205 11789 to 13031 plus strand, 44% to scaf_20e probable pseudogene
11789
MLAFLPTLAASDSRQNVVTSALIA
LLLGASTYATLSHIERSRAKHQRHKEGLPVPRPSTTLPIMGNTLDFVKNNDVFHDCVSSL
VQEFNGEPFLLSAPGRPDILVVSTPEAFEGVAKRQFDTFVKGE*LHEMFYDLLDNALTNN
DGEVWQFQRKIFAKLFSVRALRESITSTIQKHDRTLHTLFENAAVSGESFDLFRLLSRFA
MESFAKIGFGIQMGSLAIGEDHPFEKAFDITEEATAKRFSE
12523
(deletion)
12529
TQHMLERIPRLALETGYFPTMDEVQSLTYIEAAIKETLRLYPPASTAARTPSSPTV 12696 frameshift
12696
SADTFLSDGTFVPEGTTIGLPSYAMGRMASN
CNEYKPERFLDPGTGKLLSVSAFKFPAFFAGPRICVGMNLAMLEMKIVLTGRVTVQPGQE
VTYVRSLALLMNPFMVKIEKVSPSVVPIA*
13031
>CYP5014_un3
sca265 plus strand C-term pseudogene fragment 84% to sca205
9688
VAVQPGQEVTYVRSLALPMKNPFMVKIEKVPPSLIPIA* 9804
>CYP5015A1
sca42 90% to scaf_63 376713 to 378335 minus strand, no introns
MGGSSASSSSSLALWVALP
TAAAAAVLAYLLIPDERQRAIRRLPAPASTLPVLGNTLDMMSLEQPRLHDWIAEQCKAFG
GRTWRLQVVGAPPLVVVSSVEGFEDVLKTQFEVFDKGDRMNTIFRDIAGGGIVAVDGPQW
VAQRKMLSRLFTMRAFRDTISQCVHDYTLVLGRMLGDAARTGVPIDFADVMHRFSFDVFT
DIAFGLQGNSLEGGEHTQFMEAMGKIVHNIEMRFHSPDWLWKLKRALKLGSEKELAQEVA
ILDKMVFTIINKNMERKFNPDAAAAEWPPRPQRSTKDVVSLFLDAHDEQKAAGEDGGDTP
LDANFLRDIAVVVLLAGKDTTAWSMSWLIIMLNRNPKVETKLRQELREKLPKLFSDPSYV
PTMDDVEGLVYLEAVLRENLRLNPLVPLNAKEANRDTTLVDGTFVKKGTRVYIPSYTLGR
MKSVWGRDASKFKPERWLMQDPWTGEQTIRPVSAFQFVSFHAGPRT
CLGMRFAMLEMKTVLAYMLSKYHFTTRENPKSYTYDVASLLQVKGPLICKVQRAG*
>CYP5015B1
sca73a 218927 to 220480 plus strand, no introns, 81% to scaf_27d,
transcript
139397 = 3 P450s fused
MSNMVLPFAIAASLVAAAVAYFTSPTEQDRAVCELPTPRSTLPVLKNTLDLTIRQRARIY
DWILEQCREHGGRPWRVRVLGRPPAVILSSPEAMEGVLKTQFDVFVKGSAVAEISHDLLG
EGIFTVDGSKWRHQRKAASHFFSMNMIKHAMEHVVRDHSALLAVKLRAAADNGETLNIKR
VFDFFTMDIFTKIGFGVELKGLETGGNCDFMEAFERASRRIMARFQQPMCVWKLARWLNV
GAERQMAEDMKLINGVVYDVIHRSLEGNDKRSSCSGRKDLVSLFLEKASVEYAADDHTEM
TPTMLRDMSMVFIFAGRDSTSLTMTWFIIEMNRHPEVLANVRRELADKLPKLGMDDTETP
SVEDIDQLVYLEAAIRECIRLNPVAPAMQRTAAQDTTLYNGTVIKAGTRVILPHYAMGHL
ETVWGPDAEEFKPERWIDADTGKLLHVSPFRFTAFLAGPRMCLGMRFALAEMKITLATIL
SKFDLQTVENPDGFTYIPSVTLQVKGPVDVAITRAHA*
note:
sca73d 225097 to 225615 is identical to aa 348-519 sca73b
note:
sca73e 225763 to 227318 plus strand is identical to sca73c
note:
sca73f 235636 to 237189 is identical to sca73a
note:
sca73g 237303 to 237752 is identical to aa 1-150 sca73b
>CYP5015C1
sca73b 220594 to 222153 plus strand, 81% to scaf_27c
MKLEAITALISPASVAASCVALLLVYVATPSAHDRAVKHLP
TPEGDIPVLRSTLEIVRAQKSGKFHDWALAYCRKFQGRPWCLRILGKTPSVVVCCPEAFE
DIQKTQFDAFDKSPFVSAAMYDVLGHGIFAVSGPLWQHQRKTASHLFTTQMLQYAMEVVV
PEKGEALVKRLDEISKANQVVNMKRLLDLYTMDVFAKVGFDVDLHGVESDQNAELLDAFD
RMSVRMLERIQQPVWYWKLLRWLNVGPEKQLAEDIKMTDDLIYSVMSRSIEEKTKGS
RKD
LISLFIEKSAVEYTKGVHTKKDLKLMRDFVISFLAAGRETTATTMSWVILMLNRYPKVLD
QVRQELKAKLPGLASGETRAPTLENIQQLVYLEAVIKETLRLFPVVAITGRSATRDVRLY
EGTVIKADTRVVMPHYAMGRMETVWGPDANEFKPERWIDPATGKVNVVSPFKFSVFLGGP
RVCLGMKFAMAEVKISLAKLLSQFDFKTVKDPFDFTYRSSITLQIKGPLDVVVSRLKA*
>CYP5015C2P
sca73c pseudogene, 54% to scaf_27c
222301 to 223856 plus strand
MVVSYDELWVFGIVCVALLLGYLVTPSAQTRAVLHLPKPPGYLPVLIRVQH
SGRFHNWALSTCRKYEGKPWCMHVLGKAPTVFVCTPEAFEDVEKIQYEAFGRNP
LFVEATTDVLGQGVFAISGPLWHHQRKTASRLISTQMIQHNMDVVVPDKCKELMKRLDAA
ASEENPIDRVVSLKWRLDLFTMDVFCKVGFGIDMHKMETEKIIAMLEALQRSSARIVGRI
LEPSWFWKLRRDLNIGAERQFTKDMECVNDMICGFIAPSIEEKAQRDQVEAKEDEKDSRM
DLISLYLDQDAADNGKDAPFDPKKQRDFLVSFLAAGQDTTSTSMSWFVVMINRYPKVLDKNS
223341 frameshift
223343
GKMPDLASGKQTVPSLEDTQQLVYLEAAIRETL
RLFPVAPISG*TATRNVTLSNGVFLVKGTSVHIPHYTIGRMKTVWGPDAEEFKPERWIDQ
VTERITPVSPFKFSAFYGGPHACLGMKFA
223708 frameshift
223710
MSEIKITLAALLSRFNLRTSRDPFAYTYRMALSLRIDGGLDVAVSHLE*
>CYP5015D1
sca73am 216869 to 218524 minus strand, 139396, 83% to scaf_27a
MWGPLELALNLSLTSWGVLVCSLLLGWHFLSSRKQARALSKFTRPASTLPVLGNTLDLMF
KHRHDIHDWMLDECRRCEGRPWVLAAVGRPTTVVLSDVDAFEDVLHRKFDSFGKCSAWLV
SDVFGDGIFAADGVSWIHQRKTASHLFSLHMMRESMEQVVREQATVLCETLRAHCTDNQT
STSPQRGVPVNLKYTMDWYATNVFTRVGFGVDLDSLSSQEHNEFFCAFTRLPIGIHRRIQ
QPGWLWRLKRALDLGDEKQLKLDMARVDGVIYQVISQSMESKSDTAPVESKRLPDLISLF
LAKETNEYRDREAKQDNGAVATCRVETTPKLIRDMAFNFTAAGRGTTSQSLQWFIIMMNR
FPGVERKIREELQAKLPQLFEEDSTPPSMNDVQQLVYLEAAIKESLRLNPVAPLIGRTAT
QDVVFSDGMFIPSGTRVIIPTFAVARLQSIWGEDAAEFKPERWI
DPHTGKLRVISLYKFL
VFLAGPRSCLGAKLAMLELKVALATVLSKFHLRVLRDPFEIGYDASISLPVKGDVLAIVE
AAKVGNSAGAA*
>note
sca73b 233842 to 235233 is an exact duplicate of aa 1-464 of sca73a
>CYP5015E4
sca96a minus strand 88% to scaf_41b 141298 prediction too long
263342
MKSVSELFGDRSDVAVTAAAA
VTVGLGLSLLLHSTKKSKMSDTRKLPPMPKTTLPILKNILDAGGNAERFHDWLNEQSIEF
DNRPWMLSIPGRPATIVLSSPEMFEDVLKTQDDVFLRGPSGQYISFDLFGNGMVITDGDL
WFYHRKTASHLFSMQMMKDVMEATVREKLAVFLDVLGVYHQRGQQFSAKQELSHFTMDVI
AKIAFSIELNTLKDSPDREDDHEFLKAFNKACVAFGVRIQSPMWLWRLKRYLNVGWEKVF
KENNTIIQNFINDVIVQSMNKKAEYSAKGEKMVARDLITLFMESNLRHSEDIHIADDDAT
IMRDMVMSFAFAGKDSTADNMCWFIVNMNRYPEVLKKIREEMKEKLPGLLTGEIRVPTQE
QLRDLVYLEAVMKENMRLHPSTAFIMREAMDNTTLVDGTFVEKGQTLMISSYCNARNKRT
WGDDCLEFKPERMIDPETGKLRVLSPYVFSGFGAGQHVCIGQKFAMMEIKTTLATLYSKF
DIKTVEDPWEITYEFSLTMPVKGGLSVEVTPLTPLKRASSACK*
261708
>CYP5015E5P
sca96c plus strand N-terminal (inverted from seq sca96b) 88% to scaf_41a
270399
RFAVAVSLGLSLLLHSTKKSKKSDARKLPPMPKTTLPILKNILDDGGNAERF 270554
270555
HDWLNEQSIEFGNRPWMFSIPGRPATIVLSSPEMFEDVLVTQDDVFLRGP 270704
270705
SGQYISFDLFGNGMVITDGDLWFYHRKTASHLFSMQMT 270818
>CYP5015E5P
sca96b 270398 to 269250 minus strand 141300 prediction adds incorrect N-term
88%
to scaf_41a missing N-term inverted on sca96c
270398
RDVMEATVHDKL
GVFLDVLDIYHKRGKPFSIKQELSHFTMDAIAKIGFGLDMDTLKNSPDREEDHEFLEAFN
KGSVPFGVRIQSPLWLWELKKYLNVGWEKVLMDNTKIMHQFINKVILDSMNKKAELAAKG
EKMEARDLVTLLMESKLRQTEDMHIEDDDATIMRDMVMTFVFAGKDSTAHSMGWFIVNMN
RYPDVLKKIREEMKEKLPGLLTGEIRVPMQEQIKDLVYLEAVVKENIRLHPSTGFIVRET
MQDTTLVDGTFVEKGQTLMVSSYCNARNKKTWGDDCLEFKPERMIDPETGKLRVLSPYVF
SGFGSGQHVCIGQKFAQMEIKMAMATLFSKFDIKTVEDPWKLTYEFSLTIPVKGPLDVEV
TPLTPLTPPK*
269250
>CYP5015E6
sca80 349103 to 347501 minus strand 140005 64% to scaf_41a
349103
MKIVTQLPTDKRDAAVAAAAVVTLGLLVSYLSRPKDKGNKPKRKMAHVPKSTLPLLGNML
DMSTNMPRFHDWISE*CAEFDNEPWTLQIPGKEPWIVLSSAELFEDV
348783 frameshift
348781
LKTQADNFLRGPVSHHQAYDVFGNGLSISDGDAWFYQRKTASH
LFSMQIMKTVMEDSVREKLDVFLDVLGKYAARGKPFGIKKWLSHFTMDVFSKIGFGVELD
TLKNTFDQEGDHEFLEAFNVASVAFGVRIQTPTWLWELKKFLNVGWEKIIMDNCKKFHDF
IDSFVLKAMVERGQNKVARDLISLFLDSSIDTSELQIEEDEATIMRDMVTTFIFAGKDSS
AHSLGWFIVNMNRYPEILRKIREEIKEKLPGLLTGEIQVPTAAQLQELVYLEAVIRENIR
LHPSTGFIMRQATEATTLVDGTFVDKEVSVLLPSYANARNPRTWGEDASEFKPERFIDAD
TGKIRNFSPFVFSSFGSGPHICLGMKLALMEVKLTLATLLSKFDFKTVEDPWQMTYDFSL
TIPVKRPMEVEVTPLVTPYADSA*
347501
>CYP5015E7
sca91 286085 to 287701 plus strand
no introns 77% to scaf_41a
MKSVSELFGDRNDVAVTAAAAVAVSLGLSLLL
HSTKKSKKPEGVRLPPMPKTTLPILKSLFDAGGNVARFHDWLNEQSIEFDHRPWMYSIPG
RPVTIVLTSPDTIEDALSTQNDVFLRGPVGQYMSEDIFGNGMIIADGDPWYYHRKTSSHL
FSMQMMKDVMEATVREKLEVFLDVLDIYHKRGQSFSAKQELLHFTMDVIAKIGFGLELDT
LKDGPHRDEDHEFQEAFDQAAVAYAVRVQSPLWLWEIKRYFNIGWEKVFRDNTTILHNFI
DEVITQSMKKKAELAAKGEKMVARDLITLFMESTLRENQDMHIEDDDATIMRDMVMTMMF
AGRDSTAHSMCWFIVHMNRYPEILEKIRDEMKEKLPGLLTGEIKVPTQEQLRELVYLEAV
MKENIRLIPSTGFIAREAMRDTTLVDGTFVGKGQTIMVSSYCNARNADNWGEDASEFKPE
RMIDPKTGKLRVLSPFVFSPFGSGQHACMGQKFAMMQMKLTLATLYSKYDIKTVEDPWKL
TYEFSLTIPVKGPLDIEVTPLSPLMA*
>CYP5015F1
sca117 128109 to 129785 minus strand no introns, 89% to scaf_1
MNPAGLPHQQLQLQHSTSSCHSPIFSQQLKPSQPTNQLTGTSMWSSASH
DGAQQSVLLAFGALTALYASWKILSMPVPLPDPGMEDLFRPASTLPILGNTLDVLLFNRY
RMSDWINDQTDASEGKPWILQLLFQPPWVVLSMPSDLDDVFRDQFDVFEKGGTLGDISFD
VLGNGLLNVSGDKWKQQRRAASHLFSTQSIRDVMEPVIREKTLQLRDVLAQCADREQTVS
MKSLLGKFTSDVFTRIGFGVELNQLGGDVLVDDMHPLDIALHAVQNRFQTPMWMWKLTRF
LNVGAERRLRENMKIVNDMVRGIMVRSIGDKTPGDGKKNLLTLLMKDDVDADPRELQDTA
VNFFIAGKDTTSFSLSWLIVMMNRYPRVLQKIREEIASVLPGLLTGEMSAPTLEDTQKLV
YLDAAVKESVRLWSVSTYRCTTRDTTLTSGAFIEKGTVVVVSKYAAARRKNVWGDDAAEY
RPERWFDEKTGEPKSITPPQFITFSTGPRKCIGMRLAMLEMKTVMAVLFSRFDIETVEDS
FKITYDFSFVLPVKGPLAVRIRDRTAPSV*
>CYP5015F2
sca19 215894 to 214356 minus strand, no introns 132088 84% to scaf_21m
MIDSQALSPVLATAFALLLVCWKLLSKPRPHSNGQELFRPASTLPFLGNTLDVLWFQRHR
LHDWMTEQSLASGGKPWLLTGIGQRPKVVLTSPAAYEDVFKTQFDVFVRGPGETVLEVLG
QGIFNVDGDKWRHQRRVTSHLFSMHMLKDCMKSVVREKTVQLREVLATCAERGQTVSMKS
LLNKFTADTFTRIGFGVDLNGLADPVDVDTSQPLDTALGVVQTRLQSPVWLWKPRRFFNV
GSERVMRENMQQVQDTVQKIMAKSLADKEHQANGEEATTSSKHKDLMSLMLQSGDFTDPR
EVRDICVNFYAAGKDTTAFSLSWFIVMMNRHPRVLCKVREELRRVAPELFTGELDTPTLG
HLQQLTYLEAALKESLRLNSLAVYRLANRDTTLSDGTFVPKDARAVFSMYASARQPSVWG
SDAADYNPGRWIDEETGKLSSFKFVTFSAGPRQCIGMRLAMMEMMTVLSVVFSRFDLETV
VDPLDITYDFSLVLPVKGSLAVRVHSLSAHMA*
>CYP5015G1
sca24a 615142 to 613559 minus strand no introns 133194 57% to scaf_11a
MWTLSQHATFDKAA
ATVALVTAAYVGWNVVSAVVARRAVNRVLADQGVYEPPSLPVLGHTLDLMHNKDRFHDWF
AEQCLAAGGRPWVLRIIGRPPTLVLTSPQEIEDVFKTQVDIFEKGLDIREIGHDFFGDGI
VGVDGEKWQKQRRTASHLFSVGMLRDVMDAVVMEKT
LQLRDVLAECARVNRPVSMKSLLA
KLSSDVFTKIGFGVDLNGLGGDVDDDMEHPFIKAVETYGSVFQSRLQSPMWLWRLKKRLG
VGEEGELRKARVIVHDLVMEIMKKSMASKNSATGSKQQ
KDLITLFMKTMDSSADVMEVRD
AVMNFFLAGRDTTSFSMSWMIVNMNRYPRVLEKIRAEINANLPELLTGEIQAPSMADLQK
LPYLEAAMRESLRLYMATVHRAPNRSTTLSGGLHVPFGTHVIVPTYAMGRMPTVWGEDAA
EYRPERWIGEDGRVLKVSPFKFFSFLAGPHQCLGMRFALLEMQTVMAVLLSRFDIKTVEN
PFEITYDYSLVIPVKGPLMANIHDRSTSVAASS*
>CYP5015G2
sca24b 622007 to 620433 minus strand no introns 89% to scaf_11a
622007
MWGISQHHERQAVLAAGTLSGLYLGYKLLVAVYKELKITRALDAQGLHRPKSTLPILGNTLDV
MYFQKDRLQDWMAEQSQVSDGKPWVLSIIGRPQTLILTSPEACEDVFKAQFDNFGRGDEL
VDLQHDIFGEGVAGVDGEKWLKQRRIASHLFSMKMLRDVMDEVICEKSLKLRDVLAQCAK
EGCVVPMKSLLGKFSSDVFTKIGFGVDLHGLDGDINSEMDHPFIEAVDGYAEVFGARLQS
PMWYWKLKRFLNIGDERMLKRCIKVATELLNEVMLKSMASKTAEDWNTK
TDLLTLFVDTT
GKTDSSDLRDAMMDFFLAGKETTSFSLAWVIVNLNRHPRVLAKLRAEIREKLPGLMTGEL
EVPTMEDLAKVPYIEAVLKESLRLYMTGVHRTPMRSTTLREGTFVPYGSYVVMSVYAAAR
VKKVWGEDAAEYNPDRWIDEETGKIKFVNPFQFITFGGGPHQCIGMRFALLEMQTVIAVL
FSRFDIKTVEDPFKITYDYSVTLPIKGPLECTVHEATAPAY*
620433
>CYP5015G3
sca24c 624127 to 622555 minus strand, 86% to scaf_11b, 2 frameshifts
624127
MWGLAKHQVSEREAALAVSALGALYVSYKLLSAMYKSGSMARAFDAQGLYRPKSTL
PILGNTLDVMFYQKERLWDWMAEQSIL 623879 frameshift
623879
QEGKPWVLSIVGRPDALVVTSPEACEDVFKTQFDNFGRGTELRDVIYDIFGDGIAGVDGE
EWQKQRRVASHLFSMKMLRDVMDEVIIEKVTKLKDVLAECAKQGKVVPMKSLFGKFTSEV
FTKIGFGVDLRSLESDPCSDSNNAFIRAVDVYAEVFGARVQSPAWFWKLKRFLSIDDEGRLKQSAKVA
623322 frameshift
623322
GGTQQVLAKSLEVRRQDSSDAKR
TDLLTLFVEANTSIDPKAVHDTLMSFLLASK
DTSSFSLSWVLINLNRYPAVLAKLRDEIRANLPGLMTGEIKVPTMEDLQKLPYLEAVAKE
SLRLHMTASNRMANTATTLSDGTFVPEGCAVMIPMYASARVKSVWGEDAAEYKPERWIDA
ATGKVTPVSPFKFVTFGAGPRQCLGMRFALLQIQTTMAVLFSHFDLKTTEDPFDLTYDFA
ITLPVKGPLNVTVREITPAAY*
622555
>CYP5015G4 sca24d minus
strand, 58% to scaf_11b, 133197 (short prediction)
632042
MWTAQNHATTSSAVLLTAATLGSLYAGWKVATALYSQRVLDAALTKQKLHSPDSTLPVLGNTLDLLFFQRERLW
DWVTEQSAISGGKPWVLRIIDRPTSLVVTSPETLEDIFKTQFETFERGADMRELFYAFVG
DGIVGADGEQWVKHRRTASLMFTTRTLREVVDAVAKEKSLQLRDVLSECAKQGRVVSMKS
LLTKFSGDAFTKIGFGVDLNSLGGNVESAMDHPFMEAVEVYAEVLCTRLLSPTWLWKLKRFLNVGDERALKHANKI
VHDLTYEVMRESME
KKTHGEGMALQQ 631155 ()
625161 KDLLSLFMQSGDTV
DVQVVRDSVMNF
LLAGHDTTSFSLSWVVINLNRYPDVLAKLRTEFRERLPGLMTGEIDVT
TYEDLQNLPYLEAVVKESLRLYVTAVNRVANQSTTLSDGTFVPLGCGIMVALYAAARMKN
VWGEDADEYKPERWIDPKTGKVKNVSSFKFISFIAGPRQCIGMRFALLQMRVAIAVMFSR
FDLKTVEDPFKLTYDIAFTLPVKGPLNVSVHELA*
624475
>CYP5015G5
sca24 plus strand missing N-term 193 aa
571848
SVIGTFSGDTFTKIAFSVD 571904
571905
LNGLADAENDHP
571941
FNEAVDVMAEMLGSRLLSPTWVWKLKRFLNIGDEHKLKQACAIVHELTHR 572090
572091
VMSKSM 572108
572182
LLLAGKDTTDFSLAWILVNLNRYPDVLTKLRKEINEKLPGLVTGEIDIPT 572331
572332
MDDLKDMPNLEAVVKESLRLHAIA 572403
572404
TTRVPNKSVTLSDGTFVPAGCTARMTSVWGEDASVYKPERWI 572529
IDAETGKVKMVSPFKFGTLIS 572589
572589
GSRQCVGMRFALLEMRIATAVLFSRFDLKTVDDPFDVTY 572705
GSLWRSRARLMSPCTLFRLRSLRRGGED*
>CYP5015G6
sca43 plus strand 55% to scaf_11b
184231
MLASLYDVVFSVASLAALYAAWRVGSRVYSQRIIDVALANQKL
HSPPSTIPLLGNTLDALFLQKTRFWDWIAEQSELSGGKPWVLRLVGRPTTLVCTSPEALE
DIFKTHFDTFERGADLRDLLYDFFGDGIVGADGENWQKQRRAAS
184671 frameshift
184673
TTRALRDAMATVVKEK
ALHLRDALAKCAKEGRTVDMKSLLEKFSGDTFTKIAFGVDLNGMESDHPFNKAVDVMSET
LDSRLLSPTWLWKMKRFLNVGDERKLKEACAIVHELTHQVMTESMQQQQKKKNKNKSDVL
TLLLDSSGDLDVAVVRDAVMNFLLAGKDSTIFSLSWILVNLNRHPEVLRNEINEKLPGLV
SGEMDAPTMDDLKDLTYVEAVVKESLRLHGIATTRVPKKSIILSDGTFAPAGCAVMMPAY
ASARLTSVWGEDASDYKPERWIDP
185512 frameshift
185512
RVKPVSPFKFGTFIAGPRQCVGMRFALLEMRLVTAV
LFSRFDLKTVKDPFEISYEYAFTLPIKGPLLDVTVRAVSAA*
>CYP5015G7P
sca46 minus strand 50460 to 51598 pseudogene with large deletion
48%
to scaf_11a
MWSYAQHASSQHSAVLAGTAMVVALLGWKALSHALRTPEQKAADES
FRRIHRPASTLPLVGNTLDAMFFQTERFVDWMADQSALAGGKPWLMSIVG
51311 frameshift and deletion
51311
QREGKENMKIIKAIVNDVMAQSLARKNARPEHDTEDQEAEKLELKEPTTETKYLLSFFLESGLTDAQQL*D
MAVNFFFAGKDTSSFVLSWFIVMMNRHPQVLRTIRDEIRDQLPELLSGEIDAPSMEQLGR
LPYLEAAMRENLRLNTSMTTRSPNQDTTLSCGTFIPKNSVVCVCHYASARLKSTWGDDAA
EYKPERWLDPDTGKLRQFSPYQFVTFLAGPRQCIGMRFAMLELRMVASVLCSRFNIKTVE
DPFSLTYELSTVFPVKGRLMCTVESSGLAVSS*
>CYP5016A1
sca131 17930 to 19567 minus strand (N-term seems short) 73% to scaf_92, no
introns
MMPSNLLNSRGLVAIPPGPLLGSTLALVVFWVMRHG
RERRLSWTSTPLPPRGYRTLPTASASQSPVRVGMMTDQPTEGSPAFYEWVCAMTSRFRGK
PWLLHLPGRPDVLVVSSPTSFEDIQRTFALQFEKVDNDAEGLAHDAHGGAIAFIYTGQVR
PSVNMQRQLASSVLSSAALRQQASVLVKHHLQSLLRILDDTASSGDPLDVTRLMRSFAME
VFTELDFGLQLGALRSRRGECSDLEQAVDEVQRRVAERLKRPAVAWKLERLLDVGSEAAL
SRSVDVVSRITLGAVDTKRKRRRGGSPCDSPIAGARVDMLDLLLSQKCSSKSSKDPEFLA
EFVLGLVVAARDSMAHALSSCLQCLARHPEEQEKLARELKEAEEEDRDLQSVVYLEAVVK
EALRLYPAKPFIRRRARIDTVLSDGTFVAAGAKVAMDLYSMARRENVWGQNSAQFRPQRW
IDSTNGKLRPTSNYKFNAFLGGPRACLGADMAMTEMKTVLAKVIGRVHLDAVEPRMAEKD
KTKAWDAACDAAVRVLVRRRGPSPPGQYS*
>CYP5017A2
sca169a 37279 to 38733 plus strand 78% to scaf_14, yellow = poor match
37229
MKMPWHLFFDGAIY 37320 (0)
37384
ITDPKDVQHILSTNFDNYVKPQGFLDAFQEIFENSFFAVNHHAQAPDGGAGW
RLQRKVAAKVFTTANFRIFTEQVFARHAEETLLAAQAEATESRAREDPDERFCCDMQEIS
AKYTLNSIFDVAFGLPLSEIEGTENFAEHMGFVNTHCAQRLFVKQYYKLLRWVMPSEREL
RRRTREIRAVADTVLLRRLQESQEKINARSDLLSLFIRKARELAFESTKDQKVDAASLLG
PKTLRSILLTFVFAGRDTTAECLTYSFYAIARHPRVQKQIVEELESTKENTGSTHATFTF
DQVKEMKYLEAVVYEAVRLYPALPFNVKNAVKDDYLPDGTFVPAGVDVVYSPWFMGRNGE
LWGNDPLEFRPERWLEMPKRPSAYEFPAFQAGPRVCLGMGMAVLEAKLFLATTLSRFHVA
IAPGEKQERGYVLKSGLFMDGGLPLQMTPRPQSAASA*
38733
>CYP5017A3
sca169b 51409 to 52752 minus strand 79% to sca169a 74% to scaf_14
52874
MKMPWHLFFDGAIY 52833 (0)
52752
ITDPKDVQHILSTNFNNYVKPQGFLDAFQEAFDNSLFI
LNHNAEAPDGGAGWRLQRKVTLKVFTTANFRIYTEKIFARHAEETMVNAQAEAVKVRDSQ
SSNESFCCDMQAVSARYTFNSIFDVAFGLPLSEIEGADEFAEQINFVNEHCAQRLFVKQY
YKMLSWVMPSERELRRCTRGIRAVADNILLRRLKEPGEKISARSDLLSLFIRKARELAAE
GTKEQGADAAALLGPNTLRSIILTFVFGGRDTIAECITYSFYAIAKHPQVQQRIVEELES
IKTSGGLKVTAFTFDEVNSMKYLDAVVYEALRLYPAVPYNVKSAVKDDYLPDGTFVPAGV
DVVYCPWYMGRNSALWGDNPLEFRPERWLEMSKRPSAYEFPVFQAGPRICPGMNMAILET
KFFLATTLSRFHVAIAPGEKQERGYVLKMALFMDGGLPLQMTPRAQFTS*
51409