9 Monosiga brevicolis P450s
D. Nelson Feb. 22, 2008
>fgenesh2_pg.scaffold_6000057|Monbr1
missing N-term and C-term poor match, CDS in 11 exons
MGKECNVPGSTGVPLLSDQTLAFWREPLEFCRKRTEKHGSVFQTRLLNHGTIVVCDYDTAREVLALPEEAASASEAYDDL
FHDVFGEESILLTTHNREHIRLKSCLLTWLAPSNIEALDPLLTNVAART
(0)
KASSDEPLDLYQNLKVVCLHATLMIMLGLE
(0)
QSVIYEQHLERLLRDHW (2)
HGITGLLSTFNLGLGSKSTVATAQEARDILIRLIQ
(0)
DQLVRVREEPDNAALTFLRQFDK (0)
ALDEYDNAFKADQLLLLVSALVPKALAASLTLA
(0)
MAAGTQHACVAANGELDPRRWHQVLRETQRCYPSMFAVRRHVKK
(0)
NEGVDVGRFHIPLGYHIMVIIPLANADPAMYEDPQ
(0)
HRLIFHDWAQRGTRAPQSFNPDRWAENVPRPLTFGHGTH (1)
ACPGRTLTDALLLKLCAQLHHSFEVSQLALEY
(0)
LELETPLRHASLELKWWPVPRPATPLLGRLIARQERDPSSTPAT
QARTDKGRLPTVGTTSGPGSRAASQAASPRLPRRSQHVTEI*
estExt_fgenesh1_pg.C_60048
[Monbr1:36486] alternative model
>fgenesh1_pg.scaffold_21000118|Monbr1
no introns, may be too short at N-term
MWLFITVVLAFVGLAVWLRERARMRFPGGSRDILLHMTGWLQGLRTEVIQGDVARAHHVLVSKGSDKGRVIEENMMLPVW
TDGCSIESCNGRVWRTVRHSYDHFVRQLPPIEDLETYCLQHTPSITADAGVLDSPGIARTVLAAFGQHLFADACTPQHLD
VIEACSWDIRAHVAMRTNGHDDLKRACNRTIINLLEQSAFESPPDLPGWQHPLAFSAVMQPFFISPSVNFVDIVIALDNL
CDLRAKSVHWRKQGYGRAKMDIVAHLVESIRIRHPFPMLERLHGDTQYFIALDSFFAPGHPQAAFNPTQWHAEGMYQDPL
HSLIFGAGPRKCPGRTIAWASLPALLYRWLELEAAGHLTWRPADRHRFSGRDLDNAPTPMAESLHAAGRIVSCLANLAFE
TRRPEYTDAPPATPLATLRSRRPNE*
>estExt_fgenesh1_pg.C_170105|Monbr1
top part is
not P450 has WD domain, probably a G protein beta homolog (deleted)
CDS in 20 exons
for whole seq
last 8 exons are
P450 seq. 36% to CYP4T9 Xenopus.
35% to zebrafish CYP4T
34% to
CYP4A11 hum CYP4 clan, 34% to 4V5 fugu, 36% to 4T danio CYP4 clan
MLEYATTAIWTALAIILVRWIVKTVISIRAIEKLPGPKFKFPLG
(2)
TLYAHPTPVEHVKLIKRLSAENPTSGFRFWLGPVQAVC
(0)
VLNHPQAVRAILEDEPPKAPMLYRYIQPWLGEHSLLTLEGERWKNMRRLLTPAFHLHILHHYAP
(0)
VIVDASSIIIEKFLNHAKTKPEEDYDVFSDYALLTLDVICRAAFSHEGDPQRNPEDKYAVSIGQ
(0)
IAESIIARAINPKMMFDGVYYRSKEGKEFQELLDYVHDHADN
(0)
LIDKRAKEIEGLTLDDIGRTRPGGRTLLDFLDILLTTQDENGQRLSKEAIRHQCDTFLF
AGHDTTSSCLSWLSYLLSVNPEAQEKCRKEIFDAFGDEAPTYEDVQKKI
PYLTCCIKEALRMYPPIPGVARKLTKPVNVGSTVLQPGTT
(1)
AAVGILALHYNPTLWEEPTKFKPERFETGVKHDSYSFLPFSIGRRNCIGQ
NLALNEIRLAMCQILRKVVILPSAEKDYEPQPMSQIVLRSENGVRMRFK
(0)
AYEE*
>estExt_fgenesh2_pg.C_280122|Monbr1
27% to CYP4F22 human CYP4 clan
second
part from 275-500 is 36% to Helicosporidium sp. CX129156 CX128716.1 CYP711 like fragment
44%
to estExt_fgenesh1_pg.C_280109|Monbr1
MLEYILYAVGGFVALCIGLVMYSLVPNPMGYFRMRRVFNTK
(2)
LQEYLKVPLWRPMDGSFHEMQT
(0)
DLIGFFKRQLDYGNL
(2)
FGVIPLWYVFDPNLILTKPEDMKQVLFGDDLQYYRDNTCFYVMHMVLGK
(0)
GIINVGGMEWRNQHRILYKAFAPDNLMYFRPAFAARARKMVDTFKAHAQSGEPIDLLKTMNEITLGVMIDTAFGNTLS
(2)
HEEQSEMRHHLMYVIKQTTNFVHQVPLLRYIFADHTQLKRRLGEMHSLVETSLIRRREGKSFGEVCEVKR
(2)
HMIDLIIEANHSESEDGYRMSDEVMRDNMISLMAAGTETTATAMTWTLYFLDKYPE
(0)
VYRKVREENMNIDLEHLAEPGDLTKIVPYLTQVIQESMRMCSPLGN
IPGRRPFKDMQVGDLVVPAHVPMLTFAHQIHHNPQIWDAPE
(0)
EFRPERFAKDGEASRDRLRFQPFGTGRRYCLGKYMAMAEMQ
(0)
VVLSHMVRDLRFEYAGTAEGITPAFRPPTIQPRDGMPMHIKLA*
>estExt_fgenesh1_pg.C_280109|Monbr1
Seq revised
still missing N term
30% to CYP208A1
Streptomyces globisporus bacterial
like seq, CDS in 7 exons
44% to estExt_fgenesh2_pg.C_280122|Monbr1
MWMAVALVVVAGVVLVPLLLLYPFLPDLRQWHRVRQVYNAR
(2)
LRTYLKVQPWSPLAGSFTALMQ
(0)
RYHGTMQHLMDLNKG (2)
494705 FGAFVLWFAYEPNVVLTRPEDIKQLLTDNDLNYTRDNSSFALFNRFIGQ (0)
SIINANGEEWRRQHRILYKAFSPDKLVGFRSTFANRGERLAHSLLELSQA
(1)
EGSVKLGHWLGKMTLSVIIETAFGNTLR
(2)
PDEQDLMAQEFIYMTNEFTNFAHQ (0)
IPVLRHVLTDTQRLETGFERLYGLVDQAVARRRSGEQDDGQIKLIDLI
LEANGEEDDRSRLDDAAMRDNLLLLLAAGTETTATTLGWLLYELAVNPK (0) 493503
493381
ELAKLRAENAALDLEALEQPGDLSKLVPQLTNAIHEALRLHEPLGGFPSRRPLHTTQ (0) 493211 (GC bound?)
493013
IGDLVVSPGTPVLSMMSAVHRNPEYWAEPE (0) 492924
492833 VFRPARFAPGGEVEQNPFQYFPFGKGRRYCLGKYFAIAELQVVVSHLLRRLDMEYLGDRA
TMRVVYKPPVLHASDDLPM 492597
RFFARRTSRRGSKLVEAV*
See fgenesh2_pg.scaffold_28000122 [Monbr1:28763] for an
alternative model
CYP51
clan
>CYP51A1 estExt_fgenesh1_pg.C_250046|Monbr1 51% TO
CYP51A1 danio
54% to DC515864 cDNA Library, Monosiga ovata
MDKLPAAVVPYAEAAQEALVSLHETLGRPSTTTYLATSAVALGIWKYIRGNYLRPAKAPPKVPSQVPWLGCIFAFGQSPI
EFMIDCYKKYGPVYSFVMFGTEVTYLLGSEASSRFWSTHNDVLNAEDLYANITVPVFGEGVAYAVEHKIFSEQKQMAKEG
LTIDRFKAYTSMIEKETNGFIERWGQTGTIDFFDNMARMIIYTATRCLHGNETREDFDEDVAKLYHALDGGFTPQAWFFP
PWLPLPSFRRRDRAHRELKERFYKIIDRRRQKAEEGTQTDLMHTFMTTPYKNVEDGRHLTTDEVSGMMIALLMAGQHTSS
TVSSWLTCFITTTPGLEEKLYQEQVELFKRRPGPLSYEHINEMPLLWACIRETLRLRPPIMSIMRRAREDYKVTVNGVEY
VIPKGSQVCVSPTVNGRLEDEWEDPNTFNPYRFLKEEDGKLVVTEGEQITKG
GKFKWVPFGAGRHRCIGFGFAQVQIRCI
MSTILRKYKLEMVSGKLPPINYTTMIHTPTEPIVRYTRR*
See fgenesh1_pg.scaffold_25000047 [Monbr1:10963] variant model
See
fgenesh2_pg.scaffold_25000059 [Monbr1:28307] long model
same
as estExt_fgenesh2_pg.C_250059 [Monbr1:33827]
Plant-like
>CYP704 fgenesh1_pg.scaffold_8000052|Monbr1 27% to 4F2
hum, 34% to 704B1 Arab.,
37% to
94D2 44% to 704B2 delete Cyan 42% to CYP704F1 Physcomitrella
37% to
CYP704E1 Physcomitrella
32% to
CYP745 seq
MLIPLLLIAAIAGLLHIWQKRLESPNAHMAAGCVPLLGHSLLVQKHLSKILEWFWANSKAANFKTWQLK
IIGQAPYVCVLDPVVVKHVLQDNFDNYIK
(0)
GRLFRDRFTELLGRGIFNADGPEWSYQRKTAAHLFKRRELSGFMTE
(2)
VFSDHGRLVCQKLDEASRTGTVVDLQ
(0)
ELFYRYTLESIGKIAFGVNLGCFENDRVEFAVNFDTAQRIIMERVLDPAW
(2)
EIRRWFNFIHPDEIELRRCVKKLDGIAH
(0)
GIIQDRRKIGDLSDREDLLSRFMAVKDEQGKPLDDERLRDVVMSFVIAGRDTTANCLSWVFYELHQHPEVFAK
(2)
LKKEVDTVLDGAEPTHDLVHS
(1)
GMPYLHAVVKETLRLHPSVPK
(0)
DGKVAVKDDVLPDGTVIKAGTIVIYLPWVMGRMES
(2)
LWEDATRFNPERWLNQTTEPSHFQYTAFNAGPRLCLGMHMAYIEAKLLVAMLVQRFDFEVK
PNQEFTYTVTLTMPLKNGLLVTPTKRA*
See estExt_fgenesh2_pg.C_80053 [Monbr1:32130] same as
estExt_fgenesh1_pg.C_80051 [Monbr1:36798]
fgenesh2_pg.scaffold_8000054
[Monbr1:24820] different model
>CYP745 estExt_fgenesh2_pg.C_170049|Monbr1 28% to
CYP4F8 human,
41% to
745A1 Volvox, 42% to Chlamydomonas CYP745A1,
36%
to CYP5160B1 Ectocarpus
41%
to CT887000 Phaeodactylum tricornutum EST
47% to
CYP745B1
e_gwEuk.8.160.1|Ost9901_3 Ostreococcus lucimarinus (marine micro algae)
probable food for
Monosiga brevicolis, 45% to O. tauri 745B1
41% to DC507813 cDNA Library, Monosiga ovata
MAAAAANQAMDLAHQGLDWAFHRVVLQAATILPPWLLRHVPANWQALTPAKLAVVTPAAFIVARIVMHQ
(0)
LHQLRIKFALRNVQRAPQWLPIVGHTWALLIGTPWDVFHSWFETTGADLLKANVMGENSLLVYKP
RHLRQIMNSKLHNYPKDVDFAFKTFMDILGSGLVSSNGALWKKQRTLLSHALRIDILEETM (0)
PVAKRAIDRLSEKLEAIRGTGEYIEIAEE
(0)
FRVLTLQVIGELILSLSPEESSRVFPDLYLPIMEEANRRVWEPYRAYIPTP
(1)
GWFHYNRTLHELNNYLCNLIRKRWADRQAAVAAGTNEDDKDILEVIMADIDPA
TWGEGTVLQLRDEIKTFIMAGHETSAAMMTWACYELHRHPEVREKFIQEAQ
(2)
AVFGTGIAADAEGADKFTKTPLPANEQLKGLQYTMNVLK (0)
ETLRFYSLVPVVARVTVEDDVLDGHVVPAGTRILISLRSAHDNPETWKDPMTYRPERFDEPF
DLYAFMPFIQGPRNCLGQHLALLEARIVMALLMLRFKLTPRDESCGERHPSIVPVCPKNGMWVRVD*
See fgenesh1_pg.scaffold_17000049 [Monbr1:9776] different model
Included
here for comparison are two ESTs from the same CYP745 gene
In
the colonial choanoflagellate Proterospongia sp.
Proterospongia ESTs from
TBestBD
http://amoebidia.bcm.umontreal.ca/pepdb/searches/welcome.php
found 2 ESTs in 1303 ESTs =
CYP745
>PRL00000480
N-term of CYP745 like gene
47%
to DC507813 Full length cDNA Library, Monosiga ovata Dec 18 2007
41%
to CYP745 estExt_fgenesh2_pg.C_170049|Monbr1
45%
to CYP745B1
e_gw1.08.00.85.1|Ostta4
53%
to CYP745B1
e_gwEuk.8.160.1|Ost9901_3 Ostreococcus lucimarinus
yellow
is probable untranslated region
PRVR LRGDSKQEKGRRTMSG
MAASLSGLSLQNAGGKAKEAWTGLVDTLRDGFTHRPVVTA
LKCAGALVAIKVAVDTTRYVAEQWSIGSALRSLPRATGSLPFLGHALRLNVESPWDVMET
WIKSFNYNVMALDFFGKTGVVISDLERVRRVFNSKQRNYDKDLELSYSSFLDLLGNGLVT
SGGALWYKQRTLLGHALRVEILEETAPVAKRAADRLCKRL
>PRL00000755
C-terminal of CYP745 like gene
55% to CYP745
of estExt_fgenesh2_pg.C_170049|Monbr1
47% to CYP745B1 e_gw1.08.00.85.1|Ostta4
46% to
CYP745B1
e_gwEuk.8.160.1|Ost9901_3 Ostreococcus lucimarinus
PRVR vector seq (same as seq above)
TSAAMMTWLTYELTQNPDKREKFLKNASAVLGTGKGKGKTPAEQFDNFTLPDRKEI
NKLTYILNSLKETLRYYTLVPVVTREAVEEDDLCGVRVPAGCKVFIHIKAVHNNPEVWEK
PRTFMPERFEKEHDPCAFLPFIVGPRNCLGQHLALLEARIVMALMMLRFDFEPAQSNVGE
KHGRTVPVCPKHGMWLN
These are
probably from the same gene. Missing 142 aa in middle region.
Newest
M. brevicollis seq found may be in the CYP7 clan
>fgenesh1_pg.scaffold_11000095
[Monbr1:8471] MODEL IS FUSED WITH DOWNSTREAM GENE
like
CYP39 in CYP7 clan note: this translation requires A GT-AT BOUNDARY AT PPD/RWER
Expect
= 8e-22 21% IDENTITY
394781
MWILVALVVFCAVAAQLSVLLQQKSISPDIPMIGGALPWIG (2)
CGLRFIKNPRQFFEDLRVKHGDTFGIYMFGCRMLCLFDQKGVDQLYRMRDVDASFFEATKGLLSLKLPPE
(0)
LLQDSSLKKFHQALKPKLMPHYIR
(2)
YAHDVVTDHVARTVRAGEQTVSLFPYVKSLVHKI
(1)
GLACWLHPAANSPERFKHIVQAYEQLDPEQ
(0)
GFANGAEFLKTMLSGKRAERRAIDALQTAVAELCQSLGT
(0)
NLVSMYEAQPEDAMPSQRHRAIATNLFHFMLASQANMYAGMAWTLIHLLT
MADQQHLRLVRAEVLHAQQAHGEDFLRTQASLDSLAFLDACIVETLRVVQQSITLRKVMRPCTLQMDSGS
AVLPPPWYLLTLLSVTNMDPATIDPAVAKSSANSPPD
(2)
RWERVTLARPSPNPLTSTFGHGYHACPGRTFALNMSKIVLAQHILAFDL
VPQFERATVPVTSVGALARVEHDCPLRLIPQV*
392554
Fusion
sequence at C-terminal has been deleted here.
See fgenesh2_pg.scaffold_11000089 [Monbr1:25689] alternative
model