8 Monosiga ovata P450s
D. Nelson Feb. 20, 2008
Assembled from ESTs
>CYP51A1 DC515864 Full length
cDNA Library, Monosiga ovata Dec 18 2007
DC505021
DC481719 DC499362 DC515864 DC455628 DC461341
DC454243
DC488261 DC503517 DC463918 DC518896 DC496720
DC481661
DC475708 DC470881 DC457948 DC508866 DC473892
DC461250,
DC476196 DC482539 DC489615 DC514865 DC490097
DC454294
DC469866 DC502657 DC513272 DC505282 DC458836
DC477187
DC482903 DC502887 DC495059(some errors) DC451008.1
EC165400
02-JUN-2006
cDNA
Library, Monosiga ovata Dec 18 2007
54%
to CYP51A1 Monosiga brevicollis
MOC-079L11
3 prime DC505020 = 5 prime
MQHIGSLTLADIGTRITEYVSTANRTHLLAGGVAALVTLNWIKK
TYFRSSKLPPHAGSNFPFFGSMVSFGQHPVKFLERCYKEAGPVFTFTMLGSEVTYLAGGE
VTDDFWSSKNDDLAAEDLYANLTVPVFGKGVAYDVPHPVFSEQKGITKKGLTQQRFAKYT
AIIEKETLAYIQRWGESGTCDLFKDLSELIIFTATHCLHGEELRSTFDESVAALYCDLDK
GFTTAAWFLPNWLPLPSFRVRDRAHRELIRRFTHAIRDRKAKG
DVPGHDDMLETFMTATYEKVNDGRAFTESE
TAGMLLALLLAGQHTSSTVSSWLGFYMAQNPALQQDLFEEQQRVMGSQSGPLSLEAINSMPHLWAA
IRETLRLRPPLLTLMRNCRRPMEVKVGEKTYVIPKGNQVCVSPALQGVLEDLWDEPEKFD
MNRFLKKDASGTEVVTDGTQVAKGGKLKWVPFGAGRHRCIGFDFAQVQIRAIWSVILRNY
EITMTDVPEIDFTTLLQLPVHTKVHYRRRPTAAKA*
>DC486766
Full length cDNA Library, Monosiga ovata Dec 18 2007
MOC-053F16
3 prime, DC486765 = 5 prime
like
Helicosporidium sp. CX129156
CX128716.1 CYP711 like fragment
~50%
to DC460778
MSLTGIAAPLLPWQRSLVLAIATPILSIAVLLVSILYPSWRSPLRRALPSP
PAPSLLLGHLPAMGKTGHSTLFAWAKKYGGAFFVRLGCWPAVIITDVELIKSICITQFKD
FHDRSQAFRPNPHVRHLLWAQGAYWKACRNAVSPAFSRSNIAGFGAQMNQSARGLVDRVG
ALAGTGQSVDIMRVVGAMTMEVIGGTALGVDLSASNREVSEKIIAAADLLFGSNVAGSTI
TSFIRFLVPALLPLWLRVPALTA
(34
aa gap)
GFLQLMLAAKDPESGLQLTDKEVIEQCRLFMLAGFETTANTLTMAIYLLARNPDAEARMV
AEIDELFTGTEVDYDSVQRFKYVDCVLQETLRMYPPGSNLLREATSDTVLGDLEVPKGTT
IIMPMYTVHRDPALFPEPESFRPDRFLEGSALAPTDKYANLPFGAGPHMCIGNRFALAEA
AVTLIHLYKNFTLRLAAPLPDPLPLRQSITLSPAVPIPVRFDRR*
>DC460778,
DC458147, DC494197, DC496896, DC457970 cDNA Library,
Monosiga
ovata Dec 18 2007
MOC-015K20
3 prime DC460777 = 5 prime
~53%
to DC486766
MAVFSKLGISLTSIPALPRPSWAALAAAAPVVLTASAVAYMVLP
GLLSPLRGQMPSPPTVSMLLGHLPAFTKRSHLVLLQWTKLLGKVFFVRLGAWPVVVITDI
ELIKAVNISQFKDFPDRAPCFVARQDIRHLLWARGAYWKACRNAISPAFSRNNLTGFAQQ
MNESAHALAARLGRAADAGEVINMMDVVGNMTLQVIAGTAFGVRLGPEAAGQTAAFVAAA
KELFGSSVQGANLHGILRFIFPFLADIYTYIPPT
(small
gap at I-helix)
YETTANALTF
STYLLARNPAAEARLLAELADVGVPDYISYDDSQKYKYVECVVQEALRLYPAGNAIVRQA
VCSTTLGPYEIPKDTCIVTPLYTLHRDPDLFPMPEEFCPDRFLDGHPLAPSDKYAHLPFG
LGPHMCIGYRFALAEAVIALVRIYKDFTLRLDAAVPDPLPLQQGITLTADVPELFHVERRAP*
>DC501006 DC513559 DC511491 Monosiga ovata Dec 18 2007
MOC-073O13 5'
DC513560
= 3 prime
DC511492.1
DC513560.1 36% to estExt_fgenesh2_pg.C_280122 Monosiga ovata
~45%
to DC499869
MDILLTIFLSILGMLLFGLLCILAFCLPKPYEYL
LSRKIINTKTGKPLSNNNIKLPFGDLLSCLKDVVGAERSRLKYIPENAIGWWALNNYDVQ
LLNADWLRELLTMPDVKFNRSLESFKVLHRLLGTSLLGNQGEAWARMHKILYKAFQPSAL
ASYVPFFISQTKEIANNIDASIATNTSYDMNLAFNDLTLAVMVDAGFGKAVSPADRKIIL
HAFRYLFMETQNPLHDIPILSKLPFPSNLECERQFRALHETADRI
(very
small gap)
MHASGEYQEAHDGKYVVDMLLDAHEEEGKLSDIELRDNVVMLLVAGSETTGTTLTWVLHY
LTVYPDIKARVLAEIDTLDISWEHFNVTNFDTGMPYLTMVVSETLRDTPSIYGIPERIFT
EDGTVGDLRLPKGSRVGVSQYCMHHNQNYWSEPDKFDPERFAPEASAARHRFAYLPFGLG
RRQCMGKFFALNEIRVVIALLLKHYEFTYDASVGPVEVTWRPPTLMPKRGLPMFAKRR*
Note:
a short EST matched this sequence in the N-terminal region.
It
may be from a related P450 (from M. ovata TbestDB)
>MNL00001922_unclassified
N-term Genbank EC163715.1
Query:
63
CLKDVVGAERSRLKYIPENAIGWWALNNYDVQLLNADWLRELLTMPDV 110
C++DV
G + SRLK E+ + W L
Y +LLN DWLRELL M DV
Sbjct:
1
CVEDVGGGDWSRLKQSCESVLDWRGLERYYCELLNDDWLRELLRMDDV 144
52%
to DC501006 short sequence like N-term region in gray above
CVEDVGGGDWSRLKQSCESVLDWRGLERYYCELLNDDWLRELLRMDDV
>DC454381
Full length cDNA Library, Monosiga ovata Dec 18 2007
60%
to DC499869.1, 46% to DC501006,
MAVHDGHGVGSILALLGLGVLGVAAVGAACLVSFLLPSPAELRRRKALHNRITGNAL
PTSGFHLNPLSDLLDELSDPFGTSTVRRDMAKGAQSGAYGLWAFGQYYAMLSNADWLREM
LSMPDKAFRRSFQPFRTFSRLLDSSLIAAEGETWARQHRIMYKAFQPSALAGYVPLYARR
AHMLADRLAAVAAAGGREDMLLAFNDLALGIMVDAGFGAAVSADDFKILFDCFRYIMCEM
QNVVHDLPIIGALPLPTNVRLDHQFKALHAAA
>DC499869
Full length cDNA Library, Monosiga ovata Dec 18 2007
61% to DC454381
DC499870
= 3 prime
DC499870.1
35% to estExt_fgenesh2_pg.C_280122 Monosiga ovata
48%
to DC511492.1 DC513560.1
~45%
to DC501006
MVVWSSPSWPATPASALRLAGLAVLGSVALCTACAISFCLPSLRSYIISRRVRVR
STGALLPENQPGIPMKTILNEINDPIGENKRRLAAASGIGALGLWLLDSYSVLLANTDWI
RELLSHPDKIYGRTFMPFKTMDRIIGGSLIGSEGDDWARQHRIMYKAFQPSALAGYVPLY
ARRAHMLADRLAAVAAAGGREDMLLAFNDLALGIMVDAGFGAAVSADDYAIIFDGFRTAM
QETTNIIHDLPIIGSLPLPTNIRLENQFKRL
(gap)
GISAVAPDVLTRVRAEVDAVVGSVEGMDSSKLDGVLPYLTKVVNESLRLNPPVAGIVGRI
IKQDMTLGDITLPAGAGVGASTIMTHYNPANWDRPDEFDPERFGEEAVSKPPRFGYIPFG
QGRRQSLGRVFALNEIRVVLAALLKQFPFEYDDAGGPVPLQVPPPSLVPPAGMPMTVRARL
>DC514820 DC464533 Full length cDNA Library, Monosiga ovata
Dec 18 2007
DC464532
= 5 prime end
Best
full length matches are to aromatase CYP19 low 20% range
MLALASLLPTGLRSLYRKMLAMTELDPVDVQEDVLAHTVPQ
VAHQLLEKRDVAVVKLGPRRVVFTCNPAIADHVFKEKQDCYIQRLCEDEGVRELGMYQQG
VIWNNSAQWSSISKDVFHRAINPHALKEASRLARARAQLVFQSALQGQAAGGVDVLKCCR
LVTLHVTLQLFFGISTEELSAQVSAESVIADVCNYFKAWEYFLLRPASSTPPDEKTQAHR
DAINRLRTTVRIIIECARAKLAAAHADHS
(gap)
DGPPAPVATAPSASASASASAPPRASTFTVTRLLADLRRRRFEHMQQAVDTMPDDPEDRE
AWCAWLAAAHDAAVEASAKSDLLKALLHESLRFKPVGPVVIRQAVADDILPASASPHTGT
PMRKGDGIVIALDLMHRRADLFERPNVFDPGRFLQSTEDTDMIKFSQPSRFAPFGAGRKS
CVGKDLGMAEILQVCA
AILTAMAFDLGDNEPLADLETRWDIANQPTRPIALRACRPSLIL
CGPSSSGKTTLRKQLQAEGGWSSHASIE
>CYP710D1 Monosiga ovata (choanoflagellate,
single celled ancestor to animals)
GenEMBL CO435081, EC169877.1 EC166517.1 ESTs partial sequence
53% to
XM_813889.1 Trypanosoma cruzi 710C1, 43% to 710A1
46% to
710A7, 56% to 710B1, 49% to 710A14
note:
seq ortholog not found in Monosiga brevicola at JGI
DC485544
= 5 prime end
DC485545
DC517956 Full length cDNA Library, Monosiga ovata Dec 18 2007
MAQELLDAIAAWRPVATWQSALYATAALVSGYALYEQIRFRQWKQGMEGPA
LAVPLIGSIVEMVKNPYAFWENQRLRNPCGISWNSICGQFMLFSTQTDVTKKIFMNNGED
SFRLFLHPSGWKI
LGEHNIAFKHGPSHKALRKSFLNLFTRKALGVYLGIQERLVREHLATWKLVDEPTEFRLK
VRDLNLLTSQTVFIGPYLRSDEERVTFCNNYLLMTEGFLSFPIAFPGSGLWKAINARHAI
VDKLVAAARESKARMAAGADPECLLDFWSQRILEEIAEAKPGDEPAEHWSDWEMGNTMMD
FLFASQDASTASLTWTAAFMSERPDVLAKVQAEQKALRPNDEPLTYDMVEQLVYTRAVIK
EILRFRPPAVMVPAIAMVDFPLTD
TCVAPKGSL
VVPSIWAACMQGFPHPEVFDPDRMGPERQEDVQYRDNFLTFGVGPHMCVGREYAINH
LVAFLSLLSTTCSWTRIYTPESHTIKYLPTIYPGDCLIHLKPLQASA*