Red
algae Galdiera sulphuraria
May
12, 2006 D. Nelson
http://genomics.msu.edu/galdieria/sequence_data.html
data from the Galdieria Genome Project
Michigan
State University Galdieria Database http://genomics.msu.edu/galdieria
Barbier G, Oesterhelt C, Larson MD, Halgren RG, Wilkerson
C, Garavito RM, Benning C, Weber APM (2005) Genome Analysis. Comparative
Genomics of Two Closely Related Unicellular Thermo-Acidophilic Red Algae,
Galdieria sulphuraria and Cyanidioschyzon merolae, Reveals the Molecular Basis
of the Metabolic Flexibility of Galdieria and Significant Differences in
Carbohydrate Metabolism of Both Algae. Plant Physiol 137: 460-474
Weber,
A.P.M., Oesterhelt, C., Gross, W., Bräutigam, A., Imboden, L.A.,
Krassovskaya, I., Linka, N., Truchina, J., Schneidereit, J., Voll, L.M.,
Zimmermann, M., Riekhof, W.R., Yu, B., Garavito, M.R., Benning, C. (2004).
EST-analysis of the thermo-acidophilic red microalga Galdieria sulphuraria
reveals potential for lipid A biosynthesis and unveils the pathway of carbon
export from rhodoplasts. Plant Mol. Biol.55: 17-32.
>contig_454_Oct13_2005 39% to 3A5, no introns
130976
MASPNEIARWFLQTRTKLHSYLFSYCFMFLAPRIALVSPEAAKH
VMVKNVRNYVKPPMVRQGLSNLLGNKGILLAEGDDHARQRRIILPA 130707
130706
FHFDALVHLGPIFRAQGQQVVQRWLNRPEEAIDVHLDMTQVTMNVIALAAFG 130551
130550
YDPNTDSGQELYRAYRDIFTQRPPSRMLAMLFSLLPSWLLQSMPLSRLLRRQQSNVR 130380
130379
LVKKKVTEIVQKRREEYEALLVKDSNAMGKSTTNRDLLDMLVAARDPELEKKSSHLP 130209
130208
YLTDEEITSQALTFMAAGQVTTAVLLSWTLFELSIHPSAQEKLRQELQTMETTLSTQDIT 130029
130028
EMVQHLDKLEYLDVVLHESLRLHPPVLFITRQAVQDDEILGFPISQGAIVNIPIVALHRD 129849
129848
PEQWGPDAESFRPERFLSSDKNNVVIQRHAMAWLPFLYGTRACTGQRFAMLEAKTILFEL 129669
129668
LTKVSVRLQPGCEVKGYGMVSVPRDVRLQVVDLHKE* 129558
>CYP710B
like seq from genomic DNA 85% to EST based seq. EST = A4_15B03
contig_966_Oct13_2005 29466-30995
29466
MDIVSFNSLSSGNLIILLVVITMICYFILEQLHYFWWKRSSKLPGPSFTLPFLGSIIEMV 29645
29646
KNPYQFWEKQRLLDPQGVSANFLVGRITLFVTDSALVRAILNNNSARTFLLALHPSARLI 29825
29826
LGKNNIAFMHGQEHKELRKSFLSLFTRKALGVYLTLQETSIKSHLQKWIQLSKENDEMEM
30005
30006
SFLCRDLNLETSQYVFAGPYIGEQRDQFCHWYITVTKAFISAPVFLPGTNLWKAYFARKK 30185
30186
IVALLENAVIQSKKYIGNGGTPRCLLDFWTQRVLEEMEEATQQDKEMPSYSNNRKMAETL 30365
30366
MDFLFASQDASTASLTWTLALMSDYPDVLKKVQEEQKRLRPNNEPLSFELVESMTYTRQV 30545
30546
VKEILRYRPPAVMVPQNAMGSVPLTENVTVPKGSFVMPSIWSSCMQGFPDAYKFDP (0) 30713
30771
DRMSPERQEDIKYRQNFLTFGIGPHVCVGREYAINNLIAFLALIS 30905
30906
TECKFQRYRTKKSDDIIYLPTIYPGDCLMKFV* 31004
>CYP710B1
related seq, ESTs HET_11H09, HET_31E01
MQLTEFDSFNKFLSGN
LVFLGVSIALVCYLLFEQLRYFWWKRSSKLRGPSFTLPFLGSIIEMVKNPYEFWEKQRLL
DPQGVSANFSLGRITLFVTDSALVRAIL
NNNSAKTFLLALHPSARLILGKNNIAFMHGPE
HKELRKSFLALFTRKALGVYLTLQETTIKSHLQRWIELCKEKSPLEMSFFCRDLNLE
TSQYVFAGPYIGERRDEFCSWYITVTKAFISAPVFFPGTNLWKAYFARKKIVALLENAVI
QSKRYMADGGSPRCLLDFWTQRVLEEVEEAAQQGRSVSYANNRKMAETMMDFLFASQ
DASTASLTWTLALMADHPDILKRVQEEQKRLRPNNEPLSFELVENMTFTRQVVKEILRYRPPA
>contig_981_Oct13_2005
EST = A4_11A04 30% to 4F12
75266
MLQLWIVLVTFSCLFLYVFILPKWRNRHIPGPRPSLLLGNVSELSRQGGTAPLVFERFRKQYGDVFQIWSFYRQ
75044
IVVISHPDDIKYIIVTKNFPKAEEFNLSLSPLAGRGLLTVGKSQHQERRRAISKHFNE 74871
74870
DFLRQLHRHMRVELMILLSKLQQVTERKESIDFDKEATSYTLDVMCRTGFGCTANTQED 74694
74693
ASHPISRAVNVSLREMYHNLVAYPIRNCFGLYSSPALKNATGVIREFASQVIEARR 74526
74525
TESEEDKTRRPLDLLDIFLKMDNLSDQNIIAEIATFLVAGHDTTSHT 74385
74384
MSWLIYEVCQHPEIEQKIQQEVDTIWGDRQDWMLSFEEIGQLEYLNKVWKETLRKHPVAA 74205
74204
TGTLRRLDTDVTLPSCGMLLRKNTAILVPIYLVHRNPEFWPDPETFEPERFTRENTMKRH 74025
74024
PFAFQAFSNGPRNCIGQFFATHEALTTLSSLYHFFTFRLACRAEDVKPYHAMTMKP SVGKVSEDAKGV
73820
SEYVKLPVWVTPRNTMAHLREE* 73752
>contig_989_Oct13_2005 41% to contig 981
47117
MMMSCLAVSLLQLSNLSQDWSRVFKLFILAALFWTVFKFLKYVYPYWRFRNI
PGPPPKWPVGNIFELLRKPGQEHRILLQYA 46872
46871
KQYGPTFQLWYLNRRTIIVANPEDAKFVLATRNYPKSPIFCRCFSPLGHGLLTLSQEEH 46695
46694
PVQRKAISQRFNEEFLQSLHHHLTAELEVFMAQMDALCDTERVVDLDALISALTLDVIAR 46515
46514
TAFGVSFTAQTSQHHPMPHAVLTLLDELVNNMIFYPYRFWLSHITQKRLNEAINVIRKFC 46335
46334
NMVIDLRLQESREEKSNRVRDLLDIFLESDETRDNVIAHVATFMLAGHD (1) 46188
46137
STSHTLSFCMYEIAQNRDIERKLQEESDRFIVAQDRIVPFDQVGHLDYTRMVWNEALRTH 45958
45957
PAAANTSVRCADRDDVLPGSGIPITKGTGLMVSSYLIHHLPQYWENPDHFIPERHTKEAV 45778
45777
RQRSPYYFLPFSRGSRNCIGQFVANHEALTILSTIYKRYEIRLAVGAQEVEEYFRVTMKP 45598
HCRFYVQGKKDPSLDAHLGLPVKIYSRKCYS* 45502
>CYP51
contig_1016_Oct13_2005
188929
MLSQDSIALSTLTSSLEAYCWALVYILSTILFFGILWRITGSFFLSKLG
IAREVKGQQLPPTYKEGLPLVGNLIAFAKGPLNVVQRGYQSCGDIFTFK
(0) 189228
189255
VFHKHITFLVGPKAHEIFFQGTDDE
LDQNEVYAFSVPIFGKGVVYDAPLEKRLQQLRIMSAALRPARMYGYVDQMVLEAVQFFRK
WGDQGQVDILESLSDL
IILTASRCLMGREVREQLFEKVSKLYHDLDQGMQPISVFAPYLPISAHRKRDKAREEMVQ
LFRTVIQNRRRRNVKEDDMLQTFMDASYRDGSRPSEYEVA
GLLIALLFAGQHTSSITGSWTGMLLLRNKDVFERVKKEQDTIIEEHGDELNYDVLSKM
NLLHLCIKETLRMYPPLILLMRKVLKPKFYKEYVIPENDIVMVSPAASGRLENVFKNPNA
WDPDRFGPNREEDKKAPFSFIGFGGGRHGCMGEQFAYLQIKTIWTV
LVRSFDLEPIGDLSQPDY (0) 190415
190469
NAMVVGPRPPCLLKYRKKKDSFLDRVSLYA* 190555
>contig_1041_Oct13_2005
gene model
c1041_g24.t1 looks like two genes run together
dihydrolipoamide
acetyltransferase (E2) subunit of PDC and a P450
c1041_g24.t1"
class=CDS position=contig_1041:66689..63211 (- strand)
N-terminal
not clear, MDHK might be the start or one more exon could be upstream.
64863
MSLMKRVMFLGGQVARWLVNGGLLSLVFVDLAFSRWSLEDLWCTPPSQSFNCIGQGR
(1) 64693 (alt N-term)
64600
DHSMDHKSKQVIFVYIIFDCNLPSLPGPSPWPIVGNCIPLSSN
LYQTLYQYVEQPISLYFIASTPFVVVTDEAAVR
(2) 64373
64319
KVLGSGMYQKPKYFGYRSSTIRYSVEMNQKLILTNEQMRQQQADSSRKA 64173
64172
LKVMIDSKVSDIIDGMIEAAEAVVHAVDGREQVENIRRKVIELNLNVLFGYKNDKDV 64002
64001
GSLSHIIFEAGKEFILRTVNPFRIGWRWMANFRFFQ (1) 63894
63833
YVFSLITIGRRVCQHMDSQPATWVHGWVGKVGKIGKLGKVVGLIMASSQTVPTTCLWLLFLLSK () 63642
63641
YPQVVEKIREETSRVLHSTKKQSMEEFTVDDLNELAYVDCVVKECLRLYPPFPLLQREPE (0) 63462
63405
MDDILENVKIPARTPVYIVPWLLHHHPKYWKQPEDFIPDRFMYNASHGDAPSDFVYIPFGRGNK (2) 63214
63161
MCAGYHLALLELKILTIYVCQYYDWKCSFPQGKEPVSKKYPIETITHSSCNRFFFIMQLLSIGNVS*
>contig_1062_Oct13_2005 30% to 39A1
26036
MWIGLLLFFVLVFTLYLVRQNTCTGNKANLSYSPVCKGLPLLGSALEFGK
NPLKFLQECRKQYGDVFTVLLPGRRMTFIFAPTQE 25782
25781
LRKIFFNGSPNLISFTAGVEPLTCRIFGISKKGFSMAHRSLLTTLRSELGAKHIPQLAHR 25602
25601
LINRYLFTFRTVWGKEDEKEASNLLTETLSDASLRVIFGDEFANASPSLFKDFVDF 25434
25433
DEWFELAATPLLPHFLLRPFVKSRRKLLDTISQNWKYTKNAPIHKLTE 25290
25289
AYGNDGNVPSLLLSALWATWSNVSPTSFWTLTHILADEKAKVKVLAEVEKSCPLLLSSKT 25110
25109
ELSLEWIFSNLPFTAYCVSETLRLYASVVDIRKVVENLEFREFIIRKGDYLCISPAVSHR 24930
24929
ETTLFPQSEDFIPDRFQKQGTHPNAVFDKDLLTFGGGFYKCPGQSFAMVEIVLLIALVFY 24750
24749 LYDIQLVDRVPKMKESQSVGIKKPSCSCRIHYLWKRRLAGMEEI* 24615