Four P450
fragments from Tetrahymena
D. Nelson
4/5/2004
These
sequences resemble CYP4V and CYP3A and CYP46, so the 3 and 4 clans proabably
had a common ancestor and these seqs derived from that same common ancestor.
>Tetrahymena
P450 seq 1a N-term (partial 291 aa) BM400694.1
MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIK
KYEDVDYFVSHQCDENPDLRLYVVNLGSKIKLRLVDPDLMRDFFLNQQYYEKDTFYIGNV
LRCAPQGIAFVEGEQWKKARKMFSQAFHFEYLTSLAPLIEQIASKVFNQAMESSEILANY
DPLVYSQKITGQVVIATFFGEQVNEKKFRGMDLVSALTHMLNLLGEQSMSLQ
YFLFGADF
FKLRLTQSQRYVDDIIEEFRSFLTDLIEGKHQKLSQKLKEYGKIVSLPFSLESLHLRNNA
>Tetrahymena
P450 seq 1b BM399152 N-terminal probably same gene as seq 1a
MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLL
GELVEIKEAIKKYEDVDYFVSHQCDENPDLRLYVVNLGSKIKLRLVDPDLMRDXFLNQQY
YEKDTFYIGNVLRCAPLRYS
(fs)
HLQRAKQWKKARTMFS
(fs)
QAFHFEYLTSLAPLIEQIAFKSL
(fs)
NQAMESSEILASYDPLVYSQKITGQGVIATLFGEQVNEKKX
(fs)
RGMD (fs)
LVSALTHMLNLL
>Tetrahymena
P450 seq 2 N-term (partial 214 aa) BM396441.1
MVSYFALAGLAIVLYILYVFIINPYLQYRK
YLKWGKGSFYPFVGVFYGAGLRVKQYKDVDHHLKHMYDDGSDPKIYVENNATGAIIKISD
PEYIKEFVQLENKAYQKTTLLIDNIIRLVGQGIIFSEGPQWKKNRNVLSGVFHFEQLSKR
VPSIEKITKEVYKRYIDSGNVKNVDVIELFQEITAEVVSKPSSVIFQRLILPWHEFAGSS
SYLI
>Tetrahymena
P450 seq 3a C-term (partial 293 aa) BM399816 with frameshift (fs)
KTVGKRATRGGAYNSVGKQISTPFYFLFRTNFFKWGIRESDRELNKQIKEFRQMIGDIIN
ERIKEEEELEKRGEQTTKEDLVYYLKKNNLRGVLSLDEIISEFMTFYVAGMDTTGHLCGM
AIWFLTQHPEIKKKLQEELDANTDYSQNGLLKLPYLNGVIQETQRLYGPAGQLFNRVALR
DHMLKDIPIKKGTIVKPSPCSVHRHPKYFEDPHSFKPERWFNKKYCHS
(fs)
YTFIPFNAGPRN
CIGQHLALVEARIMMYYFMKTFDFESDHNFEMVLNKASQQTSRQLRIILAQEI
Tetrahymena
P450 seq 3b BM399815 BEGINNING = SEQ 3
THIS IS
PROBABLY A POOR QUALITY VERSION OF SEQ 3
THE TWO
ARE 85% IDENTICAL AT NUC. LEVEL, 5 PRIME ENDS NEARLY IDENTICAL
TREATED AS
SAME GENE IN TIGR GENE INDEX FOR TETRAHYMENA
AYNSVGKQISTPFYFLFRTNFFKWGIRESDRKLNKQIKEFRQMIGDIINER
IKEEEELEKRGEQTTKEDLVYYLKKNNLLGVLSLDEIINGFLNF (FS)
YLAGLVPLGHLGGCGLWVPIYKHPRIQKRNSEKNLD
ANSDYSQKWSPLKLSLLNGSYLRKLQRLLWTPLG*LFNP
VAPQEPHAQGLPIQKGIIVRPXXX
SVHRLLNI*RPHSFSLKKVYKICHSELYPFSGPERLA
PRLNKLMYIFGDWLKNNEGQSLNRLKISLTFP
>Tetrahymena
P450 seq 4 C-term CF653700 (168 AA)
ICLWVLAQHPELQQKIRAEIDSVIQTFDDLKHEXLNKLEYFNAFFKESLRVYPTAPQVIP
RVSARDHMVDDFPIPKGAFVSNLTIQYNEQKFPLLCKDIDTFNPDRFLDKNIIQDHFSFI
PYSAGPRNCIGQHLALIEAKIMIAYILKNYVVLPNEEHQKVRFNHLFL
>Tetrahymena
P450 seq 5 BM400871 and BM400870 C-helix to I-helix
GELTRWRRSKRNFLS (fs)
LFHFNALKNRVLSSRRLPRSSWATLPSDGKTPITIIEELQNITSE
VVIQTFFGENLKGMTVNGLQPSVEISKIIGDGFSYKANSFAYFLKLMVFGQEKASRVLNT
TFEKNFLKRVENYNQFIEGIVDKRLSELEKLTDTSKVDENFLNLYLLEYIKQQKALKENP
KIYADYEIIPKREIVHQFTTFFFAGMDTTANQTGICL
Note GELTRWRRSKRN at the
beginning of this sequence is found at the beginning of many Tetrahymena EST sequences. It was not found to be vector in Vecscreen,
but it is probably not part of the protein sequence.
&&&&&&&&&&
>Tetrahymena
P450 seq 1 (partial 291 aa N-term)
MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIK
KYEDVDYFVSHQCDENPDLRLYVVNLGSKIKLRLVDPDLMRDFFLNQQYYEKDTFYIGNV
LRCAPQGIAFVEGEQWKKARKMFSQAFHFEYLTSLAPLIEQIASKVFNQAMESSEILANY
DPLVYSQKITGQVVIATFFGEQVNEKKFRGMDLVSALTHMLNLLGEQSMSLQYFLFGADF
FKLRLTQSQRYVDDIIEEFRSFLTDLIEGKHQKLSQKLKEYGKIVSLPFSLESLHLRNNA
>5009-0-77-F07.t.1
full open reading frame, no stops
ATATAGAATATCAAATGCTATAATATAAA
ATGTTTATAGAAATATTAATTATTATTCTTT
TTTTTGCTGCTTTAAGGTTAGTAATTATCCCTTATTTCAAGTTCCTTAAGTACAAAAAAT
ATGGGGATGGCAGATTTGTTCCATTATTAGGAGAACTAGTAGAGATAAAGGAAGCTATTA
AAAAGTACGAGGATGTAGACTATTTCGTAAGTCACTAATGCGATGAAAATCCAGACTTAA
GATTATATGTTGTCAATCTTGGCTCTAAGATAAAGCTTAGATTAGTAGATCCCGACTTAA
TGAGAGATTTCTTTTTAAATCAGTAGTATTATGAAAAAGATACTTTTTACATTGGCAACG
TCCTAAGATGCGCACCTTAAGGTATAGCATTTGTAGAGGGCGAGCAATGGAAAAAAGCCA
GAAAAATGTTTTCTCAAGCTTTCCATTTTGAATACCTCACTTCTCTAGCTCCATTAATAG
AATAAATAGCTTCAAAAGTCTTTAACTAAGCTATGGAAAGTAGCGAAATCCTTGCCAATT
ATGATCCCCTTGTATATTCATAAAAGATAACAGGATAAGTGGTTATTGCTACCTTTTTTG
GAGAACAAGTAAATGAAAAAAAGTTTAGAGGTATGGATTTAGTCTCAGCTTTGACACATA
TGTTAAATCTACTTGGAGAGTAATCAATGAGCCTCTAATACTTTTTGTTTGGAGCAGACT
TTTTTAAGTTAAGACTAACTCAATCCTAAAGATATGTCGATGATATCATTGAAGAATTCC
GTTCTTTCCTGACAGATCTGATTGAGGGAAAGCATTAGAAGCTTTCACAAAAATTGAAAG
AATACGGAAAAATAGTGTCCCTTCCATTTAGCTTAGAAAGTTTGCACCTTAGAAATAACG
CAAA
There is
only one stop codon TGA
TAA and
TAG = Q
>gi|18200747|gb|BM400694.1|BM400694
5009-0-77-F07.t.1 Chilcoat/Turkewitz cDNA (large fraction)
Tetrahymena
thermophila cDNA.
Length = 934
Score = 559 bits (1441), Expect = e-161
Identities = 291/291 (100%), Positives =
291/291 (100%)
Frame = +3
Query:
1
MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV 60
MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV
Sbjct:
60
MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV 239
Query:
61
SH*CDENPDLRLYVVNLGSKIKLRLVDPDLMRDFFLNQ*YYEKDTFYIGNVLRCAP*GIA 120
SH*CDENPDLRLYVVNLGSKIKLRLVDPDLMRDFFLNQ*YYEKDTFYIGNVLRCAP*GIA
Sbjct: 240
SH*CDENPDLRLYVVNLGSKIKLRLVDPDLMRDFFLNQ*YYEKDTFYIGNVLRCAP*GIA 419
Query: 121
FVEGEQWKKARKMFSQAFHFEYLTSLAPLIE*IASKVFN*AMESSEILANYDPLVYS*KI 180
FVEGEQWKKARKMFSQAFHFEYLTSLAPLIE*IASKVFN*AMESSEILANYDPLVYS*KI
Sbjct: 420
FVEGEQWKKARKMFSQAFHFEYLTSLAPLIE*IASKVFN*AMESSEILANYDPLVYS*KI 599
Query: 181
TG*VVIATFFGEQVNEKKFRGMDLVSALTHMLNLLGE*SMSL*YFLFGADFFKLRLTQS* 240
TG*VVIATFFGEQVNEKKFRGMDLVSALTHMLNLLGE*SMSL*YFLFGADFFKLRLTQS*
Sbjct: 600
TG*VVIATFFGEQVNEKKFRGMDLVSALTHMLNLLGE*SMSL*YFLFGADFFKLRLTQS* 779
Query: 241
RYVDDIIEEFRSFLTDLIEGKH*KLSQKLKEYGKIVSLPFSLESLHLRNNA 291
RYVDDIIEEFRSFLTDLIEGKH*KLSQKLKEYGKIVSLPFSLESLHLRNNA
Sbjct: 780
RYVDDIIEEFRSFLTDLIEGKH*KLSQKLKEYGKIVSLPFSLESLHLRNNA 932
>gi|18199205|gb|BM399152.1|BM399152
5009-0-54-B02.t.1 Chilcoat/Turkewitz cDNA (large fraction)
Tetrahymena thermophila cDNA.
Length = 824
Score = 267 bits (683), Expect = 2e-73
Identities = 139/157 (88%), Positives =
141/157 (89%), Gaps = 1/157 (0%)
Frame = +2
Query:
1
MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV 60
MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV
Sbjct:
62
MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV 241
Query:
61
SH*CDENPDLRLYVVNLGSKIKLRLVDPDLMRDFFLNQ*YYEKDTFYIGNVLRCAP*GIA 120
SH*CDENPDLRLYVVNLGSKIKLRLVDPDLMRD FLNQ*YYEKDTFYIGNVLRCAP +
Sbjct: 242
SH*CDENPDLRLYVVNLGSKIKLRLVDPDLMRDXFLNQ*YYEKDTFYIGNVLRCAPLRYS 421
Query: 121
FVEG-EQWKKARKMFSQAFHFEYLTSLAPLIE*IASK 156
G KK + F QAFHFEYLTSLAPLIE*IA K
Sbjct: 422
ICRGRSNGKKPEQCFPQAFHFEYLTSLAPLIE*IAFK 532
Score = 86.3 bits (212), Expect = 9e-19
Identities = 58/89 (65%), Positives =
63/89 (70%), Gaps = 6/89 (6%)
Frame = +3
Query: 116
P*GIAFVEGEQWKKARKMFSQAFHFEYLTSLAPLIE*I------ASKVFN*AMESSEILA 169
P*GIAFVEGE +K SQ F L+ L +
*+
SKVFN*AMESSEILA
Sbjct: 408
P*GIAFVEGEAMEK-----SQNNVFLKLSILNTSLL*LH**NK*LSKVFN*AMESSEILA 572
Query: 170
NYDPLVYS*KITG*VVIATFFGEQVNEKK 198
+YDPLVYS*KITG* VIAT FGEQVNEKK
Sbjct: 573
SYDPLVYS*KITG*GVIATLFGEQVNEKK 659
Score = 30.0 bits (66), Expect = 0.078
Identities = 32/96 (33%), Positives =
45/96 (46%), Gaps = 5/96 (5%)
Frame = +1
Query: 125
EQWKKARKMFSQAF-----HFEYLTSLAPLIE*IASKVFN*AMESSEILANYDPLVYS*K 179
+QWKKAR MFS +F
HF + + +K++ A
++ Y + *+
Sbjct: 436
KQWKKARTMFSSSFPF*IPHFSSSINRINSFQKSLTKLWKVAKSWPVMIPLY--IHKR*Q 609
Query: 180
ITG*VVIATFFGEQVNEKKFRGMDLVSALTHMLNLL 215
G ++ E KK LVSALTHMLNLL
Sbjct: 610
DKGLLLPCL---ENK*MKKS*RYGLVSALTHMLNLL 708
&&&&&&&&
>5009-0-20-F06.t.1
AACATACGAGCCACGCGGGGCGGCCGCTCTAAAGAAAAAATTTTGTATTAAAAATTAAAA
AAAACATAAATAGATATTTAGACAAGTTTCTA
ATGGTAAGCTACTTTGCTTTAGCAGGTC
TAGCAATAGTCCTATACATTTTGTATGTATTTATTATCAATCCTTACTTGTAGTACAGAA
AATACTTGAAGTGGGGTAAAGGTTCTTTCTACCCTTTCGTTGGTGTTTTCTATGGTGCTG
GCTTACGTGTTAAGCAATACAAAGATGTTGATCATCACTTGAAGCATATGTATGATGACG
GATCAGACCCTAAAATTTATGTTGAAAACAATGCCACAGGTGCCATCATCAAGATTTCTG
ACCCTGAATATATTAAGGAGTTTGTCTAACTTGAAAACAAGGCTTATCAAAAGACTACTC
TCTTAATTGACAATATCATCAGACTCGTAGGTTAGGGAATCATCTTCTCTGAAGGCCCCC
AATGGAAGAAAAACAGAAATGTACTTTCTGGTGTCTTCCACTTCGAACAACTCAGCAAAC
GTGTCCCATCAATAGAAAAAATTACTAAGGAAGTTTATAAGCGTTATATTGATTCAGGCA
ATGTTAAAAACGTTGATGTCATCGAATTATTTTAAGAAATCACTGCTGAAGTCGTATCTA
AACCTTCTTCAGTAATATTTCAAAGATTAATCCTTCCTTGGCATGAGTTTGCTGGTAGCT
CTTCATACCTCATTA
>Tetrahymena
P450 Seq 2 N-term BM396441.1
MVSYFALAGLAIVLYILYVFIINPYLQYRK
YLKWGKGSFYPFVGVFYGAGLRVKQYKDVDHHLKHMYDDGSDPKIYVENNATGAIIKISD
PEYIKEFVQLENKAYQKTTLLIDNIIRLVGQGIIFSEGPQWKKNRNVLSGVFHFEQLSKR
VPSIEKITKEVYKRYIDSGNVKNVDVIELFQEITAEVVSKPSSVIFQRLILPWHEFAGSS
SYLI
Compared
to each other
Query: 40
LAIVLYILYV-FIINPYL*YRKYLKWGKGSFYPFVGVFYGAGLRVKQYKDVDHHLKHMYD 98
L I+L+ + +I PY + KY K+G G F P +G +K+Y+DVD+
+ H D
Sbjct: 6
LIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFVSH*CD 65
Query: 99
DGSDPKIYVENNATGAIIKISDPEYIKEFV*LENKAYQKTTLLIDNIIRLVG*GIIFSEG 158
+ D ++YV N +
+++ DP+ +++F L Y+K T I N++R *GI F EG
Sbjct: 66
ENPDLRLYVVNLGSKIKLRLVDPDLMRDFF-LNQ*YYEKDTFYIGNVLRCAP*GIAFVEG 124
Query: 159 PQWKKNRNVLSGVFHFEQLSKRVPSIEKITKEVYKRYIDSGNV-KNVDVIELF*EITAEV
217
QWKK R +
S FHFE L+ P IE I +V+ ++S +
N D + *+IT V
Sbjct: 125 EQWKKARKMFSQAFHFEYLTSLAPLIE*IASKVFN*AMESSEILANYDPLVYS*KITG*V
184
Query: 218 V 218
V
Sbjct: 185 V 185
>gi|18196479|gb|BM396441.1|BM396441
5009-0-20-F06.t.1 Chilcoat/Turkewitz cDNA (large fraction)
Tetrahymena thermophila cDNA.
Length = 735
Score = 426 bits (1095), Expect = e-121
Identities = 214/214 (100%), Positives =
214/214 (100%)
Frame = +3
Query:
1
MVSYFALAGLAIVLYILYVFIINPYL*YRKYLKWGKGSFYPFVGVFYGAGLRVKQYKDVD 60
MVSYFALAGLAIVLYILYVFIINPYL*YRKYLKWGKGSFYPFVGVFYGAGLRVKQYKDVD
Sbjct:
93
MVSYFALAGLAIVLYILYVFIINPYL*YRKYLKWGKGSFYPFVGVFYGAGLRVKQYKDVD 272
Query:
61
HHLKHMYDDGSDPKIYVENNATGAIIKISDPEYIKEFV*LENKAYQKTTLLIDNIIRLVG 120
HHLKHMYDDGSDPKIYVENNATGAIIKISDPEYIKEFV*LENKAYQKTTLLIDNIIRLVG
Sbjct: 273
HHLKHMYDDGSDPKIYVENNATGAIIKISDPEYIKEFV*LENKAYQKTTLLIDNIIRLVG 452
Query: 121
*GIIFSEGPQWKKNRNVLSGVFHFEQLSKRVPSIEKITKEVYKRYIDSGNVKNVDVIELF 180
*GIIFSEGPQWKKNRNVLSGVFHFEQLSKRVPSIEKITKEVYKRYIDSGNVKNVDVIELF
Sbjct: 453
*GIIFSEGPQWKKNRNVLSGVFHFEQLSKRVPSIEKITKEVYKRYIDSGNVKNVDVIELF 632
Query: 181
*EITAEVVSKPSSVIFQRLILPWHEFAGSSSYLI 214
*EITAEVVSKPSSVIFQRLILPWHEFAGSSSYLI
Sbjct: 633
*EITAEVVSKPSSVIFQRLILPWHEFAGSSSYLI 734
&&&&&&&&&&&
BM399816
5009-0-62-A08.t.2
Chilcoat/Turkewitz cDNA (large fraction)
Tetrahymena thermophila cDNA, mRNA sequence.
1
aagacggtgg gaaagcgagc cacgcggggc ggcgcctaca actctgttgg taagtaaatt
61
agcactccct tttacttctt attccgtacc aatttcttca aatggggcat cagagaatct
121 gacagggagt
tgaacaagta gataaaagaa ttccgtcaaa
tgattggtga catcatcaac
181 gagcgtatca
aagaagaaga agagttagaa aagcgtggtg aataaactac caaggaagat
241 cttgtttatt
atcttaaaaa gaataacctc cgtggagtcc tctccctcga tgaaattatt
301 agtgaattca
tgactttcta cgttgctggt atggatacaa ctggtcatct ttgcggtatg
361 gccatatggt
tccttactta acaccccgaa attaaaaaga aactctaaga agaacttgat
421 gctaacactg
actactctca aaatggtctc cttaagcttc cttaccttaa tggagttatc
481 taagaaactc aacgtctcta
tggacccgct ggttaattat
tcaatcgtgt cgctcttaga
541 gaccacatgc
ttaaggacat tcctatcaag aagggaacta ttgttaagcc ctctccctgc
601 tctgttcaca
gacatcctaa atatttcgaa gaccctcatt ccttcaagcc tgaaagatgg
661 tttaacaaaa
aatactgtca ctccttacac ttttatcccc ttcaatgctg gtcccagaaa
721 ctgcattggc taacatcttg ccttagtaga agctagaatt atgatgtatt atttcatgaa
781 gacttttgat tttgaaagcg
atcataattt tgaaatggtt ctcaataagg cttcttaata
841 aaccagtaga taactcagaa taatcttagc
tcaagaaatc
>Tetrahymena
P450 seq 3 C-term (293 aa) BM399816
KTVGKRATRGGAYNSVGKQISTPFYFLFRTNFFKWGIRESDRELNKQIKEFRQMIGDIIN
ERIKEEEELEKRGEQ
TTKEDLVYYLKKNNLRGVLSLDEIISEFMTFYVAGMDTTGHLCGM
AIWFLTQHPEIKKKLQEELDANTDYSQNGLLKLPYLNGVIQETQRLYGPAGQ LFNRVALR
DHMLKDIPIKKGTIVKPSPCSVHRHPKYFEDPHSFKPERWFNKKYCHS
(fs)
YTFIPFNAGPRN
CIGQHLALVEARIMMYYFMKTFDFESDHNFEMVLNKASQQTSRQLRIILAQEI
>EMBOSS_001_3
DGGKASHAGRRLQLCW*VN*HSLLLLIPYQFLQMGHQRI*QGVEQVDKRIPSNDW*HHQR
AYQRRRRVRKAW*INYQGRSCLLS*KE*PPWSPLPR*NY**IHDFLRCWYGYNWSSLRYG
HMVPYLTPRN*KETLRRT*C*H*LLSKWSP*ASLP*WSYLRNSTSLWTRWLIIQSCRS*R
PHA*GHSYQEGNYC*ALSLLCSQTS*IFRRPSFLQA*KMV*QKILSLLTLLSPSMLVPET
ALANILP**KLEL*CIIS*RLLILKAIIILKWFSIRLLNKPVDNSE*S*LKK
>BM399816
Seq with frameshift like 3fam and
4V
KTVGKRATRGGAYNSVGK*ISTPFYFLFRTNFFKWGIRESDRELNK*IKEFRQMIGDII
NERIKEEEELEKRGE*TTKEDLVYYLKKNNLRGVLS
LDEIISEFMTFYV
AGMDTTGHLCGM
AIWFLT*HPEIKKKL*EELDANTDYSQNGLLKLPYLNGVI*ETQRLYGPAG*LFNRVALR
DHMLKDIPIKKGTIVKPSPCSVHRHPKYFEDPHSFKPERWFNKKYCHS
YTFIPFNAGPRNCIG*HLALVEARIMMYYFMKTFDFESDHNFEMVLNKAS**TSR*LRIILAQEI
Note that
the N-term of this frag overlaps with BM400694.1 weakly so a complete seq
could be
assembled, but it would be a hybrid.
>gi|18200747|gb|BM400694.1|BM400694
5009-0-77-F07.t.1 Chilcoat/Turkewitz cDNA (large fraction)
Tetrahymena thermophila cDNA.
Length = 934
Score = 33.5 bits (75), Expect = 0.001
Identities = 18/46 (39%), Positives =
29/46 (63%)
Frame = +3
Query:
14
NSVGK*ISTPFYFLFRTNFFKWGIRESDRELNK*IKEFRQMIGDII 59 (ETAM exon?)
N
+G+* + YFLF +FFK + +S R ++ I+EFR + D+I
Sbjct: 696
NLLGE*SMSL*YFLFGADFFKLRLTQS*RYVDDIIEEFRSFLTDLI 833
>gi|18199869|gb|BM399816.1|BM399816
5009-0-62-A08.t.2 Chilcoat/Turkewitz cDNA (large fraction)
Tetrahymena thermophila cDNA.
Length = 880
Score = 467 bits (1201), Expect = e-133
Identities = 230/233 (98%), Positives =
230/233 (98%)
Frame = +1
Query:
1
KTVGKRATRGGAYNSVGK*ISTPFYFLFRTNFFKWGIRESDRELNK*IKEFRQMIGDIIN 60
KTVGKRATRGGAYNSVGK*ISTPFYFLFRTNFFKWGIRESDRELNK*IKEFRQMIGDIIN
Sbjct:
1
KTVGKRATRGGAYNSVGK*ISTPFYFLFRTNFFKWGIRESDRELNK*IKEFRQMIGDIIN 180
Query:
61
ERIKEEEELEKRGE*TTKEDLVYYLKKNNLRGVLSLDEIISEFMTFYVAGMDTTGHLCGM 120
ERIKEEEELEKRGE*TTKEDLVYYLKKNNLRGVLSLDEIISEFMTFYVAGMDTTGHLCGM
Sbjct: 181
ERIKEEEELEKRGE*TTKEDLVYYLKKNNLRGVLSLDEIISEFMTFYVAGMDTTGHLCGM 360
Query: 121
AIWFLT*HPEIKKKL*EELDANTDYSQNGLLKLPYLNGVI*ETQRLYGPAG*LFNRVALR 180
AIWFLT*HPEIKKKL*EELDANTDYSQNGLLKLPYLNGVI*ETQRLYGPAG*LFNRVALR
Sbjct: 361
AIWFLT*HPEIKKKL*EELDANTDYSQNGLLKLPYLNGVI*ETQRLYGPAG*LFNRVALR 540
Query: 181
DHMLKDIPIKKGTIVKPSPCSVHRHPKYFEDPHSFKPERWFNKKYCHSYTFIP 233
DHMLKDIPIKKGTIVKPSPCSVHRHPKYFEDPHSFKPERWFNKKYCHS F P
Sbjct: 541
DHMLKDIPIKKGTIVKPSPCSVHRHPKYFEDPHSFKPERWFNKKYCHSLHFYP 699
Score = 129 bits (324), Expect = 1e-31
Identities = 83/164 (50%), Positives =
91/164 (55%)
Frame = +2
Query: 130
EIKKKL*EELDANTDYSQNGLLKLPYLNGVI*ETQRLYGPAG*LFNRVALRDHMLKDIPI 189
E+
KKL +D +YS LL+
L + + L P LF + L +
Sbjct: 473
ELSKKLNVSMDPLVNYSIVSLLETTCLRTFLSRRELLLSPLPALFTDILNISKTLIPSSL 652
Query: 190
KKGTIVKPSPCSVHRHPKYFEDPHSFKPERWFNKKYCHSYTFIPFNAGPRNCIG*HLALV 249
K
G
K
YTFIPFNAGPRNCIG*HLALV
Sbjct: 653
KDG----------------------------LTKNTVTPYTFIPFNAGPRNCIG*HLALV 748
Query: 250
EARIMMYYFMKTFDFESDHNFEMVLNKAS**TSR*LRIILAQEI 293
EARIMMYYFMKTFDFESDHNFEMVLNKAS**TSR*LRIILAQEI
Sbjct: 749
EARIMMYYFMKTFDFESDHNFEMVLNKAS**TSR*LRIILAQEI 880
>gi|18199868|gb|BM399815.1|BM399815
5009-0-62-A08.t.1 Chilcoat/Turkewitz cDNA (large fraction)
Tetrahymena thermophila cDNA.
Length = 817
Score = 213 bits (542), Expect = 5e-57
Identities = 139/240 (57%), Positives =
165/240 (68%), Gaps = 8/240 (3%)
Frame = +2
Query:
8
TRGGAYNSVGK*ISTPFYFLFRTNFFKWGIRESDRELNK*IKEFRQMIGDIINERIKEEE 67
T
AYNSVGK*ISTPFYFLFRTNFFKWGIRESDR+LNK*IKEFRQMIGDIINERIKEEE
Sbjct:
17
TGAAAYNSVGK*ISTPFYFLFRTNFFKWGIRESDRKLNK*IKEFRQMIGDIINERIKEEE 196
Query:
68
ELEKRGE*TTKEDLVYYLKKNNLRGVLSLDEIISEFMTFYVAGMDTTGHLCGMAIWFLT* 127
ELEKRGE*TTKEDLVYYLKKNNL GVLSLDEII+ F+ F + G +TG +W L
Sbjct: 197
ELEKRGE*TTKEDLVYYLKKNNLLGVLSLDEIINGFLNFLLGGFGSTG--SSWRMWPLGP 370
Query: 128
H----PEIKKKL*EELDANTDYS-QNGL-LKLPYLNGVI*ETQRL-YGPAG*LFN-RVAL 179
+ +K+KL*EEL T + ++GL L
YL GVI*E + YGP ++ R L
Sbjct: 371
YI*TPANLKEKL*EELGLLTVTTLKSGLPLSFLYLMGVI*ENSNVSYGPRWVNYSIRSLL 550
Query: 180
RDHMLKDIPIKKGTIVKPSPCSVHRHPKYFEDPHSFKPERWFNKKYCHSYTFIPFNAGPR 239
++HMLKD ++ ++ S SVHR PHSF ++ + K CHS +
PF+ R
Sbjct: 551
KNHMLKDFLFRRELLLGLS-WSVHRLLNI*R-PHSFSLKKVY--KICHSELY-PFSGPER 715
>gi|37509509|gb|CF653700.1|CF653700
EST00033 Suppression Subtractive Hybridization Libraries of
Tetrahymena thermophila Exposed in
Dichlorodiphenyltrichloroethane
(DDT) Tetrahymena
thermophila cDNA clone DDT-236.
Length = 505
Score = 106 bits (264), Expect = 9e-25
Identities = 59/150 (39%), Positives =
88/150 (58%), Gaps = 7/150 (4%)
Frame = +1
Query: 120
MAIWFLT*HPEIKKKL*EELDANT----DYSQNGLLKLPYLNGVI*ETQRLYGPAG*LFN 175
+ +W L
*HPE+++K+ E+D+ D L KL Y N E+ R+Y A
+
Sbjct:
1
ICLWVLA*HPELQQKIRAEIDSVI*TFDDLKHEXLNKLEYFNAFFKESLRVYPTAPQVIP 180
Query: 176
RVALRDHMLKDIPIKKGTIVKPSPCSVH--RHPKYFEDPHSFKPERWFNKKYCHS-YTFI 232
RV+
RDHM+ D PI KG V + + P +D +F
P+R+ +K
++FI
Sbjct: 181
RVSARDHMVDDFPIPKGAFVSNLTI*YNE*KFPLLCKDIDTFNPDRFLDKNIIQDHFSFI 360
Query: 233
PFNAGPRNCIG*HLALVEARIMMYYFMKTF 262
P++AGPRNCIG*HLAL+EA+IM+ Y +K +
Sbjct: 361
PYSAGPRNCIG*HLALIEAKIMIAYILKNY 450
>Tetrahymena
seq 4 C-term CF653700
1 ICLWVLAQHPELQQKIRAEIDSVIQ
TFDDLKHEXLNKLEYFNAFFKESLRVYPTAPQVIP 180
181
RVSARDHMVDDFPIPKGAFVSNLTIQYNEQ KFPLLCKDIDTFNPDRFLDKNIIQDHFSFI 360
361
PYSAGPRNCIGQHLALIEAKIMIAYILKNYVVLPNEEHQKVRFNHLFL
ATTTGTCTTTGGGTTTTGGCTTAACACCCTGAACTCCAACAAAAGATCAGAGCTGAAATT
GATTCTGTTATTTAGACCTTTGACGATCTTAAGCACGAAGNTTTGAATAAACTTGAATAC
TTCAATGCTTTCTTCAAAGAATCCCTCAGAGTTTATCCCACTGCCCCTCAAGTCATTCCT
AGAGTCTCTGCTCGTGACCACATGGTCGATGACTTCCCTATTCCCAAGGGTGCCTTCGTA
TCAAACTTAACCATTTAATATAATGAATAAAAGTTCCCACTCCTCTGCAAAGATATTGAT
ACCTTCAATCCCGATAGATTCTTAGACAAGAACATCATTCAAGATCATTTCAGTTTTATC
CCTTACTCAGCAGGTCCCCGTAATTGCATCGGCTAGCACTTGGCCTTAATCGAAGCTAAG
ATCATGATTGCTTACATCCTCAAAAATTACGTAGTTTTACCCAACGAAGAACACTAGAAA
GTTAGATTTAACCACTTATTCTTGT
>EMBOSS_001_1
PYSAGPRNCIG*HLALIEAKIMIAYILKNY VVLPNEEHQKVRFNHLFL
>EMBOSS_001_2
LTQQVPVIASASTWP*SKLRS*LLTSSKIT*FYPTKNTRKLDLTTYSC
>EMBOSS_001_3
LLSRSP*LHRLALGLNRS*DHDCLHPQKLRSFTQRRTLES*I*PLI
LOCUS
BM399815
817 bp
mRNA
linear EST
17-JAN-2002
DEFINITION 5009-0-62-A08.t.1 Chilcoat/Turkewitz
cDNA (large fraction)
Tetrahymena thermophila cDNA, mRNA sequence.
ACCESSION BM399815
VERSION BM399815.1 GI:18199868
KEYWORDS EST.
SOURCE Tetrahymena
thermophila
ORGANISM Tetrahymena thermophila
Eukaryota; Alveolata; Ciliophora; Oligohymenophorea;
Hymenostomatida; Tetrahymenina; Tetrahymena.
REFERENCE 1 (bases 1 to 817)
AUTHORS Turkewitz,A.P., Karrer,K.M., Jahn,C., Orias,E.,
Kirk,K.E., Frankel
,J. and Klobutcher,L.
TITLE EST from Tetrahymena thermophila, strain
CU428.1, growing cells
JOURNAL Unpublished (2002)
COMMENT Contact: Turkewitz AP
Molecular Genetics and Cell Biology
University of Chicago
920 E. 58th Street, Chicago, IL 60637, USA
Tel: 773 702 4374
Fax: 773 702 3172
Email: apturkew@midway.uchicago.edu
Seq primer: T3.
FEATURES
Location/Qualifiers
source 1..817
/organism="Tetrahymena
thermophila"
/mol_type="mRNA"
/strain="CU428.1"
/db_xref="taxon:5911"
/clone_lib="Chilcoat/Turkewitz cDNA (large fraction)"
/note="Vector: BlueScript2 SK+;
Details on library
preparation can be found in Chilcoat and Turkewitz (2001)
Proc. Natl. Acad. Sci USA, 98: 8709-8713."
ORIGIN
1
aacaaagctg gagctcacgg gggcggccgc ctacaactct gttggtaagt aaattagcac
61
tcccttttac ttcttattcc gtaccaattt cttcaaatgg ggcatcagag aatctgacag
121 aaagttgaac
aagtagataa aagaattccg
tcaaatgatt ggtgacatca tcaacgagcg
181 tatcaaagaa
gaagaagagt tagaaaagcg tggtgaataa
actaccaagg aagatcttgt
241 ttattatctt
aaaaagaata acctccttgg agtcctctcc ctcgatgaaa ttattaatgg
301 attcctgaat ttcctacttg
gcgggtttgg ttccactggg tcatcttggc ggatgtggcc
361 tttgggtccc
tatatataaa cacccgcgaa
tttaaaagag aaactctgag aagaacttgg
421 attgctaaca
gtgactactc tcaaaagtgg tctcccctta agctttcttt acttaatggg
481 agttatctaa
gaaaactcca acgtctctta tggaccccgc tgggttaatt attcaatccg
541 gtcgctcctt aagaaccaca tgcttaagga cttcctattc agaagggaat tattgttagg
601 cctctcctgg
tctgttcaca gacttctaaa tatttgaaga
cctcattcct tcagcctgaa
661 gaaggtttac
aaaatctgtc actccgaact ttatcccttt agcggtccgg aacggttggc
721 accacggctt
aataagctga tgtatatctt cggagactgg ttgaagaaca atgaaggtca
781 aagtttaaac
cggctcaaaa tttctttaac ttttccg
>EMBOSS_001_1
NKAGAHGGGRLQLCW*VN*HSLLLLIPYQFLQMGHQRI*QKVEQVDKRIPSNDW*HHQRA
YQRRRRVRKAW*INYQGRSCLLS*KE*PPWSPLPR*NY*WIPEFPTWRVWFHWVILADVA
FGSLYINTR EFKRETLRRTWIANSDYSQKWSPLKLSLLNGSYLRKLQRLLWTPLG*LFNP
VAPQEPHAQGLPIQKGIIVRP LLVCSQTSKYLKTSFLQPE EGLQNLSLRTLSL*RSGTVG
TTA**ADVYLRRLVEEQ*RSKFKPAQNFFNFSX
>EMBOSS_001_2
TKLELTGAA AYNSVGKQISTPFYFLFRTNFFKWGIRESDRKLNKQIKEFRQMIGDIINER
IKEEEELEKRGEQTTKEDLVYYLKKNNLLGVLSLDEIINGFLNF LLGGFGSTGSSWRMWP
LGPYIQTPANLKEKL*EELGLLTVTTLKSGLPLSFLYLMGVI*ENSNVSYGPRWVNYSIR
SLLKNHMLKDFLFRRELLLGLSW SVHRLLNI*RPHSFSLKKVYKICHSELYPFSGPERLA
PRLNKLMYIFGDWLKNNEGQSLNRLKISLTFP
>EMBOSS_001_3
QSWSSRGRPPTTLLVSKLALPFTSYSVPISSNGASENLTES*TSR*KNSVK*LVTSSTSV
SKKKKS*KSVVNKLPRKILFIILKRITSLESSPSMKLLMDS
*IS YLAGLVPLGHLGGCGL
WVPIYKHPRIQKRNSEKNLD C*Q*LLSKVVSP*AFFT*WELSKKTPTSLMDPAGLIIQSG
RSLRTTCLRTSYSEGNYC*ASPGLFTDF*IFEDLIPSA*RRFTKSVTPNFIPLAVRNGWH
HGLIS*CISSETG*RTMKVKV*TGSKFL*LF
This seq frameshiftS around
TKLELTGAA AYNSVGKQISTPFYFLFRTNFFKWGIRESDRKLNKQIKEFRQMIGDIINER
IKEEEELEKRGEQTTKEDLVYYLKKNNLLGVLSLDEIINGFLNF (FS)
YLAGLVPLGHLGGCGLWVPIYKHPRIQKRNSEKNLD
ANSDYSQKWSPLKLSLLNGSYLRKLQRLLWTPLG*LFNP
VAPQEPHAQGLPIQKGIIVRPXXX
SVHRLLNI*RPHSFSLKKVYKICHSELYPFSGPERLA
PRLNKLMYIFGDWLKNNEGQSLNRLKISLTFP
&&&&&&&&&&&
>gi|18199205|gb|BM399152.1|BM399152
5009-0-54-B02.t.1 Chilcoat/Turkewitz cDNA (large fraction)
Tetrahymena thermophila cDNA.
Length = 824
Score = 262 bits (670), Expect = 6e-69
Identities = 136/157 (86%), Positives =
138/157 (87%), Gaps = 1/157 (0%)
Frame = +2
Query:
1 MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV
60
MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV
Sbjct:
62
MFIEILIIILFFAALRLVIIPYFKFLKYKKYGDGRFVPLLGELVEIKEAIKKYEDVDYFV 241
Query:
61
SHQCDENPDLRLYVVNLGSKIKLRLVDPDLMRDFFLNQQYYEKDTFYIGNVLRCAPQGIA 120
SH
CDENPDLRLYVVNLGSKIKLRLVDPDLMRD FLNQ YYEKDTFYIGNVLRCAP +
Sbjct: 242
SH*CDENPDLRLYVVNLGSKIKLRLVDPDLMRDXFLNQ*YYEKDTFYIGNVLRCAPLRYS 421
Query: 121
FVEG-EQWKKARKMFSQAFHFEYLTSLAPLIEQIASK 156
G KK + F QAFHFEYLTSLAPLIE IA K
Sbjct: 422
ICRGRSNGKKPEQCFPQAFHFEYLTSLAPLIE*IAFK 532
Score = 77.4 bits (189), Expect = 3e-13
Identities = 52/85 (61%), Positives =
57/85 (67%), Gaps = 2/85 (2%)
Frame = +3
Query: 116
PQGIAFVEGEQWKKARKMFSQAFHFEYLTSLAPLI--EQIASKVFNQAMESSEILANYDP 173
P
GIAFVEGE +K++
TSL L + SKVFN AMESSEILA+YDP
Sbjct: 408
P*GIAFVEGEAMEKSQNNVFLKLSI-LNTSLL*LH**NK*LSKVFN*AMESSEILASYDP 584
Query: 174
LVYSQKITGQVVIATFFGEQVNEKK 198
LVYS
KITG VIAT FGEQVNEKK
Sbjct: 585
LVYS*KITG*GVIATLFGEQVNEKK 659
Score = 37.4 bits (85), Expect = 0.39
Identities = 35/107 (32%), Positives =
52/107 (48%), Gaps = 5/107 (4%)
Frame = +1
Query: 125
EQWKKARKMFSQAF-----HFEYLTSLAPLIEQIASKVFNQAMESSEILANYDPLVYSQK 179
+QWKKAR
MFS +F HF + ++
+K++ A ++ PL ++
Sbjct: 436
KQWKKARTMFSSSFPF*IPHFSSSINRINSFQKSLTKLWKVAKSWPVMI----PLYIHKR 603
Query: 180
ITGQVVIATFFGEQVNEKKFRGMDLVSALTHMLNLLGEQSMSLQYFL 226
+ ++
+ +K R LVSALTHMLNLL E
MSL L
Sbjct: 604
*QDKGLLLPCLENK*MKKS*R-YGLVSALTHMLNLL-ESIMSLYILL 738
>gi|18196479|gb|BM396441.1|BM396441
5009-0-20-F06.t.1 Chilcoat/Turkewitz cDNA (large fraction)
Tetrahymena thermophila cDNA.