Chlamydomonas reinhardtii cytochrome P450s

 

D. Nelson, Sept. 2, 2004

Under revision May 11, 2006

 

39 named genes, 2 named pseudogenes,

+ one bacterial contaminant

families = 51, 55, 97, 710, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746,

747, 748, 767, 768, 769, 770, 771 (5 old families, 16 new families)

 

51 is in the 51 clan (sterol 14 alpha demethylase)

55 is of fungal origin. (nitrite/nitrate reductase, soluble enzyme)

710 is in the 61 clan (C-22 sterol desaturase in fungi [CYP61] and plants [CYP710])

737, 738, 739, 740 are in the CYP85 clan

97 is in the CYP97 clan (carotenoid hydroxylases of epsilon and beta rings)

743, 744 are in the CYP711 clan (CYP711A1 produces a carotenoid hormone in Arabidopsis)

745 may be a new plant clan, CYP97A like

CYP747 is hard to place. 38% to CYP97A6 in C-term half

741 and 742 sometimes cluster with 97 but not always.

741, 742, 748, 767, 768, 769 cluster together and have best hits to CYP4 clan members

746 may be of bacterial origin, best hit is to CYP252A1 Streptomyces peucetius

top 26 hits all bacterial

CYP746 and CYP770 may be the Chalmydomonas precursors of the CYP72 clan

There is a CYP746 in moss

 

Chlamydomonas P450 tree

 

A link to the 2003 Chlamydomonas P450 page

 

P450s sorted by gene model number using the JGI annotation

 

* indicates more than one gene model for a single gene.

 

C_60077      CYP742A1

C_130004     CYP739A1

C_130006     CYP739A2

C_130009     CYP739A4

C_130009     CYP739A5

C_130012     CYP739A6

C_130125     CYP739A3

C_140094     CYP-un1Chlre pseudogene 1, family not identified, half of gene

C_180013     CYP743A1

             CYP744A4 between C_239009 and C_239004 not annotated

C_250032     CYP746A1, 39% to Streptomyces peucetius CYP252A1

C_310063     CYP97A6

C_340039     unnnamed C-term P450 fragment PKG to heme

C_410095     CYP97B6

C_420091     CYP743A2

C_470024     CYP737A1

C_570052     CYP738A1

C_680007     CYP51G1

C_900050     CYP747A1, 41% to CYP743B2 C-term

*C_940015    CYP744A1

C_940016     CYP744A1, N-term = C_940015

C_940017     CYP744A2

C_940020     CYP744B1

C_940044     CYP744A3

C_980035     CYP743B3

C_980053     CYP741A1

*C_980058    CYP741A1 N-term

C_1040015    CYP97A5

C_1080041    CYP740A1

C_1130014    CYP743C1

C_1340038    CYP97C3 70% to 97C2

C_1370013    CYP744C1

C_1530020    unnnamed C-term P450 fragment PKG to end

C_1540014    CYP710B1

C_1730009    CYP744A5P pseudogene 81% to 744A3

C_1820019    CYP748A1 about 40% to C-term half of 741A1

C_1860018    CYP745A1

C_2580005    CYP55B1, 43% to CYP55A6

C_4150003    unnamed CYP97 like C-term P450 fragment

*C_4260002   CYP97A5

*C_5270001   CYP739A6

C_7970001    unnamed C-term P450 fragment

C_8600001    CYP743B2 falls in a seq gap of scaffold 98

C_8600002    CYP743B3 same as C_980035

*C_8650001   CYP744B1

*C_9610001   CYP743C1

C_10690001   unnamed C-term P450 fragment

*C_22500001  CYP739A5

*C_28140001  CYP746A1 = C_250032 C-helix exon duplication

C_32340001   CYP743B1 falls in a seq gap of scaffold 98

 

P450s sorted by CYP name (version 2 assembly)

 

CYP51G1      C_680007    

CYP55B1      C_2580005 43% to CYP55A6

CYP97A5     *C_4260002  

CYP97A5      C_1040015   

CYP97A6      C_310063    

CYP97B6      C_410095    

CYP97C3      C_1340038 70% to 97C2

CYP710B1     C_1540014   

CYP737A1     C_470024    

CYP738A1     C_570052    

CYP739A1     C_130004    

CYP739A2     C_130006

CYP739A3     C_130125    

CYP739A4     C_130009

CYP739A5    *C_22500001 

CYP739A5     C_130009    

CYP739A6    *C_5270001  

CYP739A6     C_130012     

CYP740A1     C_1080041   

CYP741A1    *C_980058 N-term

CYP741A1     C_980053    

CYP742A1     C_60077     

CYP743A1     C_180013    

CYP743A2     C_420091    

CYP743B1     C_32340001  

CYP743B2     C_8600001   

CYP743B3     C_8600002 same as C_980035

CYP743C1    *C_9610001  

CYP743C1     C_1130014   

CYP744A1    *C_940015   

CYP744A1     C_940016 N-term = C_940015

CYP744A2     C_940017    

CYP744A3     C_940044    

CYP744A4     between C_239009 and C_239004 not annotated

CYP744A5P    C_1730009 pseudogene 81% to 744A3

CYP744B1    *C_8650001  

CYP744B1     C_940020    

CYP744C1     C_1370013   

CYP745A1     C_1860018   

CYP746A1    *C_28140001 = C_250032 C-helix exon duplication

CYP746A1     C_250032, 39% to Streptomyces peucetius CYP252A1

CYP747A1     C_900050 41% to CYP743B2 C-term

CYP748A1     C_1820019 about 40% to C-term half of 741A1

C_140094     CYP-un1Chlre pseudogene 1, family not identified, half of gene

C_340039     unnnamed C-term P450 fragment PKG to heme

C_1530020    unnnamed C-term P450 fragment PKG to end

C_4150003    unnamed CYP97 like C-term P450 fragment

C_7970001    unnamed C-term P450 fragment

C_10690001   unnamed C-term P450 fragment

 

P450s sorted by CYP name (version 3 assembly)

 

CYP51G1      scaffold_7:2481399-2484780  Protein ID: 126254

CYP55B1      scaffold_52:370660-375180   Protein ID: 121742

CYP97A5      scaffold_55:373287-377786   Protein ID:  39257

CYP97A6      scaffold_42:732596-737181   Protein ID: 121076

CYP97B6      scaffold_1:2256360-2261776  Protein ID: 116601

CYP97C3      scaffold_64:422589-430105   Protein ID: 122396

CYP710B1     scaffold_66:390953-394690   Protein ID: 132687

CYP737A1     scaffold_41:635800-640648   Protein ID: 151890

CYP738A1     scaffold_6:2860971-2864314  Protein ID: 167934

CYP739A1     scaffold_8:1064933-1068008  Protein ID: 140983

CYP739A2     scaffold_8:1078648-1085528  Protein ID: 140985

CYP739A3     scaffold_8:1105803-1109510  Protein ID: 140993

CYP739A4a    scaffold_8:1131245-1134169  Protein ID: 165902

CYP739A4b    scaffold_8:1135368-1135969  Protein ID: 165903

CYP739A5a    scaffold_8:1125087-1127174  Protein ID: 165900

CYP739A5b    scaffold_8:1128094-1130653  Protein ID: 186291

CYP739A6     scaffold_8:1145820-1150791  Protein ID: 186292

CYP740A1     scaffold_68:172336-177730   Protein ID: 153850

CYP741A1a    scaffold_71:380138-383878   Protein ID: 179637

CYP741A1b    scaffold_846:3828-5043      Protein ID: 181363

CYP742A1     scaffold_37:480604-486602   Protein ID: 151489

CYP743A1     scaffold_1:5611907-5617553  Protein ID: 116541

CYP743A2a    scaffold_16: 609616-615492  Protein ID: 189550

CYP743A2b    scaffold_16: 609616-615492  Protein ID: 116043

CYP743B1     scaffold_71:125260-130065   Protein ID: 122749

CYP743B2     scaffold_71:130374-138996   Partial seq not annotated

CYP743B3     scaffold_71:139305-143478   Protein ID: 122730

CYP743C1     scaffold_17:1489349-1496178 Protein ID: 147793

CYP744A1a    scaffold_23:958703-961028   Protein ID: 148983

CYP744A1b    scaffold_23:962118-963228+  Protein ID: 118452

CYP744A2     scaffold_23:969108-971162   Protein ID: 118526

CYP744A3     scaffold_23:976166-982342   Protein ID: 118465

CYP744A4a    scaffold_23:1143890-1147747 Protein ID:  95157

CYP744A4b    scaffold_23:1141463-1143101 Protein ID: 103666

CYP744A5P    scaffold_21:6347-7649       Protein ID: 148389

CYP744B1     scaffold_23:1014183-1020804 Protein ID: 118428

CYP744C1     scaffold_39:932071-938361   Protein ID: 177201

CYP745A1     scaffold_74:79791-84023     Protein ID: 154128

CYP746A1     scaffold_1:3570907-3575049  Protein ID: 116510

CYP747A1     scaffold_96:178714-184286   Protein ID: 108849

CYP748A1     scaffold_9:2353835-2358515  Protein ID: 114278

CYP767A1     scaffold_9:1625885-1634209  Protein ID: 169101

CYP768A1a    scaffold_23:1470852-1473965 Protein ID: 149040

CYP768A1b    scaffold_23:1476142-1477663 Protein ID: 149041

C_140094     scaffold_48:305112-303028   Partial seq not annotated

C_4150003    scaffold_21:297178-306479   Protein ID: 191092

C_7970001    scaffold_15:453166-458216   Protein ID: 170931

C_10690001   scaffold_24:545063-551204   Protein ID: 173996

Bacterial    scaffold_661:7589-8149      Protein ID: 109783

 

 

P450s sorted by scaffold location (version 3 assembly)

 

CYP97B6      scaffold_1:2256360-2261776  Protein ID: 116601

CYP746A1     scaffold_1:3570907-3575049  Protein ID: 116510

CYP743A1     scaffold_1:5611907-5617553  Protein ID: 116541

CYP738A1     scaffold_6:2860971-2864314  Protein ID: 167934

CYP51G1      scaffold_7:2481399-2484780  Protein ID: 126254

CYP739A1     scaffold_8:1064933-1068008  Protein ID: 140983

CYP739A2     scaffold_8:1078648-1085528  Protein ID: 140985

CYP739A3     scaffold_8:1105803-1109510  Protein ID: 140993

CYP739A5a    scaffold_8:1125087-1127174  Protein ID: 165900

CYP739A5b    scaffold_8:1128094-1130653  Protein ID: 186291

CYP739A4a    scaffold_8:1131245-1134169  Protein ID: 165902

CYP739A4b    scaffold_8:1135368-1135969  Protein ID: 165903

CYP739A6     scaffold_8:1145820-1150791  Protein ID: 186292

CYP767A1     scaffold_9:1625885-1634209  Protein ID: 169101

CYP748A1     scaffold_9:2353835-2358515  Protein ID: 114278

C_7970001    scaffold_15:453166-458216   Protein ID: 170931

CYP743A2a    scaffold_16:609616-615492   Protein ID: 189550

CYP743A2b    scaffold_16:609616-615492   Protein ID: 116043

CYP743C1     scaffold_17:1489349-1496178 Protein ID: 147793

CYP744A5P    scaffold_21:6347-7649       Protein ID: 148389

C_4150003    scaffold_21:297178-306479   Protein ID: 191092

CYP744A1a    scaffold_23:958703-961028   Protein ID: 148983

CYP744A1b    scaffold_23:962118-963228+  Protein ID: 118452

CYP744A2     scaffold_23:969108-971162   Protein ID: 118526

CYP744A3     scaffold_23:976166-982342   Protein ID: 118465

CYP744B1     scaffold_23:1014183-1020804 Protein ID: 118428

CYP744A4a    scaffold_23:1143890-1147747 Protein ID:  95157

CYP744A4b    scaffold_23:1141463-1143101 Protein ID: 103666

CYP768A1a    scaffold_23:1470852-1473965 Protein ID: 149040

CYP768A1b    scaffold_23:1476142-1477663 Protein ID: 149041

C_10690001   scaffold_24:545063-551204   Protein ID: 173996

CYP742A1     scaffold_37:480604-486602   Protein ID: 151489

CYP744C1     scaffold_39:932071-938361   Protein ID: 177201

CYP737A1     scaffold_41:635800-640648   Protein ID: 151890

CYP97A6      scaffold_42:732596-737181   Protein ID: 121076

C_140094     scaffold_48:305112-303028   Partial seq not annotated

CYP55B1      scaffold_52:370660-375180   Protein ID: 121742

CYP97A5      scaffold_55:373287-377786   Protein ID:  39257

CYP97C3      scaffold_64:422589-430105   Protein ID: 122396

CYP710B1     scaffold_66:390953-394690   Protein ID: 132687

CYP740A1     scaffold_68:172336-177730   Protein ID: 153850

CYP743B1     scaffold_71:125260-130065   Protein ID: 122749

CYP743B2     scaffold_71:130374-138996   Partial seq not annotated

CYP743B3     scaffold_71:139305-143478   Protein ID: 122730

CYP741A1a    scaffold_71:380138-383878   Protein ID: 179637

CYP741A1b    scaffold_846:3828-5043      Protein ID: 181363

CYP745A1     scaffold_74:79791-84023     Protein ID: 154128

CYP747A1     scaffold_96:178714-184286   Protein ID: 108849

Bacterial    scaffold_661:7589-8149      Protein ID: 109783

 

P450 sequences

 

Note: the P450 sequences have many apparent insertions of poly Ala, poly Gly,

poly S and mixtures of these.  These are found in some ESTs so they are

real.  It is not clear why these sequences are inserted or what they do to the

structure of these P450s.

 

 

>CYP51G1 C_680007 10 EXONS 56% TO 51G1 Arab 

EST SUPPORT BI717817 BU649818 BI726293 BM001590 AV642299

 

60124 MDLPPELAVLADKVLSLSPVVLVALGSAVLILALAVGRVLFNLLPSKRPPVWEGLPFIGGLLKFTG 59927

59843 GPWKLLENGYAKFGECFTVPVAHRRVTFLIGPEVSPHFFKAGDDEMSQSE 59694

59394 VYDFNIPTFGRGVVFDVEQKVRTEQFRMFTEALTKNRLKSYVPHFNKEAE 59245

59108 EYFAKWGETGVVDFKDEFSKLITLTAARTLL 59016

58765 GREVREQLFDEVADLLHGLDEGMVPLSVFFPYAPIPVHFKRDR (2) 58637

58412 CRKDLAAIFAKIIRARRESGRREEDVLQQFIDAR 58311

58119 YQNVNGGRALTEEEITGLLIAVLFAGQHTSSITTSWTGIFMAANK 57985

57667 EHYNKAAEEQQDIIRKFGNELSFETLSEMEVLHRNITEALRMHPPLLLVMRYAKKPFSVTTSTGKSYVIPK 57455

57191 GDVVAASPNFSHMLPQCFNNPKAYDPDRFAPPREEQNKPYAFIGFGAGRHACIGQNFAYLQ (0) 57009

56877 IKSIWSVLLRNFEFELLDPVPEADYESMVIGPKPCRVRYTRRKL* 56743

 

newest data: version 3 checked April 24, 2006

Name: estExt_gwp_1H.C_70049

Protein ID: 126254

Location: Chlre3/scaffold_7:2481399-2484780

100% match

 

2481399 MDLPPELAVLADKVLSLSPVVLVALGSAVLILALAVGRVLFNLLPSKRPPVWEGLPFIGGLLKFTG (0) 2481596

2481680 GPWKLLENGYAKFGECFTVPVAHRRVTFLIGPEVSPHFFKAGDDEMSQSE (0) 2481829

2482129 VYDFNIPTFGRGVVFDVEQKVRTEQFRMFTEALTKNRLKSYVPHFNKEAE (0) 2482278

2482415 EYFAKWGETGVVDFKDEFSKLITLTAARTLL (1) 2482507

2482758 GREVREQLFDEVADLLHGLDEGMVPLSVFFPYAPIPVHFKRDR (2) 2482886

2483111 CRKDLAAIFAKIIRARRESGRREEDVLQQFIDAR (2) 2483212

2483404 YQNVNGGRALTEEEITGLLIAVLFAGQHTSSITTSWTGIFMAANK (0) 2483538

2483856 EHYNKAAEEQQDIIRKFGNELSFETLSEMEVLHRNITEALRMHPPLLLVMRYAKKPFSVTTSTGKSYVIPK (0) 2484068

2484332 GDVVAASPNFSHMLPQCFNNPKAYDPDRFAPPREEQNKPYAFIGFGAGRHACIGQNFAYLQ (0) 2484514

2484646 IKSIWSVLLRNFEFELLDPVPEADYESMVIGPKPCRVRYTRRKL* 2484780

 

>CYP55B1 C_2580005 (possible CYP55 fungal origin), 42% to 105T1

      MAPQHD (1)

47793 FPFSRPKGVEPPAEYKELRSKCPVAPGRLFDGSKIWLISRHKELKEVLQDGRFSK 47629 (0)

47243 VRTLPGFPELSPGGKAAAQSGNAATFVDMDPPEHTKYRY 47127 (0)

      missing about 20aa here ? seq gap

      AKADKLVDAMIARGGPLDLNEAFSMPLPFR 46168 (0) (same intron loc. as 55A6)

45913 VIYDFIGIPEADFAYLSANVAVRSSGSSNAKDAAAAADDLVKYMDNL 45773 (0)

45601 VAEKERNPTGKDLISELVTKQ 45539 (0)

45264 LRPGHMTREQLVQTAFLMLVAGNATVATQINLGVISLLQHPDQ 45136 (0)

44693 LAAMKADPARLVPAATEEICRFHTGSSYALRRLAVADVQVDGQ 44565 (0)

44256 LVKKGEGIIALNQSANRDESVFPDPDRFDIHRQSNPQQ 44143 (0)

43755 VGFGYGTHVCVAEWLARAEIQVAIGTLFRRLPNLRLAVPESQIQYSDPARDVGLAALPVTW* 43573

 

newest data: version 3 checked April 24, 2006

Name: e_gwW.52.47.1

Protein ID: 121742

Location: Chlre3/scaffold_52:370660-375180

Note gene model is too long at SMPLPFRVGGW, shorten by 4 amino acids

First exon is still my best guess, not in gene model e_gwW.52.47.1

51% to CYP55A5v1 Aspergillus oryzae

48% to CYP55A3 Cylindrocarpon tonkinense

42% to 105T1 Burkholderia fungorum (bacteria)

370660 MAPQH (1) 370674

370738 DFPFSRPKGVEPPAEYKELRSKCPVAPGRLFDGSKIWLISRHKELKEVLQDGRFSK (0) 370905

371291 VRTLPGFPELSPGGKAAAQSGNAATFVDMDPPEHTKYR (2) 371404

371628 GMVWPYLTPEAVEQLRPSIQ (0) 371677

372474 AKADKLVDAMIARGGPLDLNEAFSMPLPFR (0) 372563

372818 VIYDFIGIPEADFAYLSANVAVRSSGSSNAKDAAAAADDLVKYMDNL (0) 372958

373130 VAEKERNPTGKDLISELVTKQ (0) 373192

373467 LRPGHMTREQLVQTAFLMLVAGNATVATQINLGVISLLQHPDQ (0) 373595

374038 LAAMKADPARLVPAATEEICRFHTGSSYALRRLAVADVQVDGQ (0) 374166

374475 LVKKGEGIIALNQSANRDESVFPDPDRFDIHRQSNPQQ (0) 374588

374995 VGFGYGTHVCVAEWLARAEIQVAIGTLFRRLPNLRLAVPESQIQYSDPARDVGLAALPVTW* 375180

 

>CYP97A5 15 EXONS 60% TO 97A3 FIRST EXON PREDICTED BY GENSCAN

C_4260002 C_1040015

no mRNA or homology evidence for exon 1

note: CYP97A6 has homology to exon 2, but no upstream match for 5000bp

EST support = cyan BM003139 BI725954 BE441929 BI719213 CF555158

Gray resembles a cycad EST

13351 MPPDVSGNMLSFSTSISGCRF (1)

373428 GRSAARFLADLGRQWRAEASKRMPE (0) 373502

12913 ARGDIREIVGQPVFVPLYKLFLVYGKIFRLSFGPKSFVIISDPAYAKQILLTNADKYSKGLLSEILDFVMGT 12698 (0)

12532 GLIPADGEIWKARRRAVVPALHRK 12461

12332 YVMSMVDMFGDCAAHGASATLDKYAASG 12249

11994 TSLDMENFFSRLGLDIIGKAVFNYDFDSLAHDDPVIQ 11884

11707 AVYTLLREAEHRSTAPIAYWNIPGIQFV 11624

11493 VPRQKRCQEALVLVNECLDGLIDKCKKLV 11407

11269 EEEDAVFGEEFLSERDPSILHFLLASGDEISSKQ (0) 11168

11003 LRDDLMTMLIAGHETTAAVLTWTLYLLSQHPEAAAAIRKE (0) 10884

10681 VDELLGDRKPGVEDLRALK (0) 10625

10448 MTTRVINEAMRLYPQPPVLIRRALQ 10374

10118 DDHFDQFTVPAGSDLFISVWNLHRSPKLWDEPDKFKPER 10002

 9580 FGPLDSPIPNEVTENFAYLPFGGGRRKCIGDQ 9485

 9358 FALFEAVVALAMLMRRYEFNLDESKGTVGMTT 9263

 9124 GATIHTTNGLNMFVRRRDPLTVPPTSSSVAETVSTGYAFACG

      PAVMPVASAEVVAAPATAAGGGCPFHTAAGAAVPAATMSLRPTGPPSA* 8852

 

newest data: version 3 checked April 28, 2006

Name:   gwH.55.10.1

Protein ID:    39257

Location:       Chlre3/scaffold_55:373287-377786

This model differs from seq below at ends

100% match from ARGDIRE to DPLTVP

EST support = cyan BM003139 BI725954 BE441929 BI719213 CF555158

Gray resembles a cycad EST

 

scaffold_55 16 exons

373287 MPPDVSGNMLSFSTSISGCRF (1) 373349

373428 GRSAARFLADLGRQWRAEASKRMPE (0) 373502

373725 ARGDIREIVGQPVFVPLYKLFLVYGKIFRLSFGPKSFVIISDPAYAKQIL 373874

373875 LTNADKYSKGLLSEILDFVMGT (0) 373940

374106 GLIPADGEIWKARRRAVVPALHRK (2) 374177

374306 YVMSMVDMFGDCAAHGASATLDKYAAS (1) 374386

374641 GTSLDMENFFSRLGLDIIGKAVFNYDFDSLAHDDPVIQ (0) 374754

374931 AVYTLLREAEHRSTAPIAYWNIPGIQFV (0) 375014

375145 VPRQKRCQEALVLVNECLDGLIDKCKKL (0) 375228

375366 VEEEDAVFGEEFLSERDPSILHFLLASGDEISSKQ (0) 375470

375635 LRDDLMTMLIAGHETTAAVLTWTLYLLSQHPEAAAAIRKE (0) 375754

375957 VDELLGDRKPGVEDLRALK (0) 376013

376190 MTTRVINEAMRLYPQPPVLIRRALQ (0) 376264

376520 DDHFDQFTVPAGSDLFISVWNLHRSPKLWDEPDKFKPER (2) 376636

377058 FGPLDSPIPNEVTENFAYLPFGGGRRKCIGDQ (0) 377153

377280 FALFEAVVALAMLMRRYEFNLDESKGTVGMTT (1) 377375

377514 GATIHTTNGLNMFVRRRDPLTVPPTSSSVAETVSTGYAFACGPAVMPVAS 377663

377664 AEVVAAPATAAGGGCPFHTAAGAAVPAATMSLRPTGPPSA* 377786

 

>CB092428.1 hf05f08.g1 Cycad Leaf Library (NYBG) Cycas rumphii cDNA clone

hf05f08, mRNA sequence.

Length=609

 

This seq supports the secon and third exons above.

 

Query  40   GRSAARFLADLGRQWRAEASKRMPEVRLELRPCDGGGRASCPVLGKSTYTARGDIREIVG  99

            GR+ A+ +A   ++WRA  + +MPE                         ARG++R + G

Sbjct  383  GRALAKSIAVAEQKWRAHNASKMPE-------------------------ARGNVRAVAG  487

 

Query  100  QPVFVPLYKLFLVYGKIFRLSFGPKSFVIISDPAYAKQIL  139

            QP FVPLY LFL YG +FRL+FGPKSFVI+SDPA AK IL

Sbjct  488  QPFFVPLYNLFLTYGGVFRLTFGPKSFVIVSDPAIAKHIL  607

 

VVQCAGQAGIRPGFEARAIAWPRCVFVSAKTRGFRLNKRVSNDFLGRQLTIKSFSNRQRG

GKIRAATVSSLNEGGGGNEPAVERVERLTEEDRAELSVRIAAGEFTAEPVTLNLLKIRLF

LIKFGAP GRALAKSIAVAEQKWRAHNASKMPEARGNVRAVAGQPFFVPLYNLFLTYGGVF

RLTFGPKSFVIVSDPAIAKHIL

 

volvox matches

>ABSY36486.y1  CHROMAT_FILE: ABSY36486.y1 PHD_FILE:     [top]

           ABSY36486.y1.phd.1 CHEM: term DYE: ET TIME: Fri Sep  5

 

Query: 22  GRSAARFLADLGRQWRAEASKRMPE 46

           GR  ARFLADLGR+WR+EA+KRMPE

Sbjct: 240 GRPVARFLADLGRRWRSEAAKRMPE 314

 

>ABSY25604.b1  CHROMAT_FILE: ABSY25604.b1 PHD_FILE:     [top]

           ABSY25604.b1.phd.1 CHEM: term DYE: big TIME: Tue Sep 16

           11:06:39 2003

          Length = 1069

 

Query: 46  EARGDIREIVGQPVFVPLYKLFLVYGKIFRLSFGPKSFVIISDPAYAKQILLTNADKYSK 105

           +ARGDIREIVGQPVFVPLYKLFLVYGKIFRLSFGPKSFVIISDPAYAKQILLTNADKYSK

Sbjct: 340 QARGDIREIVGQPVFVPLYKLFLVYGKIFRLSFGPKSFVIISDPAYAKQILLTNADKYSK 519

 

Query: 106 GLLSEILDFVMGT 118

           GLLSEILDFVMGT

Sbjct: 520 GLLSEILDFVMGT 558

 

>CYP97A6 C_310063 missing exon 1

(0)  VRVPLNNVGKVPIFQLLYELYSS (2)

(2)  HGGVFRMRLGPKSFLVLSDPGAVRQVLVGAVDKYS (2)

9247 KGILAEILEFVMGN (0) 9306

 seq gap missing 2 exons

 

 9705 XSVDMESFFSRLSLDIIGKSVFDYDFDSLRHDDPVIQ 9812

10081 AVYSVLRESTVRSTAPFP 10128 (1)

10371 YWKLPGISLLVPRLRESDAALAIVNDTLDRLIARCKSM 10487 (0)

      LEAEGSIPMPASPSSPSSSTATSSSAPSSPSAPLEESSA

10853 PTVLHFLLGSGEALNSRQLRDDLMTLLIAGHETTAAV 10963

11275 LTWALHLLVAHPEVMKRVRDE 11277

11605 VDWVLGDRLPGSDDLPLLRYTTRVVNEALRLYPQPPVLIRRAMQ 11736

11956 DDVLPGGHVVAAGTDLFISVWNLHHSPQLWERPEAFDPDR 12075

12251 FGPLDSPPPTEFSTDFRFLPFGGGRRKCVGDMFAIAECVVALAVVLRRYDFAPDTSFGPVGFKS 12442

12584 GATINTSNGLHMLISRRDLT 12643

12644 GVPPPAPRAPAAAAGAAAGSCPHAAAAAATAAAAAAVGCPHAAAAATSGAPAGVTP 12811

 

newest data: version 3 checked April 27, 2006

Name:e_gwW.42.59.1

Protein ID:121076

Location:Chlre3/scaffold_42:732596-737181

100% to e_gwW.42.59.1 from VRVPL to MLISRR

scaffold_42

cannot identify exon 1

732596 VRVPLNNVGKVPIFQLLYELYSS (2) 732667

733002 HGGVFRMRLGPKSFLVLSDPGAVRQVLVGAVDKYS (2) 733106

733345 KGILAEILEFVMGN (0) 733386

733631 GLLAADGEHWIARRRVVAPALQRK (2) 733702

733949 FVSSQVALFGAATAHGLPQLEAAAAAAAAAAGDSRGGGA 734065

734066 ASVDMESFFSRLSLDIIGKSVFDYDFDSLRHDDPVIQ (0) 734176

734445 AVYSVLRESTVRSTAPFP (1) 734498

734738 YWKLPGISLLVPRLRESDAALAIVNDTLDRLIARCKSMVGRCCGGGGGGGGG (0) 734893

       SSAPTVLHFLLGSGEALNSRQLRDDLMTLLIAGHETTAA (0) 735324

735636 ALTWALHLLVAHPEVMKRVRDE (0) 735701

735969 VDWVLGDRLPGSDDLPLLRYTTRVVNEALRLYPQPPVLIRRAMQ (0) 736100

736320 DDVLPGGHVVAAGTDLFISVWNLHHSPQLWERPEAFDPDR (2) 736439

736615 FGPLDSPPPTEFSTDFRFLPFGGGRRKCVGDMFAIAECVVALA

       VVLRRYDFAPDTSFGPVGFKS (1) 736806

736948 GATINTSNGLHMLISRRDLTGGVPPPAPRAPAAAAGAAAGSCPHAAAAAATAAAAAA

       VGCPHAAAAATSGAPAGVTPQ* 737181

 

54% to DY932408.1 plains sunflower Helianthus petiolaris

MAASLTTLQFPSPYLNTPTTKFKLKSPSTSFPKSYGVSRSCGIKCSYSNGRKPD

SGEEKSGKKVEMTPEEKRRAELSARIASGAFTVEQPSLGSLLVSGLAKLGVPSNILEPVS

NLINSGGNYPKIPEAKGAISAIRSEAFFIP