BLACK COTTONWOOD P450S ACCESSION NUMBERS 566 models complete shown in yellow

Last modified Mar. 7, 2005  David R. Nelson

Made some minor name revisions on 8/24/2006 mostly v1, v2 changes to P1, P2 etc.

 

Complete list of P450s found by blast searches, yellow are finished assembled seqs. 

cyan seqs have a model that is retrieved and in a sequence file of cottonwood P450s.

Many of the models are short or otherwise wrong.

566 total cottonwood sequences, 566 assembled, all P450 models (524) retrieved. Some P450 sequences have no models at JGI. Two adjacent models are sometimes from a single gene. Some JGI models fuse two different genes into a hybrid gene.

one bacterial contamination = scaf_2810. A CYP39 contamination = scaffold_11900.

 

61 sequences were found in the JGI annotation but not in my first 5-6 blast searches

53 sequences in my blast output had no P450 models in JGI

 

searched with 85A 90A 71B2 97A3 72A7

 

Sequences in the 53 families shown in Blue have been assigned names.

 

All 566 sequences are named.

 

Plant P450s sort into 62 families in 10 clans.

Clans are clusters of related families. 

Six clans have only one family CYP51 (2), CYP74 (9), CYP97 (3), CYP710 (1),

CYP711 (3), CYP727 (2)

(20 sequences in this group) All have been assembled and named.

 

All members of these 6 one family clans have been assembled

 

Parentheses show how many sequences are in the collection for each family.

These numbers include pseudogenes and possible duplicate sequences, so the

count of full length functional sequences is less.

All sequences have been shown as belonging to one of the 62 plant families,

or CYP749, the only new plant family in cottonwood (CYP72 clan).

 

The CYP71 clan has 26 families, some are specific for other taxa like CYP99,

CYP719, CYP723, CYP726.  That leaves 22 families.

CYP71 (66), CYP73 (4), 75 (3), 76 (17), 77 (4), 78 (14), 79 (6), 80 (8), 81 (50),

82 (36), 83(21), 84 (4), 89 (19), 92(13), 93 (9), 98 (6), 701 (1), 703 (2), 705 (8),

706 (5), 712 (7), 736 (25) these are now all assembled. (328 sequences in this clan)

 

The CYP72 clan has 8 families.  All are present in cottonwood. CYP72 (21), 709 (2),

714 (7), 715 (3), 721 (13), 734 (2), 735 (2), 749(18) All are assembled.

(68 sequences in this clan)

 

The CYP85 clan has 16 families.  CYP702, 708 and CYP725 seem to be absent in Populus.

That leaves 13 families.  CYP85 (4), 87 (20), 88 (4), 90 (9), 707 (8), 716 (27),

718 (2), 720 (1), 722 (2), 724 (2), 728 (17), 729 (2), 733 (1). These have all been assembled and named. (99 sequences in this clan)

 

The CYP86 clan has 7 families. CYP730, CYP731 and CYP732 are only found in rice so

far. There are 4 families in cottonwood CYP86 (9), CYP94 (16), CYP96 (11) and

CYP704 (15). All members of the CYP86 clan have been assembled and named.

(51 sequences in this clan)

 

All 566 sequences are now assembled.

 

Scaffold_2810 is a bacterial contamination from another JGI genome Ralstonia metallidurans. (100% match)

 

Scaffold_11900 is a CYP39 contamination, most like chicken CYP39.

 

Most CYP71 clan members have only one intron in the middle region, making them fairly

easy to assemble.  The CYP701 family is an exception, and it should belong in another

clan.

 

 

LG_I          (-)   174753-172748   CYP716D1 46% to 716A1

LG_I          (-)   178465-176158   CYP716D2v1 47% to 716A1

LG_I          (-)   185470-183274   CYP716D3v1 47% to 716A1

LG_1          (-)   183202-183113   CYP716D3-de1bv1 partial exon 1 pseudogene -de1b

LG_I          (-)  1663727-1659176  CYP727B1 44% TO CYP727A1 85% to 727B2

LG_I          (+)  2171265-2172908  CYP80E1 39% to 76C4

LG_I          (-)  3924905-3921817  CYP78D3v1 51% to 78A5

LG_I          (-)  4925909-4924104  CYP51G1

LG_I          (+)  6124306-6126694  CYP78A22 68% to  78A9

LG_I          (-)  8142817-8140384  CYP87A7 76% to 87A2 ortholog?

LG_I          (+)  8155946-8158046  CYP80E2 41% to   76G1

LG_I          (-)  8580345-8578719  CYP76F3 47% to 76C4

LG_I          (-) 16937584-16936049 CYP86A18 78% to 86A1 Arab. 90% to 86A19

LG_I          (+) 17454906-17459858 CYP707A9 69% to 707A2

LG_I          (-) 19620275-19616562 CYP87D1P N-term and C-term only (seq gap)

LG_I          (-) 19635300-19632986 CYP87D2v1 45% to 87A2

LG_I          (-) 19665613-19663169 CYP87D3 97% to scaf_9597

LG_I          (-) 19679729-19677282 CYP87D4 44% to 87A2

LG_I          (-) 19683439-19680773 CYP87D5 50% to 87A2

LG_I          (-) 19703740-19702073 CYP705B6 44% to 705A2

LG_I          (+) 19811612-19812590 CYP71B44P like pseudogene 51% to 71B36

LG_I          (+) 19972937-19975122 CYP75A11 80% to 75A8 87% to CYP75A12

LG_I          (-) 20250132-20251678 CYP94F1  46% to 94D22

LG_I          (+) 20278749-20279837 CYP87D6P 74% to LG_IX   (+)  1537124

LG_I          (-) 22500323-22497486 CYP724B8 55% to 724B1, 52% to 724B7

LG_I          (-) 23341018-23339071 CYP82J1 50% to 82C2

LG_I          (-) 23611288-23610335 CYP82J1P

LG_I          (-) 23945857-23944313 CYP94B7  61% to 94B2

LG_I          (-) 24001619-24000402 CYP89A25 53% to 89A5

LG_I          (-) 29491328-29490406 CYP71D36Pv1 PSEUDOGENE 61% TO CYP71D26

LG_I          (-) 29497891-29497622 CYP71D35Pv1 pseudogene no model exists

LG_I          (-) 29516258-29514652 CYP71D34 93% to CYP71D28v1 52% to 71D10

LG_I          (-) 29524379-29524047 CYP71D33P  pseudogene

LG_I          (-) 29530020-29529344 CYP71D32P pseudogene

LG_I          (-) 29532948-29532496 CYP71D31P pseudogene

LG_I          (-) 29540735-29539563 CYP71D30P pseudogene

LG_I          (-) 29575643-29574039 CYP71D29 54% to 71D11 93% to 71D34

LG_I          (-) 29606262-29604787 CYP71D28v1 55% to 71D11

LG_II         (+)   611387-613607   CYP87D7P pseudogene 47% to 87A2

LG_II         (-)  1594296-1594132  CYP83F1-se1[1] pseudogene N-term fragment

LG_II         (-)  1597238-1596704  CYP83F1-se2[1] pseudogene no gene model exists

LG_II         (-)  1617938-1616292  CYP83F1 53% to 83A2, 97% to 83F4, 72% to 83F2

LG_II         (+)  1630135-1630206  CYP83F2-se1[1]  N-term fragment

LG_II         (+)  1650845-1651086  CYP83F2-se2[2] pseudogene no gene model exists

LG_II         (+)  1655600-1657684  CYP83F2 48% to 83A2, 96% to 83F3

LG_II         (+)  1667099-1667266  CYP83F2-se3[2] pseudogene 100% to LG_II 1702772

LG_II         (+)  1671977-1672045  CYP83F2-se4[1] N-term fragment

LG_II         (+)  1685541-1685612  CYP83F2-se5[1] N-term fragment

LG_II         (+)  1688396-1689904  CYP83F3v3 95% to 83F3v1

LG_II         (+)  1697711-1697881  CYP83F3-se1[2] pseudogene no gene model exists

LG_II         (+)  1702691-1702858  CYP83F3-se2[2] pseudogene no gene model exists

LG_II         (+)  1707953-1708189  CYP83F3v1-de1b detritus exon 1

LG_II         (+)  1708840-1710924  CYP83F3v1 48% to 83A2, 96% to 83F2, 63% to 83F5

LG_II         (+)  2131180-2134259  CYP722A1 61% to 722A1 46% to 722B1 in rice

LG_II         (+)  2695301-2696854  CYP94A11 67% to 94A4, 85% to 94A9v1

LG_II         (+)  2705673-2707202  CYP94A12 67% to 94A4, 83% to 94A9v1

LG_II         (-)  4100255-4098382  CYP718A1 67% to 718A1 Arab. = ortholog

LG_II         (+)  4123900-4125604  CYP78A20 67% to 78A7

LG_II         (+)  4750944-4753397  CYP733A1 70% to rice 733A1 = ortholog

LG_II         (-)  9145798-9143865  CYP81B5 50% to 81D8

LG_II         (-)  9151207-9149369  CYP81B4 49% to 81D2

LG_II         (-)  9154259-9152268  CYP81B3v1 47% to 81D8

LG_II         (+)  9173256-9175652  CYP81C5 51% to 81K1

LG_II         (+)  9525988-9529091  CYP707A14 69% to 707A4

LG_II         (-)  9774605-9770969  CYP701A11 68% to 701A3

LG_II         (-)  9856083-9854503  CYP74A1

LG_II         (-) 10070853-10066188 CYP721A3 53% to 721A1 no correct model exists

LG_II         (-) 10420336-10419995 CYP714E3P no correct model exists 69% to 714E2

LG_II         (+) 11243028-11244627 CYP76T1 51% to 76C4

LG_II         (+) 11302436-11304039 CYP76T2 50% to 76C4

LG_II         (+) 11339047-11340633 CYP76T3 52% to 76C4

LG_II         (+) 11361815-11362516 CYP76T4 partial seq

LG_II         (+) 11370278-11371869 CYP76T5 52% to 76C4

LG_II         (-) 13208335-13206633 CYP703A4 78% to 703A2

LG_II         (+) 13511138-13512986 CYP78A24 72% to 78A9

LG_II         (-) 16041237-16040849 CYP79D10P 78% to 79D8

LG_II         (-) 16609905-16609795 CYP82C-se3[2] pseudogene no gene model exists

LG_II         (+) 24351775-24351969 CYP721A-se1[4] no gene model exists

LG_II         (+) 24360275-24360469 CYP721A-se2[4] C-term, no model exists

LG_II         (+) 24440445-24442978 CYP721A2v1 100% to scaffold_1746

LG_III        (+)  3591921-3593661  CYP92A21 3 aa diffs to LG_III (+) 3610917

LG_III        (+)  3593032-3593661  CYP92A20P duplicate of exon 2

LG_III        (+)  3610917-3612657  CYP92A20 63% to 92A9

LG_III        (+)  3641419-3641829  CYP92A19P

LG_III        (+)  3649066-3651041  CYP92A19 71% to 92A9

LG_III        (+) 10170571-10172100 CYP89A26 48% to 89A5 no introns

LG_III        (+) 11000576-11004081 CYP706D1

LG_III        (+) 11353007-11354677 CYP76F4 47% to 76C4

LG_III        (-) 11661882-11664294 CYP87A8 76% to 87A2

LG_III        (-) 12109591-12107981 CYP86A21 74% to 86A7 Arab. 76% to 86A20

LG_III        (-) 13439255-13437042 CYP78A23 68% to 78A9

LG_III        (+) 14476000-14477787 CYP51G5

LG_III        (+) 15447061-15449577 CYP78D2 50% to 78A5

LG_III        (+) 17762390-17766956 CYP727B2 43% TO CYP727A1 85% to 727B1

LG_IV         (+)   521696-523425   CYP728D7v1 74% to 728D4

LG_IV         (+)   526815-528578   CYP728D8 54% to 728B3

LG_IV         (-)   589294-587777   CYP77B3 68% to 77B1 no introns 71% to CYP77B4

LG_IV         (+)   605695-607218   CYP77B4 66% to 77B1 no introns 71% to CYP77B3

LG_IV         (+)  3871814-3873941  CYP79D8 56% to 79D1, 73% to 79D5

LG_IV         (+)  5434809-5437753  CYP735A6 69% to 735A1, 91% to 735A5

LG_IV         (+)  7961227-7961940  CYP81S10P

LG_IV         (+)  8826205-8827922  CYP82L1 55% to 82G1

LG_IV         (-) 10881282-10879750 CYP710A12 68% to 710A1

LG_IV         (+) 13600264-13602937 CYP85A1  68% to CYP85A1 Arab., 90% to 85A3

LG_IV         (+) 13604641-13606601 CYP85A1P duplicate of exons 2-8 100% identical

LG_IV         (+) 13987032-13988462 CYP74C6v1 79% to CYP74C8

LG_IV         (+) 13991961-13993406 CYP74C7 78% to 74C6v1

LG_IV         (+) 14183519-14183770 CYP87D-se2[6:7] exons 6,7 identical to scaff_12583

LG_IV         (+) 15006407-15008324 CYP84A13P pseudogene 65% to 84A1

LG_IV         (-) 16503207-16503109 CYP87D11 44% to 87A2

LG_V          (+)   371877-373430   CYP96F1 46% to 96A10

LG_V          (+)   388448-390004   CYP96F2 45% to 96A10

LG_V          (-)   490310-489234   CYP87D13P  pseudogene partial

LG_V          (+)  1583967-1585490  CYP89A18 63% to 89A5

LG_V          (-)  1989840-1988143  CYP78A18 70% to 78A7

LG_V          (+)  2585685-2588232  CYP86B4 72% to 86B1 92% to 86B5

LG_V          (-)  2790530-2789037  CYP96G1