BLACK
COTTONWOOD P450S ACCESSION NUMBERS 566 models complete shown in yellow
Last
modified Mar. 7, 2005 David R.
Nelson
Made
some minor name revisions on 8/24/2006 mostly v1, v2 changes to P1, P2 etc.
Complete
list of P450s found by blast searches, yellow are finished assembled seqs.
cyan seqs have a model that is retrieved and in a sequence file
of cottonwood P450s.
Many
of the models are short or otherwise wrong.
566
total cottonwood sequences, 566
assembled, all P450
models (524) retrieved. Some P450 sequences have no models at JGI. Two
adjacent models are sometimes from a single gene. Some JGI models fuse two
different genes into a hybrid gene.
one
bacterial contamination = scaf_2810. A CYP39 contamination = scaffold_11900.
61
sequences were found in the JGI annotation but not in my first 5-6 blast
searches
53 sequences in my blast output had
no P450 models in JGI
searched
with 85A 90A 71B2 97A3 72A7
Sequences
in the 53 families shown in Blue have been assigned names.
All
566 sequences are named.
Plant
P450s sort into 62 families in 10 clans.
Clans
are clusters of related families.
Six
clans have only one family CYP51 (2), CYP74 (9), CYP97 (3), CYP710 (1),
CYP711
(3), CYP727 (2)
(20
sequences in this group) All have been assembled and named.
All
members of these 6 one family clans have been assembled
Parentheses
show how many sequences are in the collection for each family.
These
numbers include pseudogenes and possible duplicate sequences, so the
count
of full length functional sequences is less.
All
sequences have been shown as belonging to one of the 62 plant families,
or
CYP749, the only new plant family in cottonwood (CYP72 clan).
The
CYP71 clan has 26 families, some are specific for other taxa like CYP99,
CYP719,
CYP723, CYP726. That leaves 22
families.
CYP71
(66), CYP73 (4), 75
(3), 76 (17), 77 (4), 78
(14), 79 (6), 80 (8), 81 (50),
82
(36), 83(21), 84
(4), 89 (19), 92(13), 93 (9), 98 (6), 701 (1), 703 (2), 705 (8),
706
(5), 712 (7), 736 (25) these are now all
assembled. (328 sequences in this clan)
The
CYP72 clan has 8 families. All are
present in cottonwood. CYP72 (21), 709 (2),
714
(7), 715 (3), 721 (13), 734 (2), 735 (2), 749(18) All are assembled.
(68
sequences in this clan)
The
CYP85 clan has 16 families.
CYP702, 708 and CYP725 seem to be absent in Populus.
That
leaves 13 families. CYP85 (4), 87 (20), 88 (4), 90 (9), 707 (8), 716 (27),
718
(2), 720 (1), 722 (2), 724 (2), 728 (17), 729 (2), 733 (1). These have all been assembled and named. (99
sequences in this clan)
The
CYP86 clan has 7 families. CYP730, CYP731 and CYP732 are only found in rice so
far.
There are 4 families in cottonwood CYP86 (9), CYP94 (16), CYP96 (11) and
CYP704
(15). All
members of the CYP86 clan have been assembled and named.
(51
sequences in this clan)
All
566 sequences are now assembled.
Scaffold_2810
is a bacterial contamination from another JGI genome Ralstonia
metallidurans. (100% match)
Scaffold_11900
is a CYP39 contamination, most like chicken CYP39.
Most
CYP71 clan members have only one intron in the middle region, making them fairly
easy
to assemble. The CYP701 family is
an exception, and it should belong in another
clan.
LG_I (-)
174753-172748 CYP716D1 46% to 716A1
LG_I (-) 178465-176158 CYP716D2v1 47% to 716A1
LG_I (-) 185470-183274 CYP716D3v1 47% to 716A1
LG_1 (-) 183202-183113 CYP716D3-de1bv1 partial exon 1 pseudogene -de1b
LG_I (-) 1663727-1659176 CYP727B1 44% TO CYP727A1 85% to 727B2
LG_I (+) 2171265-2172908
CYP80E1 39% to 76C4
LG_I (-) 3924905-3921817
CYP78D3v1 51% to 78A5
LG_I (-) 4925909-4924104 CYP51G1
LG_I (+) 6124306-6126694
CYP78A22 68% to 78A9
LG_I (-)
8142817-8140384 CYP87A7 76% to 87A2
ortholog?
LG_I (+) 8155946-8158046
CYP80E2 41% to
76G1
LG_I (-) 8580345-8578719
CYP76F3 47% to 76C4
LG_I (-)
16937584-16936049 CYP86A18 78% to 86A1 Arab. 90% to 86A19
LG_I (+) 17454906-17459858 CYP707A9 69% to 707A2
LG_I (-)
19620275-19616562 CYP87D1P N-term and C-term only (seq gap)
LG_I (-) 19635300-19632986 CYP87D2v1 45% to 87A2
LG_I (-) 19665613-19663169 CYP87D3 97% to scaf_9597
LG_I (-) 19679729-19677282 CYP87D4 44% to 87A2
LG_I (-) 19683439-19680773 CYP87D5 50% to 87A2
LG_I (-) 19703740-19702073 CYP705B6 44% to 705A2
LG_I (+) 19811612-19812590 CYP71B44P like pseudogene
51% to 71B36
LG_I (+) 19972937-19975122 CYP75A11 80% to 75A8 87% to CYP75A12
LG_I (-) 20250132-20251678 CYP94F1 46% to 94D22
LG_I (+) 20278749-20279837 CYP87D6P 74% to LG_IX (+) 1537124
LG_I (-) 22500323-22497486 CYP724B8 55% to 724B1, 52%
to 724B7
LG_I (-) 23341018-23339071 CYP82J1 50% to 82C2
LG_I (-) 23611288-23610335 CYP82J1P
LG_I (-) 23945857-23944313 CYP94B7 61% to 94B2
LG_I (-) 24001619-24000402 CYP89A25 53% to 89A5
LG_I (-) 29491328-29490406 CYP71D36Pv1
PSEUDOGENE 61% TO CYP71D26
LG_I (-) 29497891-29497622 CYP71D35Pv1 pseudogene
no model exists
LG_I (-) 29516258-29514652 CYP71D34 93% to CYP71D28v1 52% to
71D10
LG_I (-) 29524379-29524047 CYP71D33P pseudogene
LG_I (-) 29530020-29529344 CYP71D32P pseudogene
LG_I (-) 29532948-29532496 CYP71D31P pseudogene
LG_I (-) 29540735-29539563 CYP71D30P pseudogene
LG_I (-) 29575643-29574039 CYP71D29 54% to 71D11 93% to
71D34
LG_I (-) 29606262-29604787 CYP71D28v1 55% to
71D11
LG_II (+) 611387-613607 CYP87D7P pseudogene 47% to 87A2
LG_II
(-) 1594296-1594132 CYP83F1-se1[1]
pseudogene N-term fragment
LG_II (-) 1597238-1596704
CYP83F1-se2[1] pseudogene no gene model exists
LG_II
(-) 1617938-1616292 CYP83F1
53% to 83A2, 97% to 83F4, 72% to 83F2
LG_II (+)
1630135-1630206 CYP83F2-se1[1] N-term fragment
LG_II (+) 1650845-1651086
CYP83F2-se2[2] pseudogene no gene model exists
LG_II (+) 1655600-1657684
CYP83F2 48% to 83A2, 96% to 83F3
LG_II
(+) 1667099-1667266 CYP83F2-se3[2]
pseudogene 100% to LG_II 1702772
LG_II (+)
1671977-1672045 CYP83F2-se4[1] N-term
fragment
LG_II (+)
1685541-1685612 CYP83F2-se5[1] N-term
fragment
LG_II (+) 1688396-1689904
CYP83F3v3 95% to 83F3v1
LG_II (+) 1697711-1697881
CYP83F3-se1[2] pseudogene no gene model exists
LG_II (+) 1702691-1702858
CYP83F3-se2[2] pseudogene no gene model exists
LG_II (+) 1707953-1708189
CYP83F3v1-de1b detritus exon 1
LG_II (+) 1708840-1710924
CYP83F3v1 48% to 83A2, 96% to 83F2, 63% to 83F5
LG_II (+) 2131180-2134259
CYP722A1 61% to 722A1 46% to 722B1 in rice
LG_II (+) 2695301-2696854 CYP94A11 67% to 94A4, 85% to 94A9v1
LG_II (+) 2705673-2707202 CYP94A12 67% to 94A4, 83% to 94A9v1
LG_II (-) 4100255-4098382
CYP718A1 67% to 718A1 Arab. = ortholog
LG_II (+) 4123900-4125604
CYP78A20 67% to 78A7
LG_II (+) 4750944-4753397
CYP733A1 70% to rice 733A1 = ortholog
LG_II (-) 9145798-9143865
CYP81B5 50% to 81D8
LG_II (-) 9151207-9149369
CYP81B4 49% to 81D2
LG_II (-) 9154259-9152268
CYP81B3v1 47% to 81D8
LG_II (+) 9173256-9175652
CYP81C5 51% to 81K1
LG_II (+) 9525988-9529091
CYP707A14 69% to 707A4
LG_II (-) 9774605-9770969
CYP701A11 68% to 701A3
LG_II (-) 9856083-9854503 CYP74A1
LG_II (-) 10070853-10066188 CYP721A3 53% to 721A1 no
correct model exists
LG_II (-) 10420336-10419995 CYP714E3P no correct model exists 69%
to 714E2
LG_II (+) 11243028-11244627 CYP76T1 51% to 76C4
LG_II (+) 11302436-11304039 CYP76T2 50% to 76C4
LG_II (+) 11339047-11340633 CYP76T3 52% to 76C4
LG_II (+) 11361815-11362516 CYP76T4 partial seq
LG_II (+) 11370278-11371869 CYP76T5 52% to 76C4
LG_II (-) 13208335-13206633 CYP703A4 78% to 703A2
LG_II (+) 13511138-13512986 CYP78A24 72% to 78A9
LG_II (-) 16041237-16040849 CYP79D10P 78% to 79D8
LG_II (-) 16609905-16609795 CYP82C-se3[2] pseudogene
no gene model exists
LG_II (+) 24351775-24351969 CYP721A-se1[4] no
gene model exists
LG_II (+) 24360275-24360469 CYP721A-se2[4]
C-term, no model exists
LG_II (+)
24440445-24442978 CYP721A2v1 100% to scaffold_1746
LG_III (+) 3591921-3593661 CYP92A21 3 aa diffs to LG_III (+) 3610917
LG_III (+)
3593032-3593661 CYP92A20P duplicate of exon 2
LG_III (+)
3610917-3612657 CYP92A20 63% to 92A9
LG_III (+)
3641419-3641829 CYP92A19P
LG_III (+)
3649066-3651041 CYP92A19 71% to 92A9
LG_III (+) 10170571-10172100 CYP89A26 48% to 89A5 no
introns
LG_III (+) 11000576-11004081 CYP706D1
LG_III (+) 11353007-11354677 CYP76F4 47% to 76C4
LG_III (-) 11661882-11664294 CYP87A8 76% to 87A2
LG_III (-) 12109591-12107981 CYP86A21 74% to 86A7 Arab.
76% to 86A20
LG_III (-) 13439255-13437042 CYP78A23 68% to 78A9
LG_III (+) 14476000-14477787 CYP51G5
LG_III (+) 15447061-15449577 CYP78D2 50% to 78A5
LG_III (+) 17762390-17766956 CYP727B2 43% TO CYP727A1 85%
to 727B1
LG_IV
(+) 521696-523425 CYP728D7v1
74% to 728D4
LG_IV (+) 526815-528578 CYP728D8
54% to 728B3
LG_IV (-) 589294-587777 CYP77B3 68%
to 77B1 no introns 71% to CYP77B4
LG_IV (+) 605695-607218 CYP77B4 66% to 77B1 no
introns 71% to CYP77B3
LG_IV (+) 3871814-3873941 CYP79D8 56% to 79D1, 73% to 79D5
LG_IV (+) 5434809-5437753 CYP735A6
69% to 735A1, 91% to 735A5
LG_IV (+) 7961227-7961940 CYP81S10P
LG_IV (+) 8826205-8827922 CYP82L1
55% to 82G1
LG_IV (-) 10881282-10879750
CYP710A12 68%
to 710A1
LG_IV (+)
13600264-13602937 CYP85A1 68% to
CYP85A1 Arab., 90% to 85A3
LG_IV (+)
13604641-13606601 CYP85A1P duplicate of exons 2-8 100% identical
LG_IV (+)
13987032-13988462 CYP74C6v1 79% to CYP74C8
LG_IV (+)
13991961-13993406 CYP74C7 78% to 74C6v1
LG_IV (+) 14183519-14183770
CYP87D-se2[6:7] exons 6,7 identical to
scaff_12583
LG_IV (+) 15006407-15008324
CYP84A13P
pseudogene 65% to 84A1
LG_IV
(-) 16503207-16503109 CYP87D11 44% to 87A2
LG_V (+)
371877-373430 CYP96F1 46% to 96A10
LG_V (+)
388448-390004 CYP96F2 45% to 96A10
LG_V (-) 490310-489234 CYP87D13P pseudogene partial
LG_V (+) 1583967-1585490 CYP89A18
63% to 89A5
LG_V (-) 1989840-1988143 CYP78A18
70% to 78A7
LG_V (+) 2585685-2588232
CYP86B4 72% to
86B1 92% to 86B5
LG_V (-) 2790530-2789037 CYP96G1