Vitis
vinifera cytochrome P450s
This file
includes 222 sequences found in GenPept by searching for
Vitis[orgn]
AND P450 on Sept. 20, 2007. These
start with CAN.
Note: on Oct 4,
the same search found 642 accessions.
These include
416 Sequences
from the other grape genome project starting with CAO.
Click here for a
link to those 416 sequences.
These
automated assemblies have not been checked against known
P450s for
errors in assembly, gene fusions etc.
262
accessions from the grape genome project in the WGS section have been
mined for
P450s and they have been assembled and sorted into family groups.
(see bottom
of file for a complete list of the CAAP accessions)
591 sequences
are present below but some are duplicates. Gene sequences are being clustered
into identical or presumed identical gene bins indicated with an #
followed by a
number. Pseudogenes are being
labeled in a similar way with an
@ sign
followed by a number. (in
progress)
Oct. 4, 2007,
revised Nov. 14, 2007
For an older
file of P450s from grape see http://drnelson.utmem.edu/vitis.old.htm
P450
sequences in CYP family order
Table
of 49 P450 families present
CYP83-like
sequences here merged with CYP71AT
(missing
91 (merged with CYP81), 95 (part of the CYP72 family), 99 (grass specific),
702
(Brassicales only), 705 (part of CYP712), 708 (Brassicales only), 713 (merged
with CYP71A),
717
(merged with CYP81), 719 (Ranunculales), 723 (grass specific), 725 (Taxus only
overlaps 716),
726
(part of CYP71, Euphorbia), 729, 730 (protist contaminant),
731
(protist contaminant), 732 (protist contaminant))
The
only missing families that appear lost in Vitis are CYP729 and CYP749
CYP51
2 genes, 1 pseudogene
CYP71
24 genes, 28 pseudogenes
CYP72
22 genes, 21 pseudogenes
43 sequences
CYP73
3 genes, 0 pseudogenes
CYP74
7 genes, 0 pseudogenes
CYP75
11 genes, 8
pseudogenes 52 sequences
CYP76
24 genes, 23 pseudogenes
47 sequences
CYP77
2 genes, 0 pseudogenes
CYP78
7 genes, 0 pseudogenes
CYP79
9 genes, 13 pseudogenes 4 alleles/duplicates 26 sequences
CYP80
6 genes, 0 pseudogenes
CYP81
21 genes, 14 pseudogenes
35 sequences
CYP82
34 genes, 37 pseudogenes
18 alleles/duplicates 89 sequences
CYP84
3 genes, 0 pseudogenes
CYP85
2 genes, 1 pseudogene
CYP86
6 genes, 1 pseudogene
CYP87
7 genes, 0 pseudogenes
CYP88
2 genes, 0 pseudogenes
CYP89
14 genes, 11 pseudogenes
25 sequences
CYP90
4 genes, 1 pseudogene
CYP92
6 genes, 1 pseudogene
CYP93
4 genes, 0 pseudogenes
CYP94
9 genes, 0 pseudogenes
CYP96
5 genes, 2 pseudogenes
CYP97
3 genes, 0 pseudogenes
CYP98
1 gene, 0 pseudogenes
CYP701
1 gene, 0 pseudogenes
CYP703
1 gene, 0 pseudogenes
CYP704
6 genes, 0 pseudogenes
CYP706
9 genes, 7 pseudogenes
CYP707
5 genes, 0 pseudogenes
CYP709
1 gene, 0 pseudogenes
CYP710
1 gene, 1 pseudogene
CYP711
1 gene, 1 pseudogene
CYP712
2 genes, 2 pseudogenes
CYP714
6 genes, 11 pseudogenes 1 allele 18 sequences
CYP715
1 gene, 0 pseudogenes
CYP716 15 genes, 7 pseudogenes
22 sequences
CYP718
0 genes, 1 pseudogene
CYP720 1
gene, 0 pseudogenes
CYP721
5 genes, 3 pseudogenes
CYP722
1 gene, 0 pseudogenes
CYP724
2 genes, 0 pseudogenes
CYP727
1 gene, 0 pseudogenes
CYP728
6 genes, 2 pseudogenes
CYP733
1 gene, 0 pseudogenes
CYP734
2 genes, 0 pseudogenes
CYP735 1
gene, 0 pseudogenes
CYP736
8 genes, 4 pseudogenes
Totals
315 + 201 +
23 = 553
315
named genes, 201 named pseudogenes 23 alleles/duplicates = 539 named sequences.
#1
>CYP51G6 CAAP02000072.1 81% to
51G1 Arab.
190429 MDVDNKFFNVALLIVATVVVAKLISALLIPKSRKRLPPTVKAFPVIGGLLRFLKGPVV 190256
190255
MLREEYPKLGSVFTLNLLNKNITFFIGPEVSAHFFKAPEADLSQQEVYQFNVPTFGPGVV 190076
190075 FDVDYSVRQEQFRFFTESLRVTKLKGYVDQMVTETE
(0) 189968
188248
DYFSKWGDSGEVDLKYELEHLIILTASRCLLGQEVRDKLFADVSALFHDLDNGMLPISV 188072
188071
IFPYLPIPAHRRRDQARTKLAHIFANIIASRRETGKSENDMLQCFMDSKYKDGRQTTEAE 187892
187891
VTGLLIAALFAGQHTSSITSTWTGAYLFRHKEFLSAVLDEQKNLMKKHGNKVDHDILSEM 187712
187711
DVLYRCIKEALRLHPPLIMLLRSSHSDFSVTTKDGKEYDIPKGHIVATSPAFANRLPHIY 187532
187531 KDPERYDPDRFAVGREEDKVAGAFSYISFGGGRHGCLGEPFAYLQIKAIWSHLLRNFEFE 187352
187351 LISPFPEIDWNAMVVGVKGKVMVRYKRRVLPVD* 187250
#2
>CYP51G CAAP02000381.1 =
AM475390.2, 81% to 81G1 Arab. 90% to CAAP02000072.1
97293
MDVDNKFFNAAFLLVATLVVAKLISALIIPRSKKRLPPTIKAFPLIGGLIRFLKGP 97460
97461
VVMLREEYPKLGSVFTLKLLNKNISFFIGPDVSAHFFKAPESDLSQQEVYRFNVPIFGPG 97640
97641
VVFDVDYSVRQEQFRFFTEALRVTKLKGYVDQMVMEAE (0) 97754
104754
DYFSKWGDCGEVDLKYELEHLIILTASRCLLGQEIRNKLFADVSALFHDLDNGMLPISV 104930
104931 IFPYLPIPAHRRRDQARKKLAEIFANIIASRKETGKSENDMLQCFIASKYKDGRPTTESE 105110
105111
VTGLLIAALFAGQHTSSITSTWTGAYLLRHKEYLSAVQDEQRSLMKKYGSKVDHDILSEM 105290
105291
DVLYRCIKEALRLHPPLIMLLRSSHTDFSVTTRDGKEYDIPKGHIVATSPAFANRLPHIY 105470
105471
KDPDRYDPDRFAVGREEDKAAGAFSYISFGGGRHGCLGEPFAYLQIKAIWSHLLRNFELE 105650
105651 LISPFPEVDWNAMVVGVKGKVMVRYKRRELPVN* 105752
>CYP51G1 AM475390.2 Vitis
vinifera (Pinot noir grape) = CAAP02000381.1
9521
MDVXXKFFNAXFLLVATLLVAKLISALIIPRSKKRLPPTIKAFPLIGGLIRFLKGPVVML 9342
9341 REEYPKLGSVFTLKLLNKNISFFVGPDVSAHFFKAPESDLSQQEVYRFNVPIFGPGVVFD 9162
9161 VDYSVRQEQFRFFTEALRVTKLKGYVDQMVMEAE (0) 9060
3862
DYFSKWGDCGEVDLKYELEHLIILTASRCLLGQEIRNKLFADVSALFHDLDNGMLPISV 3686
3685
IFPYLPIPAHRRRDQARKKLAEIFANIIASRKETGKSENDMLQCFIDSKYKDGRPTTESE 3506
3505 VTGLLIAALFAGQHTSSITSTWTGAYLLRHKEYLSAVQDEQRSLMKKYGSKVDHDILSEM 3326
3325
DVLYRCIKEALRLHPPLIMLLRSSHTDFSVTTRDGKEYDIPKGHIVATSPAFANRLPHIY 3146
3145
KDPDRYDPDRFAVGREEDKAAGAFSYISFGGGRHGCLGEPFAYLQIKAIWSHLLRNFELE 2966
2965 LISPFPEVDWNAMVVGVKGKVMVRYKRREL 2876
@1
>CYP51G7P pseudogene CAAP02006913.1
792
SHIFIGGGRNRCLGQHFAYLQVKAMWSHLL*NFEL*PISPFSKINWNAMVVGV 950
>CYP71AH1 old 71A11 tobacco
MKFLLVVASLFLFVFLILSATKRKSKAKKLPPGPRKLPVIGNLLQIGKLPHRSLQKLSNEYGDFIFLQLGSVPTVV
VFSAGIAREIFRTQDLVFSGRPALYAGKRFSYNCCNVSFAPYGNYWREARKILVLELLSTKRVQSFEAIRDEEVSS
LVQIICSSLSSPVNISTLALSLANNVVCRVAFGKGSDEGGNDYGERKFHEILFETQELLGEFNVADYFPGMAWINK
INGLDERLEKNFRELDKFYDKIIEDHLNSSSWMKQRDDEDVIDVLLRIQKDPNQEIPLKDDHIKGLLADIFIAGTD
TSSTTIEWAMSELIKNPRVLRKAQEEVREVAKGKQKVQESDLCKLEYLKLVIKETLRLHPPAPLLVPRVTTASCKI
MEYEIPADTRVLINSTAIGTDPKYWENPLTFLPERFLDKEIDYRGKNFELLPFGAGRRGCPGINFSIPLVELALAN
LLFHYNWSLPEGMLPKDVDMEEALGITMHKKSPLCLVASHYNLL
>CYP71AH2 tobacco
MNFLVVLASLFLFVFLMRISKAKKLPPGPRKLPIIGNLHQIGKL
PHRSLQKLSNEYGDFIFLQLGSVPTVVVSSADIAREIFRTHDLVFSGRPALYAARKLS
YNCYNVSFAPYGNYWREARKILVLELLSTKRVQSFEAIRDEEVSSLVQIICSSLSSPV
NISTLALSLANNVVCRVAFGKGSAEGGNDYEDRKFNEILYETQELLGEFNVADYFPRM
AWINKINGFDERLENNFRELDKFYDKVIEDHLNSCSWMKQRDDEDVIDVLLRIQKDPS
QEIPLKDDHIKGLLADIFIAGTDTSSTTIEWAMSELIKNPRVLRKAQEEVREVSKGKQ
KVQESDLCKLDYLKLVIKETFRLHPPVPLLVPRVTTASCKIMEYEIPVNTRVFINATA
NGTNPKYWENPLTFLPERFLDKEIDYRGKNFELLPFGAGRRGCPGINFSIPLVELALA
NLLFHYNWSLPEGMLAKDVDMEEALGITMHKKSPLCLVASHYTC
>71A9/CYP71AH3 Glycine max
MISFTVFVFLTLLFTLSLVKQLRKPTAEKRRLLPPGPRKLPFIG
NLHQLGTLPHQSLQYLSNKHGPLMFLQLGSIPTLVVSSAEMAREIFKNHDSVFSGRPS
LYAANRLGYGSTVSFAPYGEYWREMRKIMILELLSPKRVQSFEAVRFEEVKLLLQTIA
LSHGPVNLSELTLSLTNNIVCRIALGKRNRSGADDANKVSEMLKETQAMLGGFFPVDF
FPRLGWLNKFSGLENRLEKIFREMDNFYDQVIKEHIADNSSERSGAEHEDVVDVLLRV
QKDPNQAIAITDDQIKGVLVDIFVAGTDTASATIIWIMSELIRNPKAMKRAQEEVRDL
VTGKEMVEEIDLSKLLYIKSVVKEVLRLHPPAPLLVPREITENCTIKGFEIPAKTRVL
VNAKSIAMDPCCWENPNEFLPERFLVSPIDFKGQHFEMLPFGVGRRGCPGVNFAMPVV
ELALANLLFRFDWELPLGLGIQDLDMEEAIGITIHKKAHLWLKATPFCE
#9
>CYP71AH4 CAAP02005003.1a, 53% to
71B.d, 64% to 71A9, 62% to 71AH2 Nicotiana tabacum DQ350356.1
note
71A9 is 58% to 71AH2 so it is probably misnamed should be CYP71AH3
17504
MGISSFQASHSMVSQSLLLLLLVIFSALLLFLLSTKQKRKSVASRRLPPGPKKLPLIGNLHQLGSLPH 17301
17300 VGLQRLSNEYGPLMYLKLGSVPTLVVSSADMAREIFREHDLVFSSRPAPYAGKKLSYGCN 17121
17120
DVVFAPYGEYWREVRKIVILELLSEKRVQSFQELREEEVTLMLDVITHSSGPVYLSELT 16944
16943
FFLSNNVICRVAFGKKFDGGGDDGTGRFPDILQETQNLLGGFCIADFFPWMGWFNKLNG 16767
16766
LDARLEKNFLELDKIYDKVIEEHLDPERPEPEHEDLVDVLIRVQKDPKRAVDLSIEKIKGVLLT (0)
16575
16475
DMFIAGTDTSSASLVWTMAELIRNPSVMRKAQEEVRSAVRGKYQVEESDLSQLIYLKLVVKE 16310
16309
SLRLHPPAPLLVPRKTNEDCTIRGYEVPANTQVFVNGKSIATDPNYWENPNEFQPE 16142
16141
RFLDSAIDFRGQNFELLPFGAGRRGCPAVNFAVLLIELALANLLHRFDWELADGMRREDL 15962
15961 DMEEAIGITVHKKNPLYLLATPAN* 15887
@7
>CYP71AH5P CAAP02005003.1b,
pseudogene 70% to CAAP02005003.1a
26375 NVAFTSFGEY*KEVRNIVILEVLSAKRVHSFQ
25611
HGWMQAIKLMFDVIAHSSGPVNSIELRVFLSNNVIC*VAFGTKFDGGGDNGTRRFPEIL 25435
25434
QETQNLLGGFCIADFFPWMGWFDKLNAWLGCQVDKNFMELNRIYDKGIEMHLDPERPEPE 25255
25254 HEDLVDVLI*VQKDLRQVVSLSNEKIKGVLT (0)
25074
VHCSD*YPFSLAGMDNAEMIRNRSVMRKAQEKVRSTVRGKYQVEESDLSQLIYLKLVVKE 24895
24894 SLRLHLPAPSLVPRKTTKNCTI 24829
24815
FPQIHVFVNGNLISIDSNYWENPNEFQPERFVDSSIDFRGQSFEFLPFGASMRGCPGANF 24636
24635 AVLLIEVALTNILHRLTGNFLMG 24567
>CYP71AH6 Gossypium raimondii
58% to CAAP02005003.1a, 53% to 71A9/71AH3
CO072855.1
CO095493.1, CO072856.1
MDFQFILTLSFIAFTLMVFKYKARTRRLPPGPWKLPIIGNLHQLGDSSHKSIQRLSQ
QYGPMMFLQLGAVPTLVISSADAAMAIFKGPGGGYDLAFSGRPTNLYVAKKLSYEYNGIT
FAPYGELWREMRKIAVAELLSSKRVQSFRTIREEEVAAMLNHIDIASSSSAPVNLKKLSL
LLANHVVCRVTFGKKYGGGGDGGTNRFDRVLHEVQHLVGEFVVSDYFPWMWWVNKLNGMETRVEKNFEELDKLY
DEVIADHVAPTRTKANHEDIVDVLLRLQKDARQLITLNNQQIKGVLTDMFIAGTDTTAS
SLVWTFTELIRNPPSMEKVKYEVRKVGNGRDKIEESDIPKLHCLHSVIKETLRLHPPAPL
LVPRETTEDCVVGDYEIPAKTRVIINAKSIGTDPKYWENPHDFQPDRFMKSSVDFKGQHL
EFLPFGVGRRGCPGMSFAIMLLQLMVANFLYRFDWELPEGMSVEDVDMEEELGITVFKKT
PLCLVPIRVV*
#10
>CYP71AP5 CAAP02001743.1a, 43%
to 71B2, 51% to 71B.c 53% to 71A1, 78% to 71AP4
15554 MALLQWLKEGFLPSFLFAGIILVAVLKFLQKGMLRKRKFNLPPSPRKLPIIGNLHQLGNMPHIS 15363
15362
LHRLAQKFGPIIFLQLGEVPTVVVSSARVAKEVMKTHDLALSSRPQIFSAKHLFYDCTDI 15183
15182
VFSPYSAYWRHLRKICILELLSAKRVQSFSFVREEEVARMVHRIAESYPCPTNLTKILGL 15003
15002
YANDVLCRVAFGRDFSAGGEYDRHGFQTMLEEYQVLLGGFSVGDFFPSMEFIHSLTGMK 14826
14825
SRLQNTFRRFDHFFDEVVKEHLDPERKKEEHKDLVDVLLHVKEEGATEMPLTMDNVKAIIL (0) 14646
14513 DMFAAGTDTTFITLDW 14466
14465
GMTELIMNPKVMERAQAEVRSIVGERRVVTESDLPQLHYMKAVIKEIFRLHPPAPVLVPR 14286
14285
ESMEDVTIDGYNIPAKTRFFVNAWAIGRDPESWRNPESFEPQRFMGSTIDFKGQDF 14118
14117
ELIPFGAGRRSCPAITFGAATVELALAQLLHSFDWELPPGIQAQDLDMTEVFGITMHR 13944
13943 IANLIVLAKPRFP* 13902
#7
>CYP71AS3.a CAAP02000057.1 Vitis
vinifera 6 genes in a cluster 62% to CYP71AS1
177875
MELYSPSMWLHLLLLLLPLMFLIKRKIELTGQKKPLPPGPTKLPIIG 177735
177734
NLHQLGALPHYSLWQLSKKYGSIMLLQLGVPTVVVSSAEAAREFLKTHDIDCCSRPPLV 177558
177557
GLGKFSYNHRDISFAPYGDYWREVRKICVLEVFSTKRVQSFQFIREEEVALLIDSIVQSS 177378
177377
SSGSPIDLTERLMSLTANIICRIAFGKSFQASEFGDGRFQEVVHEAMALLGGLTAADFFP 177198
177197 YVGRIVDRLTGLHGRLERSFHEMDGFYQQVIEDHLNPGRVKEEHEDIIDVLLRIEREQSE 177018
177017 SSALQFTKDNAKAIVM (0) 176970
176205
DLFLAGVDTGAITVSWAMTELARNPRIMKKAQAEVRNSIGNKGKVTEGDVDQ 176050
176049
LHYLKMVVKETLRLHPPAPLLLPRETMSHFEINGYHFYPKTQVHVNVWAIGRDPNLWKNP 175870
175869
EEFLPERFMDNSVDFRGQHFELLPFGAGRRICPGMYMAIATVELALANLLYRFNWNLPNG 175690
175689 MKETDISMEEAAGLTVRKKFALNLVPILHHC* 175594
@5
>CYP71AS3-de1b CAAP02000057.1 54% to
CYP71B.a
178567
RLTRLYGWLERRTSYELDGFY*QVIGLHDLKDVKEDFIDVLLQTERD 178427
#6
>CYP71AS4.b CAAP02000057.1 Vitis
vinifera 6 genes in a cluster
170360
MALYSPSMWLHLLLLLLPLMYLIKRRIELKGQKKPLPPGPTKLPIIG 170220
170219
NLHQLGTLPHYSWWQLSKKYGPIILLQLGVPTVVVSSAEAAREFLKTHDIDCCSRPPLV 170043
170042
GLGKFSYNHQDIGFAPYGDYWREVRKICVHEVFSTKRLQSFQFIREEEVALLIDSIAESS 169863
169862
SSGSPIDLTERLMSLTANIICRIAFGKSFQVSEFGDGRFQEVVREAMALLGGFTAADFFP 169683
169682
YVGRIVDRLTGLHGRLERSFLEMDGFYQRVIEDHLNPGRVKEEHEDIIDVLLKIQRERSE 169503
169502 SGAVQFTKDSAKAILM (0) 169455
169009
DLFLAGVDTGAITLTWAMTELARNPRIMKKAQVEVRNSIGNKGKVTEGDVDQ 168854
168853
LHYLKMVVKETLRLHPPVPLLLPRETMSHFEINGYHIYPKTQVQVNVWAIGRDPNLWKNP 168674
168673
EEFLPERFMDNSVDFRGQHFELLPFGAGRRICPGMYMVIATVELALANLLYRFNWNLPNG 168494
168493 MREADINMEEAAGLTVRKKFALNLVPILHHC* 168398
#8
>CYP71AS4v2 CAN60733.1| 73% to
CAN83446.1 62% to 71AS1 55% to 71B34
96%
to 71B.d and 71B.e, possible allele of 71B.b, since CAN83446.1 = 71B.e
MELYSPSIWLCLLLLLLPLMYLIKRRIELKGQKKPLPPGPTKLPIIGNLHQLGALPHYSWWQLSKKYGPI
MLLQLGVPTVVVSSVEAAREFLKTHDIDCCSRPPLVGLGKFSYNHRDIGFAPYGDYWREVRKICVLEVFS
TKRVQSFQFIREEEVTLLIDSIAQSSSSGSPIDLTERLMSLTANIICRIAFGKSFQVSEFGDGRFQEVVH
EAMALLGGFTAADFFPYVGRIVDRLTGHHGRLERSFLEMDGFYERVIEDHLNPGRVKEEHEDIIDVLLKI
ERERSESGAVQFTKDSAKAILMDLFLAGVDTGAITLTWAMTELARNPRIMKKAQVEVRNSIGNKGKVTEG
DVDQLHYLKMVVKETLRLHPPAPLLVPRETMSHFEINGYHIYPKTQVXVNVWAIGRDPNLWKNPEEFLPE
RFMDNSVDFRGQHFELLPFGAGRRICPGMYMAIATVELALANLLYRFNWNLPNGMREADINMEEAAGLTV
RKKFALNLVPILHHC
#5
>CYP71AS5.c CAAP02000057.1 Vitis
vinifera 6 genes in a cluster
157794
MALYSPSIWLHLLLLLLPLMFLIKRKIELKGQKKPLPPGPTKLPIIG 157654
157653 NLHQLGALPHYSLWQLSKKYGSIMLLQLGVPTVVVSSAEAAREFLKTHDIDCCSRPPLV 157477
157476
GLGKFSYNHRDISFAPYGDYWREVRKICVLEVFSTKRVQSFQFIREEEVALLIDSIVQSS 157297
157296
SSGSPIDLTERLMSLTANIICRIAFGKSFQASEFGDGRFQEVVHEAMALLGGLTASDFFP 157117
157116 YVGRIVDRLTGLHGRLERSFHEMDGFYQQVIEDHLNPGRVKEEHEDIIDVLLRIEREQSE 156937
156936 SSALQFTKDNAKAILM (0) 156889
156035
DLFLAGVDTGAITVAWAMTELARNPGIMKKAQAEVRSSIGNKGKVTESDVDQ 155880
155879
LHYLKVVVKETLRLHPPAPLLLPRETMSHFEINGYHIYPKTQVHVNVWAIGRDPNLWKNP 155700
155699 EEFLPERFMDNSVDFRGQHFELLPFGAGRRICPGMYMAIATVELALANLLYRFNWNLPNG 155520
155519 IREADISMEEAAGLTVRKKFALNLVPILHHC* 155424
@4
>CYP71AS5-de1b CAAP02000057.1 65% to
CYP71B.c
159495 VKEEHENFIDVLLQTERDRT 159436
#4
>CYP71AS6v1 .d CAAP02000057.1 Vitis
vinifera 6 genes in a cluster
152305
MALYSPSIWLHLLLLLLPLMFLIKRKIELKGQKKPLPPGPTKLPIIGNLHQLGALPHYSL 152126
152125 WQLSKKYGSIMLLQLGVPT 152069
152068
VVVSSAEAAREFLKTHDIDCCSRPPLVGPGKFSYNHRDIGFAPYGDYWREVRKICVLEVF 151889
151888
STKRVQSFQFIREEEVTLLIDSIAQSSSSGSPIDLTERLMSLTANIICRIAFGKSFQVSE 151709
151708
FGDGRFQEVVHEAVALLGGFTAADFFPYVGRIVDRLTGLHGRLERSFLEMDGFYERVIED 151529
151528
HLNPGRVKEEHEDIIDVLLKIERERSESGAVQFTKDSAKAIIM (0) 151400
150931
DLFLAGVDTGAITLTWAMTELARNPRIMKKAQVEVRSSIGKKGKVTKGDVDQLHYLKMVV 150752
150751 KETLRLHPPVPLLVPRETMSHFEINGYHIYPKTQVHVNVWAIGRDPNLWKNPEEFLPERF 150572
150571
MDNSVDFRGQHFELLPFGAGRRICPGMYMAIATVELALANLLYRFNWNLPNGMREADISM 150392
150391 EEAAGLAVRKKFALNLVPILHHC* 150320
>CYP71AS6v2 gi|147855782|emb|CAN83446.1a
2 genes 55% to CYP71B
97%
to 71B.d missing some seq after LPII
This
may be an allele of 71B.d since it is upstream of 71B.e
MALYSPSXWLHLLLLLLPLMYLIKRXIELKGQKKPLPPGPTKLPII
VSSAEAAREFLKTHDIDCCSRPPL
VGXGKFSYNHRDIGFAPYGDYWREVRKICVLEVFSTKRVQSFQFIREEEVTLLIDSIAQSSSSGSPIDLT
ERLMSLTANIICRIAFGKSFQASEFGDGRFQEVVHEAMALLGGFTAADFFPYVGRIVDRLTGLHGRLERS
FLEMDGFYQRVIEDHLNPGRVKEEHEDIIDVLLKIERERSESGAVQFTKDSAKAILMDLFLAGVDTGAIT
LTWAMTELARNPRIMKKAQVEVRNSIGNKGKVTE
GDVDQLHYLKMVVKETLRLHPPAPLLVPRETMSHFE
INGYHIYPKTQVHVNVWAIGRDPNLWKNPEEFLPERFMDNSVDFRGQHFELLPFGAGRRICPGMYMAIAT
VELALANLLYRFNWNLPNGMREADINMEEAAG
#3
>CYP71AS7v1.e CAAP02000057.1 Vitis
vinifera 6 genes in a cluster, 58% to CYP71AS1
135067
MAPYSPDLWLPLVLLFLSLLFLLKKILELKEQKGPPGPPKLPIIG 134933
134932
NLHQLGALIHQSLWQLSKKYGPVMLLHLGFVPTLVVSSAEAAKKVLKDHDISCCSRPPLI 134753
134752 SIGRLSYNYLDISFAPYGPYWREIRKICVLQLFSTNRVQSFQVIREAEVALLIDSLAQSS 134573
134572
SSASPVDLTDKIMSLTANMICRIAFGRSFEGSEFGKGRFQEVVHEATAMMSSFFAADFFP 134393
134392
YVGRIVDRLTGIHERLEKSFHELDCFYQQVIEEHLNPGRMKEEHEDIIDVLLNIEKEQDE 134213
134212 SSAFKLTKDHVKAILM (0) 134165
134087 DLFLAGVDTGAITVVWAMTELARKPGVRKKVQDEVRSHIRERGKVRESDIEQ 133932
133931
FHYLKMVVKETLRLHPPVPLLLPKETMSTIEISGYQIYPKTQVYVNVWAIGRDPNLWNNP 133752
133751
EEFFPERFIDNSVDFKGQHFEFLPFGAGRRVCPAMNMAIAMVELTLANLLYHFNWKLPHG 133572
133571 MKEGDINMEEAPGLSVHKKIALSLVPIKYP* 133479
>CYP71AS7v2
CAN83446.1b
2 genes
Contains
some intron seq. This seq ortholog to CYP71AS7v1
KKILELKEQKGPPGPPKLPIIGNLHQLGALIHQSLWQLSKKH
GPVMLLHLGFVPTLVVSSAEAAKKVLKDHDISCCSRPPLISIGRLSYNYLDISFAPYGPYWREIRKICVL
QLFSTNRVQSFQVIREAEVALLIDSLAQSSSSASPVDLTDKIMSLTANMICRIAFGRSFEGSEFGKGRFQ
EVVHEATAMMSSFFAADFFPYVGRIVDRLTGIHERLEKSFHELDCFYQQVIEEHLNPGRMKEEHEDIIDV
LLNIEKEQDESSAFKLTKDHVKAILMAYFFEQDLFLAGVDTGAITVVWAMTELARKPGVRKK
Missing
some seq here
EKFRESDI
EQFHYLKMVVKETLRLHPPVPLLLPKETMSTIEISGYQIYPKTQVYVNVWAIGRDPNLWNNPEEFFPERF
IDNSVDFKGQHFEFLPFGAGRRVCPAMNMAIAMVELTLANLLYHFNWKLPHGMKEGDINMEEAPGLSVHK
KIALSLVPIKYP
@3
>CYP71AS8P.f CAAP02000057.1 Vitis
vinifera 6 genes in a cluster
pseudogene
74% to .c, missing first exon
126787 NLLLAGVNTSASTVVWAMAELARNPIVMKKAQAEVRSVIGN 126660
126659
KGKVTESDLDQLLYFKLVVKETFRLHPPSPLLLPRETMSHFQMNGYHIHPKTRVHVNV*A 126480
126479
IGRDPNVWKNPKEFFPESFIDNSIDFKGQHFELLPFGAGRRVCPAINMGIAMLELTFANL 126300
126299
LYHFNWKLPHGMKEEDINMEEGAGITSPKKFALILRPTQYP* 126174
@2
>CYP71AS9P.fg CAAP02000057.1
pseudogene 60% to CYP71B.d
122651
FLAGAKQECPTMV*EMAELARNPRTMKKTQAEVRSCAGKQGKVLGT
122506 DLDQLNYLKMMTMKEMLRLYPSV
TILPTETMQHFNIN
VYPKTQFLQLDVLAIGKDP 122327
122326 NIWEN
122309 PEEFSLERF 122283
@6
>CYP71AS10P CAAP02000950.1 pseudogene (+) strand, 49% to CYP71AS5.c CAAP02000057.1
8049
VIKKATVVLASFSREDFFQFGGWIIDKFIGVHA*REKSFHIFDQFYQKVIDDHLDLNRP 8225
8226
KPEHEDIVDVLLGL*KDQTNV 8288
8601
NLFLGII*ATTITIVWALTELAKNPRVMKVAQAEIKSCLGYKLMVEESDLDRFQYLKIVF 8780
8781
K 8783
8780
QTLRLHPPLVMLTPWETVAHCKIGGYDVYPKTRIHINVWVIGKDPRVWDNLEEFNPERF 8956
8957
MNSDIDFRGQHFALVPLGAGRRLCLGMNIATTIMELTLANLLYSFD*RLPSGMKMEEIST 9136
9137
EEGFGSPGHKNEPLYLIP 9190
>CYP71AS11P AM481172 missing part
of exon 1
CAN66328.1
this
part 66% to CYP71AS7v1
11384
MATYSPFLWLPLLLLLPSLFFLIKRTVDQ*RVQREQLPPGLPIIGNLHQLGQLPHQS 11214
11213
LWQLFHKYG 11187
11185
TVIVLHLGFVPTLVVSSAEAARVVLKTRD 11099
(gap) this part 73% to 71AS7v1
10117
NLLLAGVNTSASTVV*AMAELARNPRVMKKAQAEVRSVMGNKGKVTESDLDQLLYLKLVV
9938
9937 KEIFRLHPPGPLLLPRETMSHFQMNGYHIHPKTRVHVNV*AIGREPNVWKNPEEFFPLRF
9758
9757
IDNSIDFKGQHFELLPFGAGRRVCPAINMGIAMVELTFANLLYHFNWKLPHGMKEEDINM 9578
9577 EEGAGITSPKKFAFILRP 9524
>CYP71AS12P second pseudogene on
AM481172 56% to CYP71AS6v1
8682
VMLLQLGSVPTVVVSSA*ATKEVKT 8608
7225
FLAGAKQECPTMV*EMAELARNPRIMKKTQAEVRSCAGKQGKVLGT 7088
7080
DLDQLNYLKMMTMKEMLRLYP
7018
FSHTILPTETMQHFNIN 6068
6966
SSSVYPKTQFLQLDVLAIGKDP 6901
6900
NIWENTQKNF 6871
6883
PEEFSLERF 6857
>CYP71AT3 CAAP02000328.1a, 92%
to CAN64422.1
(CAO61025.1)
46439 MTLLLFVILAFPLFLLFLYRKHRKNGGLLPPGPPGLPFIGNLHQMDNSAPHRYLWQLS 46612
46613
KQYGPLMSLRLGFVPTIVVSSAKIAKEVMKTQDLEFASRPSLIGQQRLSYNGLDLAFSPY 46792
46793
NDYWREMRKICVLHLFTLKRVKSYTSIREYEVSQMIEKISKLASASKLINLSEALMFLTS 46972
46973 TIICRVAFGKRYEGEGCERSRFHGLLNDAQAMLGSFFFSDHFPLMGWLDKLTGLTARLEK 47152
47153
TFREMDLFYQEIIEEHLKPDRKKQELEDITDVLIGLRKDNDFAIDITWDHIKGVLM (0) 47320
47389
NIFLGGTDTGAATVTWAMTALMKNPRVMKKAQEEVRNTFGKKGFIGEDDVEKLPYLKA 47562
47563
VVKETMRLLPSVPLLVPRETLQKCSLDGYEIPPKTLVFVNAWAIGRDPEAWENPEEFMPE 47742
47743
RFLGSSVDFRGQHYKLIPFGAGRRVCPGLHIGVVTVELTLANLLHSFDWEMPAGMNEEDI 47922
47923 DLDTIPGIAMHKKNALCLVAKKYN* 47997
>CYP71AT4 CAAP02000328.1b, 96%
to CAN64422.1 but 5.8 kb upstream, different gene
76820
MTVLLFVILAFPLLLLFLHRKHRKNGGLLHLPPGPPGLPVIGNLHQMDNSAPHRYLWQLS 76999
77000
KQYGPLMSLRLGFIPTIVVSSARIAKEVMKTHDLKFASRPSLIGPRRLSYNCLDLAFSPY 77179
77180
NDYWREMRKICVLHLFTLKRVQSYTPIREYEVSQMIEKISKLASASKLINLSETVMFLTI 77359
77360
TIICRVSFGKRYEDEGCETSRFHGLLNDAQAMLGSFFFSDHFPLMGWLDKLTGLTARLEK 77539
77540 TLRDMDLFYQEIIEDHLKPDRKKQEQEDITDVLIELQKDNSFAIDITWDHIKGVLM
(0) 77707
77780
NIFVGGTDAGTATVIWAMTALMKNPRVMKKAQEEVRNTFG 77899
77900
KKGFIGEDDVEKLPYLKAVVKETMRLLPAAPLLLPRETLQKCSIDGYEIPPKTLVFVNAW 78079
78080
AIGRDPEAWENPEEFIPERFLGSSVDFRGQNYKLIPFGAGRRVCPAIHIGAVTVELTLAN 78259
78260
LLYSFDWEMPAGMNKEDIDFDVIPGLTMHKKNALCLMAKKYN* 78388
>CYP71AT5P
gi|147832399|emb|CAN64422.1| 48% to CYP83A2/83B1
CAAP02000328.1c
84167-85735 100% match, 96% to CYP71AT4
MTVLLFVILAFPLLLLFLHRKHRKNGGLLHLPPGPPGLPFIGNLHQMDNSARHRYLWQLSKQYGSLMSLR
LGFIPTIVVSSARIAKEVMKTHDLEFASRPSLIGPQRLSYNCLDLAFSPYNDYWREMRKICVLHLFTLKR
VQSYTPIREYEVSQMIEKISKLASASKLINLSETLMFLTSTIICRVAFGKRYEDEGFERSRFHGLLNDAQ
AMLGSFFFSDHFPLIGWLDKLTGLTARLEKTFRDMDLFYQEIIEDHLKPDRKKQEQEDITDVLIGLQKDN
SFAIDITWDHIKGVLM
(0)
NIFVGGTDTGAATVIWAMTALMKNPRVMKKAQEEVRNTFGKKGFIGEDDVEKLP
YLKAVVKETMRLLPAVPLLIPRETLQKCSIDGYEIPPKTLVFVNAWAIGRDPEAWENPEEFIPERFLGSS
VDFRGQNYKLIPFGAGRRVCPGIHIGAVTVELTLANLLYSFDWEMPAGMNKEDIDFDVIPGLTMHKKNAL
CLMAKKYN*
>CYP71AT6P CAAP02000328.1d,
pseudogene 100% to CAN64424.1 in overlaps, 69% to CAN64422.1
94192
ILLALPLILLEIRETMEECFFRPPGPPGLPFIGNLLHLDKSAPHRYLWQLSEKYGAL 94362
94363
MFLRLGFVPTLVVSSARMAEEVMKTHDLEFSSRPSLLGQQKLS*NGLDLAFAPYTNYWRE 94542
94543 MKKICTLHLFNSKRAQSFRSIREDEVSRMIEKISKFASASKLVNLSETLHFLTSTIICRI 94722
94723 AFSKRYEDEGWERSRFHTLLSEAQAIMGASFFKDYFPFMGWVDKLTGLTARLQKILRELD 94902
94903 LFYQEIIDHLNPERTKYEQEDIADILIG
RINDSSFAIDITQDHIKAVVM
95017 NIFVGGTDTIAAILVWAMTALMKDPIVMKKAQEEIRNIG 95259
95260 GKKGFRDEDDIEKLPYLKALTKETMKLHPPIPLIPRATPENCSVNGCEVPPKTLVFVNA 95436
95437 WAIGRDPESRENPHEFNPERFLGTFIDFKGQHYGLMAFRAGRRGCPGIYLRTVIIQLALG 95616
95617 NLLYSFDWEMPNGMTKEDIDTDVKHGVTM 95703
>CYP71AT6P CAN64424.1 44% to
83A2/B1, pseudogene
MKKICTLHLFNSKRAQSFRSIREDEVSRMIEKISKFASASKLVNLSETLHFLTSTIICRIAFSKRYEDEG
WERSRFHTLLSEAQAIMGASFFKDYFPFMGWVDKLTGLTARLQKILRELDLFYQEIIDHLNPERTKYEQE
DIADILI
GGTDTIAAILVWAMTALMKDPIVMKKAQEEIRNIGGKKGFRDEDDIEKLPYLKALTKETMKLH
PPIPLIPRATPENCSVNGCEVPPKTLVFVNAWAIGRDPESRENPHEFNPERFLGTFIDFKGQHYGLMAFR
AGRRGCPGIYLRTVIIQLALGNLLYSFDWEMPNGMTKEDIDTD
GHFTGQLGQLAGNILGGFRQLRFSGVSITMWKLKRWKLRVHETQKNI
>CYP71AT7 CAAP02000328.1e, 84%
to 104360
99326
MMILLLILLALPLFLLFLLRNRRRTPLPPGPPGLPLIGNLLQLDKSAPHIYLWRLS 99493
99494
KQYGPLMILRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSLLGLRKLSYNGLDVAFSPY 99673
99674 NDYWREMRKICVLHLFNSKRAQSFRPIREDEVLEMIKKISQFASASKLTNLSEILISLTS 99853
99854
TIICRVAFSKRYDDEGYERSRFQKLVGEGQAVVGGFYFSDYFPLMGWVDKLTGMIALADK 100033
100034
NFKEFDLFYQEIIDEHLDPNRPEPEKEDITDVLLKLQKNRLFTIDLTFDHIKAVLM (0) 100201
100333 NIFLAGTDTSAATLVWAMTMLMKNPRTMTKAQEELRNLIGKKGFVDEDDLQKLPYLKAIV 100512
100513
KETMRLHPASPLLVPRETLEKCVIDGYEIPPKTLVYVNAWAIGRDPESWENPEEFMPERF 100692
100693
LGTSIDFKGQDYQLIPFGGGRRICPGLNLGAAMVELTLANLLYSFDWEMPAGMNKEDIDI 100872
100873 DVKPGITMHKKNALCLLARIPMH* 100944
>CYP71AT8 CAAP02000328.1f, 71%
to CAN64422.1
104360
MMILLLILLALPLFLLFLLRNQRRAPLPPGPPGLPFIGNLLQLDKSAPHLYLWRLS 104527
104528
KQYGPLMFLRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSLLGQQKLFYNGLGLTFTPY 104707
104708
NDYWREMRKICVLHLFNSKRVQSFRYIREDEVLEMIKKISKFASASKLTNLSEILIPLTS 104887
104888
TIICRVAFGKRYDDEGCERSRFHELLGGIQTMAIAFFFSDYFPLMSWVDKLTGMISRLEK 105067
105068
VSEELDLFCQKIIDEHLDPNKPMPEQEDITDILLRLQKDRSFTVDLTWDHIKAILM (0) 105235
105143 DIFIAGTDTSAATLVWAMTELMKNP 105427
105428
IVMKKAQEEFRNSIGKKGFVDEDDLQMLCYLKALVKETMRLHPAAPLLVPRETREKCVID 105607
105608
GYEIAPKTLVFVNAWAIGRDPEFWENPEEFMPERFLGSSIDFKGQDYQFIPFGGGRRACP 105787
105788
GSLLGVVMVELTLANLLYSFDWEMPAGMNKEDIDTDVKPGITVHKKNALCLLARSHT* 105961
>CYP71AT8 AM489206.2a 58% to
71AT1 tomato
1212
MMILLLILLALPLFLLFLLRNQRRAPLPPGPPGLPFIGNLLQLDKSAPHLYLWR 1373
1374
LSKQYGPLMFLRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSLLGQQKLFYNGLGLTFT 1553
1554
PYNDYWREMRKICVLHLFNSKRVQSFRYIREDEVLEMIKKISKFASASKLTNLSEILIPL 1733
1734
TSTIICRVAFGKRYDDEGCERSRFHELLGGIQTMAIAFFFSDYFPLMSWVDKLTGMISR 1910
1911
LEKVSEELDLFCQKIIDEHLDPNKPMPEQEDITDILLRLQKDRSFTVDLTWDHIKAILM (0) 2087
2206
DIFIAGTDTSAATLVWAMTELMKNPIVMKKAQEEFRNSIGKKGFVDEDDLQMLCYLKA 2378
2379
LVKETMRLHPAAPLLVPRETREKCVIDGYEIAPKTLVFVNAWAIGRDPEFWENPEEFMPE 2558
2559
RFLGSSIDFKGQDYQFIPFGGGRRACPGSLLGVVMVELTLANLLYSFDWEMPAGMNKEDI 2738
2739
DTDVKPGITVHKKNALCLLARSH 2807
>CYP71AT9 CAAP02000328.1g, 73% to
CAN64422.1
(CAO61031.1)
on contig CU459218.1 chr18 scaffold_1
111012
MILHLILLALPLFLLFLVRNHRNNGRTPLPPGPPGLPFIGNLLQISKTAPHLYLWQLSKQ 111191
111192
YGSLMFLRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSMLGLKKLTYNGLSLSVAPSND 111371
111372
YWREMRKVCALHLFNSKRVQSFRHIREDEVLETVKKISKFASASKLTNLSEILILLTSTI 111551
111552
ICRVAFGKRYDDEGCERSRFHELLGGVQTMSMAFFFSDHFPLMGWVDKLTGMIARLEKIF 111731
111732 EELDLFCQEIIDEHLDPNRSKLEQEDITDVLLRLQKDRSSTVDLTWDHIKAMFV
(0) 111929
111831 DIFVAGTDTSAATVVWAMTELMKNPIVMKKAQE 112091
112092
ELRNLIGKKGFVDEDDLQKLSYLKALVKETMRLHPAAPLLVPRETLEKCVIDGYEIAPKT 112271
112272
LVFVNAWAIGRDPEFWENPEEFMPERFLGSSIDFKGQDYQLIPFGGGRRVCPGLLLGAVM 112451
112452
VELTLANLLYSFDWEMPAGMNKEDIDTDVKPGITMHKKNALCLLARSHI* 112601
>CYP71AT9 AM489206.2b 57% to
71AT1 tomato, 88% to AM489206.2a
same
as partial seq CAN71113.1
7716
MILHLILLALPLFLLFLVRNHRNNGRTPLPPGPPGLPFIGNLLQISKTAPHLYLWQLSKQY 7898
7899
GSLMFLRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSMLGLKKLTYNGLSLSVAPSNDY 8078
8079
WREMRKVCALHLFNSKRVQSFRHIREDEVLETVKKISKFASASKLTNLSEILILLTSTII 8258
8259
CRVAFGKRYDDEGCERSRFHELLGGVQTMSMAFFFSDHFPLMGWVDKLTGMIARLEKIF 8435
8436
EELDLFCQEIIDEHLDPNRSKLEQEDITDVLLRLQKDRSSTVDLTWDHIKAMFV (0) 8597
8697
DIFVAGTDTSAATVVWAMTELMKNPIVMKKAQEELRNLIGKKGFVDEDDLQXLSYLKA 8870
8871
LVKETMRLHPAAPLLVPRETLEKCVIDGYEIAPKTLVFVNAWAIGRDPEFWENPEEFMPE
9050
9051
RFLGSSIDFKGQDYQLIPFGGGRRVCPGLLLGAVMVELTLANLLYSFDWEMPAGMNKEDI 9230
9231 DTDVKPGITMHKKNALCLLARSHI* 9305
>CYP71AT10Pv1 CAAP02000328.1h,
pseudogene, 70% to CAN64422.1, 96% to CAN71114.1
94%
to CYP71AT5P
116430
LHLPPGPGLPFIGNLYQMDNSTPHVYLWQLSKQYGPILSLGLGLVPTLVDSLAKMAKEL 116606
116607 LKAHDLEFSSRSSSLGQQSVT 116669
YNGLDLD
117280 FAPYDGYWREMRKICVLHPFSSKRVQSFRSIREDEVSRIIEKISKSASAAKLTDLSETVM 117459
117460
LLTSNIICRTAFGKRYEDKGYDRSRFHGLLNDAQAMMGSFFFTDHFPSMGWVDKLTDLIA 117639
117641
RPEKNFKELDLFYQEVIDEHLDPKRPKQEQEDIAVVLLRLQRERLFSVDLTWDHIKAVLM 117820
117972
DVFVAGTDPGAATLVWAMAEVTKNPGGKKKAQEELRTVFGRKGFVDEDDLHKLPYLKA 118145
118146
LVKETLRVHPPAPLLLTKETLENCTIDAYDIPPKTLVFVNAWAIGRDPEAWENPEEILPE 118325
118326
RFLSSSVDFKGQDYELISFSVGRRGCPGIHLGVVTVELALANLLYSFD*EMPAGMNKENI 118505
118506 DMDMKPGLTLDKRNALCLQARQYNLAS* 118589
>CYP71AT10Pv2 AM489206.2c pseudogene
70% to AM489206.2a 56% to 71AT1
13061
PPGPGLPFIGNLYQMDNSAPHVYLWQLSKQYGPILSLGLGLV 13186
14708
GVTXTLVVSSARMAKEVLKAHDLEFSSRSSSLGQQRLSYNGLDLAFAPYDGYWREMRKICVLHPF 14902
14903
SSKRVQSFRSIREVEVSRMIEKFSKSASAAKLTDLSETVMLLTSNIICRTAFGKRYEDK 15079
15080
GYDRSRFHGLLNDAQAMMGSFFFTDHFPSMGWVDKLTXL
LLRPEKNFKELDLFY*EIIDEHLDPKRPKQEQEDIXVV 15311
15312
LLRLQRERLFLVDLTWDHIKAVPM (0)
15535
DVFVAGTDPGAATLVWAMAEVTKNPGGKKKAQEELRTVFGRKGFVDEDDLHKLPYLKA 15708
15709
LVKETLRVHPPAPLLLXKETLENCTIDGYDIPPKTLVFVNAWAIGRDPEAWENPEEILPE 15888
15889
RFLSSSVDFKGQDYELISFSVGRRGCPGIHLGVVTVELALANLLYSFD*EMPAGMNKENI 16068
16069
DMDMKPGLTLDKRNALCLXARQY 16137
>CYP71AT10Pv2 CAN71114.1 50% to
83A2/B1
same
as AM489206.2c only 9
aa diffs with CYP71AT10Pv1 (97% identical)
MAKEVLKAHDLEFSSRSSSLGQQRLSYNGLDLAFAPYDGYWREMRK
ICVLHPFSSKRVQSFRSIREVEVSRMIEKFSKSASAAKLTDLSETVMLLTSNIICRTAFGKRYEDKGYDR
SRFHGLLNDAQAMMGSFFFTDHFPSMGWVDKLT
DVFVAGTDPGAATLVWAMAEVT
KNPGGKKKAQEELRTVFGRKGFVDEDDLHKLPYLKALVKETLRVHPPAPLLLXKETLENCTIDGYDIPPK
TLVFVNAWAIGRDPEAWENPEEILPERFLSSSVDFKGQDYELISFSVGRRGCPGIHLGVVTVELALANLL
YSFD
>CYP71AT11P CAAP02000504.1 pseudogene 76% to CAAP02000328.1e 50% to 71B37
104371
MLLLLVFLMVLPLFLLWKHRVNGGKLLPPGPPGLPLIGSLHQL 104499
116804
SL*SLTDTYGISLNNMDPLMFLHLGFEPILVVSSPRTAEVMKTHDPEFSSRPSLLVIT 116977
125485
ALQKLSYNGLDLAFASYGAYWREIRKICV 125571
125582
DIVDILLKLHKDRLFTVDLSWNHIKAVLM (0) 125668
126490
AGTDTVAATMVWTMTALMKNPRVMKKAQKEVRTLVGEKCFVDEDDIQKLTYMKALVKESMR 126672
126673
LYPAAPLLIPRETLQKCNIDGY*IPTKTLVFVNAWAIGRDPESWENPEEFMPERFLGTCI 126852
126853
DFKGQDYKLIPFGAGRRIWPGMNLGAVTVELALANLLYSFDWEMPAGMKMEDIDTDAKPG 127032
127033
LTMTKKNDLYLVARNYI* 127086
>CYP71AU3 CAAP02005726.1 85% to
CAAP02001743.1b, 54% to 71A26
11548
MGSFLDLLYKENASFFLLFLPFF
11479
VFIYFLIKWLYPTTPAVTTKRLPPSPPKLPIIGNLHQLGLLPHRSLWALAQRHGPIMLLH 11300
11299
FGKVPVVIVSAADAAREIMKTNDVIFLNRPKSSIFAKLLYDYKDVSMAPYGEYWRQMRSI 11120
11119
CVLHLLSNRRVQSFRGVREEETALLMEKISSSSSSSTPIDLSKMFLSLTNDLICRVALGR 10940
10939
KYSGDETGRKYRELLKEFVGLLGGFDVADYIPWLSWVNFINGLDAKVEKVAKEFDRFLDE 10760
10759
VVKEHVERRKRGVDEEVKDFVDVLLGIQEDNVTGVAITGVCIKALTL (0) 10619
10189
DMFAAGSDTTYTVLEWAMTELLRHPQVMRQLQNEVRGIAQGKLLITEDDLDKMQYLKAVI 10010
10009
KETLRLHPPVPLLLPRESTRGAKIMGYDIEVGTQVITNAWAIGRDPLLWDEAEEFRPERF 9830
9829
LNSSIDFTGKDFELIPFGAGRRGCPGTLFAAMAIEVALANLVHQFDWEVGGGGRREDLDM 9650
9649 TECTGLTIHRKVPLLAVATPWPR* 9578
>CYP71AU4 gi|147767047|emb|CAN67678.1|
46% to 71T4
CAAP02004888.1
13222-15147 1 aa diff
MLLLDPLSFSLFPFFFFIVLLVRWLFSTPPTTHKTLPPSPPRLPVLGNMHQLGIYPYRSLLCLARCYGPL
MLLQLGRVRTLVVSSPDAAQEIMKTHDLIFANRPKMSLGKRLLYDYKDVSVAPYGEYWRQMRSICVLHLL
SNKRVQSFNTVRREEISLLIQKIEEFSSLSTSMDLSGMFMRLTNDVICRVAFGRKYSGDERGKKFRRLLG
EFVELLGGFNVGDYIPWLAWVEYVNGWSAKVERVAKEFDEFLDGVVEEHLDGGTGSIAKGDNEKDFVDVL
LEIQRDGTLGFSMDRDSIKALILDIFAGGTDTTYTVLEWAMTELLRHPKAMKELQNEVRGITRGKEHITE
DDLEKMHYLKAVIKETLRLHPPIPLLVPRESSQDVNIMGYHIPAGTMVIINAWAMGRDPMSWDEPEEFRP
ERFLNTNIDFKGHDFELIPFGAGRRGCPGISFAMATNELVLANLVNKFDWALPDGARAEDLDMTECTGLT
IHRKFPLLAVSTPCF*
>CYP71AU5 CAAP02003357.1 92% to CAAP02005726.1 53% to 71A26
38067
MGSFLGLLYKENDS
38025
FFLLLLPFFIFTHFLIKWLYPTTPAVTTKKLLPSPPKLPIIGNLHQLGSLPHRSLWALAQ 37846
37845
RHGPLMLLHFGRVPVVIVSAVDAAREIMKTNDAIFSNRPKSNISAKLLYDYKDVSTAPYG 37666
37665
EYWRQMRSICVLHLLSTRRVQSFRGVREEETALLMEKISSSSSSSIPIDLSQMFLSLTND 37486
37485
LICRVALGRKYSGDENGRKYRELLKEFGALLGCFNVGDYIPWL 37357
37357 SWVNFINGLDAKVEKVAKEFDRFLDEVVKEHVERRKRGVDEEVKDFVDVLLGIQEDN
37187
37186
VTGVAITGVCIKALTL 37139
36747
DMFAAGSDTTYTVLEWAMTELLRHPQVMRQLQNEVRGIAQGKLLITEDDLDKMQYLKAVI 36568
36567
KETLRLYPPIPLLVPRESTRDAKIMGYDIAARTQVITNVWAIGRDPLLWDEAEEFRPER 36391
36390
FLNSSIDFRGQDFELIPFGSGRRGCPGTLFAAMAIEVVLANLVHRFDWEVGGGGRREDLD 36211
36210
MTECTGLTIHRKVPLLAVATPWPR* 36136
@9
>CYP71AU6P CAAP02001743.1b,
pseudogene, 58% to CAN67678.1
35127
IFIYFLIKWLYPTTSTVTTKRLPHFPLKLPIIGNLFQLGSLSHRSL*VLAQRHGSLMLLH 34948
34947 FGRVPVVIVSIANTAREIMKTNDVIFSNRSKSNISAKLLYDYKDVSTTPYKEYWRQMRSI 34768
34767
CVLHFLSTRRVLSFRGVQEEETTLMMEKISSSASSTPIDLSQMFQSLTNDLICRVSL*RK 34588
34587
YSGDETGRKYRELLKKFVGLLGGFNVGDYIPWLSWVNFINGLETKVEKVSKVFDRFLD 34414
34413 EVVK*HVERRKRCGVDEEMKDFVDVLL 34333 XXXXXXXXXXXXXXXXXXXXXX
33813
DMFAARSDSTYTVLEWAMTKLLRHPQVMRQL*NEARGIAQGKLLITEDDLGKMQYLMAVIK 33631
33631
ETLRLHPLIPLLILRESTRGAKIMGYDIEAGTRVITNAWPIGGDPLLWDEAEEFWPERF 33455
33454
LNSSIDFTGKDFELISFGAGQRGCPGTLFAKMAIELVLANLVHHFDWEVAGGGRREDLDM 33275
33274
TECIGLTIHIKVLLLAVATP 33215
#11
>CYP71BC1 gi|147861230|emb|CAN80448.1|
= AM435124.2
CAAP02002092.1
15806-13388 (-) strand 1 aa diff.
MTMKISENMLLLFSQSSANQWLLALGILSFPILYLFLLQRWKKKGIEGAARLPPSPPKLPIIGNLHQLGK
LPHRSLSKLSQEFGPVLLLQLGRIPTLLISSADMAKEVLKTHDIDCCSRAPSQGPKRLSYNFLDMCFSPY
SDYWRAMRKVFVLELLSAKRAHSLWHAWEVEVSHLISSLSEASPNPVDLHEKIFSLMDGILNMFAFGKNY
GGKQFKNEKFQDVLVEAMKMLDSFSAEDFFPSVGWIIDALTGLRARHNKCFRNLDNYFQMVVDEHLDPTR
PKPEHEDLVDVLLGLSKDENFAFHLTNDHIKAILL
(0)
NTFIGGTDTGAVTMVWAMSELMANPRVMKKVQAEV
RSCVGSKPKVDRDDLAKLKYLKMVVKETFRMHPAAPLLIPHRTRQHCQINANGCTYDIFPQTTILVNAFA
IGRDPNSWKNPDEFYPERFEDSDIDFKGQHFELLPFGAGRRICPAIAMAVSTVEFTLANLLYCFDWEMPM
GMKTQDMDMEEMGGITTHRKTPLCLVPIKYGCVE*
@10
>CYP71BC1-de2b CAAP02002092.1 C-term
pseudogene, 67% to 71BC1
16741 KAQHTDMEEVGGITISR 16691
16647 PLCFVPIKYGWV 16612
@11
>CYP71BC3-de1b CAAP02002092.1 N-term
pseudogene 80% to 71BC2
19486
VVLYSVICFFLVQKWGNRVVVERATTPPSPSKLAIIGNLHQLS*WSYRSLWTLSQKYGSI 19307
19306 MFLQLGSV 19283
#12
>CYP71BC3
gi|147781883|emb|CAN72169.1| 62% to CYP71BC1
CAAP02002092.1
28083-26374 (-) strand 100% match
MAMEIAEAVMEVFSPSSVTDWLFTLSVVLLSVLCFFLVQKWGNRAVLERATTPPSPPKLPIIGNLHQLSKLH
HRSLWTLAQKHGSIMFLQLGSIPTIVISSADMAEQVLRTRDNCCCSRPSSPGSKLLSYNFLDLAFAPYSD
HWKEMRKLFNANLLSPKRAESLWHAREVEVGRLISSISQDSPVPVDVTQKVFHLADGILGAFAFGKSYEG
KQFRNQKFYDVLVEAMRVLEAFSAEDFFPTGGWIIDAMSGLRAKRKNCFQNLDGYFQMVIDDHLDPTRPK
PEQEDLVDVFIRLLEDPKGPFQFTNDHIKAMLM
(0)
NTFLGGTDTTAITLDWTMSELMANPRVMNKLQAEVRS
CIGSKPRVERDDLNNLKYLKMVIKEALRKHTPIPLLIPRETMDYFKIHDKSSSREYDIYPGTRILVNAWG
IGRDPKIWKDPDVFYPERFEDCEIEFYGKHFELLPFGGGKRICPGANMGVITAEFTLANLVYCFDWELPC
GMKIEDLGLEEELGGITAGRKKPLCLVARRCGCSCTEPM*
@12
>CYP71BC3-de1c CAAP02002092.1 N-term
pseudogene, 78% to CYP71BC2
29938
KLATIGNLHQLSKWSYRSLWTLSQKYGSIMFLQLGSV 29828
@13
>CYP71BC3-de1d CAAP02002092.1 N-term
pseudogene 79% to CYP71BC2
31559
VVLFSVICFFLVQKWGNRVVVERATTPPSPSKLAIIGNLHQLS*WSYRSLWTLSHKYGSI 31380
31379 MFLQLGSV 31356
@14
>CYP71BC3-de2b CAAP02002092.1 C-term
pseudogene 95% to 71BC2
39030
KMVIKEAMRKHTPIPLLIPRETMDYFKIHDKSSSREYDIYRETRILVNAWGIGRDPKSWK 38851
38850
DPDVFYPERFEDCEIEFYGKHFELLPFGGGKRICPGANMGVITAEFTLANLVCCFDWELP 38671
38670 CGMKIEDLGLEEELGGITASRKTPLCLVARRCGC 38569
>CYP71BE1 CAAP02002803.1, 46% to
CAAP02001743.1a, 42% to 71B37,
6
aa diffs to CYP71BE1 AM445470.2
28146 MEFPSSFLFPFLLFLFILFKVSKKSKPQISIPKRPPGPWKLPLIGNLHQLVGSLPHHSLRDL 28331
28332
AKKYGPLMHLQLGQVSMLVVSSPEIAKEVMKTHDINFAQRPHLLATRIVSYDSTDVAFSP 28511
28512
YGDYWRQLRKICVVELLSAKRVKSFQVIRKEEVSKLIRIINSSSRFPINLRDRISAFTYS 28691
28692
AISRAALGKECKDHDPLTAAFGESTKLASGFCLADLYPSVKWIPLVSGVRHKLEK 28856
28857
VQQRIDGILQIVVDEHRERMKTTTGKLEEEKDLVDVLLKLQQDGDLELPLTDDNIKAVIL (0) 29036
DIFGGGGDTVSTAVEWTMAEMMKNPEVMKKAQAE
29216
29217
VRRVFDGKGNVDEAGIDELKFLKAVISETLRLHPPFPLLLPRECREKCKINGYEVPVKTR 29396
29397 VVINAWAIGRYPDCWSEAERFYPERFLDSSIDYKGADFGFIPFGSGRRICPGILFGIPVI 29576
29577
ELPLAQLLFHFDWKLPNGMRPEDLDMTEVHGLAVRKKHNLHLIPIPYSPLTVG* 29738
>CYP71BE4P CAAP02000100.1e pseudogene
97018
LIGNMHQLISYLPHHALRDLAKKHGPLMDLQLGEVSTIIVSSPETAKGVIKTQII 96854
96853
ISQRPHV*KFWI*ELFTAKPVQFFQSIREEEVSGLVRSIS 96734
96733
LNIRSPINLAKE 96698
96499
SGTMVHRVMSEMLKNPQIMKKAQAEVRQTFETKGEVDDIGIHELKILKLVVKETPRLHPP 96320
96319
APLLLPRECGERFEISGCDDIPLNPMSLLLHGQLEEMEALNST*QLQPREIFLKSLVDYK 96140
96139
GTNFDFIPFG 96110
>CYP71BE5 CAAP02000100.1d 61% to CAAP02002803.1
81714
MELQFSFFPILCT
81675
FLLFIYLLKRLGKPSRTNHPAPKLPPGPWKLPIIGNMHQLVGSLPHRSLRSLAKKHGPLM 81496
81495
HLQLGEVSAIVVSSREMAKEVMKTHDIIFSQRPCILAASIVSYDCTDIAFAPYGGYWRQI 81316
81315
RKISVLELLSAKRVQSFRSVREEEVLNLVRSVSLQEGVLINLTKSIFSLTFSIISRTA 81142
81141
FGKKCKDQEAFSVTLDKFADSAGGFTIADVFPSIKLLHVVSGMRRKLEKVHKKLD 80977
80976
RILGNIINEHKARSAAKETCEAEVDDDLVDVLLKVQKQGDLEFPLTMDNIKAVLL 80812
80544
DLFVAGTETSSTAVEWAMAEMLKNPRVMAKAQAEVRDIFSRKGNADET 80401
80400
VVRELKFLKLVIKETLRLHPPVPLLIPRESRERCAINGYEIPVKTRVIINAWAIARDPKY 80221
80220
WTDAESFNPERFLDSSIDYQGTNFEYIPFGAGRRMCPGILFGMANVELALAQLLYHFDWK 80041
80040
LPNGARHEELDMTEGFRTSTKRKQDLYLIPITYRPLPVE* 79921
>CYP71BE6 CAAP02000100.1c 61% to CAAP02002803.1
60542
MELQFSFFPILCT
60513
FLLFIYLLKRLGKPSRTTHPAPNLPPGPWKLPIIGNMHQLVGSLPHHSLRNLAKKHGPLM 60334
60333
HLQLGEVSAIVVSSREMAKEVMKTHDIIFSQRPCILAASIVSYDCTDIAFAPYGDYWRQI 60154
60153
RKISILELLSAKRVQSFRSVREEEVLNLVRSISSQEGVSINLTESIFSLTFSIISRAA 59980
59979
FGKKCKDQEAFSVTLEKFAGSGGGFTIADVFPSIKLLHVVSGIRHKLEKIHKKLD 59815
59814
TILENIINEHKARSEASEISEAEVDEDLVDVLLKVQKQGDLEFPLTTDNIKAILL 59650
59202
DLFIAGSETSSTAVEWAMAEMLKNPGVMAKAQAEVRDIFSRKGNADETMIHELKFLK 59032
59031
LVIKETLRLHPPVPLLIPRESRESCEINGYEIPVKTRVIINAWAVARDPEHWNDAESFNP 58852
58851
ERFLDSSIDYQGTNFEYIPFGAGRRMCPGILFGMANVEIALAQLLYYFDWKLPNGTQHEE 58672
58671
LDMTEDFRTSLRRKLNLHLIPITYRPLPVE* 58579
>CYP71BE6-de1b CAAP02000100.1c-de1b
pseudogene
N-term
63932
ISILCTFLLFIYLLKRLGKPYRTNGPARKLPAGPWKLPIIGNMHQLFGSLPHHSLRNLAK 63753
63752
QHGTLMHLQPGEASTIVVS*REMEK 63678
>CYP71BE7 CAAP02000100.1b 61% to CAAP02002803.1
47703
MELHFPSFH
47676
ILSAFILFLVVVLRTQKRSKTGSLTPNLPPGPWKLPLVGNIHQLVGSLPHHALRDLAKKY 47497
47496
GPLMHLQLGEVSTIVVSSSEIAKEVMKSHDIIFAQRPHILATRIMSYNSTNIAFAPYGDY 47317
47316
WRHLRKICMSELLSANRVQSFQSIRNEEESNLVRSISLNTGSPINLTEKTFASICAIT 47143
47142
TRAAFGKKCKYQETFISVLLETIKLAGGFNVGDIFPSFKSLHLISGMRPKLEKLH 46978
46977
QEADKILENIIHEHKARGGTTKIDKDGPDEDLVDVLLKFHEDHGDHAFSLTTDNIKA 46807
46806
VLL (0) 46798
46624
DIFGAGSEPSSTTIDFAMSEMMRNPRIMRKAQEEVRRIFDRKEEIDEMGIQELKFLKLVI 46445
46444
KETLRLHPPLPLLLPRECREKCEIDGHEIPVKSKIIVNAWAIGRDPKHWTEPESFNPERF 46265
46264
LDSSIDYKGTNFEYIPFGAGRRICPGILFGLASVELLLAKLLYHFDWKLPNGMKQQDLDM 46085
46084
TEVFGLAVRRKEDLYLIPTAYYPLSHE* 46001
>CYP71BE8P CAAP02000100.1a frameshift and stop possible
pseudogene
44% to
CYP71B33, 64% to CAAP02002803.1
MEIHLPSSYAFFAFLLSMFIVFKIGKVQIQNL
31068
PAKLPPGPWKLPLIGNMHQLVGSLPHHTLKRLASKYGPFMHLELGEVSALVVSSPEIARE 30889
30888
VMKTHDTIFAQRPPLLSSTIINYNATSISFSPYGDYWRQLRKICTIELLSAKRVKSFQSI 30709
30708
RE*EVSKLIWSISLNAGSPINLSEKIFSLTYGITSRSAFGKKFRGQDAFVSAIL 30547
30546
EAVELSAGFCVADMYPSLKWLHYISGMKPKLEKVHQKIDRILNNIIDDHRKRKTTTKAG 30370
30369
QPETQEDLVDVLLNLQEHGDLGIPLTDGNVKAVLL (0) 30265
29794
DIFSGGGETSSTAVVWAMAEMLKSPIVMEKAQAEVRRVFDGKR 29666
29665
DINETGIHELKYLNSVVKETLRLHPSVPLLLPRECRERCVINGYEIPENTKVIINAWAIA 29486
29485
QDPDHWFEPNKFFPERFLDSSIDFKGTDFKYIPFGAGRRMCPGILFAIPNVELPLANLLY 29306
29305
HFDWKLPDGMKHEDLDMTEEFGLTIRRKEDLNLIPIPYDPFLVL* 29171
#13
>CYP71BE9Pv1 CAAP02000216.1a pseudogene CYP71BE like, 78% to CAAP02001833.1a
7514
MDFLFSSILFAFLLFLYMLYKMGERSKASISTKKLPPGPWKLPLL 7648
7648
GNMHQLVGSLPHQSLSRLSKQYGPLMSLQLCEVYALTISSPEMAKQV 7788
7789
MKTHDINFAHRPPLLASNVLSYDSTDILYPPYGDYWRQLRNICVVELLTSKRVKSFQLVR 7968
7969
EAELSNLITAVVSCSRLPFNRNENLSSYTFSIISRAAFGEKFEDQDAFISVTKEMAELYS 8148
8149
GFCVADMYPSVKWLDLISGMRYKLDKVFQR 8238
8241
DRILQNIVDEHRDKL*PQAGKLQGEEDLVDVLLKLQQHGDLEFPLTDNNIKGVIL 8405
13594
NIFSGGGKTTFTSVD*
13642
AMSEMLKNPRVMEKAQAEVRRVFDGKGNVDETGLDG 13749
13749
IKIF*AVVKETLRLHTPFPLLLPRECREMCWIDGYEIPEKTRIIVNAWAIG*DSVYWVEA 13928
13929
ERFYPERFLDSSIDYKGTDFGYIPFGAGRRICPGIPFAMPYIELPLAHLLYHFDWKLP 14102
14103
KGIKAEDLDMTEAFCLAVCRKQDLHLIPIPYNPLHAQ* 14216
>CYP71BE9Pv2 Pinot noir (a highly
heterozygous grape genome)
CAN66039.1
top part is a retrotransposon seq like AAP46207 putative retrotransposon protein Oryza
sativa
97%
(6 aa diffs) to CAAP02000216.1a probable ortholog to the pseudogene
from
AM472203.2 exon 2 only
MNEEMKALQIDLPIGKIPVGCRWVFTIKYKVDGTVEWLRKSLYGLKQSPRAWFGRFTSFMKSIGYKQSNS
YHTLFLKHNKEQIIALIVCVDDMIVIGNDYEEMKTLQEHLAHDFEMKDLDKLKYFLGIEVSRSKKAYALS
VVCQFMHSPSKEHMNVVIHILRYLKSSPGKGILFTKGDNLDINGYTDADWAGSIQDRCSTSWYFTFKVVA
RSNAEAEYKGMAKAICELLWIRNLVKDLHIKQVSPMKLYCDNKAACDIAHNPVQHDRTKYVEVGRHFIKE
KLESKLIEVPHVRSQDQLADVLTKAMSNQ
2182
NIFSGGGKTTSTSVD*AMSEMLKNPRVMEKAQAEVRRVFDGKGNVDETGLDGLKFFK
2352
2352 AVVKETLRLHTPFPLLLPRECREMCWIDGYEIPEKTRIIVNAWAIG*DSVYWVEA
2516
ERFYPERFLDSSIDYKCTDFGYVP
FGAGRRICPGIPFAMPYIELPLAHLLYHFDWKLPKGIKAEDLDMTEAFCLAVCRKQDLHLIPIPYNPLHAQ* 2804
>CYP71BE10v1 CAAP02000216.1b 79% to CAAP02001833.1a
51405
MEFSSSSLLFAFLLFLYMLYKIGKRSKANISTQKLPPGPWKLPLIGNVHQLVGSLPHRS 51581
51582
LTLLAKKYGPLMRLQLGEVSTLIVSSPEMAKQVMKTHDTNFAQRPILLATRILSYDCSGV 51761
51762
AFAPYGDYWRQLRKICVVELLTAKRVKSFQSVREEEISNLITMVTSCSRLQINFTEKISS 51941
51942
LTFSIIARAAFGKKSEDQDAFLSVMKELVETASGFCVADMYPSVKWLDLISGMRYKIDKV 52121
52122
FRMTDRILQNIVDEHREKLKTQSGKLEGEADLVDVLLKLQQNDDLQFPLTDNNIKAVIL (0) 52298
52520
DIFGGAGESTSTSVEWAMSEMLKAPIVIEKAQAEVRSVFDGKGHVDETAIDELKFLKAVV 52699
52700
NETLRLHPPFPLLLPRECREMCKINGYEIPEKTRIIVNAWAIGRDSDYWVEAERFYPERF 52879
52880
LDSSIDYKGTDFGYIPFGAGRRICPGILFAMPGIELPLANLLYHFDWKLPNGMKAEDLDM 53059
53060
TEAFGLAVRRKQDLHLIPIPYNPSHAD* 53143
>CYP71BE10v2 Pinot noir (a highly
heterozygous grape genome)
CAN81963.1 (partial translation
of intact gene)
Overall
98% to 71BE10 probable ortholog
from
AM487125.2 first exon 97% (7 aa diffs) to 71BE10,
second
exon 1 aa diff to 71BE10
12930
MEFFSSSLLFAFLLFLYMLYKIAKRSKDNISTQKLPPGPWKLPLIGNVHQLVGSLPHRSL 12751
12750
TXLAKKYGPLMRLQLGEVSTLIVSSPEMAKQVMKTHDTNFAQRPILLATRILSYDCSGVA 12571
12570
FAPYGDYWRQLRKICVVELLTAKRVKSFQSVREEEISNLITMVTSCSRLQINFTEKISSL 12391
12390
TFSIIARAAFGKKSEDQDAFLSVMKELVEXASGFCVADMYPSVKWLDLISGMRYKIDKVF 12211
12210
RMTDRILQNIVDEHREKLKTQSGKLEGEADLVDVLLKLQQNGDLQFALTDNNIKAVIL (0) 12037
11816
DIFGGAGESTSTSVEWAMSEMLKAPIVMEKAQAEVRSVFDGKGHVDETAIDELKFLKAVV 11637
11636
NETLRLHPPFPLLLPRECREMCKINGYEIPEKTRIIVNAWAIGRDSDYWVEAERFYPERF
11457
11456
LDSSIDYKGTDFGYIPFGAGRRICPGILFAMPGIELPLANLLYHFDWKLPNGMKAEDLDM 11277
11276 TEAFGLAVRRKQDLHLIPIPYNPSHAD* 11193
>CYP71BE11-de1b CAAP02000216.1c pseudogene N-term
66338
WKLPLIVNMHGLV 66376
>CYP71BE11 CAAP02000216.1c pseudogene 85% to CAAP02001833.1a
66818
LLFLYMLYKIGKRSKGNISAQKLPLEPWKLPLIGNMHQLIDGSLPHRSLSRLTKQYESLM 66997
66998
SLQLGEVSTLIISSPEMAKQVMKTHDINFAQR 67093
72159
STLLATNILSYHSIDIDFPPYGDYGRHLQKICVVELLTS*RFKSFQLVGEDELSNL 72326
72327
IT 72332
72334
TLTSCSRLPINLTDKLSSCTFAIIAGAAFGEKCKDQDAFILVLKETLELLFGLCVTNM 72507
72508
YPSVKWLDLISGMRYKIEKVFQRTDRILQNIVDEHRDKMQTEAGKLQGEENIVDVLLKIQ 72687
72688
QHGDHEFPLTDNNIKSXXX 72735
74253
DIFAGGGETTSISVKWAISEMLKNX 74324
74320
RMMEKAQAEVRRVFDGQGNADEELKFLKGVVKETLRLHPPLPLLIPRECREMCEINRYEI 74499
74500
PKKTLIIINAWAIGRDSNYWVEAERFYPDRFLDSSIDYKGTDFGYIPFGAGRRMYHGILF 74679
74680
SLPIIELSLAHLLYHFDWKLPNGMKA*DLDMTEALGLVVRRKQDLHLIPILDNPLHAQ* 74856
>CYP71BE12 CAAP02000216.1d one frameshift 83% to CAAP02001833.1a
86311
MDFQFSSILFAFLLFLYMLYKMGERSKASISTQKLPPGPWKLPLIGNMHQLVGSLPHQS 86487
86488
LSRLAKQYGPLMSLQLGEVSTLIISSPDMAKQVMKTHDINFAQRPPLLASKILSYDSMDI 86667
86668
VFSPYGDYWRQLRKICVVELLTAKRVKSFQLVREEELSNLITAIVSCSRPINLTENIFS 86844
86845
STFSIIARAAIGEKFEGQDAFLSVMKEIVELFSGFCVADMYPSVKWLDLISGMRYKLDKV 87024
87025
FQRTDRMLQNIVDQHREKLKTQAGKLQGEGDLVDVLLELQQHGDLEFPLTDNNIKAVIL (0) 87201
87442
DIFSGGGETTSTSLDWAMSEMLENPRVMEKAQAEVRRVFDGKGNVDE 87582
87583
TGLDELKFLKAVVKETLRLHPPLPLLVPRECREMCEINGYEIPKKTSIIVNAWAIGRDSD 87762
87763
YWVEAERFYPERFLDSSIDYKGTDFGYIPFGAGRRMCPGILFSMPSIELSLAHLX 87924
87927
HFDWKLPNEMKAEDLDMTEAFGLAVRRKQDLLLIPIPHNQSHAQ* 88061
>CYP71BE13-de2b CAAP02001833.1a
pseudogene 94% to CAN81963.1
10995
GHVDENAIDELKFLKAVVKETLRLHPPFPILLPRECREMRKINGYRIPEKTRIIVNAWA 11171
11172
IG*DSDYWVEAERFYPERFLDSSIDYKGADFGYIPFGAGRRICPGILFAMPNIELPLAYL 11351
11352
LYHFDWKLPNGMKAEDLDMTEAFGLAVRRKQDLHLIPIPYKP 11477
>CYP71BE13 CAAP02001833.1b 69% to
71BE1
15131
MDVLFSSILFASLLFLYMLYKIGKRWRGNISSQKLPPGPWKLPLIGNMHQLIDGSLPHHSLSRLA 15325
15326
KQYGPLMSLQLGEISTLIISSPEMAKQILKTHDINFAQRASFLATNTVSYHSTDIVFSPY 15505
15506
GDYWRQLRKICVVELLTSKRVKSFQLIREEELSNLITTLASCSRLPINLTDKLSSCTFAI 15685
15686
IARAAFGEKCKEQDAFISVLKETLELVSGPCVADMYPSVKWLDLISGMRHKIEKVFKRTD 15865
15866
RILQNIVDEHREKMKTEAGKLQGEEDLVDVLLKLQQHGDLEFPLTDNNIKAVIL (0) 16027
16320
DIFAGGGETTSISVEWAMSEMLKNPRVMDKAQAEVRRVFDGKGNADEELKFLKVVV 16487
16488
KETLRLHPPFPLLIPRECREMCEINGYEIPKKTLIIVNAWAIGRDSDHWVEAERFYPERF 16667
16668
LDSSIDYKGTDFGYIPFGAGRRMCPGILFSLPIIELSLAHLLYNFDWKLPNGMKADDLDM 16847
16848
TEALGIAVRRKQDLHLIPIPYNPSHVQ* 16931
>CYP71BE14P CAAP02008751.1 CYP71BE
pseudogene 64% to 71BE1
6182
IQLTVSTLVVSSPEIAKEFMKTDDVSFAQRPNILVTSIVSYGSTNIGFAPYSDYWRQVR 6006
6005
KLCATELLSAKRVKSFQLIREEEVSNVIKRIASHSGSTINLSEEISSVTLPL 5850
5850
IARAAFGKICKDQDSFIGAVTEMAELATGFCAADVFPSVK*VDQVTGIRSKLEKLHERVD 5671
5670
RILQNIVKEHKESMTTKRGKLEAEDLVDTFLKIQEDGDLKFPLTENNVKAVIL (0) 5512
DMFSG
5255
AGETSSTVGEWAMTELIRHPRVMEKAQ 5175
5178
TRVRREFAGKGTVEESGIHELKFIKAVVKETLRLHPPAPLLLPRECRERC 5029
5028
EINGYEIPVKTRVIDNA*AIGRDPDSWTEPERFNPERFLDSWLDYKGTDFEFIPFGAGRR 4849
4848
MCPDMSFAIPSVELSLANFIYHFDWKLPTGIKPEDLDMTEIISLSVRRKQNLHLIPIPYN 4669
4668
PFPAE* 4651
>CYP71BE15P CAAP02007291.1
pseudogene 78% to 71BE1
6388
MEFSSSSVLFPFLLFLFMLFRIGKRSKPNISTPKLPPGPWKLPLIGNLHQLVGSLPHHSL 6567
6568
KDLAEKYGPLMHLQLGQVS 6624
6627
ASPQIAKEVMKTHDLNFAQRPHLLVTRIVTYDSTDIAFAPYGDYWRQLRKICVIELLSAK 6806
6807
RVRSFQLIRKEEVSNLIRFIDSCSRFPIDLREKISSFTFAVISKAALGKEFKEQDSLESV 6986
6987
LEEGTKLASGFCLADVYPSVKWIHLISGMRHKLEKLHGRIDG 7112
7111
EHRERMEKRTGELEAEEDFIDVLLKLQQDGDLELPLTDDNIKAVIL 7248
7688
GHATASTAVEWAMSEMMKNPRVMEQAQAEVRRVFDGKGDVDETGIDELKFLKAVVSETLR 7867
7868
LHPPFPLLLPRECREKCKINGYEVPVKTRMTINAWAIGRDPDYWTEAERFYPERFLDSSV 8047
8048
DYKGADFGFIPFDAGRRMCPGILFAIPSIELPLAHLLFHFDWELPNGMRHEDLDMTEVHG 8227
8228
LSAKRKHSLHLIPIPYNS*PVG* 8296
>CYP71BE16P CAAP02000648.1 pseudogene 66% to CYP71BE13
1591
QVHQRLDRILQNIIDEHKESKTTTETGKQEANEDLVDILLKLQKHGNFGFPLIDNNIKAIIL (0) 1776
2908
NIFGGGGETSSIAIEWAM*KMM 2973
2975 KNPRVMEKA*AKVRQIFGGKKTLR*WMKQV*DTL
3077
KTKVIINAWAIGRDPYYQTKAKRFHPE*FLDSPIDYKGNNFEYIPFGAGKRICPGILFAIPNIELPL 3277
3278
ANMLDHFDWELLYGMKKDDIDMTESFGLKVRRKQDLCLILIPHNPLHVE* 3427
>CYP71BG1 Solanum tuberosum
DR034423.1,
BQ514535.1, BM114062.1, BQ119583.2, BQ506191.2, CK717210.1
67%
to CYP71BG2P
MEASILQLLLLLSLTSCTILFYKIRRWRRPPSPPSLPIIGHLHLLTDMPHHTFFHLSQKLG
PIIHLQLGQIPTLIISSPRLAELILKTNDHIFCSRPQIIAAQYLSFGCSDITFSPYGPYWRQARKICVTE
LLSSKRVNSFQFIRNEEINRMIQLISSHFDSELSSELDLSQVFFALANDILCRVAFGKRF
IDDRLKDKDLVSVLTETQALLAGFCLGDFFPDWEWVNWLSGMKKRLMNNLKDLGEVCDEI
IDEHLMKKRDDDQNGDGSEDFVDVLLRVQKRDDLQVPITDDNLKALIL
(0)
DMFVAGTDTSAATLEWTMTELARHPSVMKKAQDEVREIAAN
KGKVEEFDLQHLHYMKAVIKETMRLHPPVPLLVPRESIEKCTLDDYEIPAKTRVLINTY
AIGRDPEYWNNPLDYNPERFMEKDIDFRGQDFRFLPFGGGRRGCPGYALGLATIELSLAR
LLYHFDWKLPTGVEAQDVNLSEIFGLATRKRVALKLVPTINKLYLLSD*
>CYP71BG2 tomato breaker fruit
Solanum lycopersicum
BM411522.1,
BM412569.1, BP881630.1, ES895470.1, DB685010
DU947425.1
(GSS)
MEASILQLLLLLSLTSCTILFYKIRGRWRRRPPSPPSLPIIGHLHLLNQMPHHTFFNLSQ
KLGKIIYLQLGQIPTLIISSPRLAELILKTNDHIFCSRPQIIAAQYLSFGCSDITFSPYG
PYWRQARKICVTELLSSKRVHSFEFIRDEEINRMIELISSRSQSEVDLSQVFFGLA
NDILCRVAFGKRFIDDKLKDKDLVSVLTETQALLAGFCFGDFFPDFEWVNWLSGMKKRLM
NNLKDLREVCDEIIKEHLMKNRDDDGSEDFVDVLLKVQKRDDLQVPITDDNLKALI
LDMFVAGTDTSAATLEWTMTELAR
HPSVMKKAQNEVRKIVANRGKVEEFDLQHLHYMKAVIKETMRLHPPVPLLVPRESIEKCS
IDGYEVPAKTRVLINTYAIGRDPEYWNNPLDYNPERFMEKDIDLRG
QDFRFLPFGGGRRGCPGYALGLATIELSLARLL
YRFDWKLPSGVEAQDMDLSEIFGLATRKKVALKLVPTITKLYPTF*
>CYP71BG3 CA993587.1 Gossypium
hirsutum CO128388.1 Gossypium raimondii
DRKWLNSRSQSLTPPSPPS
LPIIGHLHLLTDMPHHTFTILAQKLGPIIYLQLGQVPTVIVSSPRLARLILKTHDHVFSN
RPQLVSAQYLSFNCSDVTFSPYGPYWRQARKICVTELLSSKRVNSFQLIRDEEVSRLL
TTLSAHPGSEVNVSELFLSLANDILCRVAFGRRFTERVGSSNHLAAVLRETQELFAGM
SVGDFFPEWEWVHSVSGYKRRLMKNLNELRRVCDEVIQEHLQRGETGIKEDFVDVLLR
VQKQDNLEVPITDDNLKALVLDMFVAG
TDTSAATLEWTMTELVKHPEIMK
QAQEEVRAVARRTGKAIDETHLQHLHFTKSIIKEAMRLHPTVPLLVPRESMDECIIDGYK
IPPKTRLLINTYAIGRDPNSWDNPLQFNPNRFQDSNIDLKDQDFRFLPFGGGRRGCPGYG
FGLATVEIALARLLFHFDWELPYGIHTDDVDVDEIFGLASRKRTPLILVPTVNEGL*
>CYP71BG4 DY280303.1,
DY276238.1, Citrus clementina,
Citrus
reticulata x Citrus temple EX448715.1 (C-term)
Hybrid:
Two different species of citrus are combined here
To
try to achieve a full seq. Still missing the C-term
63%
to 71BG1, 64% to 71BG3, 63% to 71BG2
MMDSFTPQVLLPLFVVSIITLLYWKLLS
RSRSQPATAANTPPSPPNKYPIIGHLHLLTDMPHHTFAALADKLGPIFHLQLGQVPTVVI
SSSELAKLVLKTHDHVFASRPQLIADQYISFGCSDVTFASYGPYWRQVRKICVTELLSSK
RVGSFQAVRDEEVKRLLTSVKSQCGSVTDMSKLFFTLANDILCLAAFGMRYVNEEGKKSN
NLASVFTESQELLSGFCIGDFFPEW
GWLSSLSGFTRRLRKNTQDLTVAIDEIISEHLFRKQATDDSGSSLMDGDGD
FIDVLLRVQQRDDLEVPITDDNLKALVLDMFMA
GTDTTAATMEWTMTELARHPRVMKKAQEEVRRVASGGGEVNESHIQQLRYMKAVIKETMR
LHPTVPLLVPRESMEKCVLEGYEIPAKTRILINSYAIGRDPKSWENPLEYIPERFDENNI
DFKDQDFRLLPFGGGRRGCPGYSFGLATVETALARLLYHFDWALPPGV
@8
>CYP71BG5P CAAP02000323.1 pseudogene 48% to CAAP02001743.1a,
52% to 71P1 rice
like
potato and cotton ESTs, 67% to 71BG1, 65% to 71BG2, 65% to 71BG3
7816
LPPSPPPLPIIGHLHLLTDMPHHSLSDLALKLGPIIHPRLGQVATVVVSSARLAALVLKT 7995
7996
HDHVFASRPPLTAAQYLSFGCSDVTFSPHGTYWRQARKICVTELLSPKRVTYFQFIRNEE 8175
8176
THPPPHHLPSSLSALSGSETDMSQLFFTLANNLCRVAFGKRFMDDSEGEKKHMVDVLTE 8352
8353
TQALFAGFCIGDFFPDWKWLNSITGLNRRLRKNLEELIAVCNEIIEEHVNEKKERED 8523
8524 FVGVLLRVQKRKDLEVAITDDNLKALVL (0)
8607
9110 DMFVAGTDT 9136 XXXXXXXXXX
9142 ELARHPHVMKKAQQEVRNIASGEGKVEETHLHQLHYKKAVIK*TMRLHPPVPLLVPRQSM
9321
9322 ENCILDGYEIPAKIQLLINTYAIGCVPQSWE
9414
10537
NPLDYNPKRFVDGDVDFKGQDSGFLPFGGGRRGCPSYSFGLATVEIALARLLYHFDWELP 10716
10717
HGVEADDMDLNEIFGLATRKNSGLILVPRY 10806
CYP72
family (22 genes) [21 pseudogenes]
CYP72A
subfamily (18 genes) [20 pseudogenes]
>CYP72A85 CAAP02000598.1a 65% to CAAP02002795.1
68%
to CAAP02000473.1
GSVIVT00009392001
on Genoscope browser
chrUn_random
from 57641820 to 57646375 (4556bp) on strand +
no
P450 neighbors
35540
MEIAYDSVLIFCAFALLSLAWRAFYLVWLRPRRLERCLRRQGLMGNSYRPLHGDAKKVSIMLKEA 35734
35735
NSRPINLSDDIVPRVIPFLYKTIQQY (1) 35812
37936
GKNSFTWVGPIPRVNIMKPELIREVFLEAGRFQKQKPNPLANFLLTGL 38079
38080
VSYEGEKWAKHRKLLNPAFHVEKLK (0) 38154
38642
LMSPAFHLSCRQMISKMEEMVSPEGSCELDVWPFLKNLTADALSRTAFGSSYEEGRRLFQL 38824
38825
LQEQTYLTMEVFQSVYIPGW (2) 38884
39056
YLPTKRNKRMKKIDKEMNTLLNDIITKRDKAMKDGKTANEDLLGILMESNSKEIQEGGN 39232
39233
SKNAGISMQEVIEECKLFYLAGQETTSNLLLWTMVLLSKHPNWQTLAREEVFQVFGKNKP 39412
39413
EFAGLSRLKV (0) 39442
39670
VTMIFYEVLRLYPPGATLNRAVYEDINLGELYLPSGVEIVLPTILVHHDPEIWGDDVKEF 39849
39850
KPERFSEGVMKATKGQVSYFPFGWGPRICIGQNFAMAEAKMALAMILQCFTFELSPSYTH 40029
40030
APTSVLTLQPQYGAHLILHKI* 40095
$$$$
>CYP72A86P CAAP02000598.1b pseudogene exons 4 and 5,
84% to CAAP02002686.1
117092
FFPTKTNKRMKQISKEVH 117039
117038
ALLGGIINKREKAMEAGETANSDLLGILMESNFREIQEHQNNTKIGMSAKDVIDECKLFY 116859
116858
LAGQETTSVLLLWTMVLLSQHPDWQARAREEVLQVFGNNKPENDGLNHLKI (0) 116706
116303
VTMIFHEVLRLYPPVTVLTRMVSKDTQVGDMYFPAGVQVSLPTILVHHDHEIWGDDAKEF 116124
116123
NPERFAEGVSKATKNQVSFLPFGWGPRVCIGQNFAMMEAKIALAMILQRFSFELSPSYAH 115944
115943
APYSLITIQPQYGAHLILRGL* 115878
>CYP72A86P CAAP02000983.1 pseudogene 84% to CAAP02002686.1
3
aa diffs to CAAP02000598.1b
chrUn_random
from 3711617 to 3699111 on strand -
GSVIVT00000151001 first two lines
GSVIVT00000150001 C-term part
16119
TGDVISRTAFGSSYEEGRRIFQLQKEQTYLAIKVAMSVYIPGWR 15988
4832 FFPTKTNKRMKQISKEVHALLGGIINKREK 4743
AMEAGETANSDLLGILMESNFREIQEHQN 4655
4654 NTKIGMSAKDVIDECKLFYLAGQETTSVLLLWTMVLLSQHTDWQARAREEVLQVFGNNKP
4475
4474 ENDGLNHLKI 4445
4035
VTMIFHEVLRLYPPVTVLTRMVSKDTQVGDMYFPAGVQVSLPTILVHHDHEIWGDDAKEF 3856
3855
NPERFAEGVSKATKNQVSFLPFGWGPRVCIGQNFAMMEAKTALAMILQRFSFELSPSYAH 3676
3675 APFSLITIQPQYGAHLILRGL 3613
$$$$
>CYP72A86P-ie5b CAAP02000598.1b-ie5b pseudogene
internal exon 5 fragment
chrUn_random
from 3699666 to 3699734 on strand -
116504
IMMIFHEVLKLLYPLYT*HHAMH 116436
$$$$
>CYP72A87 CAAP02000355.1a one stop codon, possible pseudogene 93% to CAN67740.1
2
aa diffs to CAN71061.1 exon4 and 5
GSVIVP00005878001
Genoscope browser version stops at PERFS*
chrUn_random
from 35354787 to 35357214 (2422bp) on strand +
35357215 to 35357439 continues to the true end
52926
MEMKQLNLVALSFAFITILIYAWRILNWMWLRPKRLERCLKQQGLAGNSYRLLHGDFKEM 53105
53106
FMMIKEASSRPISISDDIVQRITPFHYHSIKKY (1) 53204
53464
GKSSFIWMGPKPRVNIMEPELIRDVLSMHTVFRKSRVHAFVKLLVSGLPFLDGEKWA 53635
53636
KHRKIINPAFRLEKLK (0) 53683
53868
NMLPAFHLSCSDMISKWEGKLSTEGSCELDVWPYLQNLTSDAISRTAFGSNYEEGRMIFE 54047
54048
LQREQAQLLVQFSDSAYIPGWW (2) 54113
54473
FLPTKSNKRMKQIRKEVNALLWGIIDKRGKAMKAGETLNDDLLGILLESNFKEIQEHEN 54649
54650
DKNVGMCIKDVIEECKIFYFAGQETTSALLLWTMVLLSKHPNLQARAREEVLHVFGNNKP 54829
54830
EGDGLNHLKI (0) 54859
55153
VMMILHEVLRLYPPVPLLARTVYEDIQVGDMYLPAGVDVSLPTILVHHDHEIW 55311
55312
GEDAREFNPERFS*GVLKATKSPVSFFPFGWGSRLCIGQNFAILEAKMVLAMILQRFSFS 55491
55492
LSPSYSHAPCSLVTLKPQHGAHLILHGI* 55578
>CYP72A87 gi|147816916|emb|CAN71061.1|
60% to 72A15
2
aa diffs to CAAP02000355.1a
MXXEALNRGVM
FLPTKSNKRMKQIRKEVNALLWGIIDKRGKAMKAGETLNDDLLGILLESNFKEIQEHEN
DKNVGMCIKDVIEECKIFYFAGQETTSALLLWTMVLLSKHPNWQARAREEVLHVFGNNKPEGDGLNHLKI
VMMILHEVLRLYPPVPLLARTVYEDIQVGDMYLPAGVDVSLPTILVHHDHEIWGEDAREFNPERFSQGVL
KATKSPVSFFPFGWGSRLCIGQNFAILEAKMVLAMILQRFSFTLSPSYSHAPCSLVTLKPQHGAHLILHG
I
$$$$
>CYP72A87-de1b CAAP02000355.1b pseudogene 97% to CAN67740.1
3
aa diffs to CAN67740.1
chrUn_random from 35362457 to 35362735 on strand +
60596
MEMKQLNLVALSFAFITILIYAWRVLNWMWLRPKRLERCLKQQGLAGNSYRLLYGDFKEM 60775
60776
SMMIKEATSRPISISDDIVQRVAPFHYHSIKKY (1) 60874
>CYP72A88 CAAP02000355.1c 94% to CAN67740.1, 97% to CAAP02000355.1d
GSVIVP00005881001 in Genoscope browser
chrUn_random from 35380636 to 35383685 (3050bp) on strand +
78889
MEMKQLNLVALSFAFITILIYAWRILNWMWLRPKRLERCLKQQGLAGNSYRLLHGDLKEM 79068
79069
FMMIKEASSRPISISDDIVQRIAPFQYHSIKKY (1) 79167
79982
GKSSFIWMGPKPRVNIMEPELIRDVLSMHTVFRKPRVHALVKLLVSGLLFLDGEKWA 80152
80153
KHRKIINPAFRLEKLK (0) 80200
80450
NMLPAFHLSCSDMISKWEGKLSTEGSCELDVWPYLQNLTGDAISRTAFGSNYEEGRMIFE 80629
80630
LQREQAQLLVQFSESAFIPGWR (2) 80695
80839
FLPTKSNKRMKQIRKEVNALLWGIIDKRGKAMKAGETLNDDLLGILLESNFKEIQEHEN 81015
81016
DKNVGMSIKDVIEECKLFYFAGQETTSVLLLWTMVLLSKHPNWQARAREEVLHVFGNNKP 81195
81196
EGDGLNHLKI (0) 81225
81519
VMMILHEVLRLYPPVPLLARTVYEDIQVGDMYLPAGVDVSLPTILVHHDHEIWGEDAREF 81698
81699
NPERFSQGVLKATKSPVSFFPFGWGSRLCIGQNFAILEAKMVLAMILQRFSFSLSPSYSH 81878
81879
APCSLVTLKPQYGAHLILHGI 81941
>CYP72A89 CAAP02000355.1d 95% to CAN67740.1
GSVIVT00005885001 in Genoscope browser
chrUn_random from 35415527 to 35417550 (2024bp) on strand +
112703
MEMKQLNLVALSFTFITILIYAWRILNWMWLRPKRLERCLKQQGLAGNSYRLLHGDFKEM 112882
112883
FMMIKEATSRPISISDDIVQRIAPFHYHSIKKY (1) 112981
113632
GKSSFIWMGPKPRVNIMEPELIRDVLSMHTVFRKPRVHALVKLLVSGLLFLDGEKWA 113802
113804
KHRKIINPAFRLEKVK 113851
114101
NMLPAFHLSCSDMISKWEGKLSTEGSCELDVWPYLQNLTGDAISRTAFGSNYEEGRMIFE 114280
114281
LQREQAQLLVQFSESAFIPGWR 114346
114705
FLPTKSNKRMKQNRKEVNELLWGIIDKREKAMKAGETLNDDLLGILLESNFKEIQEHGN 114881
114882
DKNVGMSIKDVIDECKIFYFAGQETTSVLLLWTMILLSKHPNWQARAREEVLHVFGNNKP 115061
115062
EGDGLNHLKI 115091
115384
VMMILHEVLRLYPPVPLLARTVYEDIQVGDMYLPAGVDVSLPTILVHHDHEIW 115542
115543
GEDAREFNPERFSQGALKATKSLVSFFPFGWGSRLCIGQNFAILEAKMVLAMILQRFSFS 115722
115723
LSPSYSHAPCSLVTLKPQYGAHLILHGI 115806
>CYP72A90
gi|147795635|emb|CAN67740.1| 55% to 72A15
95%
to CAAP02000355.1d
no
exact match in Genoscope
MEMKQLNLVALSFAFITILIYAWRVLNWMWLRPKRLERCLKQQGLAGNSYRLLYGDFKEMSMMIKEATSR
PISFSDDILQRVAPFHYHSIKKYGKSSFIWMGLKPRVNIMEPELIRDVLSMHTVFRKPRVHALGKQPASG
LFFLEGEKWAKHRKIINPAFRLEKLKNMLPAFHLSCSDMISKWEXKLSTXGSCEXDVWPYLQNLTGDAIS
RTAFGSNYEEGRMIFELQREQAQLLVQFSQSACIPGWRFLPTKSNKRMKQNRKEVNELLWGIIDKREKAM
KAGETLNDDLLGILLESNFKEIQEHGNDKNVGMSIKDVIDECKIFYFAGQETTSVLLLWTMVLLSKHPNW
QARAREEVLHVFGNNKPEGDGLNHLKIVMMILHEVLRLYPPVPLLARTVYEDIQVGDMYLPAGVDVSLPT
ILVHHDHEIWGEDAREFNPERFSQGALKATKSLVSFFPFGWGSRLCIGQNFAILEAKMVLAMILQRFSFS
LSPSYSHAPCSLVTLKPQYGAHLILHGI
>CYP72A91P gi|147791559|emb|CAN72865.1|
AM476150.2
52%
to 72A15 95% to CAAP02000355.1c
CYP72A88
has no
exact match in Genoscope
MEMKQLNLVALSFAFITILIYAWRILNWMWLRPKRLERCLKQQGLAGNSYRLLHGDFKEMFMMIKEATSR
PISISDDIVQRIAPFHYHSIKKYGKSSFIWMGPKPRVNIMEPELIRDVLSMHTVFRKPRVHALVKLLVSG
LLFLDGEKWAKHRKIINPAFRLEKVK
NMLPAFHLSCSDMISKWD
(deletion
of 7 aa)
SCELDVWPYLQNLTGDAISRTAFGSNYEKGRMIFE
LQREQAQLLVQFSESAFIPGWRFXPTKSNKRMKQIRKEVNALLWGIIDKRGKAMKAGETLNDDLLGILLE
SNFKEIQEHENDKNVGMSIKDVIEECKLFYFAGXETTSALLLWTMVLLSKHPNWQARAREEILHVFGNNK
PEGDGLNHLKIVMMILHEVLRLYPPVPFLARSVYEDIQVGDMYLPAGVDVSLPTILVHHDHEIWGEDARE
FNPERFSQGVLKAMKSPVSFFPFGWGSQSCIGQNFAILEAKMVLAMILQRFSFSLSPSYSHAPSSLVTLI
PQYGAHLXLHGI
>CYP72A92 CAAP02000149.1a 90% to CAAP02002795.1
GSVIVP00005888001 in Genoscope browser
chrUn_random from 35487285 to 35497057 (9773bp) on strand -
33193
MKLSSVAISFAFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMLRM 33014
33013
ISEANSRSISLSDDIVQRVLPFHCHSIKKY (1) 32921
31022
GKNYFIWMGPKPVVNIMDPELIRDVFLKYNAFRKPPPHPLGKLLATGLVTLEGEQ 30858
30857
WTKRRKIINPAFHLEKLK (0) 30804
30164
HMVPAFQLSCSDMVNKWEKKLSKDGSCELDIWPDLENLAGDAISRTAFGSSYEEG 30000
29999
RRIFQLQKEQAHLAVKVFRSVYIPGWR (2) 29919
29675
FVPTKTNKRMRQISNEVHALLKGIIERREKAMKVGETANDDLLSLLMESNFREMQEHDE 29499
29498
RKNVGMSIKDVIEECKLFYFAGQETTSDLLLWTMVLLSKHSNWQARAREEVLQVFGNKKP 29319
29318
DGDGLNHLKI (0) 29289
24060
VTIIFHEVLRLYPPVSMLIRTVVADSQVGGWYFPDGALITLPILLIHHDHEIWGEDAKEF 23881
23880
NPERFSEGVSKATKGQFAFYPFGYGPRVCIGQNFAMMEAKMALAMILQRFSFELSPSYAH 23701
23700
APSNIITIQPQYGAYLILHGL* 23635
$$$$
>CYP72A93 CAAP02000149.1b 86% to CAAP02002795.1
GSVIVP00005893001 in Genoscope browser
chrUn_random from 35547921 to 35552540 (4620bp) on strand -
88676
MELISVAISFAFITLLIYAWRLLNWVWLRPKKLERCLRQQGITGNSYRLLHGDVREMLRM 88497
88496
ISEANSRPISLSDEIVQRVLPFHYHSLKKY (1) 88407
86832
GKNYFIWMGPKPVVNIMDPELIRDVFLRYNAFHKPAPHPLGKLLATGLVTLEGEQ 86668
86667
WTKHRKIINPAFHLEKLK (0) 86614
85979
HMVPAFQLSCGDMVNKWEKKLSKDGSCELDIWPDLENLTGDAISRTAFGSSYEEG 85815
85814
RRIFQLQKEQAHLAVKVFRSVYIPGWR (2) 85734
85493
FVPTKTNKRIRQIRNELHALLKGIIEKREKAMLVGETANDDLLSLLMESNFREMQEHDE 85317
85316
RKNVGMSIDDVIEECKLFYFAGQETTSDLLLWTMILLSKHSNWQARAREEILQVFGNKKP 85137
85136
DGNGLNHLKI (0) 85107
84572
VTMIFHEVLRLYPPVSMLIRTVFVDSQVGRWYFPVGSHVALPILLIHHDHEIWGEDAKEF 84393
84392
NPERFSEGVSKATKGGQFAFFPFGYGPRACIGQNFAMMEAKMALAMILQRFSFELSPSYA 84213
84212
HAPFNVITVQPQYGAHLILHGL* 84144
>CYP72A93
gi|147833897|emb|CAN66491.1| AM486124.1
62%
to 72A15
4
aa diffs to CAAP02000149.1b
11981
MELISVAISFAFITLLIYAWRLLNWVWLRPKKLERCLRQQGITGNSYRLLHGDVREMLRM 11802
11801
ISEANSRPISLSDEIVQRVLPFHYHSLKKY () 11712
10111
GKNYFIWMGPKPVVNIMDPELIRDVFLRYNAFHKPAPHPLGKLLATGLVTLEGEQ 9947
9946
WTKHRKIINPAFHLEKLK 9893
9253
HMVPAFQLSCSDMVNKWEKKLSKDGSCELDIWPDLENLTGDAISRTAFGSSYEEG 9089
9088
RRIFQLQKEQAHLAVKVFRSVYIPGWR ()
8767
FVPTKTNKRIRQIRNELHALLKGIIEKREKAMXVGETANDXLLSLLMESNFREMQEHDE 8591
8590
RKNVGMSVXDVIEECKLFYFAGQETTSDLLLWTMVLLSKHSNWQARAREEILQVFGNKKP 8411
8410
DGNGLNHLKI (0) 8381
7846
VTMIFHEVLRLYPPVSMLIRTVFPDSQVGRWYFPVGSHVALPILLIHHDHEIWGEDAKEF 7667
7666
NPERFSEGVTKATKGGQFAFFPFGYGPRACIGQNFAMMEAKMALAMILQRFSFELSPSYA 7487
7486
HAPFNVITVQPQYGAHLILHGL* 7418
$$$$
>CYP72A94P CAAP02000149.1c pseudogene exon 4 only, 79%
to CAAP02001786.1
chrUn_random from 35615448 to 35615834 on strand –
152090
FFPTKTNKRMKQISKEVHALLRGIINKREKAMEAGETANSGLLGILMESNFKEIHEHQN 151914
151913
NMKIGMSAKDVIDECKLFYLAGQETISVLLLWTMVLPSQHSDWQARAREEV*QVFGNNK 151737
151736
RQNDGLNHLKI (0)
$$$$
>CYP72A95 CAAP02000149.1d frameshift in exon 5,
possible pseudogene
same
as CAN72247.1, 94% to CAAP02002686.1 another pseudogene
GSVIVT00005897001 in Genoscope browser
chrUn_random from 35636714 to 35642885 (6172bp) on strand –
not correctly assembled, only contains C_term from MMEAK to end
184279
MKHSSVAISFGFLTVLISCLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLHGDFKEMSM 184100
184099
MLKEAYSRPISLSDDIAPRVLPFHCHFIKKY (1) 184007
183058
GKNFFAWFGPNPMVNIMEPELIRDILLKSNVFQKPPPHPLGKLLVSGLVTLEGERWA 182888
182887
KRRKNINPAFHLEKLK (0) 182840
182021
NMLPAFHLSCSDMVTKWKMLSVGGSCELDVWPYLENLTGDVISRTAFGSSYEEGR 181857
181856
RIFQLQKEQTHLAIQVTMSVYIPGWR (2) 181779
180119
FLPTKTNRRMKQISKEVYALLRGIVNKREKAMKAGETANSDLLGILMESNFREIQEHQN 179943
179942
NKKIGMSVRDVIEECKLFYLAGQETTSVLLVWTMVLLSEHPNWQARAREEVLQVFGNKKP 179763
179762
EADGLNHLKI (0) 179733
179322
VTMIFHEVLRLYPPIAMLARAVYKDTQVGDMCFPAGVQVRP 179200
179203 PTILVHHDHEIWGDDAKEFNPERFAEGVLKATKNQVSFFPFGWGPRVCIGQNFAMMEAKI
179024
179023
ALAMILQHFSFELSPSYAHAPFNILTMQPQYGAHLILRGLQC* 178895
>CYP72A95 gi|147815271|emb|CAN72247.1|
50% to 72A10, = CAAP02000149.1d, cyan part too long
MKYQKVQIXWSSRAGSTLRHLPRCEGCELSLEALKKSLKLE
MKHSSVAISFGFLTVLISCLWRLLNWVWL
RPKRLERCLREQGLAGNSYRLLHGDFKEMSMMLKEAYSRPISLSDDIAPRVLPFHCHFIKKYGKNFFAWF
GPNPMVNIMEPELIRDILLKSNVFQKPPPHPLGKLLVSGLVTLEGERWAKRRKNINPAFHLEKLKNMLPA
FHLSCSDMVTKWKMLSVGGSCELDVWPYLENLTGDVISRTAFGSSYEEGRRIFQLQKEQTHLAIQVTMSV
YIPGWR
$$$$
>CYP72A96 CAAP02000473.1 97% to CAAP02000149.1d, bad
boundary at RKNFF
GSVIVP00000189001 on Genoscope browser (missing N-term)
chrUn_random from 4489657 to 4495024 on strand –
52293
MKHSSVAISFGFLTVLISCLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLHGDFKEMS 52117
52116
MMLKEAYSRPISLSDEIAPRVLPFHCHFIKKY (1) 52021
51446
RKNFFAWFGPNPMVNIMEPELIRDVLLKSNVFQKPPPHPLGKLLVSGLVTLEGERWA 51276
51275
KRRKIINPAFHLEKLK (0)
50085
NMLPAFHLSCSDMVTKWKMLSVGGSCELDVWPYLENLTGDVISRTAFGSSYEEGR 49921
49920
RIFQLQKEQTHLAIQVTMSVYIPGWR (2) 49843
48155
FLPTKTNRRMKQISKEVYALLRGIINKREKAMKAGETANSDLLGILMESNFREIQEHQN 45979
47978
NKKIRMSVKDVIEECKLFYLAGQETTSVLLVWTMVLLSEHPNWQARAREEVLQVFGNKKP 47799
47798
EAAGLNHLKI (0) 47769
47357
VTMIFHEVLRLYPPVAMLARAVYKDTQVGDMCFPAGVQVVLPTILVHHDHEIWGDDAKEF 47178
47177
NPERFAEGVLKATKNQVSFFPFGWGPRVCIGQNFAMMEAKIALAMILQHFSFELSPSYAH 46998
46997
APFSILTMQPQYGAHLILRGLQC* 46926
>CYP72A97P CAAP02002686.1 pseudogene, 76% to
CAAP02004668.1 CYP72A
94% to CAAP02000149.1d
GSVIVP00000152001 in Genoscope browser not assembled correctly
Only exon6 and 7 correct in this model (VFGN-HRAV)
chrUn_random 3749666 to 3760267 on strand –
13105
MKHSSVAISFGFLTVLISCLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLHGDFKEMSM 12926
12925
MLKEAYSRPISLSDDTTPRVLPFHFHFIKKY 12833
11910
GKNSFAWFGPNPMVNIMEPELIRDVLLKSNVFQKPPPHPLGKLLVSGLVTLEGERWA 11740
11739
KRRKIINPAFHLEKLK 11692
10658
NMLPAFHLSCSDMVTKWKMLSVGGSCELDVWPYLENLTGDVISRTAFGSSYE 10503
10502
EGRRIF*LQKEQTHFASQ 10449
5472
VTMSVYIPGWR 5440
3730
FYPQRRNRRMKQISKEVYALLRGIVSNREKAMKAGETASSDLLGILMESNFREIQEHQNN 3551
3550
KKIGMSVKDVIEECKLFSLDGQETTSVLLVWTMVLLSEHPNWQACAREEVLQ 3395
3395
VFGNKKPEADGLNHLKI 3345
2933
VTMIFHEVLRLYPLVAMLHRAV 2868
2866
YKDTQVGDMCFPVGVQVVLPTILVHHDHEIWGDDAKEFNPKRFAEAVLKATKNQVSFFPF 2687
2686
GWGPRVCIGQNFAMMEAKIALAMILQHFSFELSPSYAHAPFSILTMQPQYGAHLILRGLQC* 2501
>CYP72A97P-ie5b CAAP02002686.1a-ie5b see
CAAP02000598.1b-ie5b
3152
IMMIFHEVLKL 3120
>CYP72A98
gi|147777099|emb|CAN63404.1| AM456876.2
49%
to 72A14
86%
to CAAP02002686.1, 88% to CAAP02000149.1d
no
exact match in Genoscope
MKHSSVAISFGFLTVLISCLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLHGDFKEMSMMLKEAYSRPI
SLSDDIAPRVLPFHCHFIKKY
()
GKNSFAWFGPNPMVNIMEPGLIRDVLLKSNVFQKPPPHPLGKLLVSGLV
TLEGERWAKRRKIINPAFHLEKLK
()
NMLPAFQLSCSDMVTKWKKLSVGGSCELDVWPXXXXXXXX
VISRTAFGSSYEEGRRIFQLQKELTHLASQ
VTMSVYIPGXR
()
FLSTKMNRRMKXISKEVYALLRGIINKREKAMKAGKXANSEXLLGILMESNFREI
QEHQNNKKIGMSAKDXIEECKLFYLAGQETTSVLLLWTMFLLSEHPNWQACAREEVLQVFGKK
KPEADGLNHLKI
VTMIFHEVLRLY
PLVAMLNRAVYKDTQVGDMYFPARVQVALPTILVHHDHEIWGDNAKGFDPERFAEGILKATKTSSA
(Deletion)
CIGQNFAMMEAKIALAMILQHFSFELSPSYAHAPFNILTMQPQYGVHLILRGLQC
$$$$
>CYP72A99 CAAP02004338.1a runs off the end 84% to
CAN72247.1
100% to CAO16049.1 end is 98% to CAAP02000983.1
1697
MKLSSVAISFGFLTVLISCVWRLLNWVWLRPKRLERCLREQGLAGNSYRLLQGDSKEMSR 1518
1517
MMKEAYSRPISLSDDIVQRVLPFHCHFIKKY (1) 1425
172
GKNFFTWVGPSPRVNIMEPELMRDVLLKSNIFQKTPSHPLVKLLVSGLVALEGEQWA 2
CYP72A99
gi|157327641|emb|CAO16049.1|
unnamed protein product [Vitis vinifera]
4 aa diffs to CAAP02000983.1 from
TDGE to end
GSVIVP00009398001 in Genoscope
chrUn_random from 57722038 to 57738734 on strand -
MKLSSVAISFGFLTVLISCVWRLLNWVWLRPKRLERCLREQGLAGNSYRLLQGDSKEMSRMMKEAYSRPI
SLSDDIVQRVLPFHCHFIKKYGKNFFTWVGPSPRVNIMEPELMRDVLLKSNIFQKTPSHPLVKLLVSGLV
ALEGEQWAKRRKIINPAFHPEKLKNMLSAFHLSCSDMVNKWKKLSVEGSCELDVWPYLENL
TGDVISRTA
FGSSYEEGIRIFQLQKEQTYLAIKVAMSVYIPGWRFFPTKTNKRMKQISKEVHALLGGIINKREKAMEAG
ETANSDLLGILMESNFREIQEHQNNTKIGMSAKDVIDECKLFYLAGQETTSVLLLWTMVLLSQHPDWQAR
AREEVLQVFGNNKPENDGLNHLKIVTMIFHEVLRLYPPVTVLTRMVSKDTQVGDMYFPAGVQVSLPTILV
HHDHEIWGDDAKEFNPERFAEGVSKATKNQVSFLPFGWGPRVCIGQNFAMMEAKIALAMILQRFSFELSP
SYAHAPYSLITIQPQYGAHLILRGL
$$$$
>CYP72A100P CAAP02004338.1b 90% to CAAP02004439.1
100%
to CAO16050.1 = CU459449.1
32686
SLPQAT
MIFHKVLRLYPLVAMLPRVVYKDTQVGDMCFPAGVQVLLSTILVHHDHEILGDD 32507
32506
AKEFNPERFAEGVLKATKNQVSFFPFGWGPRVCIGQNFAMMEAKIALAMIL*HFSFELSP 32327
32326
SYTHASFSILTMQPQYGAHLILRGLQC* 32243
>CYP72A100P gi|157327642|emb|CAO16050.1| unnamed protein product [Vitis vinifera]
beginning = CAAP02001786.1
92%
to CAAP02000473.1
GSVIVP00009400001 in Genoscope browser not correctly assembled
57770331 to 57770582 on strand – NY* to CKLF
57769280 to 57770326 on strand – YLAG to end
NY*HAHRFLPTKMNRRMKQISKEVYALLRGIINKREKAMKAGKTANSDLLGILMESNFREIQEHQNNKKIGMS
VKDVIEECKLF
YLAGQKTTSVLLVWTMALLSEHPNWQAHAREEVLQVFGNKKWEVDGLNHLKIAT
MIFHKVLRLYPLVAMLPRV
VYKDTQVGDMCFPAGVQVLLSTILVHHDHEILGDDAKEFNPERFAEGVLKATKNQVSFFPFGWGPRVCIG
QNFAMMEAKIALAMIL*HFSFELSPSYTHASFSILTMQPQYGAHLILRGLQC*
>CYP72A101PX = CYP72A100P
CAAP02001786.1 pseudogene 89% to CAAP02004439.1 CYP72A
57770174 to 57770582 on strand –
note CYP72A100P may be identical to 72A101P (merge)
426
NY*HAHRFLPTKMNRRMKQISKEVYALLRGIINKREKAMKAGKTANSDLLGILMESNFRE 247
246
IQEHQNNKKIGMSVKDVIEECKLFY 172
170
YLAGQKTTSVLLVWTMALLSEHPNWQAHAREEVLQVFGNKKWEVDGLNHLK 18
$$$$
>CYP72A102P CAAP02004439.1 pseudogene 83% to CAAP02000101.1
91%
to CAAP02000473.1
GSVIVP00000178001 in Genoscope browser not correctly assembled
4296186 to 4297437 on strand –
2539
NY*HAHRFLPTKTNRKMKQISKEVYALLRGIVNKREKAMKVGETTNSDLLGMLMESNFRE 2360
2359
IQEHQNNKKIRISVKDVIEECKLFYLAGQKTTSVLLVWTMVLLSEHPN*QARAREEVLQV 2180
2179
FGNKKWEADGLNHLKI (0) 2135
1719
VTMIFHEVLRLYPPIAMLPRVVYKDTQVGDMCFPTGLQVVLPTILVHHDHEIWGDD 1552
1551
AKEFNPKRFVEGVLKVTKNQVSFFPFGWGPRVCIGQNFAMMEAKIALAMIL*HFSFELSP 1372
1371
SYTHASFNILTM*PQYGAHLILHGLQC* 1288
$$$$
>CYP72A103 CAAP02002795.1 87% to CAAP02004668.1
90%
to CAAP02000149.1a
25135
MKLSSVAISFAFITLLIYAWRLLNSVWLKPKKIERYLRQ
25018
QGLIGNSYRLLHGDFREMSRMIDEANSRPISLSDDIVQRVLPFHYHSIKKY (1) 24866
24228
GKNCFIWMGPKPVVNIMEPELIRDVLLKHNAFQKPPVHPLGKLLATGVIALEGEQ 24064
24063
WTKRRKIINPAFHLEKLK (0) 24010
23631
HMVPAFQLSCSEMVNKWEKKLSKDGSCELDIWPDLENLAGDVISRTAFGSSYEE 23470
23469
GRRIFQLQKEQAHLAVQVSQSIYIPGWR (2) 23386
23080
FVPTKTNKRMRQISNEVNALLKGIIERREKAMKVGETANDDLLGLLMESNYKEMQEHGE 22904
22903
RKNVGMSNKDVIEECKLFYFAGQETTSVLLLWTMVLLSKHSNWQARAREEVLQVFGNKKP 22724
22723
DGDGLNHLKI (0) 22694
22147
VTMIFHEVLRLYPPASMLIRSVYADTEVGGMYLPDGVQVSLPILLLHHDHEIWGDDAKDF 21968
21967
NPERFSEGVSKATKGQFAFFPFGYGPRVCIGQNFAMMEAKMALAMILQRFSFELSPSYAH 21788
21787
APISVITIQPQYGAHLILHGL* 21722
>CYP72A103 gi|157356442|emb|CAO62605.1| unnamed protein product [Vitis vinifera]
identical to CAAP02002795.1
GSVIVP00000202001 in Genoscope browser
4708885 to 4712295 on strand –
MKLSSVAISFAFITLLIYAWRLLNSVWLKPKKIERYLRQQGLIGNSYRLLHGDFREMSRMIDEANSRPIS
LSDDIVQRVLPFHYHSIKKYGKNCFIWMGPKPVVNIMEPELIRDVLLKHNAFQKPPVHPLGKLLATGVIA
LEGEQWTKRRKIINPAFHLEKLKHMVPAFQLSCSEMVNKWEKKLSKDGSCELDIWPDLENLAGDVISRTA
FGSSYEEGRRIFQLQKEQAHLAVQVSQSIYIPGWRFVPTKTNKRMRQISNEVNALLKGIIERREKAMKVG
ETANDDLLGLLMESNYKEMQEHGERKNVGMSNKDVIEECKLFYFAGQETTSVLLLWTMVLLSKHSNWQAR
AREEVLQVFGNKKPDGDGLNHLKIVTMIFHEVLRLYPPASMLIRSVYADTEVGGMYLPDGVQVSLPILLL
HHDHEIWGDDAKDFNPERFSEGVSKATKGQFAFFPFGYGPRVCIGQNFAMMEAKMALAMILQRFSFELSP
SYAHAPISVITIQPQYGAHLILHGL
$$$$
>CYP72A104P gi|147798934|emb|CAN63796.1|
AM469525.2 56% to 72A7
pseudogene
5
aa diffs plus some errors to CAAP02002795.1 (same with CAO62605.1)
no
exact match in Genoscope, may be the same seq as CYP72A103
RFVPTXTNKRMRQISNEVNALLKGIIERREKxxEVGExxTSTANXXLLGLLMESNYKEMQEHDERKNVGMS
NKDVIXECKLFYFAGQETTSVLLLWTMVLLSKHSXWQARAREEVLQVFGNKKPDGDGLXHLKI
(0)
14303
VTMIFHEVLRLYPPASMIXX 14250
14251
SVYXDTEVGG
MYLPDGVXVSLPILLVHHDHEIWGDDAKDFNPERFSEGVSKATKGQFAFFPFGYGPRVCIGQNFAMMEAK
MALAMIVQRFSFELSPSYAHAPFSVITIQPQYGAHLILHGL
$$$$
>CYP72A105 CAAP02002402.1a 91% to CAAP02004668.1
no
exact match in Genoscope
9868
MKLSSVAVSFAFITLLIFAWRLLNWVWLRPKKLERCLRKQGLTGNSYRLLHGDFREMSRM 10047
10048
NNEANSGPISFSDDIVKRVLPFFNHSIQKY (1) 10137
11178
GKNSFTWLGPKPVVNIMEPELIRDVLLKHNVFQKPPPHPLGKLLATGVVALEGEQW 11345
11346
TKRRKIINPAFHLEKLK (0) 11396
11706
HMVSAFQLSCSDMVNKWEKKLSMDDSCELDIWPYLQILTGDVISRTAFGSSYEEGRRIFQ 11885
11886
LQKEQAHLVAQVTQSVYVPGWR (2) 11951
12540
FFPTKINRRMRQIRNEVNALLKGIIEKREKAMKVGETANHDLLGLLMESNYREMQENDE 12716
12717
RKNVGMSIKDVIEECKLFYFAGQETTSVLLLWTMVLLSKHSNWQARAREEVLQVFGNKKP 12896
12897
DGDGLNHLKI (0) 12926
13502
VTMIFHEVLRLYPPASMLIRTVFADSQVGGLYLSDGVLIALPILLIHHNHEIWGEDAKEF 13681
13682
NPGRFSEGVSKAAKTQVSF 13738
13737
FFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFDLSPSYAHAPSS 13874
13875
LLMQPQHGAHLILHGL* 13925
>CYP72A105 gi|147810740|emb|CAN67452.1|
64% to 72A15
3
aa diffs to CAAP02002402.1a
MVLLSKHSNWQARAREEVLQVFGNKKPDGDGLNHLKIVTMIFHEVLRLYPPASMLIRTVFADSQVGGLYL
PDGVLIXLPILLIHHNHEIWGEDAKEFNPGRFSEGVSKAAKTQVSFFPFGYGPRICVGQNFAMMEAKMAL
AMILQRFSFDLSPSYAHAPXSLLTMQPQHGAHLILHGL
$$$$
>CYP72A106P CAAP02002402.1b pseudogene
GSVIVP00011018001 on Genoscope Browser not assembled correctly
69155641 to 69158451 on strand +
48350
ISVAISFAFITLLIYAWRLLNWVWLRPKKLERCLRQQGITGNSYRLLHGDVREMLRMISE 48529
48530
ANSRPISLSDEIVQRVLPFHYHSLKKYGIAGFL 48628
49855
SRFVPTKTNKRMRQISNEVNALLKGSIERREKAMKVGEMREHDERKNVG 50001
50002
MSNKDVIKECKLFYFAGQETTSVLLLWTMVPLSKHSNWQGRAREEVLQVFGNKKPDGDG 50178
50179
LNHLK 50193
50799
VYADTEVGGMYLPDGVQVSLPILLVHHDHEIWGDDAKDFNPERFSEGVSKATKGQFAFFP 50978
50979
FGYGPRVCIGQNFAMMEAK 51035
50717
MIKLFSILQVTMIFHEVLRLYPPASMICLC
MALAMIVQRFS
51067
51068
FELSPSYAHAPFSVITIQPQYGAHLILHGL 51157
$$$$
>CYP72A107 CAAP02002484.1 96% to CAAP02004668.1 CYP72A
no
exact match in Genoscope
10902
MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYSCLYGDFKEMSRM 11081
11082
INEANSRPISFSDDIVQRVLPFHDHSIQKY (1) 11171
12213
GKNSFTWLGPKPVVNIMEPELIRDVFLKHNAFQKVPPHPLGKLLATGVVALEGEQW 12380
12381
TKRRKIINPAFHLEKLK (0) 12431
12655
HMVSAFQLSCSDMVNKWEKKLSLDGSCELDVWPYLENLAGDVISRTAFG 12801
12802
SSYEEGRRIFQLQREQAHLAIQVTRSIYVPGWR (2) 12900
13371
FFPTKTNRRMRQISNEVNALLKGIIEKREKAMKVGETANHDLLGLLMESNYRDMQENDE 13547
13548
RKNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNWQTRAREEVLRVFGNKKP 13727
13728
DGDGLNHLKI (0) 13757
14373
VTMIFHEVLRLYPPVSMLLRTVFADSQVGGLYLPDGVQIALPILLLHHDHEIWGEDAKEF 14552
14553
NPGRFSEGVSKAAKTQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSYAH 14732
14733
APISLITMQPQYGAHLILHGL* 14798
>CYP72A107 gi|147791938|emb|CAN72443.1
gi|147791939|emb|CAN72444.1| AM462621.1
65% to 72A15
100%
to CAAP02002484.1
Note
CAN72443.1 is the N-terminal of the same gene
same
as CAN68126.1 and and 1 aa diff to CAAP02004668.1
adjacent
to CAN72443.1
MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYSCLYGDFKEMSRMINE
ANSRPISFSDDIVQRVLPFHDHSIQKY
GKNSFTWLGPKPVVNIMEPELIRDVFLKHNAFQKVPPHPLGKLLATGVVALEGEQW
TKRRKIINPAFHLEKLK
()
HMVSAFQLSCSDMVNKWEKKLSLDGSCELDVWPYLENLAGDVISRTAFG
SSYEEGRRIFQLQREQAHLAIQVTRSIYVPGWR
()
FFPTKTNRRMRQISNEVNALLKGIIEKREKAMKVGETANHDLLGLLMESNYRDMQENDE
RKNVGMSIKDVIEECKLFYLAGQETTSVLLLWT
MVLLSKHSNWQTRAREEVLRVFGNKKPDGDGLNHLKI
(0)
VTMIFHEVLRLYPPVSMLLRTVFADSQVGGLYL
PDGVQIALPILLLHHDHEIWGEDAKEFNPGRFSEGVSKAAKXQVSFFPFGYGPRICVGQNFAMMEAKMAL
AMILQRFSFELSPSYAHAPISLJTXXPQYGAHLILHGL
>CYP72A107 gi|147781059|emb|CAN68126.1|
AM465661.2 partial seq
66%
to 72A15
1
aa diff to CAAP02004668.1
3
aa diffs to CAAP02002484.1
508
FFPTKTNRRMRQISNEVNALLKGIIEKREKAMKVGETANHDLLGLLMESNYRDMQENDE 684
RKNVGMSIKDVIEECKLFYLAGQETTSVLLLWT
MVLLSKHSNWQTRAREEVLRVFGNKKPDGDGLNHLKI
(0)
VTMIFHEVLRLYPPVSMLLRTVFADSQVGGLYL
PDGVQIALPILLLHHDHEIWGEDAKEFNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMAL
AMILQRFSFELSPSYAHAPISLLTTHPQYGAHLILHGL
$$$$
>CYP72A108 CAAP02004668.1 72% to CAN67740.1 CYP72A
96%
to CAAP02002484.1
GSVIVP00011014001 on Genoscope Browser
69064443 to 69068470 on strand +
7761
MKLSSVAISFAFIVLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMS 7934
7935
EMINEANSRPISFSDDIVQRVLPFHDHSIQKY (1) 8030
9054
GKNSFTWFGPKPVVYIMEPELIRDVLLKHNVFQKPPPHPLSKLLATGVVAL 9206
9207
EGEQWTKRRKIINPAFHLEKLK (0) 9272
9534
HMVSAFQLSCSDMVNKWEKKLSLDGSCELDVWPYLENLAGDVISRTAF 9677
9678
GSSYEEGRRIFQLQREQAHLAIQVTRSIYVPGWR (2) 9779
10366
FFPTKTNRRMRQISNEVNALLKGIIEKREKAMKVGETANHDLLGLLMESNYRDMQENDER 10545
10546
KNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNWQTHAREEVLRVFGNKKPD 10725
10726
GDGLNHLKI (0) 10752
11363
VTMIFHEVLRLYPPVSMLLRTVFADSQVGGLYLPDGVQIALPILLLHHDHEIWGEDAKE 11539
11540
FNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSYA 11719
11720
HAPISLLTTHPQYGAHLILHGL* 11788
>CYP72A108 gi|147858656|emb|CAN80407.1|
40% to 72A10
2
aa diffs to CAAP02004668.1 and CAO21263.1
MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMSXMINEANSRPIS
FSDDIVQRVLPFHDHSIQKYGEQWTKRRKIINPAFHXEKLKHMVSAFQLSCSDMVNKWEKXLSLDGSCEL
DVWPYLENLAGDVISRTAFGSSYEEGRRIFQLQREQAHLAIQVTRSIYVPGWR
>CYP72A108 gi|157328551|emb|CAO21263.1| unnamed protein product [Vitis vinifera]
100% to CAAP02004668.1
MKLSSVAISFAFIVLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMSEMINEANSRPIS
FSDDIVQRVLPFHDHSIQKYGKNSFTWFGPKPVVYIMEPELIRDVLLKHNVFQKPPPHPLSKLLATGVVA
LEGEQWTKRRKIINPAFHLEKLKHMVSAFQLSCSDMVNKWEKKLSLDGSCELDVWPYLENLAGDVISRTA
FGSSYEEGRRIFQLQREQAHLAIQVTRSIYVPGWRFFPTKTNRRMRQISNEVNALLKGIIEKREKAMKVG
ETANHDLLGLLMESNYRDMQENDERKNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNWQTH
AREEVLRVFGNKKPDGDGLNHLKIVTMIFHEVLRLYPPVSMLLRTVFADSQVGGLYLPDGVQIALPILLL
HHDHEIWGEDAKEFNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSP
SYAHAPISLLTTHPQYGAHLILHGL
$$$$
>CYP72A109 CAAP02001850.1 6 aa diffs to CAAP02003454.1
exact
match to 53909800 to 53913784 + strand
GSVIVP00009051001 in Genoscope browser
163
MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMSRM 342
343 INEANSRPISFSDDIVQRVLPFHDHSIQKY (1)
432
1262
GKNNFIWLGPKPVVNIMQPELIRDVLLKHNAFQKPPPHPLGKLLASGISSLDGEQW 1428
1429
TKRRKIINPAFHLEKLK (0) 1479
1885
HMVSAFQLSCSDMVNKWEKQLSLDGSCELDIWPYLQNLTGDVISRTAFGSSYEEG 2049
2050
RRIFQLQKEQALLTVQVTRSVYVPGWR (2) 2130
2730
FFPTKTNRRMRQISSEVDALLKGIIEKREKAMQAGETANDDLLGLLMESNYREMQENDE 2906
2907
RKNVGMSIKDVIEECKLFYLAGQETTSALLLWTMVLLSKHSNWQARAREEVLRVFGNKKP 3086
3087
DGDGLNHLKI (0) 3116
3722
VTMIFHEVLRLYPPAPMLTRAVFADSQVGGLYLPDGVQIALPILLIHHDDKIWGEDAK 3895
3896
EFNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSY 4075
4076
AHAPISLLTIQPQHGAHLILHGL* 4147
>CYP72A110 CAAP02001422.1 6 aa diffs to CAAP02001850.1
no
exact match in Genoscope
11379
MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMSRM 11200
11199
INEANSRPISFSDDIVQRVLPFHDHSIQKY (1) 11110
10292
GKNNFIWLGPKPVVNIMQPELIRDVLLKHNAFQKPPRHPLGKLLASGVASLEGEQW 10125
10124
TKRRKIINPAFHLEKLK 10074
9745 HMVSAFQLSCSDMVNKWEKQLSLDGSCELDIWPYLQNLTGDVISRTAFGSSYEEG
9582
9581 RRIFQLQKEQALLTVQVTRSVYVPGWR 9501
8901
FFPTKTNRRMRQISSEVDALLKGIIEKREKAMQAGETANDDLLGLLMESNYREMQENDE 8725
8724
RKNVGMSIKDVIEECKLFYLAGQETTSALLLWTMVLLSKHSNWQARAREEVLRVFGNKKP 8545
8544 DGDGLNHLKI 8495
7909 VTMIFHEVLRLYPPAPMLTRAVFADSQVGGLYLPDGVQIALPILLIHHDDKIWGDDAKEF
7730
7729
NPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSYAH 7550
7549 APISLLTMQPQHGAHLILHGL* 7484
>CYP72A111P CAAP02001422.1 pseudogene 96% to CAAP02003454.1
GSVIVP00009781001 on Genoscope Browser not assembled correctly
Chr19_random 699832 to 701684 on strand -
32669
GKNNFIWLGPKPVVNIMQPELIRDVLLKHNAFQKAPRHPLRKLLASGIASLEGEQW 32502
32501
TKRRKIINPAFHLEKLK 32451
32049
HMVSAFQLSCSDMVNKWEKQLSLDGSCELDIWPYLQNLTGDVISRTAFGSSYEEG 31885
31884
RRIFQLQKEQALLAVQVTRSVYVPGWR 31804
31203
FFPTKTNRRMRQISSEVNALLKGIIEKREKAMQAGETANDDLLGLLMESNYREM 31042
31041
QENDERKNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNRQACAREEVLRLF 30862
30861
GNKKPDGDGLNHLKI 30717
>CYP72A112P CAAP02001422.1 pseudogene fragment 52% to CAAP02002795.1
Chr19_random 732617 to 733044 on strand +
63362
VIS*TTFGSSYEEGRRILLLQEELA*LTIRIF 63457
63664
KGNKRIKKADKEIQELLRGIIDQREKAMKVCETVNDDLLSIL 63789
>CYP72A113 CAAP02003454.1 91% to CAN67740.1
GSVIVP00000208001 in Genoscope browser
4821416 to 4825397 on strand +
21735
MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMS 21908
21909
RMINEANSRPISFSDDIVQRVLPFHDHSIQKY (1) 22004
22833
GKNNFIWLGPKPVVNIMQPELIRDVLLKHNAFQKPPPHPLGKLLASGISSL 22985
22986
DGEQWTKRRKIINPAFHLEKLK (0) 23051
23456
HMVSAFQLSCSDMVNKWEKQLSLDGSCELDIWPYLQNLTGDVISRTAFGSSYEEGRRIF 23632
23633
QLQKEQALLAVQVTRSVYVPGWR (2) 23701
24301
FFPTKTNRRMRQISSEVDALLKGIIEKREKAMQAGETANDDLLGLLMESNYREMQENDER 24480
24481
KNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNWQACAREEVLRVFGNKKPD 24660
24661
GDDLNHLKI (0) 24687
25294
VTMIFHEVLRLYPPVPMLTRAVFADSQVGGLYLPDGVQIALPILLIHHDDKIWGEDAKE 25470
25471
FNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSYA 25650
25651
HAPISLLTMQPQHGAHLILHGL* 25719
>CYP72A114P CAAP02000680.1 pseudogene missing exons 2 and 3
7 aa diffs
to CAAP02003454.1
GSVIVP00000210001 in Genoscope browser not assembled correctly
chrUn_random
4873006
to 4875944 on strand +
1243
MKLSSVAISFAFIVLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMSRM 1422
1423
INEANSRPMSFSDDIVQRVLPFHDHSIQKY (1) 1512
2769
FFPTKTNRRMRQISSEVNALLKGIIEKREKAMKAGETANDDLLGLLMESNYREMQENDE 2945
2946
RKNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNWQACAREEVLRVFGNKKP 3125
3126
DGDDLNHLKI (0) 3155
3759
VTMIFHEVLRLYPPAPMLTRAVFADSQVGGLYLPDGVQIALPILLIHHDDKIWGDDAK 3932
3933
EFNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSY 4112
4113
AHAPISLTTMQPQHGAHLILHGL* 4184
>CYP72A115P CAAP02000101.1 N-term
exon may be a pseudogene or the rest of the gene
may
run off the end of the contig, 1 aa diff to CAN63404.1
chrUn_random
3956335
to 3956628 on strand -
6161
MKHSSIAISFGFLTVLISCLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLHGDFKEMSM 5982
5981
MLKEAYSRPISLSDDIAPRYELLLFIIVLKFADLFKLW 5868
>CYP72A116P CAAP02000101.1
N-term exon pseudogene, 93% to CAN63404.1
chrUn_random
3957647
to 3957925 on strand -
7458
MKHSSVAISFGFLTVLISYLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLLHGDFKEMS 7279
7278
MMLKEAYSRPINLSDDIALCVLPFHCRFIKKYG 7180
>CYP72A117P CAAP02000101.1,
pseudogene, missing exons 2,3, 83% to CAN63404.1
about
90% to CAAP02000149.1d
GSVIVP00000171001 in Genoscope browser not assembled correctly
chrUn_random
4136790
to 4143738 on strand -
193271
MKHNSVAISFGFLTVFISCLWMLLNWVWLRPKRLERCLREQGLAENSYSLLHGDFKEMSM 193092
193091
ILKEAYSRPISLSDDIAPRVLPFRCHFIKKY 192999
187549
FLPTKTNRKMKQISKKAYALLRGIINKREKTMKADKTGNSDLLVILMESNFR* 187391
187390
IQEHKNNKKIGMSVKEVIEECKIFYLAGQETTSVFLVWTMVLLSENPNWQARAREEVLQV 187211
187210
FGNKKLEANGLNHLKI (0) 187163
186751
VTMIFHEVLRLYPPVAMLTRAVYKDTQVGDMYFPAGVQVALPTILVHHDHEIWGDD 186584
186583
VKEFNPERLAEGISKAKKNQVSFFPFGWGPQACIGQNFAMMEAKIALAMILQHFLFELSP 186404
186403
SYAHAPFNILTMQLQYGGHLILHGLQC 186323
>CYP72A118P
gi|147818466|emb|CAN71976.1| 51% to 72A10 probable pseudogene
80%
to CAAP02000149.1d
chrUn_random
57735623
to 57737209 on strand –
first line does not match
MEAXELGVXETEEXREMPLESKAWLGIRIGYYKGDSKEMSRMMKEAYSRPISLSDDIVQRVLPFHCHFIKKY
1865
GKNFFTWVGPSPRVNIMEPELMRDVLLKSNIFQKTPSHPLVKLLVSGLVALEGEQWAKRRKIINPAFHPEKLK 1649
1598
RKIINPVFHPE 1566 (small duplication)
1271
NMLSAFHLSCSDMVNKWKKLSVEGSCELDVWPYLENLTGDVISRTAFGSSYEEGIRIFQLQKEQT
YLAIKVAMSVYIPGWR 1029
$$$$
>CYP72A119P
gi|147779725|emb|CAN67214.1| AM437669.2
45%
to 72A8 C-helix
CAAP02016581.1,
74% to CAN71976.1 pseudogene
876 KLGKNSFTWVSPNPRVNIM
KPNIFQKTPSHPLVKXLVSGLVAQEGEQWAKRRKIINPVFHPEKLK
1151
RKIINPAFHPE 1183 (small duplication)
1443
NMLPTIHLSCS 1475
1475
LSVEGSRESDVWPYLENLTWDVIARTAFGSSYEEGRKIFQLQKEQTYLAINVATWVNIPGWTYA 1666
AM437669.2
4 aa diffs to CAAP02003489.1, 68% to CAAP02000473.1
chrUn_random
69166373
to 69166585 on strand –
from GKNS to FHPEK 1 aa diff
15215
GKNSFTWVGPN 15247
15249
PRVNIMKPELMRDVLLK
15298
RPNIFQKTPSHPLVKXLVSGLVAQEGEQWAKRRKIINPVFHPEKLK 15435
15486
RKIINPAFHPE 15518
MLPTIHLSCS
LSVEGSRESDVWPYLE
15860
NLTWDVIARTAFGSSYEEGRKIFQLQKEQTYLAINVATWVNIPGWTY 16000
>CYP72A119P CAAP02003489.1 pseudogene CYP72A 4
aa diffs to CAN67214.1
39277
GKNSFTWVGPNPRVNIMKPELMRDVLLKPNIFQKTPSHPLVKLLVSGLVAQEGEQWAKRR 39098
39097
KIINPVFHPEKLK 39059
38658
NLTWDMIARTAFGSSYEEGRKIFQLQKE*TYLAINVATSVNIPGWTY 38518
$$$$
CYP72D
SUBFAMILY (4 genes) [1 pseudogene]
>CYP72D3 gi|147795107|emb|CAN60851.1| 43% to 72A15
yellow region too long
CAAP02006515.1a
1-3060 runs off end, missing exon 1
87% to CAAP02007230.1
GSVIVP00009515001 on Genoscope browser
chrUn_random
60546398
to 60549431 on strand + missing exon 1
MAYSFAILTMYTLSRVVYSIWWRPKSLEKQLRRQGIRGTRYKLLFGDAKAMKQSFMEARSKPMALNHSIV
PRVVPFYHEIAQKY
GKVSVSWNFTTPRVLIVEPELMRLILTSKNGHFQRLPGNPLGYLLSRGLSYLQGEK
WAKRRKLLTPAFHFEKLK
SYHRTVRGHHSRNGPDSAESYGAICLFGSWGVTSLGFELKTKVFLVALL
GMVPAFSVSCRKLIERWKNLVAPQGTYELDMMHEFQNLTGDVISQVAFGSNYEEGKKVFELQKEQAVLVMEAF
RTFYIPGFRFVPIGKNKKRYYIDSEIKAILKKIILKRKQTMKPGDLGNDDLLGLLLQCQEQTDSEMTIED
VIEECKLFYFAGQETTANWLTWTILLLSMHPNWQEKAREEVLQLCGKKMPDIEAINRLKIVSMILHEVLR
LYPPVTQQFRHTCERINIAGMCIPAGVNLVLPTLLLHHSPEYWGDDVEEFKPERFSEGVSKASKGDQIAF
YPFGWGHRICLGQGFAMIEAKMALAMILQHFWFELSPTYTHAPHTVITLQPQHGAPIILHEI
>CYP72D3 CAAP02015403.1 =
CAN60851.1 4 aa diffs, runs off end
1507 MAYSFAILTMYTLSRVV
1456
YSVWWRPKSLEKQLRRQGIRGTRYKLLFGDAKAMKQSFMEARSKPMALNHSIVPRVL 1286
1285 PFYHEIAQKY 1253
1178 GKVSVSWNFTTPRVLIVEPELMRLILTSKNGHFQRLPGNPLGYLLTRGLSYLQGEKWAK 1002
1001 RRKLLTPAFHFEKLK 957
232
GMVPAFSVSCRKLIERWKNLVAPQGTYELDMMPEFQ 125
124
NLTGDVISQVAFGSNYEEGKKVFELQKEQAVLVMEAFRTFY 2
>CYP72D4
gi|147795108|emb|CAN60852.1| AM443849.2 43% to 72A14
CAAP02006515.1b
5359-7817 adjacent to CAN60851, 87% to CAN60851
97% to CAAP02007230.1, 100% to CAO16149.1
GSVIVP00009516001 in Genoscope Browser
chrUn_random
60551733
to 60554191 on strand +
MAYSFAILTVYTLLRVVYSIWWRPKSLEKQLRRQGIRGTHYKLLFGDAKAMKQSFVEARSKPMALNHSIV
PRVTPFYHEMAQKYGKVSVSWHFTTPRVLIVEPELMRMILXYKNGHLXRLPGNPLGYHLSRGLLSLEGEK
WAKRRKLLSPAFHLEKLK
GMMPAFSTSCHXLIERWKNLVGPQGTYELDVMPEFQ
NLTGDVISRTAFGSSYEEGRRVFELQKEQIVLVMEDFRNFYIPGFRFVPTRK
NKRRYYMDSEIKAMIKKIILKKKQTLKNGDPGNDDLLGLLLQCQEQTDSEMTIEDVVEECKLFYFVGQET
TANWLTWTILLLSMHPNWQEKARAEVLQICGKKMPDIEAISNLKIVSMILHEVLRLYPPVIMQFRHTRER
INIAGMYIPAGVDLVLPTVLLHHSPEYWGDDVEEFKPERFSEGVSKASKGDQTAFYPFGWGHRICLGQGL
AMIEAKMALAMILQHFWFELSPAYTHAPYRIITLQPQYGAPIILHQI
>CYP72D5 CAAP02007230.1 87% to
CAN60851.1, 100% to CAO41622.1
GSVIVP00013480001 in Genoscope Browser
chrUn_random
89958971
to 89961429 on strand +
1439
MAYSFAILTVYTLLRVVYSIWWRPKSLEKQLRRQGIRGTHYKLLFGDAKAMKQSFMEARS 1618
1619
KPMALNHSIVPRVIPFYHEMAQKY 1690
1768
GKVSVSWHFTTPRVLIVEPELMRMILKYKNGHLHRLPGNPLGYHLSRGLLSLE 1926
1927
GEKWAKRRKLLSPAFHLEKLK 2019
2699
GMMPAFSTSCHDLIERWKNLVGPQGTYELDVMPEFQNLTGDVISRTAFGSSYEEGRRVFE 2878
2879
LQKEQIVLVMEDFRNFYIPGFR 2944
3035
FVPTRKNKRRYYMDSEIKAMIKKIILKKKQTLKNGDPGNDDLLGLLLQCQEQTDSEMTI 3211
3212
DDVVEECKLFYFVGQETTANWLTWTTLLLSMHPNWQEKARAEVLQICGKKMPDIEAISNL 3391
3392
KI 3397
3469
VSMILHEVLRLYPPVIMQFRHTGERINIAGMCIPAGVDLVLPTALLHHSPEYWGDDVEEF 3648
3649
KPERFSEGVSKASKGDQTAFYPFGWGHRICLGQGLAMIEAKMALAMILQHFWFELSPTYT 3828
3829
HAPHRIITLQPQYGAPIILHQI* 3897
>CYP72D6 CAAP02003169.1 48% to 72A15, 75% to CAAP02007230.1, 76% to CAN60851.1
GSVIVP00032271001 in Genoscope Browser
Chr4
1480668
to 1483369 on strand +
19453
MAFSFAILVVYGLLRAVYTIWWRPKSLEKQLRQQGIRGTRYKPMYGDMKALKLSFQEAQS 19632
19633
KPMTLNHSIVPRVIPFFHQMFQNY 19704
20367
GKISMSWIFTRPRVMIVDPELIRMILADKNGQFQKPPLNPLVDLLTLGLSTLE 19525
20526
GEQWAKRRKLITPAFHVEKL 19585
20884
GMVPAFSMSCCNLIERWKNWVGPQGTYELDVMPEFQNVTGDVISRAAFGSSYEEGKKVFE 21063
21064
LQKEQAVLVIEASRAIYLPGFR 21129
2219
FVPTVKNRRRYHIDNEIKAMLRSMIDRKKQAMKNGDSGYNDDLLGLLLQLTEEID 21383
21384
NEMRIEDLIEECKLFYFAGQETTANLLTWTMILLSMNPKWQDKAREEVLQICGKKIPDLE 21563
21564
AIKHLKI 21584
21726
VSMILHEVLRLYPSVVNLLRYTHKRTDVAGLSIPAGVELYLPTILLHHSPEYWGDDVEEF 21905
21906
KPERFSEGVSKASKGDQIAFYPFGWGPRICLGQSFAMIEAKMALAMILQNFWFELSPTYT 22085
22086
HAPYTVITLQPQYGAPIILHQI* 22155
>CYP72D6
gi|147773778|emb|CAN65255.1| 53% to 72A8
1
aa diff to CAAP02003169.1
missing first two exons
GMVPAFSMSCCNLIERWKNWVGPQGTYELDVMPEFQNVTGDVISRAAFGSSYEEGKKVFELQKE
QAVLVIEASRAIYLPGFRFVPTVKNRRRYHIDNEIKAMLRSMIDRKKQAMKNGDSGYNDDLLGLLLQLTE
EIDNEMRIEDLIEECKLFYFAGQETTANLLTWTMILLSMNPKWQDKAREEVLQICGKKIPDLEAIKHLKI
VSMILHEVLRLYPSVVNLLRYTHKRTDVAGLSIPAGVELYLPTILLHHSPEYWGDDVEEFKPERFSEGVS
KASKGDQIAFYPFGWGPRICLGQSFAMIEAKMALAMILQNFWFELSPTYTHAPYTVITLQPQYGAPIILH
QI
>CYP72D7P CAAP02000210.1 pseudogene, one stop codon,
insert in C-helix 59% to CAAP02007230.1
GSVIVP00016465001 in Genoscope Browser (missing C-helix region)
chr11
1013195
to 1015601 on strand -
131775
MAVSMFSCLLISSLVLLLYGVLRVSYSIWWKPKWLEKRLRQQGIRGTPYKLVMGDMKEYI 131596
131595
RLITEAWSKPMNLNHHIVSRVDPFTQNNMQQY
131412
GKVSLFWAGTTPRLIVMDPGMIKEVLSNKQGHFQKPYISPLILTLARGLTALEGEVWAK 131236
131317
RRIINPAFHLEKLK 131192
131015
VMIPAFTTSCSMLIERWKELASLQETCEVDIWPELQNLTRDVISRAALGSSFEEGRQ 130845
130844
IFELQKEHITLTLEAMQTLYIPGFR 130770
130633
FIPTKKNQRRKYLQKRTTSMFRDLIQRKKDAIRTGQAEGDNLLGLLLLSSSQNNLPEN 130460
130459
VMSTKDNAITLEEVIEECKQFYLAGHETTSSWLTWTVTVLAMHPNWQEKAREEVMQICGK 130280
130279
KEPDSEALSHLKI 130241
129794
VSMILYEVLRLYPPVIAVYQHAYKETKIGTISLPAGVDLTLPTLLIHHDPELWGDDAEEF 129615
129614
KPERFAEGVSKASKDQLAFFPFGWGPRTCIGQNFAMIEAKVALAMILQHFSFELSPSYTH 129435
129434
APHTVMTLQPQHGAQLKFYQL* 129369
CYP73
family
>CYP73A78
gi|147821469|emb|CAN70035.1| = AM455281.2
CAAP02004907.1
7360-5632 (-) strand, 1 aa diff
MAHLLNKPVFFSTLLTIILLSSTRLLASYLSISPPLIASFLPLAPLILYLFYSIAKRSASLPPGPLSIPL
FGNWLQVGNDLNHQLLASMAQKYGPVFLLKLGSKNLAVVSDPELASQVLHTQGVEFGSRPRNVVFDIFTG
NGQDMVFTVYGDHWRKMRRIMTLPFFTNKVVHHYSEMWEEEMELVVDDLRNKESVKSEGLVIRKRLQLML
YNIMYRMMFDSKFESQEDPLFIQATRFNSERSRLAQSFDYNYGDFIPFLRPFLRGYLNKCRELQSRRLAF
FNNYFVEKRREIMAANGEKHKIRCAIDHIIDAQLKGEISEANVLYIVENINVAAIETTLWSMEWAIAELV
NHPHVQCKIRDEITTILQGDAVTESNLHQLPYLQATVKETLRLHAPIPLLVPHMNLEEAKLGGYTIPKES
KVVVNAWWLANNPSWWKNPEEFRPERFLEEESGTDAVAGGKVDFRFLPFGVGRRSCPGIILALPILALVI
AKLVMNFEMRPPIGVEKIDVSEKGGQFSLHIANHSTVALTPIAA
>CYP73A81 CAAP02000489.1 94% to
73A78
132479
MAHLLNKPLFFTLVTIILLSSTRLLASYLPISPNIARFLPLAPLILYLFYSISKRSAS 132652
132653
LPPGPLSIPIFGNWLQVGNDLNHQLLASMAQKYGPVFLLKLGSKNLTVVSDPELASQVLH 132832
132833
TQGVEFGSRPRNVVFDIFTGNGQDMVFTVYGDHWRKMRRIMTLPFFTNKVVHQYSEMWEE 133012
133013 EMDLVVDDLRNKESVKTEGLVIRKRLQLMLYNIMYRMMFDAKFESQEDPLFIQATRFNSE 133192
133193
RSRLAQSFDYNYGDFIPLLRPFLRGYLNKCRELQSSRLAFFNNYYVEKRR 133342
133464
EIMAANGEKHKIRCAIDHIIDAQHKGEISEENVLYIVENINVAAIETTLWSMEWAIAEL 133640
133641
VNHPHVQSKIRDEITTVLQGGAVTESNLHQLPYLQATVKETLRLHSPIPLLVPHMNLEEA 133820
133821
KLGGYTIPKESKVVVNAWWLANNPEWWKNPEEFRPERFLQEESATDAVAGGKADFRFLPF 134000
134001
GVGRRSCPGIILALPILALVIGKMVMNFEMRPPIGVEKIDVSEKGGQFSLHIANHSTVAF 134180
134181 TPITA* 134198
>CYP73A82
gi|147775009|emb|CAN77208.1| 63% to CYP73A78
MDLILIEKALLAVFCAIILAITISKLLGKKLKLPPGPLPVPVFGNWLQVGDDLNHLNLSDLAKKFGDIFM
LRMGQRNLVVVSSPDLAKDVLHTQGVEFGSRTRNVVFDIFTGKGQDMVFTVYGEHWRKMRRIMTVPFFTN
KVVQQYRVGWEDEAARVVEDVKKNPEASTNGIVLRRRLQLMMYNNMYRIMFDRRFDSEEDPLFVKLKALN
GERSRLAQSFEYNYGDFIPILRPFLRGYLKICKEVKERRLQLFKDHFLEERKKLASTKSTDHNSLKCAVD
HILDAQQKGEINEDNVLYIVENINVAAIETTLWSIEWGIAELVNHPHIQKKLRDELNTVLGPGVQVTEPD
IQKLPYLQAVIKETLRLRMAIPLLVPHMNLNDAKLGGYDIPAESKILVNAWWLANDSSKWKKPEEFRPER
FLEEESKVEANGNDFRYLPFGVGRRSCPGIILALPILGITIGRLVQNFELLPPPGQAKLDTTGKGGQFSL
HILKHSTIVARPIEA
>CYP73A82 CAAP02000415.1 =
CAN77208.1
86000
MDLILIEKALLAVFCAIILAITISKLLGKKLKLPPGPLPVPVFGNWLQVGDDL 86158
86159
NHLNLSDLAKKFGDIFMLRMGQRNLVVVSSPDLAKDVLHTQGVEFGSRTRNVVFDIFTGK 86338
86339
GQDMVFTVYGEHWRKMRRIMTVPFFTNKVVQQYRVGWEDEAARVVEDVKKNPEASTNGIV 86518
86519 LRRRLQLMMYNNMYRIMFDRRFDSEEDPLFVKLKALNGERSRLAQSFEYNYGDFIPILRP 86698
86699 FLRGYLKICKEVKERRLQLFKDHFLEERK 86785
86992
KLASTKSTDHNSLKCAVDHILDAQQKGEINEDNVLYIVENINVA 87120
88683
AIETTLWSIEWGIAELVNHPHIQKKLRDELNTVLGPGVQVTEPDIQKLPYLQAV 88844
88845 IKETLRLRMAIPLLVPHMNLNDAKLGGYDIPAESKILVNAWWLANDSSKWKKPEEFRPER 89024
89025
FLEEESKVEANGNDFRYLPFGVGRRSCPGIILALPILGITIGRLVQNFELLPPPGQA 89195
89196 KLDTTGKGGQFSLHILKHSTIVARPIEA 89279
>CYP74A13 CAAP02000041.1a CYP74A
54% to 74A4 (CAO47688.1)
in
contig CU459225.1 chr3 scaffold_8
234521
MSSLSSSSSSSRSELPLLKIPGDYGLPFFGPIRDRFDYFYNQGQDEFFKTRMQKYHSTVFRAN 234709
234710
MPPGPFISSDSKVVVLLDAVSFPVLFDSSKVEKRNVLDGTFMPSTDLTGGYRVLAFLDPS 234889
234890
EPKHDLLKRFSFSLLASRHRDFIPVFRSGLPDLFTTIEDDVSSKGKANFNNIADGMYFNF 235069
235070
VFRLICGKDPSDAKIRSEGPNIFSKWLFLQLSPLMTLGLSMLPNFIEDLLLHTFPLPPFL 235249
235250
VKSDYNKLYKAFYESASSVLDEGERMGINRDEACHNLVFLAGFSTFGGMKVLFPPLIKWV 235429
235430
GLAGEKLHRELADEIRTVVKAEGGVTFAALDKMALTKSVVYEALRIGPPVPFQYGKARE 235606
235607
DMVIHSHDAAFEIKKGEMIFGYQPFATKDPKVFENPEDFVAHRFMGEGEKLLKYVYWSN 235783
235784
GRETDNPTAENKQCSGKDLVVLISKLMLVEIFLRYDTFEVESGTMVLGSAVLFKSLTKSS 235963
235964
YT* 235972
>CYP74A14 CAAP02000041.1b CYP74A
54% to 74A4 (CAO47689.1)
in
contig CU459225.1 chr3 scaffold_8
244102
MSSSSSSSSSSRPELPLRKIPGDYGLPFFGPIRNRFDYFYNQG 244230
244231
QDEFFKTRMQKYHSTVFRANMPPGPFISSDSKVVVLLDTVSFPVLFDSSKVEKRNVFVGT 244410
244411
FMPSTDLTGGYRVLPYLDPSEPKHDLLKRFSFSLLASRHRDFIPVFRSGLPDLFSTIEDD 244590
244591
VSRKGKANFNDIADDMYFNFVFRLICGKDPSDAKIRSEGPNIFLKWLFLQLSPLLTLGLS 244770
244771
ILPNFIDDLLLHTFPFPPFLVKSDYNKLYKAFYESASSVLDEGERMGIKRDEACHNLVFL 244950
244951
AGFNSFGGMKVFFPALIKWVGLAGEKLHRELADEIRTVIKAEGGVTFAALDKMALTKSM 245127
245128
VYEALRIEPPVPFQYGKAREDMVIHSHDAAFEIKKGEMIFGYQPFATKDPKVFENPEEFV 245307
245308
AHRFMGEGEKLLKYVYWSNGRETDNPTAENKQCSGKDLVVLISRLMLVEIFLRYDTFEV 245484
245485
ESGTMLLGSSLLFKSLTKTSYT* 245553
>CYP74A15 CAAP02000041.1c CYP74A
56% to 74A5 (CAO47690.1 fused)
in
contig CU459225.1 chr3 scaffold_8 upstream of CAAP02006275.1a
252843
MSSSSSSLPLNFDNSSSSSKLPLRSIPGDCGSPFFGPIKDRFDYFYNEGRDQFFRTRMQKY 253025
253026
QSTVFRANMPPGPSMASNPNVVVLLDAISFPILFDTSRIEKRNVLDGTYMPSTAFTGGYR 253205
253206
VCAYLDPSEPNHALLKRLFMSSLAARHHNFISVFRSCLTELFITLEDDASRKGKADFNGI 253385
253386
SDNMSFNFVFKLFCDKHPSETKLGSNGPNLVTKWLFLQLAPLITLGLSMLPNVVEDLLLH 253565
253566
TFPLPSLFVKSDYKNLYHAFYASASSILDEAESMGIKRDEACHNLVFLAGFNAYGGMKTL 253745
253746
FPALIKWVGLAGEKLHGQLADEIRSIVKAEGGVTFAALDKMALTKSVVYEALRIEPPVP 253922
253923
FQYGKAKEDMVIHSHDAAFEIKKGEMIFGYQPFATKDPKVFDNPEEFVAHRFMGDGEKM 254099
254100
LEYVYWSNGRESDDPTVENKQCPGKDLVVLLSRVMMVEFFLRYDTFNIECGTLLLGSSVT 254279
254280
FKSLTKQPTFDHKSITHVS* 254339
>CYP74A16 CAAP02006275.1a
CYP74A, 96% to CAAP02000041.c (CAO47690.1 fused)
in
contig CU459225.1 chr3 scaffold_8
5348
MSSSSSSLPLNFVNSSSSSKLPLRSIPGDCGSPFFGPIKDRFDYFYNEGRDQFFRTRMQK 5527
5528
YQSTVFRANMPPGPFMAFNPNVVVLLDAISFPILFDTSRIEKRNVLDGTYMPSTAFTGGY 5707
5708
RVCAYLDPSEPNHALLKRFFTSSLAARHHNFIPVFRSCLTELFTTLEDDVSRKGKADFNG 5887
5888
ISDNMSFNFVFKLFCDKHPSETKLGSNGPNLVTKWLFLQLAPLITLGLSMLPNVVEDLLL 6067
6068
HTFPLPSLFVKSDYKKLYHAFYASASSLLDEAESMGIKRDEACHNLVFLAGFNAYGGMKT 6247
6248
LFPALIKWVGLAGGKLHRQLADEIRSIVKAEGGVTFAALDKMALTKSVVYEALRIEPPVP 6427
6428
FQYGKAKEDMVIHSHDAAFEIKKGEMIFGYQPFATKDPKVFDNPEEFVAHRFMGDGEKLL 6607
6608
EYVYWSNGRESDDPTVENKQCPGKDLVVLLSRVMLVEFFLHYDTFDIECGTLLLGSSVTF 6787
6788
KSLTKQPTFDHKSIKHVS* 6844
>CYP74A17 CAAP02006275.1b
CYP74A, 84% to CAAP02000041.c (CAO47691.1)
in
contig CU459225.1 chr3 scaffold_8
11732
MSSSSDKNDLNSSSSLSKLPLRKIPGDYGLPFFGAIKDRLDYFYKQGREEFFNARMHK 11905
11906
YQSTVFRANMPPGPFMASNPNVIVLLDSISFPILFDTSKVEKRNVLDGTYMPSTAFTGGY 12085
12086
RVCAYLDPSETNHALLKRLFMSALAARHHNFIPLFRSSLSELFTSLEDDISSKGEADFND 12265
12266
ISDNMSFNFVFRLFCDKYPSETALGSQGPSIVTKWLFFQLAPLITLGLSLLPNFVEDLLL 12445
12446
HTFPLPSIFVKSDYKKLYRAFYASASSILDEAESMGIKRDEACHNLVFLAGFNAYGGMKA 12625
12626
LFPSLIKWVGSAGEKLHRELADEIRTVVKAEGGVSFAALEKMSLTKSVVYEALRIDPPVP 12805
12806
FQYGKAKEDMVIHSHDAAFEIKKGEMIFGYQPFATKDPKVFDNPEEFMGNRFMGEGERLL 12985
12986
KYVYWSNGRESGNPTVENKQCAGKDLVLLLSRVMLVEFFLRYDTFDIESGTLLLGSSVTF 13165
13166
KSITKATDS* 13195
>CYP74A1 CAAP02000063.1
(CAO61246.1) in contig CU459218.1
chr18
scaffold_1
61%
to 74A1 Arab. 70% to 74A1 tomato
next
closest match to the tomato 74A1 is CAAP02000041.1b 60%
so
this is considered the ortholog of CYP74A1. Note it is distant from the
other
CYP74 gene cluster on chr 3
149291
MASPSLTFPSLQLQFPTHTKSSKPSNHKLIVRPIFASVSEKPSVPVSQSQVTPPGPIRKI 149470
149471
PGDYGLPFIGPIKDRLDYFYNQGREEFFRSRAQKHQSTVFRSNMPPGPFISSNSKVIVLL 149650
149651
DGKSFPVLFDVSKVEKKDVFTGTFMPSTEFTGGFRVLSYLDPSEPDHTKLKRLLFFLLQS 149830
149831
SRDRVIPEFHSCFSELSETLESELAAKGKASFADPNDQASFNFLARALYGTKPADTKLGT 150010
150011
DGPGLITTWVVFQLSPILTLGLPKFIEEPLIHTFPLPAFLAKSSYQKLYDFFYDASTHVL 150190
150191
DEGEKMGISREEACHNLLFATCFNSFGGMKIIFPTILKWVGRGGVKLHTQLAQEIRSVVK 150370
150371
SNGGKVTMASMEQMPLMKSTVYEAFRIEPPVALQYGKAKQDLVIESHDSVFEVKEGEMLF 150550
150551
GYQPFATKDPKIFERSEEFVPDRFVGEGEKLLKHVLWSNGPETENPTLGNKQCAGKDFV 150727
150728
VLAARLFVVELFLRYDSFDIEVGTSLLGSAINLTSLKRASF* 150853
>CYP74B13 AM441513 PLN
18-MAY-2007 Vitis vinifera (Pinot noir grape)
11751
MLSSTVMSVSPGVPTPSSLTPPSPPSSSPVRAIPGSYGWPVLGPIADRLDYFW 11593
11592
FQGPETFFRKRIDKYKSTVFRTNVPPSFPFFVDVNPNVIAVLDCKSFSFLFDMDVVEKKN 11413
11412
VLVGDFMPSVKYTGDIRVCAYLDTAETQHAR 11320
10198
VKGFAMDILKRSSSIWASEVVASL 10127
10125
DTMWDTIDAGVAKSNSASYIKPLQRFIFHFLTKCLVGADPAVSPEIAESGYVMLDKWVFL 9946
9945
QLLPTISVNFLQPLEEIFLHSFAYPFFLVKGDYRKLYEFVEQHGQAVLQRGETEFNLSKE 9766
9765
ETIHNLLFVLGFNAFGGFTIFFPSLLSALSGKPELQAKLREEVRSKIKPGTNLTFESVK 9589
9588
DLELVHSVVYETLRLNPPVPLQYARARKDFQLSSHDSVFEIKKGDLLCGFQKVAMTDPKI 9409
9408
FDDPETFVPDRFTKEKGRELLNYLFWSNGPQTGSPSDRNKQCAAKDYVTMTAVLFVTHMF 9229
9228 QRYDSVTASGSSITAVEKAN* 9166
>CYP74B13 CAAP02000110.1 (CAO24035.1) in contig CU459253.1
chr12
scaffold_36
no heme
Cys 56% to 74B2 also w/o Cys, 3 aa diffs to AM441513
146202
MLSSTVMSVSPGVPTPSSLTPPSPPSSSPVRAIPGSYGWPVLGPIADRLDYFWFQGPETFFRKR 146011
146010
IDKYKSTVFRTNVPPSFPFFVGVNPNVIAVLDCKSFSFLFDMDVVEKKNVLVGDFMPSVK 145831
145830
YTGDIRVCAYLDTAETQHAR (0) 145771
144656
VKSFAMDILKRSSSIWASEVVASLDTMWDTIDAGVAKSNSASYIKPLQRFIFHFLTKCL 144480
144479
VGADPAVSPEIAESGYVMLDKWVFLQLLPTISVNFLQPLEEIFLHSFAYPFFLVKGDYR 144303
144302
KLYDFVEQHGQAVLQRGETEFNLSKEETIHNLLFVLGFNAFGGFTIFFPSLLSALSGKPE 144123
144122
LQAKLREEVRSKIKPGTNLTFESVKDLELVHSVVYETLRLNPPVPLQYARARKDFQLSS 143946
143945
HDSVFEIKKGDLLCGFQKVAMTDPKIFDDPETFVPDRFTKEKGRELLNYLFWSNGPQTGS 143766
143765
PSDRNKQCAAKDYVTMTAVLFVTHMFQRYDSVTASGSSITAVEKAN* 143625
AM
# and CAN # are from Velasco et al. heterozygous Pinot Noir grapevine variety
CAO
# and CAAP # are from Jaillon et al. PN40024 highly homozygous French-Italian
Public Consortium
Note:
CYP75s and CYP79s are interleaved
CYP75
family
CYP75A
subfamily (9 genes) [5 pseudogenes] 2 alleles, 15 orthologs from other strains
31
sequences
>CYP75A28
gi|83715792|emb|CAI54277.1| AJ880356
Shiraz mRNA
flavonoid-3,5'-hydroxylase
78% to CYP75A8 Catharanthus roseus
MAIDTSLLLEFAAATLLFFITRFFIRSLLPKPSRKLPPGPKGWPLLGALPLLGNMPHVALAKMAKRYGPV
MFLKMGTNSMVVASTPEAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML
GGKALEDWSQVRAVELGHMLRAMLELCQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM
VVELMTSAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDWLLTKMMEEHTASAHERKGNPDFLDVIMAN
QENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLP
KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFL
SGRNTKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDEVEINMDEAFGLALQKA
VSLSAMVTPRLHQSAYAV*
>CYP75A28
gi|85679310|gb|ABC72066.1| flavonoid 3',5'-hydroxylase 99%
only
3 aa diffs, DQ351701 cv. Sangiovese Berries genomic
RFFIRSLLLKPSRKLPPGPKGWPLLGALPLLGNMPHVALAKMAKRYGPVMFLKMGTNSMVVASTPEAARA
FLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHMLGGKALEDWSQVRAVELGHMLR
AMLELCQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDMVVELMTSAGYFNIGDFIPSIA
WLDIQGIQRGMKHLHRKFDWLLTKMMEEHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNL
FTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPL
NLPRVSTQACEVNGYYIPKNTGLSVNIWAIGRDPDVWESPEEFRPERFLSGRNTKIDPRGNDFELIPFGA
GRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV
>CYP75A28 gi|147862221|emb|CAN82592.1|
AM436340.2c Pinot
Noir
genomic
99%
only 4 aa diffs
MAIDTSLLLEFAAATLLFFITRFFIRSLLPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPV
MFLKMGTNSMVVASTPEAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML
GGKALEDWSQVRAVELGHMLRAMLELCQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM
VVELMTSAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDWLLTKMMEEHTASAHERKGNPDFLDVIMAN
QENSTGEKLTITNIKALLL
NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDXVIGRSRRLVESDLP
KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWESPEEFRPERFL
SGRNEKIDPRGNDFELIPFGAGRRILRWH
>CYP75A28-de2b
gi|157028306|emb|CAAP02012536.1| PN40024,
contig_12536
Length=4320
Middle
exon on (-) strand near end of contig = CAN82592.1| AM436340.2c
This
is a pseudogene fragment that is missing the end of the exon
4021 NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLPKLPYLQAIC 3842
3841
KESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIW 3722
There
is another exon 1 on (-) strand at 1-382
=
CAN82592.1| AM436340.2c CYP75A28
>CYP75A28 gi|83944624|gb|ABC48916.1|
DQ298201.1 Cabernet
Sauvignon genomic
flavonoid
3'-hydroxylase 98% to CYP75A28
4
aa diffs
this
seq is called VvF3’5’H-1a in Castellarin et al. BMC Genomics 2006
IGQVILSRRVFETKGSESNEFKDMVVELMTSAGYFNIGDFIPSIAWLDIQGIQRGMEHLHRKFDWLLTKM
MEEHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSI
LKRAHEEMDKVIGRSRWLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSV
NIWAIGRDPDVWESPEEFRPERFLSGRNE
>CYP75A28
gi|83944626|gb|ABC48917.1| DQ298202.1 Cabernet Sauvignon genomic
flavonoid
3'-hydroxylase 98% to CYP75A28
4
aa diffs
this
seq is called VvF3’5’H-1b in Castellarin et al. BMC Genomics 2006
IGQVILSRRVFETKGSESNEFKDMVVELMTSAGYFNIGDFIPSIAWLDIQEIQRGMEHLHRKFDWLLTKM
MEEHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSI
LKRAHEEMDKVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSV
NIWAIGRDPDVWESPEEFRPERFLSGRNE
>CYP75A28
gi|83944628|gb|ABC48918.1| DQ298203.1 Cabernet Sauvignon genomic
flavonoid
3'-hydroxylase (1 aa diff)
this
seq is called VvF3’5’H-1c in Castellarin et al. BMC Genomics 2006
IGQVILSRRVFETRGSESNEFKDMVVELMTSAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDWLLTKM
MEEHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSI
LKRAHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSV
NIWAIGRDPDVWESPEEFRPERFLSGRNT
>CYP75A32P CAAP02004900.1b pseudogene
96% to
75A28 CAI54277.1 AJ880356.1 Shiraz mRNA
2
frameshifts
CAO16882.1 + CAO16883.1 CU459242.1
19834
MAIDTSLLLEFAAATLLFFITRFFIRSILPKPSRKLPPGPKGWPLLGALPLVGNMPHVALAKMAK 19640
19639
RYGPVMFLKMGTNSMVVASTPEAARAFLKTLDINFSSRPPNAGATLLAYHAQD 19481
19480
MVFADYGARWKLLRKLSNLHMLGG 19409
19409
KALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGS 19230
19229
ESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRMIEEHT 19053
19052
ASAHERKGNPDFLDVVMGHQGNSTGEKLTLTNIKALLL (0) 18939
18547
NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLPKLPYLQAIC 18368
18367
KESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRL 18263
18261
SVNIWAIGRDPDVWESPEEFRPERFLSGRNAKIDPRGNDFELIPFGAGRRICAGTRMGIVL 18079
18078
VEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV* 17917
>CYP75A33 CAAP02004900.1a 5 aa diffs to CAN60359.1, Pinot Noir genomic
N-term
corrected
100%
to CAO16880.1
CU459242.1
10277
MAIDTSLLPELAAATLLFFITRFFIRSLFPKPSRKLPPGPRGWPLLGALPLLGNMPHVAL 10098
10097 AKMAKRYGPVMFLKMGTNSMVVASTP
10019
EAARAFLKTLDINFSNRPPNAGATHLAYDAQDMVFADYGARWKLLRKLSNLHMLGGKALE 9840
9839
DWSQVRTVELGHMLRAMLELSQREEPVVVPEMLSFSVANMIGQVILSRRVFETKGSESNE 9660
9659
FKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHE 9480
9479 RKGNPDFLDVVMGHQGNSTGEKLTLTNIKALLL
9381
8991
NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLQKLPYLQAIC 8812
8811
KESFRKHPSTPLNLPRVSNEACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFSPERF 8632
8631
LSGRNAKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINM 8452
8451 DEAFGLALQKAVSLSAMVTPRLHQSAYAV 8365
>CYP75A34
gi|147861244|emb|CAN81079.1| AM457118.1 Pinot Noir genomic
91%
to CYP75A28
4
aa diffs to CAO16875.1
CU459242.1
MAIDTSLLPELAAATLLFFITRFFICSLFPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPV
MFLKMGTSSMVVASTPEAARAFLKTLDINFSNRPPNAGATHLAYGAQDMVFADYGPRWKLLRKLSNLHML
GGKALEDSSQVRTVELGHMLRAMLELSQREEPVVVPEMLSFSIANMIGQVILSRRVFETKGSESNEFKDM
VVELMTCAGYFNIGDFIPSIAWMDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDVVMGH
QGNSTGEKLTLTNIKALLQNLFAAGTDTSASIIEWSLAEMLKNPSILKRAQEEMDHVIGRNRRLVESDLP
KLPYLQAICKESLRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWENPEEFRPERFL
SGRNAKIDPRGNDFELIPFGAGRRICAGARMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKA
VSLSAMVTPRLHQSAYAV
>CYP75A35 gi|147777347|emb|CAN62887.1|
AM437324.2 Pinot
Noir
genomic
72%
to CYP75A28
CAAP02001548.1
61270-63341 (+) strand, N-term corrected
100% to CAO16871.1 CU459242.1
MAIDTSFFIVSAAATLLFLIVHSFIHFLVS
RRSRKLPPGPKGWPLLGVLPLLKEMPHVALAKMAKKYGPVMLLKMGTSNMVVASNPEAAQAFLKTHE
ANFLNREPGAATSHLVYGCQDMVFTEYGQRWKLLRRLSTLHLLGGKAVEGSSEVRAAELGRVLQTMLEFS
QRGQPVVVPELLTIVMVNIISQTVLSRRLFQSKESKTNSFKEMIVESMVWAGQFNIGDFIPFIAWMDIQG
ILRQMKRVHKKFDKFLTELIEEHQASADERKGKPDFLDIIMANQEDGPPEDRITLTNIKAVLVNLFVAGT
DTSSSTIEWALAEMLKKPSIFQRAHEEMDQVIGRSRRLEESDLPKLPYLRAICKESFRLHPSTPLNLPRV
ASEACEVNGYYIPKNTRVQVNIWAIGRDPDVWENPEDFAPERFLSEKHANIDPRGNDFELIPFGSGRRIC
SGNKMAVIAIEYILATLVHSFDWKLPDGVELNMDEGFGLTLQKAVPLLAMVTPRLELSAYAA
>CYP75A36 CAAP02004490.1 21113-19182 PN40024
MAIDTSLLLKLAAAILLFFITRFFIRSLLPKPSRKLPP
GPRGWPLLGALPLLGNMPHVALAKMAKRYGPVMFLKMGTNCMVVASTPEAAQAFLKTLDI
NVSNRPPNAGATHLAYDAQDMVFADYGPRWKLLRKLSNLHMLGGKALEDWSQVRTVELGH
MLRAMLELSQREEPVVVPEMLSFSVANMIGQVILSRRVFETKGSESNEFKDMVVELMTTA
GYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDVVM
GHQGNSTGEKLTLTNIKALLL
(0)
NLFTAGTDTSSSVIEWSLAEMLKN
PSILKRAHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSAQACEV
NGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFLSGRNAKIDPRGNDFELIPFGAGR
RICAGTRMGIVLVEYILGSLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQ
SAYAV*
>CYP75A36 gi|86156244|gb|ABC86840.1|
DQ356236.1 Sangiovese genomic
flavonoid
3',5'-hydroxylase 94% to 75A28
CAAP02004490.1
21113-19182, 3 aa diffs, N-term corrected
CAO23870.1 translated from CAAP02004490.1
21113 MAIDTSLLLKLAAAILLFFIT
RFFIRSLLPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPVMFLKMGTNCMVVASTPEAARA
FLKTLDINVSNRPPNAGATHLAYDAQDMVFADYGPRWKLLRKLSNLHMLGGKALEDWSQVRTVELGHMLR
AMLELSQREEPVVVPEMLSFSVANMIGQVILSRRVFETKGSESNEFKDMVVELMTTAGYFNIGDFIPSIA
WLDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDVVMGYQGNSTGEKLTLTNIKALLLNL
FTAGTDTSSSVIEWSLAEMLKNPSILKRVHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPL
NLPRVSAQACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFLSGRNAKIDPRGNDFELIPFGA
GRRICAGTRMGIVLVEYILGSLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV*
>CYP75A36
gi|83944630|gb|ABC48919.1| DQ298204.1 Cabernet Sauvignon genomic
flavonoid 3'-hydroxylase 94% to CYP75A28
2
aa diffs to CYP75A36
this
seq is called VvF3’5’H-2a in Castellarin et al. BMC Genomics 2006
IGQVILSRRVFETKGSESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRM
IEEHTASAHERKGNPDFLDVVMGHQGNSTGEKLTLTNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSI
LKRAHEEMDKVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSV
NIWAIGRDPDVWESPEEFRPERFLSGRNE
>CYP75A36
gi|83944632|gb|ABC48920.1| DQ298205.1 Cabernet Sauvignon genomic
flavonoid
3'-hydroxylase 94% to CYP75A28
this
seq is called VvF3’5’H-2b in Castellarin et al. BMC Genomics 2006
2
aa diffs to CYP75A36
IGQVILSRRVFETKGSESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRM
TEEHAASAHERKGNPDFLDVVMGHQGNSTGEKLTLTNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSI
LKRAHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSAQACEVNGYYIPKNTRLSV
NIWAIGRDPDVWESPEEFRPERFLSGRNE
>CYP75A37P
gi|147794774|emb|CAN60359.1| AM429113.2 Pinot Noir genomic
93%
to CYP75A28 wrong N-term
CAO23867
pseudogene
MVQFKSCGTLGQRMRSIHLHPTILHGWGTPNLSVLEVWMVLELFFPNFQLLLVSRSPAMQGLPGEAPGRP
LRWLNGRKKYVYNNNNWRVDVVC
EAARAFLKTLDINFSNRPPNAGATHLAYDAQDMVFADYGPRWKLLRK
LSNLHMLGVKALEDWSRVRTVELGHMLRAMLELSQREEPVVVPEMLSFSVANMIGQVILSRRVFETKGSE
SNEFKDMVVELMTSAGYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDF
LDVVMGHQGNSTGEKLTLTNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRR
LVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSNEACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEE
FSPERFLSGRNAKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAF
GLALQKAVSLSAMVTPRLHQSAYAV
>CYP75A37P CAAP02002140.1c translation = CAO23867
PN40024
CAN60359.1
pseudogene, Pinot
Noir
genomic
missing
the N-term not in next 15kb
These
genes are usually intact in the first exon, not split.
48449
AARAFLKTLDINFSNRPPNAGATHLAYDAQDMVFADYGPRWKLLRKLSNL 48300
48299
HMLGVKALEDWSRVRTVELGHMLRAMLELSQREEPVVVPEMLSFSVANMIGQVILSRRVF 48120
48119
ETKGSESNEFKDMVVELMTSAGYLNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRM 47943
47942
IEEHTASAHERKGNPDFLDVVMGHQGNSTGEKLTLTNIKALLL (0) 47814
47492
NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLPKLPYLQAIC 47313
47312
KESFRKHPSTPLNLPRVSNEACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFSPERF 47133
47132
LSGRNAKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGV 46965
46964
EINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV 46866
>CYP75A38v1
gi|157331175|emb|CAO63558.1| same seq as CAAP02005443.1 PN40024
next
gene is CYP79A29P CAO63559
3
aa diffs to CAN82588.1| AM436340.2a Pinot Noir genomic
7 aa diffs to BAE47007.1,
5
aa diffs to DQ786631.1, 12 aa diffs to CAI54277.1,
9
aa diffs to ABC86841, 11 aa diffs to ABC72066.1,
2
aa diffs to ABC48916 and ABC48917 (partials)
MAIDTSLLLELAAATLLFFITRFFIRSLLLKSSRKLPPGPKGWPLVGALPLLGNMPHVALAKMAKRYGPV
MFLKMGTNSMVVASTPGAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML
GGKALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM
VVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMEHLHRKFDWLLTKMMEEHTASAHERKGNPDFLDVIMAN
QENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDKVIGRSRRLVESDLP
KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWESPEEFRPERFL
SGRNEKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKA
VSLSAMVTPRLHQSAYAV
>CYP75A38v2
gi|157332081|emb|CAO68617.1| CU460585.1 PN40024
runs off
the end
2 aa diffs
to DQ356237.1 Sangiovese genomic
3
aa diffs to DQ786631.1 Cabernet Sauvignon mRNA
STPGAARAFLKTLDINFSNRPPNAGASLLAYHAQD
MVFADYGARWKLLRKLSNLHMLGGKALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIG
QVILSRRVFETKGSESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIQHGMKHLHRKFDRLLTKMME
EHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILK
RAHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNI
WAIGRDPDVWESPEEFRPERFLSGRNEKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFD
WKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV
>CYP75A38v3
gi|147862217|emb|CAN82588.1| AM436340.2a Pinot Noir genomic
97%
to 75A28, 11 aa diffs
3 aa diffs
to CAAP02005443.1a 8380-10309
MAIDTSLLLELAAATLLFFITRFFIRSLLLKSSRKLPPGPKGWPLVGALPLLGNMPHVALAKMAKRYGPV
MFLKMGTNSMVVASTPGAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML
GGKALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM
VVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDRLLTKMMEEHTASAHERKGNPDFLDVIMAN
QENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDKVIGRSRRLVESDLP
KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFL
SGRNEKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKA
VSLSAMVTPRLHQSAYAV
>CYP75A38v4
gi|111144659|gb|ABH06585.1| translated from DQ786631.1 Cabernet Sauvignon mRNA
flavonoid
3'5' hydroxylase
2 aa diffs
to CAN82588.1 Pinot
Noir genomic
4 aa diffs
to CAAP02007407.1 PN40024
11
aa diffs 97% to 75A28 CAI54277 Shiraz mRNA
MAIDTSLLLELAAATLLFFITRFFIRSLLLKSSRKLPPGPKGWPLVGALPLLGNMPHVALAKMAKRYGPV
MFLKMGTNSMVVASTPGAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML
GGKALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFRDM
VVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDRLLTKMMEEHTASAHERKGNPDFLDVIMAN
QENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLP
KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFL
SGRNEKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKA
VSLSAMVTPRLHQSAYAV
>CYP75A38v5 gi|78183426|dbj|BAE47007.1|
AB213606.1 Cabernet
Sauvignon genomic
flavonoid
3',5'-hydroxylase 98%
4
aa diffs to CYP75A38v4 and CYP75A38v3
7
aa diffs to 75A28, EST = EE066764
MAIDTSLLLEFAAATLLFFITRFFIRSLLLKSSRKLPPGPKGWPLVGALPLLGNMPHVALAKMAKRYGPV
MFLKMGTNSMVVASTPGAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML
GGKALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM
VVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDRLLTKMMEEHTASAHERKGNPDFLDVIMAN
QENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLP
KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFL
SGRNTKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDEVEINMDEAFGLALQKA
VSLSAMVTPRLHQSAYAV
>CYP75A38v6
gi|86156246|gb|ABC86841.1| DQ356237.1 Sangiovese genomic
flavonoid
3',5'-hydroxylase
2
aa diffs to CAAP02008469.1 translation = CAO68617 PN40024
runs
off end
8
aa diffs to 75A28
note:
79A29P is close
CAO68617
may be allelic with CAO63558 that also has CYP79A29P close
If
not it is a nearly identical duplication
RFFIRSLLPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPVMFLKMGTNSMVVASTPEAARA
FLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHMLGGKALEDWSQVRAVELGHMLR
AMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDMVVELMTTAGYFNIGDFIPSIA
WLDIQGIQRGMKHLHRKFDRLLTKMMEEHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNL
FTAGTDTSSSVIEWSLAEMLKNPSILKRVHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPL
NLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWESPEEFRPERFLSGRNEKIDPRGNDFELIPFGA
GRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV
>CYP75A39Pv1
gi|157332520|emb|CAO70765.1| CU460864.1 same seq as CAAP02008469.1
100%
to CYP75A38v4
P450
gene does not continue upstream
ESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDRLLTKMMEEHTASAHERKGNPD
FLDVIMANQENSTGEKLTITNIKALLL
NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDL
PKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPNVWESPEEFRPERF
LSGRNEKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQK
AVSLSAMVTPRLHQSAYAV
>CYP75A39Pv2
gi|157023020|emb|CAAP02017822.1| PN40024, contig_17822 Length=1454
Identical
to CAN82588.1 and AB213606.1 and DQ356237.1
Does
not extend, pseudogene fragment
609 ESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDRLLTKMMEEHTA 430
429
SAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLL 319
>CYP75A40P gi|147819898|emb|CAN60738.1|
AM440112.2 Pinot
Noir
genomic
9
aa diffs to CYP75A38v3, end of the gene is missing
probably
pseudogene
94%
to CYP75A28
MAIDTSLLLELAAATLLFFITRFFIRSLLPKSSRKLPPGPKGWPLVGALPLLGNMPHVALAKMAKRYGPV
MFLKMGTNGMVVASTPGAARAFLKTLDINFSNRPLNAGATLLAYRSQDMVFADYGARWKLLRKLSNLHML
GGKALEDWSQVRAVELGHMLRAMLELSQRTEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM
VVELITTAGYFNIGDFIPSIAWLDIQGIQHGMKHLHRKFDRLLTKMMEEHTASAHERKGNPDFLDVIMAN
QEKSTGEKLTITNIKALLLVGTIWHRNLWYNIHVIQHAILYDHCSEYGILQIGIRAFVG
>CYP75A41 gi|157333816|emb|CAO18026.1|
PN40024
7
aa diffs to CYP75A34
MAIDTSLLPELAAATLLFFITRFFIRSLFPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPV
MFLKMGTNSMVVASTPEAARAFLKTLDINFSNRPPNAGATHLAYGAQDMVFADYGPRWKLLRKLSNLHML
GGKALEDSSQVRTVELGHMLRAMLELSQREEPVVVPEMLSFSIANIIGQVILSRRVFETKGSESNEFKDM
VVELMTCAGYFNIGDFIPSIAWMDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDVVMGH
QENTTGEKLTLSNIKALLQNLFAAGTDTSASIIEWSLAEMLKNPSILKRAQEEMDHVIGRNRRLVESDLP
KLPYLQAICKESLRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWENPEEFRPERFL
SGRNAKIDPRGNDFELIPFGAGRRICAGARMGIVLVEYILGTLVHSFDWKIPDGVEINMDEAFGLALQKA
VSLSAMVTPRLHQSAYAV
>CYP75A41
gi|147862169|emb|CAN82604.1| AM436584.2 Pinot Noir genomic
86%
to CYP75A28
CAAP02012125.1 2184-4508, 1 aa diff
CAAP02007036.1
12858-10533 (-) strand, 1 aa diff, no seq gap
MAIDTSLLPELAAATLLFFITRFFIRSLFPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPV
MFLKMGTNSMVVASTPEAARAFLKTLDINFSNRPPNAGATHLAYGAQDMVFADYGPRWKLLRKLSNLHML
GGKALEDSSQVRTVELGHMLRAMLELSQREEPVVVPEMLSFSIANIIGQVILSRRVFETKGSESNEFKDM
VVELMTCAGYFNIGDFIPSIAWMDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDV
(small
seq gap)
NLFAAGTDTSASIIEWSLAEMLKNPSILKRAQEEMDHVIGRNRRLVESDLPKLPYLQAICKESLRKHPSTPLNL
PRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWENPEEFRPERFLSGRNAKIDPRGNDFELIPFGAGR
RICAGARMGIVLVEYXLGTLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV
>CYP75A42
gi|147802021|emb|CAN61852.1| Pinot Noir genomic
91%
to CYP75A28
7
aa diffs to 75A41
MAIDTSLLPELAAATLLFFITRFFIRSLFPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPV
MFLKMGTXSMVVASTPEAARAFLKTLDINFSNRPPNAGATHLAYGAQDMVFADYGPRWKLLRKLSNLHML
GGKALEDSSQVRTVELGHMLRAMLELSQREEPVVVPEMLSFSIANIIGQVILSRRVFETKGSESNEFKDM
VVELMTCAGYFNIGDFIPSIAWMDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDVVMGH
QENTTGEKLTLSNIKALLQNLFAAGTDTSASIIEWSLAEMLKNPSILKRAXEEMDXVIGRXRRLVESDLP
KLPYLQAICKESXRKHPSTPLNLPRVSNEACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFSPERFL
SGRNAKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKA
VSLSAMVTPRLHQSAYAV
>CYP75A43
gi|147852187|emb|CAN80142.1| Pinot Noir genomic
70%
to CYP75A28, 73% to CYP75A41
MSRSSRRLPPGPRGWPVVGCLPLLGAMPHVALAQLAQKYGAIMYLKLGTCDVVVASKPDSARAFLKTLDL
NFSNRPPNAGATHIAYEAQDFVFADIGPRWNLLRKLTSLHMLGAKSFKDWGAIRGAEIGHMIQAMCELSR
RGEPVVVPEMVSCALANIIGQKSLSRRVFETQGSESNDFKEMVVELMRLAGLFNVGDFIPSIAWMDLQGX
EGKMKLLHNKFDALLTRMIEEHSATAHERLGNPDILDVVMAEQEYSCGVKLSMVNIKALLLNLFIAGTDT
SSGTIEWALAEILKNPTMLKRAHAEMDRVIGKNRLLQESDVPKLPXLEAICKETFRKHPSVPLNIPRVSA
NACEVDGYYIPEDTRLFVNVWAIGRDPAVWENPLEFKPERFLSEKNARISPWGNDFELLPFGAGRRMCAG
IRMGIEVVTYALGTLVHSFDWKLPKGDELNMDEAFGLVLQKAVPLSAMVTPRLHPSAYKAQV
>CYP75A43 CAAP02001252.1 PN40024 genomic
2
aa diffs to CAN80142.1
35518 MVQIDEL
35539
LFTALVFLVTNFFVKRITSMSRSSRRLPPGPRGWPVVGCLPLLGAMPHVALAQLAQKYGA 35718
35719
IMYLKLGTCDVVVASKPDSARAFLKTLDLNFSNRPPNAGATHIAYEAQDFVFADIGPRWN 35898
35899 LLRKLTSLHMLGAKSFKDWGAIRGAEIGHMIQAMCELSRRGEPVVVPEMVSCALANIIGQ 36078
36079
KSLSRRVFETQGSESNDFKEMVVELMRLAGLFNVGDFIPSIAWMDLQGTEGKMKLLHN 36252
36253
KFDALLTRMIEEHSATAHERLGNPDILDVVMAEQEYSGGVKLSMVNIKALLL (0) 36408
NLFIAGTDTSSGTIEWALAEILKNPTMLKRAH
36603
36604 AEMDRVIGKNRLLQESDVPKLPYLEAICKETFRKHPSVPLNIPRVSANACEVDGYYIPED 36783
36784
TRLFVNVWAIGRDPEVWENPLEFKPERFLSEKNARISPWGNDFELLPFGAGRRMCAGIRM 36963
36964
GIEVVTYALGTLVHSFDWKLPKGDELNMDEAFGLVLQKAVPLSAMVTPRLHPSAYKAQV* 37143
CYP75B
subfamily (2 genes) [3 pseudogenes] 4 orthologs 2 alleles
21
sequences
>CYP75B32v1
gi|83715794|emb|CAI54278.1| AJ880357.1 Shiraz mRNA
flavonoid-3'-hydroxylase
same
as AJ880357
CAAP02002732.1
7596-5384 (-) strand, 1 aa diffs
MNPLALIFCTALFCVLLYHFLTRRSVRLPPGLKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV
DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF
RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV
LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD
GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRSRLVTDLDLPQLT
YVQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGG
ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR
AAPLMVHPLPRLSPQVFGK
>CYP75B32v1
gi|157342333|emb|CAO64446.1| CU459229.1 PN40024
complement(join(4340871..4341512,4341761..4342213,
4342649..4343083))
2
aa diffs to 75B32v1 (CAI54278.1), 6 aa diffs to BAE47006
6
aa diffs to 75B32v3 BAE47005.1
6
aa diffs to DQ786632.2, 5 aa diffs to 75B32v2 CAN68303.1
MNPLALIFCTALFCVLLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV
DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF
RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV
LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD
GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRSRLVTDLDLPQLT
YVQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGG
ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR
AAPLMVHPRPRLSPQVFGK
>CYP75B32v2
gi|147833535|emb|CAN68303.1| Pinot Noir genomic
99%
5 aa diffs to CYP75B32v1,
2 aa diffs to CAO64446
MNPLALIFCTALFCVLLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV
DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF
RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV
LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD
GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRXRLVTDLDLPQLT
YXQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEKPLEFRPSRFLPGG
ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR
AAPLMVHPLPRLSPQVFGK
>CYP75B32v3
gi|78183422|dbj|BAE47005.1| AB213604.1 Cabernet Sauvignon genomic
flavonoid
3'-hydroxylase 98%
6
aa diffs to CYP75B32v1
MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV
DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF
RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV
LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD
GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPQLT
YLQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEKPLEFRPSRFLPGG
ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR
AAPLMVHPLPRLSPQVFGK
>CYP75B32v4
gi|111144661|gb|ABH06586.1| translated from DQ786632.2 Cabernet Sauvignon mRNA
flavonoid
3' hydroxylase 99%
8
aa diffs
MNPLALFFCTALFCVLLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV
DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF
RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV
LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD
GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPQLT
YLQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEKPLEFRPSRFLPGG
ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR
APPLMVHPRPRLSPQVFGK
>CYP75B32v4 gi|78183424|dbj|BAE47006.1|
AB213605.1 Cabernet
Sauvignon genomic
flavonoid
3'-hydroxylase 98% to CYP75B32
100% to DQ786632.2, 100% to AB213605.1, 4 aa diffs to AB213604
3
aa diffs to CAN68303,
8 aa diffs to 75B32v1
MNPLALFFCTALFCVLLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV
DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF
RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV
LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD
GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPQLT
YLQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEKPLEFRPSRFLPGG
ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR
APPLMVHPRPRLSPQVFGK
$$$$
>CYP75B38 gi|83944614|gb|ABC48911.1|
DQ298196.1
Cabernet Sauvignon genomic
flavonoid
3'-hydroxylase 97% to CYP75B32
this
seq is called VvF3’H-1a in Castellarin et al. BMC Genomics 2006
100%
to CAN75347
VCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMVLAGVFNIGDFVPALEWLDLQGVASKMKKLH
ARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNADGEGGKLTDVEIKALLLNLFTAGTDTSSSTV
EWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLTYLQAIVKETFRLHPSTPLSLPRMAAESCEI
NGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLG
LRMVHLLTATLVHAFNWELPEGQVAEKLNMDEA
>CYP75B38 gi|83944616|gb|ABC48912.1|
DQ298197.1 Cabernet
Sauvignon genomic
flavonoid
3'-hydroxylase 97% to CYP75B32
this
seq is called VvF3’H-1b in Castellarin et al. BMC Genomics 2006
1
aa diff to CAN75347
VCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMVLAGVFNIGDFVPALEWLDLQGVASKMKKLH
ARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNADGEGGKLTDVEIKALLLNLFTAGTDTSSSTV
EWAIAELIRHPEMMAQAQQEPDAVVGRGRLVTDLDLPKLTYLQAIVKETFRLHPSTPLSLPRMAAESCEI
NGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLG
LRMVHLLTATLVHAFNWELPEGQVAEKLNMDEA
>CYP75B38
gi|83944618|gb|ABC48913.1| DQ298198.1 Cabernet Sauvignon genomic
flavonoid
3'-hydroxylase 97% to CYP75B32
this
seq is called VvF3’H-1c in Castellarin et al. BMC Genomics 2006
1
aa diff to CAN75347
VCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMVLAGVFNIGDFVPALEWLDLQGVASKMKKLH
ARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNADGEGGKLTDVEIKALLLNLFTAGTDTSSSTV
EWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLTYLQAIVKETFRLHPSTPLSLPRMAAESCEI
NGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLG
LRMVHLLTATLVHAFNWELPEGQVAEKRNMDEA
>CYP75B38
gi|83944620|gb|ABC48914.1| DQ298199.1 Cabernet Sauvignon genomic
flavonoid
3'-hydroxylase 97% to CYP75B32
this
seq is called VvF3’H-1d in Castellarin et al. BMC Genomics 2006
100%
to CAO64444.1
VCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMVLAGVFNIGDFVPALEWLDLQGVASKMKKLH
ARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNADGEGGKLTDVEIKALLLNLFTAGTDTSSSTV
EWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLTYLQAIVKETFRLHPSTPLSLPRMAAESCEI
NGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLG
LRMVHLLTATLVHAFNWELPEGQVAEKLNMDEA
>CYP75B38 gi|157342331|emb|CAO64444.1|
CU459229.1 PN40024
100% to
CAN75347.1
1
aa diff to AB213603.1
complement(join(4317456..4318097,4318193..4318645,
4319126..4319560))
MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV
DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGQALDDF
RHIRQEEVLALMRALARAGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV
LAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNAD
GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLT
YLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGG
ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR
AAPLMVHPRPRLSPQVFGK
>CYP75B38-de3b CU459229.1 1206 bp
upstream of CAO64444
Same
as CAAP02002916.1-de3b
C-term fragment
4320766
GQVAEKLNMDKAYGLALQ*AAPLMVHPQPRLSPQGFG 4320656
>CYP75B38-de3c CU459229.1 same as
CAAP02002916.1-de3c C-term
fragment
4309781
GLTLQRAAPLMVHPQPRLSPQGFG 4309707
>CYP75B38 gi|147801850|emb|CAN75347.1|
Pinot Noir genomic
97%
to CYP75B32
MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV
DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGQALDDF
RHIRQEEVLALMRALARAGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV
LAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNAD
GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLT
YLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGG
ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR
AAPLMVHPRPRLSPQVFGK
>CYP75B38 CAAP02002916.1 100% to CAN75347.1, Pinot Noir genomic
97%
to CYP75B32, 1 aa diff to AB213603
45444
MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYG 45265
45264
PLMHLRMGFVDVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRW 45085
45084
RMLRKICSVHLFSGQALDDFRHIRQ 45010
44529
EEVLALMRALARAGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEM 44353
44352
VVELMVLAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSER 44173
44172
HVDLLSTLISLKDNADGEGGKLTDVEIKALLL 44077
43981
NLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLTYLQAIV 43802
43801
KETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRF 43622
43621
LPGGERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAE 43442
43441
KLNMDEAYGLTLQRAAPLMVHPRPRLSPQVFGK* 43340
>CYP75B38-de3b CAAP02002916.1-de3b
C-term fragment
46650
GQVAEKLNMDKAYGLALQ*AAPLMVHPQPRLSPQGFGK* 46534
>CYP75B38-de3c CAAP02002916.1-de3c C-term fragment
35664
GLTLQRAAPLMVHPQPRLSPQGFG 35593
>CYP75B38
gi|78183420|dbj|BAE47004.1| AB213603.1 Cabernet Sauvignon genomic
flavonoid
3'-hydroxylase 96% to CYP75B32
1 aa diff
to CAO64444.1
MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV
DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGQALDDF
RHIRQEEVLALMRALAREGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV
LAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNAD
GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLT
YLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGG
ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR
AAPLMVHPRPRLSPQVFGK
$$$$
>CYP75B39P gi|83944622|gb|ABC48915.1|
DQ298200.1
Cabernet Sauvignon genomic
truncated
flavonoid 3'-hydroxylase, pseudogene
only
2 aa diffs to BAE47003.1
this
seq is called VvF3’H-2 in Castellarin et al. BMC Genomics 2006
VCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMVLAGVFNIGDFVPALEWLDLQGIASKMKKLH
ARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNADGEGGKLTDVEIEALLL
(0)
NLFTAGTDTSSSTV
EWAIAELIRHPEMMAQA
& (23 bp deletion)
GRGRLVTDLDLPQLTYLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIP
KNATLLVNVWAIARDPEVWEKPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAG
MSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDE
>CYP75B39P
gi|78183418|dbj|BAE47003.1| AB213602.1 Cabernet Sauvignon genomic
flavonoid
3'-hydroxylase pseudogene
96%
to CYP75B32
same
deletion as DQ298200.1, 2 aa diffs
MNPLALSFCTALFCVLLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV
DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGQALDDF
RHIRQ
()
EEVLALMRALARAGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV
LAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNAD
GEGGKLTDVEIKALLL
(0) 1396
NLFTAGTDTSSSTVEWAIAELIRHPEMMAQA
1584 & (23 bp deletion)
1586
GRGRLVTDLDLPQLTYLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVN 1762
1763
VWAIARDPEVWEKPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMV 1942
1943
HLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQRAAPLMVHPRPRLSPQVFGK 2107
>CYP75B39P
gi|147825152|emb|CAN62275.1| AM488740.1 Pinot Noir genomic
96%
to CYP75B32
1 aa diff to CAN75347
plus deletion same as in AB213602 and DQ298200
MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV
DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGQALDDF
RHIRQEEVLALMRALARAGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV
LAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNAD
GEGGKLTDVEIKALLL
(0)
NLFTAGTDTSSSTVEWAIAELIRHPEMMAQA
& (23 bp deletion)
PGRGRLVTDLDLPQLTYLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIPKN
ATLLVNVWAIARDPEVWEKPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLT
ATLVHAFNWELPEGQVAEKLNMDEAYGLTLQRAAPLMVHPRPRLSPQVFGK
CYP76
family (24 genes) [23 pseudogenes]
CYP76A
subfamily (5 genes) [3 pseudogenes]
>CYP76A10 CAAP02005158.1b 66% to
CAN77399.1
23434
MEWTTNFLVWLIIPFLSALLLLLHRLKSGFNKHLPPGPPGWPIFGNIFDLGTLPHQKLAG 23255
23254
LRDTYGDVVWLNLGYIGTMVVQSSKAAAELFKNHDLSFSDRSIHETMRVHQYNESSLSLA 23075
23074
PYGPYWRSLRRLVTVDMLTMKRINETVPIRRKCVDDLLLWIEEEARGMDGTATGLELGRF 22895
22894
FFLATFNMIGNLMLSRDLLDPQSRKGSEFFTAMRISMESSGHTNFADFFPWLKWLDPQGL 22715
22714
KKRMEVDLGKSIEIASGFVKERMRQGRAEESKRKDFLDVLLEFQGDGKDEATKISEKGIN 22535
22534
IFIT (0) 22523
22430
EMFMAASETTSSTMEWAMTELLRSPESMTKVKAELGRVIGEKRKLEESDLDDLPYLH 22260
22259
AVVKETLRLHPAAPFLVPRRAVEDTKFMGYHIPKGTQVFVNVWAIGREAETWDDALCFKP 22080
22079
ERFVDSNMDYKGQNFEFIPFGAGRRICVGIPLAYRVLHFVLGSLLHHFDWQLERNVTPE 21903
21902
TMDMKERRGIVICKFHPLKAVPKIKPIST* 21813
>CYP76A12 gi|147791648|emb|CAN77399.1|
AM476034.2 (871-3002)
44%
to 76G1
CAAP02005158.1a
16011-13907 (-) strand 100% match
MVDWASNILLWCIILVIPVLFLLLHRRRSGSVRLPPGPPGWPVFGNMFDLGAMPHETLAGLRHKYGDVVW
LNLGAIKTTVVQSSKAAAELFKNQDLCFSDRTITETMRAQGYHESSLALAPYGPHWRSLRRLMTMEMLVT
KRINETAGVRRKCVDDMLSWIEEEARGVGGEGRGIQVAHFVFLASFNMLGNLMLSCDLLHPGSKEGSEFF
EVMVRVMEWPGHPNSADFFPWLRWMDPQGLRKKAERDLGIAMKIASGFVQERIKRGPAAEDHKKDFLDVL
LDFQGSGKNEPPQISDKDLNIIILEIFMAGSETTSSTVEWALTELLRHPECMAKVKAELGRVVGASGKLE
ERHIDDLQYLQAVVKETFRLHPPIPFLVPRKAVRDTNFMGYHIPKNTQLFVNVWAIGREAELWEEPSSFK
PERFLDLNHIDYKGQHFZLIPFGAGRRMCAGVPLAHRMVHLVLGSLVYHFDWQLDSSITLETMDMRENLA
MVMRKLEPLKALPKKVSL
>CYP76A13
gi|147791649|emb|CAN77400.1 AM476034.2 (10417-12389)
45%
to 76G1
CAAP02005373.1b
19219-17303 (-) strand, 2 aa diffs
Missing
N-term seq
Nearly
identical to adjacent gene CAN77399.1 3 aa diffs
MXDWASNILLWCIILVIPVLFLLLXXRRSGSVRLPPGPPG
WPVFGNMFDLGA
MPHETLAGLRHKYGDVVWLNLGAIKTTVVQSSKAAAELFKNQDLCFSDRTITETMRAQGYHESSLALAPY
GPHWRSLRRLMTMEMLVTKRINETAGVRRKCVDDMLSWIEEEARGVGGEGRGIQVAHFVFLASFNMLGNL
MLSCDLLHPGSKEGSEFFEVMVRVMEWSGHPNFADFFPWLRWMDPQGLRKKAERDLGIAMKIASGFVQER
IKRGPAAEDHKKDFLDVLLDFQGSGKNEPPQISDKDLNIIILEIFMAGSETTSSTVEWALTELLRHPECM
AKVKAELGRVVGANGKLEERHIDDLQYLQAVVKETFRLHPPIPFLVPRKAVRDTNFMGYHIPKNTQLFVN
VWAIGREAELWEEPSSFKPERFLDLNHIDYKGQHFELIPFGAGRRMCAGVPLAHRMVHLVLGSLVYHFDW
QLDSSITLETMDMRENLAMVMRKLEPLKALPKKVSL*
>CYP76A14P CAAP02005373.1b
pseudogene 64% to CAN77399.1
5191
MERASNFLLYLIVISSSAMSFMLCRRKSGFNRLPPRPIGWPILSNMLDLGTMLHQTLAGLRHK 5003
5002
YGDVVWLRLGAIKTMVILSSKAAGELFKNHDLSFADRSIGETMRVHEYNEGSLALVPYGP 4823
3499
LTTDMFTVRRINETANVRRKCVDDMLLWIEKEALGVNGEASSVHVAEAVFLSNMLGNL 3326
3325
MLSRDVLDLRSEEGSEFFTIMSNLTEWSGHPNLADFFPWLGWLNL*GLRK 3176
2669
KSQQRDLGKAMEMASGFVNERMKKQRTEGTKRKDFLDVLLEFEGNGRDEPAKTSDRDVNI 2490
2489
FIL (0) 2481
2336
EIFMAGSETSSSIVEWVMTELLRNPKSMSKVKDELARVVGADRNVEESDIDELQYLQ 2166
2165
AVVKETLRLHPPIPFLIPRSAIQDTSFMGYHIPKDTQVLVNAWAIGRDPGS*EDPSSFKP 1986
1985
ERFLDSKKIDYKGQNFE 1935
752
LIPFGAGRRICAGIPLAHRVLHLVLGTLLHHFDWQLEGNVTPETMDMKEKWGLVMLESQP 573
572 LKAVPKKLT* 543
>CYP76A15
gi|147774514|emb|CAN76783.1| 42% to 76G1
CAAP02013124.1
= CAN76783.1
CAAP02000672.1b
113501-112101 added missing parts
113663
MELSTASIVFWSCFFSAALLLFLRLIKFTKGSTKSTPPGPQGWPIFGNIFDLGT
1403
LPHQTLYRLRPQHGPVLWLQLGAINTMVVQSAKAAAELFKNHDLSFSDRNVPFTLTAHNY 1224
1223
DQGSMALGKYGPYWRMIRKVCASELLVNKRINEMGSLRRKCVDDMIRWIEEDAAKSGAEG 1044
1043
RAGEVELPHFLFCMAFNLIGNITLSRDVVDIKSKDGHEFFQAMNGVVEWAGRPNIADFFP 864
863
LLKRLDPLGMMRNMVRDMGQALNLIARFVKERDEERQSGMVREKRDFLDVLLECRDDEKE 684
683 GPHEMSDNKVKIIVL (0) 639
413
EMFFAGSDTTSSTLEWAMTELLRRPESMRKAQEELDRVVGPHGKVEESDIDQLLYLQ 243
242
AVVKETLRLHPPIPLLLPRNALQDTNFMGYFVPKNTQVFVNAWAIGRDPDAWKEPLSFKP 63
62 DRFLGSNLDYKGQNFEFIPF 3
GSGRRICIGISLANKLLPLALASLLHCFDWELGGGVTPET
111981
111980 MDMNERVGITVRKLIPLKPIPKRRTV* 111900
>CYP76A15-de2b CAAP02000672.1b-de2b pseudogene,
C-term
111539
KEERFIDSDKQ*GDGFVLMASLAGIPSTLAHKVMHLVLLGLLLHRFDWDLEWDIFPK 111369
>CYP76A16
gi|147774515|emb|CAN76784.1| 49% to 76G6, 44% to 76G1
CAAP02000672.1a
105702-104085 (-) strand 100% match
MSSLLWWSAFFSAALLVLLRRIKPRKGSTKLRPPGPQGWPILGNIFDLGTMPHQTLYRLRSQYGPVLWLQ
LGAINTVVIQSAKVAAELFKNHDLPFSDRKVPCALTALNYNQGSMAMSNYGTYWRTLRKVCSSELLVIKR
INEMAPLRHKCVDRMIQWIEDDATMARVQGGSGEVEVSHLVFCVAFNLIANLMLSRDFFDMKPKEGNEFY
BAMNKIMELAGKPNTADFFPFLKWLDPQGIKRNMVRELGRAMDIIAGFVKERVEERQTGIEKEKRDFLDV
LLEYRRDGKEGSEKLSERNMNIIILEMFFGGTETTSSTIEWAMTELLRKPKSMRKVKEELDRVVGPDRKV
EESDIDELLYLQAVVKETLRLHPALPLLIPRNALQDTNFMGYFIPQNTQVFVNAWSIGRDPEAWHKPLSF
KPRRFLGSDIDYKGQNFELIPFGSGRRMCIGMPFAHKVVPFVLASLLHCFDWELGSNLTPETIDMNERVG
LTLRKLVPLKAIPRKRIVRDR
>CYP76A17P CAAP02000672.1c pseudogene 93% to CAAP02005373.1b
117421
DLRSKEGSEFFTIMSNLTEWSGHPNLSDFFPWLGWLDLQGLRKNMERDLGKAMEMASGF 117245
117244
VNERMKKQRTEGTKRKDFLDVLLEFEGNGKDEPAKISDRDVIIFIL (0) 117107
117000
EIFLAGSETSSSIVEWAMTELLRNPKSMSEVKDELARVVGADRNVEESDIDELQYLQAVV 116821
116820
KETLRLHPPIPFLILRSAIQDTSFMGYHIPKDTQVLVNARAIGRDPGSWEDPSSFKPERF 116641
116640
LDSKKIEYKGQNFELIPFGAGRRICAGIPLAHRVLHLVLGTLLHHFDWQLKGNVTPETMD 116461
116460
MKEKWGLVMRKSQPLKAVPKKLT* 116389
CYP76F
subfamily (5 genes) [2 pseudogenes]
>CYP76F2
gi|7406712|emb|CAB85635.1| putative ripening-related P-450 enzyme
same
as AJ237995
MELLSCLLCFLAAWTSIYIMFSARRGRKHAAHKLPPGPVPLPIIGSLLNLGNRPHESLANLAKTYGPIMT
LKLGYVTTIVISSAPMAKEVLQKQDLSFCNRSIPDAIRAAKHNQLSMAWIPVSTTWRALRRTCNSHLFTS
QKLDSNTHLRHQKVQELLANVEQSCQAGGPVDIGQEAFRTSLNLLSNTIFSVDLVDPISETAQEFKELVR
GVMEEAGKPNLVYYFPVLRQIDPQGIRRRLTIYFGRMIEIFDRMIKQRLQLRKIQGSIASSDVLDVLLNI
SEDNSNEIERSHMEHLLLDLFAAGTDTTSSTLEWAMAELLHNPETLLKARMELLQTIGQDKQVKESDISR
LPYLQAVVKETFRLHPAVPFLLPRRVEGDADIDGFAVPKNAQVLVNAWAIGRDPNTWENPNSFVPERFLG
LDMDVKGQNFELIPFGAGRRICPGLPLAIRMVHLMLASLIHSYDWKLEDGVTPENMNMEERYGISLQKAQ
PLQALPVRV
>CYP76F10P CAAP02001054.1 75% to 76F2
34678
MDLMSYLLCLLVAWTSIYIVVSARRSKSGAGKLPPGPVPFPIIGNLLNLGNKPH 34517
34516
ESLANLAKIYGPVMSLKLGCVTTVVITSATMAKEVLQKKDQSFCNRTIPDALRALNHNQI 34337
34336
SMVWLPVSTKWRTLRKICNSHIFTNQKLDSSNYLRHQKVQDLLANVEQSCQAGDVVDIGQ 34157
34156
EAFRTTLNLLSNTTFSVDLVEPSSDTVQEFKELVRHMMEEAAKPNLADYFPVVRKIDPQG 33977
33976
IRRRMAIHFGKMIKVLDKKVKQRLRSRQVQGWMASSDVLDTLLNISEDSNNFLDITHIDH 33797
33796
LLL (0)
32324
DLFVAGTDTTANTLEWAMAELLHNPETLLRVQAELRQTIGKDKLVKESDIARLPYLQA 32151
32150
VVKETFRLHPAVPFLLPRKVEVDTEMCGFIVPKDAQVLVNVWAIGRDPNLWENPNLFMPE 31971
31970
RFLGSDMDVRGQNFELIPFGAGRRICPGLLLGIRMVQLMLASLIHSNDWKLEDGLTPENM 31791
31790
NMEEKFGFTLQKAQPLRVLPIHV 31722
>CYP76F10P gi|147816105|emb|CAN66326.1
AM481161.2
60%
to 76C1
2
AA DIFFs to 76F10P
4841
MDLMSYLLCLLVAWTSIYIXVSARRSKSGAGKLPPGPVPFPIIGNLLNLGNKPHESLANL 4662
4661
AKIYGPVMSLKLGCVTTVVITSATMAKEVLQKKDQSFCNRTIPDALRALNHNQISMVWLP 4482
4481
VSTKWRTLRKICNSHIFTNQKLDSSNYLRHQKVQDLLANVEQSCQAGDVVDIGQEAFRTT 4302
4301
LNLLSNTTFSVDLVEPSSDTVQEFKELVRHMMXEAAKPNLADYFPVVRKIDPQGIRRRMA 4122
4121
IHFGKMIKVLDKKVKQRLRSRQVQGWMASSDVLDTLLNISEDSNNFLDITHIDHLLL 3951
2471 DLFVAGTDTTSNTLEWAMAELLHNPETLLRVQAELRQTIGKDKLVKESDIARLPYLQA
2298
2297
VVKETFRLHPAVPFLLPRKVEVDTEMCGFIVPKDAQVLVNVWAIGRDPNLWENPNLFMPE 2118
2117
RFLGSDMDVRGQNFELIPFGAGRRICPGLLLGIRMVQLMLASLIHSYDWKLEDGLTPENM 1938
1937
NMEEKFGFTLQKAQPLRVLPIHV 1869
>CYP76F11 CAAP02001054.1 pseudogene 71% to 76F2
10316
MDLFSCLLCLLVAWASIYIVVSARRRKSGAGKLPPGPVPFPIIGNLLNLGNKPH 10155
10154
ESLANLAKIHGPVMTLELGCVTTVVITSATMAKEVLQKKDQSFCNRTIPDALRAHNHNQL 9975
9974 SVVWLPASTKWRTLRK 9927
NSHIFTSQKLDSNAHL (this seq is inverted)
9885
NCQAGDVVDIGLEAFRTTLNLLSNTIFSVDLVEPSSDTVQEFKELVRNMMEEAAKPN 9715
(GAP)
9713
MAIHFGNMIEVFDKMVKQRLRSRQVQGWMASSDVLHILLTISEDSNNVLDITNIDHLLL 9537
9389 DLFAAGTDTTTNTLEWAMA 9333
9333
KLLHKPETLRRVQVELLQTIGKDKLVKESDIAQLPYLQAVVKETFRLHPAVPLLLPRKAD 9154
9153
VDTDICGFIVPKDAQVLVNVWAIGRDPNLWENPNSFMPERFLGSDMDVRGQNFELIPFGA 8974
8973
GRRICPGIRMIHLMLASLLHSYDWKLEDGVTPENMNMEEKFGVTLQNAQPLRALPT 8806
8805 LV* 8797
>CYP76F11 gi|147802689|emb|CAN72997.1|
AM480526.2 60% to 76C1
3
aa diffs to 76F11
11345 MDLFSCLLCLLVAWASIYIVVSARRRKSGAGKLPPGPVPFPIIGNLLNLGNKPHESLANL
11166
11165
AKIHGPVMTLELGCVTTVVITSATMAKEVLQKKDQSFCNRTIPDALRAHNHNQLSVVWLP 10986
10985
ASTKWRTLRK 10956
10911
NSHIXTSQKLDSNAHL 10958 (this seq is inverted)
10914
NCQAGDVVDIGLEAFRTTLNLLSNTIFSVDLVEPSSDTVQEFXELVRNMMEEAAKPN 10744
(gap)
10742
MAIHFGNMIEVFDKMVKQRLRSRQVQGWMASSDVLHILLTISEDSNNVLDITNIDHLLL 10566
10415
DLFAAGTDTTTNTLEWAMA 10359
10359
KLLHKPETLRRVQVELLQTIGKDKLVKESDIAQLPYLQAVVKETFRLHPAVPLLLPRKAE 10180
10179 VDTDICGFIVPKDAQVLVNVWAIGRDPNLWENPNSFMPERFLGSDMDVRGQNFELIPF
10006
XAGRRICPGIRMIHLMLASLLHSYDWKLEDGVTPENMNMEEKFGVTLQKAQPLRALPTLV 9826
>CYP76F12 CAAP02002347.1d =
CAN79423.1 86% to CYP76F2
CAN79423.1
has 2 aa diffs and small deletion
24772 MEMLSCLLCFLVAWTSIYIMFSVRRGSQHTAYKLPPGPVPLPIIGNLLNLGNRPHESLAE 24951
24952
LAKTYGPIMTLKLGYVTTIVISSAPMAKEVLQKQDLSFCNRFVPDAIRATNHNQLSMAWM 25131
25132
PVSTTWRVLRKICNSHLFTTQKLDSNTHLRHHKVQELLAKVEESRQAGDAVYIGREAFRT 25311
25312
SLNLLSNTIFSVDLVDPISETVLEFQELVRCIIEEIERPNLVDYFPVLRKIDPQGIRRRL 25491
25492
TIYFGKMIGIFDRMIKQRLQLRKMQGSIATSDVLDTLLNISEDNSNEIERNHMEHLLL (0) 25677
DLFVAGTDTTSSTLEWAMAE 25851
25852
LLHNPEKLLKARVELLQTIGKDKQVKESDITRLPFLQAVVKETFRLHPVVPFLIPHRVEE 26031
26032 DTDIDGLTVPKNAQVLVNAWAIGRDPNIWENPNSFVPERFLELDMDVKGQNFELIPFGAG 26211
26212
RRICPGLPLATRMVHLMLASLIHSCDWKLEDGMTPENMNMEDRFGITLQKAQPLKAIPIRV* 26397
>CYP76F13P CAAP02002347.1c
pseudogene missing N-term and insertion in exon 2
93%
to 76F2 CAB85635.1
11894
KVQELLANVEQRCQAGGPVDIGREAFRTSLNLLSNAIFSVDLVDPISETAQEFKELVRGV 11715
11714
MEEAGKPNLVDYFPVLRQIDPQGIRRGLTIYFGRMIEIFDRMIKRRLRLRKMQGSIASSD 11535
11534 VLDILLNISEDNSNEIERSHMEHLLL (0)
DLFVAGTDTTSSTLEWAM 11250
10487
AELLYNPEKLLKARMELLQTIGQDKQVKESDITRLPYVQAVVKETFRLHPAVPFLL 10320
10319 PRRVEEDTDIQGFTVPKNAQVLVNAWAIGRDPNTWENPNSFVPERFLGLDMDVKGQNFEL 10140
10139
IPFGAGRRICPGLPLAIRMVHLMLASLIHSYDWKLEDGVTPENMNMEESFGLSLQKAQPL 9960
9959 QALPVRV 9939
>CYP76F14 CAAP02002347.1b 98% to
76F2 CAB85635.1 7 aa diffs
6638