Vitis vinifera cytochrome P450s

 

 

This file includes 222 sequences found in GenPept by searching for

Vitis[orgn] AND P450 on Sept. 20, 2007.  These start with CAN.

Note: on Oct 4, the same search found 642 accessions.  These include

416 Sequences from the other grape genome project starting with CAO.

Click here for a link to those 416 sequences.

 

These automated assemblies have not been checked against known

P450s for errors in assembly, gene fusions etc.

 

262 accessions from the grape genome project in the WGS section have been

mined for P450s and they have been assembled and sorted into family groups. 

(see bottom of file for a complete list of the CAAP accessions)

 

591 sequences are present below but some are duplicates. Gene sequences are being clustered into identical or presumed identical gene bins indicated with an #

followed by a number.  Pseudogenes are being labeled in a similar way with an

@ sign followed by a number.  (in progress)

Oct. 4, 2007, revised Nov. 14, 2007

 

For an older file of P450s from grape see http://drnelson.utmem.edu/vitis.old.htm

 

P450 sequences in CYP family order

 

 

Table of 49 P450 families present

CYP83-like sequences here merged with CYP71AT

(missing 91 (merged with CYP81), 95 (part of the CYP72 family), 99 (grass specific),

702 (Brassicales only), 705 (part of CYP712), 708 (Brassicales only), 713 (merged with CYP71A),

717 (merged with CYP81), 719 (Ranunculales), 723 (grass specific), 725 (Taxus only overlaps 716),

726 (part of CYP71, Euphorbia), 729, 730 (protist contaminant),

731 (protist contaminant), 732 (protist contaminant))

 

The only missing families that appear lost in Vitis are CYP729 and CYP749

 

CYP51   2 genes,  1 pseudogene

CYP71  24 genes, 28 pseudogenes

CYP72  22 genes, 21 pseudogenes   43 sequences

CYP73   3 genes,  0 pseudogenes

CYP74   7 genes,  0 pseudogenes

CYP75  11 genes,  8 pseudogenes   52 sequences

CYP76  24 genes, 23 pseudogenes   47 sequences

CYP77   2 genes,  0 pseudogenes

CYP78   7 genes,  0 pseudogenes

CYP79   9 genes, 13 pseudogenes    4 alleles/duplicates 26 sequences

CYP80   6 genes,  0 pseudogenes

CYP81  21 genes, 14 pseudogenes   35 sequences

CYP82  34 genes, 37 pseudogenes   18 alleles/duplicates 89 sequences

CYP84   3 genes,  0 pseudogenes

CYP85   2 genes,  1 pseudogene

CYP86   6 genes,  1 pseudogene

CYP87   7 genes,  0 pseudogenes

CYP88   2 genes,  0 pseudogenes

CYP89  14 genes, 11 pseudogenes   25 sequences

CYP90   4 genes,  1 pseudogene

CYP92   6 genes,  1 pseudogene

CYP93   4 genes,  0 pseudogenes

CYP94   9 genes,  0 pseudogenes

CYP96   5 genes,  2 pseudogenes

CYP97   3 genes,  0 pseudogenes

CYP98   1 gene,   0 pseudogenes

CYP701  1 gene,   0 pseudogenes

CYP703  1 gene,   0 pseudogenes

CYP704  6 genes,  0 pseudogenes

CYP706  9 genes,  7 pseudogenes

CYP707  5 genes,  0 pseudogenes

CYP709  1 gene,   0 pseudogenes

CYP710  1 gene,   1 pseudogene

CYP711  1 gene,   1 pseudogene

CYP712  2 genes,  2 pseudogenes

CYP714  6 genes, 11 pseudogenes    1 allele 18 sequences

CYP715  1 gene,   0 pseudogenes

CYP716 15 genes,  7 pseudogenes   22 sequences

CYP718  0 genes,  1 pseudogene

CYP720  1 gene,   0 pseudogenes

CYP721  5 genes,  3 pseudogenes

CYP722  1 gene,   0 pseudogenes

CYP724  2 genes,  0 pseudogenes

CYP727  1 gene,   0 pseudogenes

CYP728  6 genes,  2 pseudogenes

CYP733  1 gene,   0 pseudogenes

CYP734  2 genes,  0 pseudogenes

CYP735  1 gene,   0 pseudogenes

CYP736  8 genes,  4 pseudogenes

 

Totals 315   +  201        +           23  =   553

315 named genes, 201 named pseudogenes 23 alleles/duplicates = 539 named sequences.

 

#1

>CYP51G6 CAAP02000072.1 81% to 51G1 Arab.

190429  MDVDNKFFNVALLIVATVVVAKLISALLIPKSRKRLPPTVKAFPVIGGLLRFLKGPVV  190256

190255  MLREEYPKLGSVFTLNLLNKNITFFIGPEVSAHFFKAPEADLSQQEVYQFNVPTFGPGVV  190076

190075  FDVDYSVRQEQFRFFTESLRVTKLKGYVDQMVTETE (0) 189968

188248  DYFSKWGDSGEVDLKYELEHLIILTASRCLLGQEVRDKLFADVSALFHDLDNGMLPISV  188072

188071  IFPYLPIPAHRRRDQARTKLAHIFANIIASRRETGKSENDMLQCFMDSKYKDGRQTTEAE  187892

187891  VTGLLIAALFAGQHTSSITSTWTGAYLFRHKEFLSAVLDEQKNLMKKHGNKVDHDILSEM  187712

187711  DVLYRCIKEALRLHPPLIMLLRSSHSDFSVTTKDGKEYDIPKGHIVATSPAFANRLPHIY  187532

187531  KDPERYDPDRFAVGREEDKVAGAFSYISFGGGRHGCLGEPFAYLQIKAIWSHLLRNFEFE  187352

187351  LISPFPEIDWNAMVVGVKGKVMVRYKRRVLPVD*  187250

 

#2

>CYP51G CAAP02000381.1 = AM475390.2, 81% to 81G1 Arab. 90% to CAAP02000072.1

97293   MDVDNKFFNAAFLLVATLVVAKLISALIIPRSKKRLPPTIKAFPLIGGLIRFLKGP  97460

97461   VVMLREEYPKLGSVFTLKLLNKNISFFIGPDVSAHFFKAPESDLSQQEVYRFNVPIFGPG  97640

97641   VVFDVDYSVRQEQFRFFTEALRVTKLKGYVDQMVMEAE (0) 97754

104754  DYFSKWGDCGEVDLKYELEHLIILTASRCLLGQEIRNKLFADVSALFHDLDNGMLPISV  104930

104931  IFPYLPIPAHRRRDQARKKLAEIFANIIASRKETGKSENDMLQCFIASKYKDGRPTTESE  105110

105111  VTGLLIAALFAGQHTSSITSTWTGAYLLRHKEYLSAVQDEQRSLMKKYGSKVDHDILSEM  105290

105291  DVLYRCIKEALRLHPPLIMLLRSSHTDFSVTTRDGKEYDIPKGHIVATSPAFANRLPHIY  105470

105471  KDPDRYDPDRFAVGREEDKAAGAFSYISFGGGRHGCLGEPFAYLQIKAIWSHLLRNFELE  105650

105651  LISPFPEVDWNAMVVGVKGKVMVRYKRRELPVN*  105752

 

>CYP51G1 AM475390.2 Vitis vinifera (Pinot noir grape) = CAAP02000381.1

9521  MDVXXKFFNAXFLLVATLLVAKLISALIIPRSKKRLPPTIKAFPLIGGLIRFLKGPVVML  9342

9341  REEYPKLGSVFTLKLLNKNISFFVGPDVSAHFFKAPESDLSQQEVYRFNVPIFGPGVVFD  9162

9161  VDYSVRQEQFRFFTEALRVTKLKGYVDQMVMEAE   (0) 9060

3862  DYFSKWGDCGEVDLKYELEHLIILTASRCLLGQEIRNKLFADVSALFHDLDNGMLPISV  3686

3685  IFPYLPIPAHRRRDQARKKLAEIFANIIASRKETGKSENDMLQCFIDSKYKDGRPTTESE  3506

3505  VTGLLIAALFAGQHTSSITSTWTGAYLLRHKEYLSAVQDEQRSLMKKYGSKVDHDILSEM  3326

3325  DVLYRCIKEALRLHPPLIMLLRSSHTDFSVTTRDGKEYDIPKGHIVATSPAFANRLPHIY  3146

3145  KDPDRYDPDRFAVGREEDKAAGAFSYISFGGGRHGCLGEPFAYLQIKAIWSHLLRNFELE  2966

2965  LISPFPEVDWNAMVVGVKGKVMVRYKRREL  2876

 

@1

>CYP51G7P pseudogene CAAP02006913.1

792 SHIFIGGGRNRCLGQHFAYLQVKAMWSHLL*NFEL*PISPFSKINWNAMVVGV 950

 

>CYP71AH1 old 71A11 tobacco

MKFLLVVASLFLFVFLILSATKRKSKAKKLPPGPRKLPVIGNLLQIGKLPHRSLQKLSNEYGDFIFLQLGSVPTVV

VFSAGIAREIFRTQDLVFSGRPALYAGKRFSYNCCNVSFAPYGNYWREARKILVLELLSTKRVQSFEAIRDEEVSS

LVQIICSSLSSPVNISTLALSLANNVVCRVAFGKGSDEGGNDYGERKFHEILFETQELLGEFNVADYFPGMAWINK

INGLDERLEKNFRELDKFYDKIIEDHLNSSSWMKQRDDEDVIDVLLRIQKDPNQEIPLKDDHIKGLLADIFIAGTD

TSSTTIEWAMSELIKNPRVLRKAQEEVREVAKGKQKVQESDLCKLEYLKLVIKETLRLHPPAPLLVPRVTTASCKI

MEYEIPADTRVLINSTAIGTDPKYWENPLTFLPERFLDKEIDYRGKNFELLPFGAGRRGCPGINFSIPLVELALAN

LLFHYNWSLPEGMLPKDVDMEEALGITMHKKSPLCLVASHYNLL

 

>CYP71AH2 tobacco

MNFLVVLASLFLFVFLMRISKAKKLPPGPRKLPIIGNLHQIGKL

PHRSLQKLSNEYGDFIFLQLGSVPTVVVSSADIAREIFRTHDLVFSGRPALYAARKLS

YNCYNVSFAPYGNYWREARKILVLELLSTKRVQSFEAIRDEEVSSLVQIICSSLSSPV

NISTLALSLANNVVCRVAFGKGSAEGGNDYEDRKFNEILYETQELLGEFNVADYFPRM

AWINKINGFDERLENNFRELDKFYDKVIEDHLNSCSWMKQRDDEDVIDVLLRIQKDPS

QEIPLKDDHIKGLLADIFIAGTDTSSTTIEWAMSELIKNPRVLRKAQEEVREVSKGKQ

KVQESDLCKLDYLKLVIKETFRLHPPVPLLVPRVTTASCKIMEYEIPVNTRVFINATA

NGTNPKYWENPLTFLPERFLDKEIDYRGKNFELLPFGAGRRGCPGINFSIPLVELALA

NLLFHYNWSLPEGMLAKDVDMEEALGITMHKKSPLCLVASHYTC

 

>71A9/CYP71AH3 Glycine max

MISFTVFVFLTLLFTLSLVKQLRKPTAEKRRLLPPGPRKLPFIG

NLHQLGTLPHQSLQYLSNKHGPLMFLQLGSIPTLVVSSAEMAREIFKNHDSVFSGRPS

LYAANRLGYGSTVSFAPYGEYWREMRKIMILELLSPKRVQSFEAVRFEEVKLLLQTIA

LSHGPVNLSELTLSLTNNIVCRIALGKRNRSGADDANKVSEMLKETQAMLGGFFPVDF

FPRLGWLNKFSGLENRLEKIFREMDNFYDQVIKEHIADNSSERSGAEHEDVVDVLLRV

QKDPNQAIAITDDQIKGVLVDIFVAGTDTASATIIWIMSELIRNPKAMKRAQEEVRDL

VTGKEMVEEIDLSKLLYIKSVVKEVLRLHPPAPLLVPREITENCTIKGFEIPAKTRVL

VNAKSIAMDPCCWENPNEFLPERFLVSPIDFKGQHFEMLPFGVGRRGCPGVNFAMPVV

ELALANLLFRFDWELPLGLGIQDLDMEEAIGITIHKKAHLWLKATPFCE

 

#9

>CYP71AH4 CAAP02005003.1a, 53% to 71B.d, 64% to 71A9, 62% to 71AH2 Nicotiana tabacum DQ350356.1

note 71A9 is 58% to 71AH2 so it is probably misnamed should be CYP71AH3

17504  MGISSFQASHSMVSQSLLLLLLVIFSALLLFLLSTKQKRKSVASRRLPPGPKKLPLIGNLHQLGSLPH  17301

17300  VGLQRLSNEYGPLMYLKLGSVPTLVVSSADMAREIFREHDLVFSSRPAPYAGKKLSYGCN  17121

17120  DVVFAPYGEYWREVRKIVILELLSEKRVQSFQELREEEVTLMLDVITHSSGPVYLSELT  16944

16943  FFLSNNVICRVAFGKKFDGGGDDGTGRFPDILQETQNLLGGFCIADFFPWMGWFNKLNG  16767

16766  LDARLEKNFLELDKIYDKVIEEHLDPERPEPEHEDLVDVLIRVQKDPKRAVDLSIEKIKGVLLT (0) 16575

16475  DMFIAGTDTSSASLVWTMAELIRNPSVMRKAQEEVRSAVRGKYQVEESDLSQLIYLKLVVKE  16310

16309  SLRLHPPAPLLVPRKTNEDCTIRGYEVPANTQVFVNGKSIATDPNYWENPNEFQPE  16142

16141  RFLDSAIDFRGQNFELLPFGAGRRGCPAVNFAVLLIELALANLLHRFDWELADGMRREDL  15962

15961  DMEEAIGITVHKKNPLYLLATPAN*  15887

 

@7

>CYP71AH5P CAAP02005003.1b, pseudogene 70% to CAAP02005003.1a

26375  NVAFTSFGEY*KEVRNIVILEVLSAKRVHSFQ

25611  HGWMQAIKLMFDVIAHSSGPVNSIELRVFLSNNVIC*VAFGTKFDGGGDNGTRRFPEIL  25435

25434  QETQNLLGGFCIADFFPWMGWFDKLNAWLGCQVDKNFMELNRIYDKGIEMHLDPERPEPE  25255

25254  HEDLVDVLI*VQKDLRQVVSLSNEKIKGVLT (0)

25074  VHCSD*YPFSLAGMDNAEMIRNRSVMRKAQEKVRSTVRGKYQVEESDLSQLIYLKLVVKE  24895

24894  SLRLHLPAPSLVPRKTTKNCTI  24829

24815  FPQIHVFVNGNLISIDSNYWENPNEFQPERFVDSSIDFRGQSFEFLPFGASMRGCPGANF  24636

24635  AVLLIEVALTNILHRLTGNFLMG  24567

 

>CYP71AH6 Gossypium raimondii 58% to CAAP02005003.1a, 53% to 71A9/71AH3

CO072855.1 CO095493.1, CO072856.1

MDFQFILTLSFIAFTLMVFKYKARTRRLPPGPWKLPIIGNLHQLGDSSHKSIQRLSQ

QYGPMMFLQLGAVPTLVISSADAAMAIFKGPGGGYDLAFSGRPTNLYVAKKLSYEYNGIT

FAPYGELWREMRKIAVAELLSSKRVQSFRTIREEEVAAMLNHIDIASSSSAPVNLKKLSL

LLANHVVCRVTFGKKYGGGGDGGTNRFDRVLHEVQHLVGEFVVSDYFPWMWWVNKLNGMETRVEKNFEELDKLY

DEVIADHVAPTRTKANHEDIVDVLLRLQKDARQLITLNNQQIKGVLTDMFIAGTDTTAS

SLVWTFTELIRNPPSMEKVKYEVRKVGNGRDKIEESDIPKLHCLHSVIKETLRLHPPAPL

LVPRETTEDCVVGDYEIPAKTRVIINAKSIGTDPKYWENPHDFQPDRFMKSSVDFKGQHL

EFLPFGVGRRGCPGMSFAIMLLQLMVANFLYRFDWELPEGMSVEDVDMEEELGITVFKKT

PLCLVPIRVV*

 

#10

>CYP71AP5 CAAP02001743.1a, 43% to 71B2, 51% to 71B.c 53% to 71A1, 78% to 71AP4

15554  MALLQWLKEGFLPSFLFAGIILVAVLKFLQKGMLRKRKFNLPPSPRKLPIIGNLHQLGNMPHIS  15363

15362  LHRLAQKFGPIIFLQLGEVPTVVVSSARVAKEVMKTHDLALSSRPQIFSAKHLFYDCTDI  15183

15182  VFSPYSAYWRHLRKICILELLSAKRVQSFSFVREEEVARMVHRIAESYPCPTNLTKILGL  15003

15002  YANDVLCRVAFGRDFSAGGEYDRHGFQTMLEEYQVLLGGFSVGDFFPSMEFIHSLTGMK  14826

14825  SRLQNTFRRFDHFFDEVVKEHLDPERKKEEHKDLVDVLLHVKEEGATEMPLTMDNVKAIIL (0)  14646

14513  DMFAAGTDTTFITLDW  14466

14465  GMTELIMNPKVMERAQAEVRSIVGERRVVTESDLPQLHYMKAVIKEIFRLHPPAPVLVPR  14286

14285  ESMEDVTIDGYNIPAKTRFFVNAWAIGRDPESWRNPESFEPQRFMGSTIDFKGQDF  14118

14117  ELIPFGAGRRSCPAITFGAATVELALAQLLHSFDWELPPGIQAQDLDMTEVFGITMHR  13944

13943  IANLIVLAKPRFP* 13902

 

#7

>CYP71AS3.a CAAP02000057.1 Vitis vinifera 6 genes in a cluster 62% to CYP71AS1

177875  MELYSPSMWLHLLLLLLPLMFLIKRKIELTGQKKPLPPGPTKLPIIG 177735

177734  NLHQLGALPHYSLWQLSKKYGSIMLLQLGVPTVVVSSAEAAREFLKTHDIDCCSRPPLV  177558

177557  GLGKFSYNHRDISFAPYGDYWREVRKICVLEVFSTKRVQSFQFIREEEVALLIDSIVQSS  177378

177377  SSGSPIDLTERLMSLTANIICRIAFGKSFQASEFGDGRFQEVVHEAMALLGGLTAADFFP  177198

177197  YVGRIVDRLTGLHGRLERSFHEMDGFYQQVIEDHLNPGRVKEEHEDIIDVLLRIEREQSE  177018

177017  SSALQFTKDNAKAIVM (0) 176970

176205  DLFLAGVDTGAITVSWAMTELARNPRIMKKAQAEVRNSIGNKGKVTEGDVDQ  176050

176049  LHYLKMVVKETLRLHPPAPLLLPRETMSHFEINGYHFYPKTQVHVNVWAIGRDPNLWKNP  175870

175869  EEFLPERFMDNSVDFRGQHFELLPFGAGRRICPGMYMAIATVELALANLLYRFNWNLPNG  175690

175689  MKETDISMEEAAGLTVRKKFALNLVPILHHC*  175594

 

@5

>CYP71AS3-de1b CAAP02000057.1 54% to CYP71B.a

178567  RLTRLYGWLERRTSYELDGFY*QVIGLHDLKDVKEDFIDVLLQTERD  178427

 

#6

>CYP71AS4.b CAAP02000057.1 Vitis vinifera 6 genes in a cluster

170360  MALYSPSMWLHLLLLLLPLMYLIKRRIELKGQKKPLPPGPTKLPIIG 170220

170219  NLHQLGTLPHYSWWQLSKKYGPIILLQLGVPTVVVSSAEAAREFLKTHDIDCCSRPPLV  170043

170042  GLGKFSYNHQDIGFAPYGDYWREVRKICVHEVFSTKRLQSFQFIREEEVALLIDSIAESS  169863

169862  SSGSPIDLTERLMSLTANIICRIAFGKSFQVSEFGDGRFQEVVREAMALLGGFTAADFFP  169683

169682  YVGRIVDRLTGLHGRLERSFLEMDGFYQRVIEDHLNPGRVKEEHEDIIDVLLKIQRERSE  169503

169502  SGAVQFTKDSAKAILM  (0) 169455

169009  DLFLAGVDTGAITLTWAMTELARNPRIMKKAQVEVRNSIGNKGKVTEGDVDQ  168854

168853  LHYLKMVVKETLRLHPPVPLLLPRETMSHFEINGYHIYPKTQVQVNVWAIGRDPNLWKNP  168674

168673  EEFLPERFMDNSVDFRGQHFELLPFGAGRRICPGMYMVIATVELALANLLYRFNWNLPNG  168494

168493  MREADINMEEAAGLTVRKKFALNLVPILHHC*  168398

 

#8

>CYP71AS4v2 CAN60733.1| 73% to CAN83446.1 62% to 71AS1 55% to 71B34

96% to 71B.d and 71B.e, possible allele of 71B.b, since CAN83446.1 = 71B.e

MELYSPSIWLCLLLLLLPLMYLIKRRIELKGQKKPLPPGPTKLPIIGNLHQLGALPHYSWWQLSKKYGPI

MLLQLGVPTVVVSSVEAAREFLKTHDIDCCSRPPLVGLGKFSYNHRDIGFAPYGDYWREVRKICVLEVFS

TKRVQSFQFIREEEVTLLIDSIAQSSSSGSPIDLTERLMSLTANIICRIAFGKSFQVSEFGDGRFQEVVH

EAMALLGGFTAADFFPYVGRIVDRLTGHHGRLERSFLEMDGFYERVIEDHLNPGRVKEEHEDIIDVLLKI

ERERSESGAVQFTKDSAKAILMDLFLAGVDTGAITLTWAMTELARNPRIMKKAQVEVRNSIGNKGKVTEG

DVDQLHYLKMVVKETLRLHPPAPLLVPRETMSHFEINGYHIYPKTQVXVNVWAIGRDPNLWKNPEEFLPE

RFMDNSVDFRGQHFELLPFGAGRRICPGMYMAIATVELALANLLYRFNWNLPNGMREADINMEEAAGLTV

RKKFALNLVPILHHC

 

#5

>CYP71AS5.c CAAP02000057.1 Vitis vinifera 6 genes in a cluster

157794  MALYSPSIWLHLLLLLLPLMFLIKRKIELKGQKKPLPPGPTKLPIIG 157654

157653  NLHQLGALPHYSLWQLSKKYGSIMLLQLGVPTVVVSSAEAAREFLKTHDIDCCSRPPLV  157477

157476  GLGKFSYNHRDISFAPYGDYWREVRKICVLEVFSTKRVQSFQFIREEEVALLIDSIVQSS  157297

157296  SSGSPIDLTERLMSLTANIICRIAFGKSFQASEFGDGRFQEVVHEAMALLGGLTASDFFP  157117

157116  YVGRIVDRLTGLHGRLERSFHEMDGFYQQVIEDHLNPGRVKEEHEDIIDVLLRIEREQSE  156937

156936  SSALQFTKDNAKAILM  (0) 156889

156035  DLFLAGVDTGAITVAWAMTELARNPGIMKKAQAEVRSSIGNKGKVTESDVDQ  155880

155879  LHYLKVVVKETLRLHPPAPLLLPRETMSHFEINGYHIYPKTQVHVNVWAIGRDPNLWKNP  155700

155699  EEFLPERFMDNSVDFRGQHFELLPFGAGRRICPGMYMAIATVELALANLLYRFNWNLPNG  155520

155519  IREADISMEEAAGLTVRKKFALNLVPILHHC*  155424

 

@4

>CYP71AS5-de1b CAAP02000057.1 65% to CYP71B.c

159495  VKEEHENFIDVLLQTERDRT  159436

 

#4

>CYP71AS6v1 .d CAAP02000057.1 Vitis vinifera 6 genes in a cluster

152305  MALYSPSIWLHLLLLLLPLMFLIKRKIELKGQKKPLPPGPTKLPIIGNLHQLGALPHYSL  152126

152125  WQLSKKYGSIMLLQLGVPT 152069

152068  VVVSSAEAAREFLKTHDIDCCSRPPLVGPGKFSYNHRDIGFAPYGDYWREVRKICVLEVF  151889

151888  STKRVQSFQFIREEEVTLLIDSIAQSSSSGSPIDLTERLMSLTANIICRIAFGKSFQVSE  151709

151708  FGDGRFQEVVHEAVALLGGFTAADFFPYVGRIVDRLTGLHGRLERSFLEMDGFYERVIED  151529

151528  HLNPGRVKEEHEDIIDVLLKIERERSESGAVQFTKDSAKAIIM (0) 151400

150931  DLFLAGVDTGAITLTWAMTELARNPRIMKKAQVEVRSSIGKKGKVTKGDVDQLHYLKMVV  150752

150751  KETLRLHPPVPLLVPRETMSHFEINGYHIYPKTQVHVNVWAIGRDPNLWKNPEEFLPERF  150572

150571  MDNSVDFRGQHFELLPFGAGRRICPGMYMAIATVELALANLLYRFNWNLPNGMREADISM  150392

150391  EEAAGLAVRKKFALNLVPILHHC* 150320

 

>CYP71AS6v2 gi|147855782|emb|CAN83446.1a 2 genes 55% to CYP71B

97% to 71B.d missing some seq after LPII

This may be an allele of 71B.d since it is upstream of 71B.e

MALYSPSXWLHLLLLLLPLMYLIKRXIELKGQKKPLPPGPTKLPII

 

VSSAEAAREFLKTHDIDCCSRPPL

VGXGKFSYNHRDIGFAPYGDYWREVRKICVLEVFSTKRVQSFQFIREEEVTLLIDSIAQSSSSGSPIDLT

ERLMSLTANIICRIAFGKSFQASEFGDGRFQEVVHEAMALLGGFTAADFFPYVGRIVDRLTGLHGRLERS

FLEMDGFYQRVIEDHLNPGRVKEEHEDIIDVLLKIERERSESGAVQFTKDSAKAILMDLFLAGVDTGAIT

LTWAMTELARNPRIMKKAQVEVRNSIGNKGKVTE GDVDQLHYLKMVVKETLRLHPPAPLLVPRETMSHFE

INGYHIYPKTQVHVNVWAIGRDPNLWKNPEEFLPERFMDNSVDFRGQHFELLPFGAGRRICPGMYMAIAT

VELALANLLYRFNWNLPNGMREADINMEEAAG

 

#3

>CYP71AS7v1.e CAAP02000057.1 Vitis vinifera 6 genes in a cluster, 58% to CYP71AS1

135067  MAPYSPDLWLPLVLLFLSLLFLLKKILELKEQKGPPGPPKLPIIG 134933

134932  NLHQLGALIHQSLWQLSKKYGPVMLLHLGFVPTLVVSSAEAAKKVLKDHDISCCSRPPLI  134753

134752  SIGRLSYNYLDISFAPYGPYWREIRKICVLQLFSTNRVQSFQVIREAEVALLIDSLAQSS  134573

134572  SSASPVDLTDKIMSLTANMICRIAFGRSFEGSEFGKGRFQEVVHEATAMMSSFFAADFFP  134393

134392  YVGRIVDRLTGIHERLEKSFHELDCFYQQVIEEHLNPGRMKEEHEDIIDVLLNIEKEQDE  134213

134212  SSAFKLTKDHVKAILM (0) 134165

134087  DLFLAGVDTGAITVVWAMTELARKPGVRKKVQDEVRSHIRERGKVRESDIEQ  133932

133931  FHYLKMVVKETLRLHPPVPLLLPKETMSTIEISGYQIYPKTQVYVNVWAIGRDPNLWNNP  133752

133751  EEFFPERFIDNSVDFKGQHFEFLPFGAGRRVCPAMNMAIAMVELTLANLLYHFNWKLPHG  133572

133571  MKEGDINMEEAPGLSVHKKIALSLVPIKYP*  133479

 

>CYP71AS7v2

CAN83446.1b 2 genes

Contains some intron seq. This seq ortholog to CYP71AS7v1

KKILELKEQKGPPGPPKLPIIGNLHQLGALIHQSLWQLSKKH

GPVMLLHLGFVPTLVVSSAEAAKKVLKDHDISCCSRPPLISIGRLSYNYLDISFAPYGPYWREIRKICVL

QLFSTNRVQSFQVIREAEVALLIDSLAQSSSSASPVDLTDKIMSLTANMICRIAFGRSFEGSEFGKGRFQ

EVVHEATAMMSSFFAADFFPYVGRIVDRLTGIHERLEKSFHELDCFYQQVIEEHLNPGRMKEEHEDIIDV

LLNIEKEQDESSAFKLTKDHVKAILMAYFFEQDLFLAGVDTGAITVVWAMTELARKPGVRKK

Missing some seq here

EKFRESDI

EQFHYLKMVVKETLRLHPPVPLLLPKETMSTIEISGYQIYPKTQVYVNVWAIGRDPNLWNNPEEFFPERF

IDNSVDFKGQHFEFLPFGAGRRVCPAMNMAIAMVELTLANLLYHFNWKLPHGMKEGDINMEEAPGLSVHK

KIALSLVPIKYP

 

@3

>CYP71AS8P.f CAAP02000057.1 Vitis vinifera 6 genes in a cluster

pseudogene 74% to .c, missing first exon

126787  NLLLAGVNTSASTVVWAMAELARNPIVMKKAQAEVRSVIGN  126660

126659  KGKVTESDLDQLLYFKLVVKETFRLHPPSPLLLPRETMSHFQMNGYHIHPKTRVHVNV*A  126480

126479  IGRDPNVWKNPKEFFPESFIDNSIDFKGQHFELLPFGAGRRVCPAINMGIAMLELTFANL  126300

126299  LYHFNWKLPHGMKEEDINMEEGAGITSPKKFALILRPTQYP*  126174

 

@2

>CYP71AS9P.fg CAAP02000057.1 pseudogene 60% to CYP71B.d

122651  FLAGAKQECPTMV*EMAELARNPRTMKKTQAEVRSCAGKQGKVLGT

122506  DLDQLNYLKMMTMKEMLRLYPSV

        TILPTETMQHFNIN

        VYPKTQFLQLDVLAIGKDP  122327

122326  NIWEN

122309  PEEFSLERF  122283

 

@6

>CYP71AS10P CAAP02000950.1  pseudogene (+) strand, 49% to CYP71AS5.c   CAAP02000057.1

8049 VIKKATVVLASFSREDFFQFGGWIIDKFIGVHA*REKSFHIFDQFYQKVIDDHLDLNRP 8225

8226 KPEHEDIVDVLLGL*KDQTNV 8288

8601 NLFLGII*ATTITIVWALTELAKNPRVMKVAQAEIKSCLGYKLMVEESDLDRFQYLKIVF 8780

8781 K 8783

8780 QTLRLHPPLVMLTPWETVAHCKIGGYDVYPKTRIHINVWVIGKDPRVWDNLEEFNPERF 8956

8957 MNSDIDFRGQHFALVPLGAGRRLCLGMNIATTIMELTLANLLYSFD*RLPSGMKMEEIST 9136

9137 EEGFGSPGHKNEPLYLIP 9190

 

>CYP71AS11P AM481172 missing part of exon 1

CAN66328.1

this part 66% to CYP71AS7v1

11384 MATYSPFLWLPLLLLLPSLFFLIKRTVDQ*RVQREQLPPGLPIIGNLHQLGQLPHQS 11214

11213 LWQLFHKYG 11187

11185 TVIVLHLGFVPTLVVSSAEAARVVLKTRD 11099

(gap)  this part 73% to 71AS7v1

10117 NLLLAGVNTSASTVV*AMAELARNPRVMKKAQAEVRSVMGNKGKVTESDLDQLLYLKLVV 9938

 9937 KEIFRLHPPGPLLLPRETMSHFQMNGYHIHPKTRVHVNV*AIGREPNVWKNPEEFFPLRF 9758

 9757 IDNSIDFKGQHFELLPFGAGRRVCPAINMGIAMVELTFANLLYHFNWKLPHGMKEEDINM 9578

 9577 EEGAGITSPKKFAFILRP 9524

 

>CYP71AS12P second pseudogene on AM481172 56% to CYP71AS6v1

8682 VMLLQLGSVPTVVVSSA*ATKEVKT 8608

7225 FLAGAKQECPTMV*EMAELARNPRIMKKTQAEVRSCAGKQGKVLGT 7088

7080 DLDQLNYLKMMTMKEMLRLYP

7018 FSHTILPTETMQHFNIN 6068

6966 SSSVYPKTQFLQLDVLAIGKDP 6901

6900 NIWENTQKNF 6871

6883 PEEFSLERF 6857

 

>CYP71AT3 CAAP02000328.1a, 92% to CAN64422.1

(CAO61025.1)

46439  MTLLLFVILAFPLFLLFLYRKHRKNGGLLPPGPPGLPFIGNLHQMDNSAPHRYLWQLS  46612

46613  KQYGPLMSLRLGFVPTIVVSSAKIAKEVMKTQDLEFASRPSLIGQQRLSYNGLDLAFSPY  46792

46793  NDYWREMRKICVLHLFTLKRVKSYTSIREYEVSQMIEKISKLASASKLINLSEALMFLTS  46972

46973  TIICRVAFGKRYEGEGCERSRFHGLLNDAQAMLGSFFFSDHFPLMGWLDKLTGLTARLEK  47152

47153  TFREMDLFYQEIIEEHLKPDRKKQELEDITDVLIGLRKDNDFAIDITWDHIKGVLM (0)  47320

47389  NIFLGGTDTGAATVTWAMTALMKNPRVMKKAQEEVRNTFGKKGFIGEDDVEKLPYLKA  47562

47563  VVKETMRLLPSVPLLVPRETLQKCSLDGYEIPPKTLVFVNAWAIGRDPEAWENPEEFMPE  47742

47743  RFLGSSVDFRGQHYKLIPFGAGRRVCPGLHIGVVTVELTLANLLHSFDWEMPAGMNEEDI  47922

47923  DLDTIPGIAMHKKNALCLVAKKYN*  47997

 

>CYP71AT4 CAAP02000328.1b, 96% to CAN64422.1 but 5.8 kb upstream, different gene

76820  MTVLLFVILAFPLLLLFLHRKHRKNGGLLHLPPGPPGLPVIGNLHQMDNSAPHRYLWQLS  76999

77000  KQYGPLMSLRLGFIPTIVVSSARIAKEVMKTHDLKFASRPSLIGPRRLSYNCLDLAFSPY  77179

77180  NDYWREMRKICVLHLFTLKRVQSYTPIREYEVSQMIEKISKLASASKLINLSETVMFLTI  77359

77360  TIICRVSFGKRYEDEGCETSRFHGLLNDAQAMLGSFFFSDHFPLMGWLDKLTGLTARLEK  77539

77540  TLRDMDLFYQEIIEDHLKPDRKKQEQEDITDVLIELQKDNSFAIDITWDHIKGVLM (0)  77707

77780  NIFVGGTDAGTATVIWAMTALMKNPRVMKKAQEEVRNTFG  77899

77900  KKGFIGEDDVEKLPYLKAVVKETMRLLPAAPLLLPRETLQKCSIDGYEIPPKTLVFVNAW  78079

78080  AIGRDPEAWENPEEFIPERFLGSSVDFRGQNYKLIPFGAGRRVCPAIHIGAVTVELTLAN  78259

78260  LLYSFDWEMPAGMNKEDIDFDVIPGLTMHKKNALCLMAKKYN*  78388

 

>CYP71AT5P gi|147832399|emb|CAN64422.1| 48% to CYP83A2/83B1

CAAP02000328.1c 84167-85735 100% match, 96% to CYP71AT4

MTVLLFVILAFPLLLLFLHRKHRKNGGLLHLPPGPPGLPFIGNLHQMDNSARHRYLWQLSKQYGSLMSLR

LGFIPTIVVSSARIAKEVMKTHDLEFASRPSLIGPQRLSYNCLDLAFSPYNDYWREMRKICVLHLFTLKR

VQSYTPIREYEVSQMIEKISKLASASKLINLSETLMFLTSTIICRVAFGKRYEDEGFERSRFHGLLNDAQ

AMLGSFFFSDHFPLIGWLDKLTGLTARLEKTFRDMDLFYQEIIEDHLKPDRKKQEQEDITDVLIGLQKDN

SFAIDITWDHIKGVLM (0)

NIFVGGTDTGAATVIWAMTALMKNPRVMKKAQEEVRNTFGKKGFIGEDDVEKLP

YLKAVVKETMRLLPAVPLLIPRETLQKCSIDGYEIPPKTLVFVNAWAIGRDPEAWENPEEFIPERFLGSS

VDFRGQNYKLIPFGAGRRVCPGIHIGAVTVELTLANLLYSFDWEMPAGMNKEDIDFDVIPGLTMHKKNAL

CLMAKKYN*

 

>CYP71AT6P CAAP02000328.1d, pseudogene 100% to CAN64424.1 in overlaps, 69% to CAN64422.1

94192  ILLALPLILLEIRETMEECFFRPPGPPGLPFIGNLLHLDKSAPHRYLWQLSEKYGAL  94362

94363  MFLRLGFVPTLVVSSARMAEEVMKTHDLEFSSRPSLLGQQKLS*NGLDLAFAPYTNYWRE  94542

94543  MKKICTLHLFNSKRAQSFRSIREDEVSRMIEKISKFASASKLVNLSETLHFLTSTIICRI  94722

94723  AFSKRYEDEGWERSRFHTLLSEAQAIMGASFFKDYFPFMGWVDKLTGLTARLQKILRELD  94902

94903  LFYQEIIDHLNPERTKYEQEDIADILIG

       RINDSSFAIDITQDHIKAVVM

95017  NIFVGGTDTIAAILVWAMTALMKDPIVMKKAQEEIRNIG  95259

95260  GKKGFRDEDDIEKLPYLKALTKETMKLHPPIPLIPRATPENCSVNGCEVPPKTLVFVNA  95436

95437  WAIGRDPESRENPHEFNPERFLGTFIDFKGQHYGLMAFRAGRRGCPGIYLRTVIIQLALG  95616

95617  NLLYSFDWEMPNGMTKEDIDTDVKHGVTM  95703

 

>CYP71AT6P CAN64424.1 44% to 83A2/B1, pseudogene

MKKICTLHLFNSKRAQSFRSIREDEVSRMIEKISKFASASKLVNLSETLHFLTSTIICRIAFSKRYEDEG

WERSRFHTLLSEAQAIMGASFFKDYFPFMGWVDKLTGLTARLQKILRELDLFYQEIIDHLNPERTKYEQE

DIADILI

 

GGTDTIAAILVWAMTALMKDPIVMKKAQEEIRNIGGKKGFRDEDDIEKLPYLKALTKETMKLH

PPIPLIPRATPENCSVNGCEVPPKTLVFVNAWAIGRDPESRENPHEFNPERFLGTFIDFKGQHYGLMAFR

AGRRGCPGIYLRTVIIQLALGNLLYSFDWEMPNGMTKEDIDTD

GHFTGQLGQLAGNILGGFRQLRFSGVSITMWKLKRWKLRVHETQKNI

 

>CYP71AT7 CAAP02000328.1e, 84% to 104360

99326   MMILLLILLALPLFLLFLLRNRRRTPLPPGPPGLPLIGNLLQLDKSAPHIYLWRLS  99493

99494   KQYGPLMILRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSLLGLRKLSYNGLDVAFSPY  99673

99674   NDYWREMRKICVLHLFNSKRAQSFRPIREDEVLEMIKKISQFASASKLTNLSEILISLTS  99853

99854   TIICRVAFSKRYDDEGYERSRFQKLVGEGQAVVGGFYFSDYFPLMGWVDKLTGMIALADK  100033

100034  NFKEFDLFYQEIIDEHLDPNRPEPEKEDITDVLLKLQKNRLFTIDLTFDHIKAVLM (0)  100201

100333  NIFLAGTDTSAATLVWAMTMLMKNPRTMTKAQEELRNLIGKKGFVDEDDLQKLPYLKAIV  100512

100513  KETMRLHPASPLLVPRETLEKCVIDGYEIPPKTLVYVNAWAIGRDPESWENPEEFMPERF  100692

100693  LGTSIDFKGQDYQLIPFGGGRRICPGLNLGAAMVELTLANLLYSFDWEMPAGMNKEDIDI  100872

100873  DVKPGITMHKKNALCLLARIPMH*  100944

 

>CYP71AT8 CAAP02000328.1f, 71% to CAN64422.1

104360  MMILLLILLALPLFLLFLLRNQRRAPLPPGPPGLPFIGNLLQLDKSAPHLYLWRLS  104527

104528  KQYGPLMFLRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSLLGQQKLFYNGLGLTFTPY  104707

104708  NDYWREMRKICVLHLFNSKRVQSFRYIREDEVLEMIKKISKFASASKLTNLSEILIPLTS  104887

104888  TIICRVAFGKRYDDEGCERSRFHELLGGIQTMAIAFFFSDYFPLMSWVDKLTGMISRLEK  105067

105068  VSEELDLFCQKIIDEHLDPNKPMPEQEDITDILLRLQKDRSFTVDLTWDHIKAILM (0) 105235

105143  DIFIAGTDTSAATLVWAMTELMKNP  105427

105428  IVMKKAQEEFRNSIGKKGFVDEDDLQMLCYLKALVKETMRLHPAAPLLVPRETREKCVID  105607

105608  GYEIAPKTLVFVNAWAIGRDPEFWENPEEFMPERFLGSSIDFKGQDYQFIPFGGGRRACP  105787

105788  GSLLGVVMVELTLANLLYSFDWEMPAGMNKEDIDTDVKPGITVHKKNALCLLARSHT*  105961

 

>CYP71AT8 AM489206.2a 58% to 71AT1 tomato

1212 MMILLLILLALPLFLLFLLRNQRRAPLPPGPPGLPFIGNLLQLDKSAPHLYLWR 1373

1374 LSKQYGPLMFLRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSLLGQQKLFYNGLGLTFT 1553

1554 PYNDYWREMRKICVLHLFNSKRVQSFRYIREDEVLEMIKKISKFASASKLTNLSEILIPL 1733

1734 TSTIICRVAFGKRYDDEGCERSRFHELLGGIQTMAIAFFFSDYFPLMSWVDKLTGMISR 1910

1911 LEKVSEELDLFCQKIIDEHLDPNKPMPEQEDITDILLRLQKDRSFTVDLTWDHIKAILM (0) 2087

2206 DIFIAGTDTSAATLVWAMTELMKNPIVMKKAQEEFRNSIGKKGFVDEDDLQMLCYLKA 2378

2379 LVKETMRLHPAAPLLVPRETREKCVIDGYEIAPKTLVFVNAWAIGRDPEFWENPEEFMPE 2558

2559 RFLGSSIDFKGQDYQFIPFGGGRRACPGSLLGVVMVELTLANLLYSFDWEMPAGMNKEDI 2738

2739 DTDVKPGITVHKKNALCLLARSH 2807

 

>CYP71AT9 CAAP02000328.1g, 73% to CAN64422.1

(CAO61031.1) on contig CU459218.1 chr18 scaffold_1

111012  MILHLILLALPLFLLFLVRNHRNNGRTPLPPGPPGLPFIGNLLQISKTAPHLYLWQLSKQ  111191

111192  YGSLMFLRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSMLGLKKLTYNGLSLSVAPSND  111371

111372  YWREMRKVCALHLFNSKRVQSFRHIREDEVLETVKKISKFASASKLTNLSEILILLTSTI  111551

111552  ICRVAFGKRYDDEGCERSRFHELLGGVQTMSMAFFFSDHFPLMGWVDKLTGMIARLEKIF  111731

111732  EELDLFCQEIIDEHLDPNRSKLEQEDITDVLLRLQKDRSSTVDLTWDHIKAMFV (0) 111929

111831  DIFVAGTDTSAATVVWAMTELMKNPIVMKKAQE  112091

112092  ELRNLIGKKGFVDEDDLQKLSYLKALVKETMRLHPAAPLLVPRETLEKCVIDGYEIAPKT  112271

112272  LVFVNAWAIGRDPEFWENPEEFMPERFLGSSIDFKGQDYQLIPFGGGRRVCPGLLLGAVM  112451

112452  VELTLANLLYSFDWEMPAGMNKEDIDTDVKPGITMHKKNALCLLARSHI*  112601

 

>CYP71AT9 AM489206.2b 57% to 71AT1 tomato, 88% to AM489206.2a

same as partial seq CAN71113.1

7716 MILHLILLALPLFLLFLVRNHRNNGRTPLPPGPPGLPFIGNLLQISKTAPHLYLWQLSKQY 7898

7899 GSLMFLRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSMLGLKKLTYNGLSLSVAPSNDY 8078

8079 WREMRKVCALHLFNSKRVQSFRHIREDEVLETVKKISKFASASKLTNLSEILILLTSTII 8258

8259 CRVAFGKRYDDEGCERSRFHELLGGVQTMSMAFFFSDHFPLMGWVDKLTGMIARLEKIF 8435

8436 EELDLFCQEIIDEHLDPNRSKLEQEDITDVLLRLQKDRSSTVDLTWDHIKAMFV (0) 8597

8697 DIFVAGTDTSAATVVWAMTELMKNPIVMKKAQEELRNLIGKKGFVDEDDLQXLSYLKA 8870

8871 LVKETMRLHPAAPLLVPRETLEKCVIDGYEIAPKTLVFVNAWAIGRDPEFWENPEEFMPE 9050

9051 RFLGSSIDFKGQDYQLIPFGGGRRVCPGLLLGAVMVELTLANLLYSFDWEMPAGMNKEDI 9230

9231 DTDVKPGITMHKKNALCLLARSHI* 9305

 

>CYP71AT10Pv1 CAAP02000328.1h, pseudogene, 70% to CAN64422.1, 96% to CAN71114.1

94% to CYP71AT5P

116430  LHLPPGPGLPFIGNLYQMDNSTPHVYLWQLSKQYGPILSLGLGLVPTLVDSLAKMAKEL  116606

116607  LKAHDLEFSSRSSSLGQQSVT  116669

        YNGLDLD

117280  FAPYDGYWREMRKICVLHPFSSKRVQSFRSIREDEVSRIIEKISKSASAAKLTDLSETVM  117459

117460  LLTSNIICRTAFGKRYEDKGYDRSRFHGLLNDAQAMMGSFFFTDHFPSMGWVDKLTDLIA  117639

117641  RPEKNFKELDLFYQEVIDEHLDPKRPKQEQEDIAVVLLRLQRERLFSVDLTWDHIKAVLM  117820

117972  DVFVAGTDPGAATLVWAMAEVTKNPGGKKKAQEELRTVFGRKGFVDEDDLHKLPYLKA  118145

118146  LVKETLRVHPPAPLLLTKETLENCTIDAYDIPPKTLVFVNAWAIGRDPEAWENPEEILPE  118325

118326  RFLSSSVDFKGQDYELISFSVGRRGCPGIHLGVVTVELALANLLYSFD*EMPAGMNKENI  118505

118506  DMDMKPGLTLDKRNALCLQARQYNLAS*  118589

 

>CYP71AT10Pv2 AM489206.2c pseudogene 70% to AM489206.2a 56% to 71AT1

13061 PPGPGLPFIGNLYQMDNSAPHVYLWQLSKQYGPILSLGLGLV 13186

14708 GVTXTLVVSSARMAKEVLKAHDLEFSSRSSSLGQQRLSYNGLDLAFAPYDGYWREMRKICVLHPF 14902

14903 SSKRVQSFRSIREVEVSRMIEKFSKSASAAKLTDLSETVMLLTSNIICRTAFGKRYEDK 15079

15080 GYDRSRFHGLLNDAQAMMGSFFFTDHFPSMGWVDKLTXL

      LLRPEKNFKELDLFY*EIIDEHLDPKRPKQEQEDIXVV 15311

15312 LLRLQRERLFLVDLTWDHIKAVPM (0)

15535 DVFVAGTDPGAATLVWAMAEVTKNPGGKKKAQEELRTVFGRKGFVDEDDLHKLPYLKA 15708

15709 LVKETLRVHPPAPLLLXKETLENCTIDGYDIPPKTLVFVNAWAIGRDPEAWENPEEILPE 15888

15889 RFLSSSVDFKGQDYELISFSVGRRGCPGIHLGVVTVELALANLLYSFD*EMPAGMNKENI 16068

16069 DMDMKPGLTLDKRNALCLXARQY 16137

 

>CYP71AT10Pv2 CAN71114.1 50% to 83A2/B1

same as AM489206.2c only 9 aa diffs with CYP71AT10Pv1 (97% identical)

MAKEVLKAHDLEFSSRSSSLGQQRLSYNGLDLAFAPYDGYWREMRK

ICVLHPFSSKRVQSFRSIREVEVSRMIEKFSKSASAAKLTDLSETVMLLTSNIICRTAFGKRYEDKGYDR

SRFHGLLNDAQAMMGSFFFTDHFPSMGWVDKLT

 

DVFVAGTDPGAATLVWAMAEVT

KNPGGKKKAQEELRTVFGRKGFVDEDDLHKLPYLKALVKETLRVHPPAPLLLXKETLENCTIDGYDIPPK

TLVFVNAWAIGRDPEAWENPEEILPERFLSSSVDFKGQDYELISFSVGRRGCPGIHLGVVTVELALANLL

YSFD

 

>CYP71AT11P CAAP02000504.1  pseudogene 76% to CAAP02000328.1e 50% to 71B37

104371 MLLLLVFLMVLPLFLLWKHRVNGGKLLPPGPPGLPLIGSLHQL 104499

116804 SL*SLTDTYGISLNNMDPLMFLHLGFEPILVVSSPRTAEVMKTHDPEFSSRPSLLVIT 116977

125485 ALQKLSYNGLDLAFASYGAYWREIRKICV 125571

125582 DIVDILLKLHKDRLFTVDLSWNHIKAVLM (0) 125668

126490 AGTDTVAATMVWTMTALMKNPRVMKKAQKEVRTLVGEKCFVDEDDIQKLTYMKALVKESMR 126672

126673 LYPAAPLLIPRETLQKCNIDGY*IPTKTLVFVNAWAIGRDPESWENPEEFMPERFLGTCI 126852

126853 DFKGQDYKLIPFGAGRRIWPGMNLGAVTVELALANLLYSFDWEMPAGMKMEDIDTDAKPG 127032

127033 LTMTKKNDLYLVARNYI* 127086

 

>CYP71AU3 CAAP02005726.1 85% to CAAP02001743.1b, 54% to 71A26

11548 MGSFLDLLYKENASFFLLFLPFF

11479 VFIYFLIKWLYPTTPAVTTKRLPPSPPKLPIIGNLHQLGLLPHRSLWALAQRHGPIMLLH 11300

11299 FGKVPVVIVSAADAAREIMKTNDVIFLNRPKSSIFAKLLYDYKDVSMAPYGEYWRQMRSI 11120

11119 CVLHLLSNRRVQSFRGVREEETALLMEKISSSSSSSTPIDLSKMFLSLTNDLICRVALGR 10940

10939 KYSGDETGRKYRELLKEFVGLLGGFDVADYIPWLSWVNFINGLDAKVEKVAKEFDRFLDE 10760

10759 VVKEHVERRKRGVDEEVKDFVDVLLGIQEDNVTGVAITGVCIKALTL (0) 10619

10189 DMFAAGSDTTYTVLEWAMTELLRHPQVMRQLQNEVRGIAQGKLLITEDDLDKMQYLKAVI 10010

10009 KETLRLHPPVPLLLPRESTRGAKIMGYDIEVGTQVITNAWAIGRDPLLWDEAEEFRPERF 9830

 9829 LNSSIDFTGKDFELIPFGAGRRGCPGTLFAAMAIEVALANLVHQFDWEVGGGGRREDLDM 9650

 9649 TECTGLTIHRKVPLLAVATPWPR* 9578

 

>CYP71AU4 gi|147767047|emb|CAN67678.1| 46% to 71T4

CAAP02004888.1 13222-15147 1 aa diff

MLLLDPLSFSLFPFFFFIVLLVRWLFSTPPTTHKTLPPSPPRLPVLGNMHQLGIYPYRSLLCLARCYGPL

MLLQLGRVRTLVVSSPDAAQEIMKTHDLIFANRPKMSLGKRLLYDYKDVSVAPYGEYWRQMRSICVLHLL

SNKRVQSFNTVRREEISLLIQKIEEFSSLSTSMDLSGMFMRLTNDVICRVAFGRKYSGDERGKKFRRLLG

EFVELLGGFNVGDYIPWLAWVEYVNGWSAKVERVAKEFDEFLDGVVEEHLDGGTGSIAKGDNEKDFVDVL

LEIQRDGTLGFSMDRDSIKALILDIFAGGTDTTYTVLEWAMTELLRHPKAMKELQNEVRGITRGKEHITE

DDLEKMHYLKAVIKETLRLHPPIPLLVPRESSQDVNIMGYHIPAGTMVIINAWAMGRDPMSWDEPEEFRP

ERFLNTNIDFKGHDFELIPFGAGRRGCPGISFAMATNELVLANLVNKFDWALPDGARAEDLDMTECTGLT

IHRKFPLLAVSTPCF*

 

>CYP71AU5 CAAP02003357.1  92% to CAAP02005726.1 53% to 71A26

38067 MGSFLGLLYKENDS

38025 FFLLLLPFFIFTHFLIKWLYPTTPAVTTKKLLPSPPKLPIIGNLHQLGSLPHRSLWALAQ 37846

37845 RHGPLMLLHFGRVPVVIVSAVDAAREIMKTNDAIFSNRPKSNISAKLLYDYKDVSTAPYG 37666

37665 EYWRQMRSICVLHLLSTRRVQSFRGVREEETALLMEKISSSSSSSIPIDLSQMFLSLTND 37486

37485 LICRVALGRKYSGDENGRKYRELLKEFGALLGCFNVGDYIPWL 37357

37357 SWVNFINGLDAKVEKVAKEFDRFLDEVVKEHVERRKRGVDEEVKDFVDVLLGIQEDN 37187

37186 VTGVAITGVCIKALTL 37139

36747 DMFAAGSDTTYTVLEWAMTELLRHPQVMRQLQNEVRGIAQGKLLITEDDLDKMQYLKAVI 36568

36567 KETLRLYPPIPLLVPRESTRDAKIMGYDIAARTQVITNVWAIGRDPLLWDEAEEFRPER 36391

36390 FLNSSIDFRGQDFELIPFGSGRRGCPGTLFAAMAIEVVLANLVHRFDWEVGGGGRREDLD 36211

36210 MTECTGLTIHRKVPLLAVATPWPR* 36136

 

@9

>CYP71AU6P CAAP02001743.1b, pseudogene, 58% to CAN67678.1

35127  IFIYFLIKWLYPTTSTVTTKRLPHFPLKLPIIGNLFQLGSLSHRSL*VLAQRHGSLMLLH  34948

34947  FGRVPVVIVSIANTAREIMKTNDVIFSNRSKSNISAKLLYDYKDVSTTPYKEYWRQMRSI  34768

34767  CVLHFLSTRRVLSFRGVQEEETTLMMEKISSSASSTPIDLSQMFQSLTNDLICRVSL*RK  34588

34587  YSGDETGRKYRELLKKFVGLLGGFNVGDYIPWLSWVNFINGLETKVEKVSKVFDRFLD  34414

34413  EVVK*HVERRKRCGVDEEMKDFVDVLL  34333 XXXXXXXXXXXXXXXXXXXXXX

33813  DMFAARSDSTYTVLEWAMTKLLRHPQVMRQL*NEARGIAQGKLLITEDDLGKMQYLMAVIK  33631

33631  ETLRLHPLIPLLILRESTRGAKIMGYDIEAGTRVITNAWPIGGDPLLWDEAEEFWPERF  33455

33454  LNSSIDFTGKDFELISFGAGQRGCPGTLFAKMAIELVLANLVHHFDWEVAGGGRREDLDM  33275

33274  TECIGLTIHIKVLLLAVATP  33215

 

#11

>CYP71BC1 gi|147861230|emb|CAN80448.1| = AM435124.2

CAAP02002092.1 15806-13388 (-) strand 1 aa diff.

MTMKISENMLLLFSQSSANQWLLALGILSFPILYLFLLQRWKKKGIEGAARLPPSPPKLPIIGNLHQLGK

LPHRSLSKLSQEFGPVLLLQLGRIPTLLISSADMAKEVLKTHDIDCCSRAPSQGPKRLSYNFLDMCFSPY

SDYWRAMRKVFVLELLSAKRAHSLWHAWEVEVSHLISSLSEASPNPVDLHEKIFSLMDGILNMFAFGKNY

GGKQFKNEKFQDVLVEAMKMLDSFSAEDFFPSVGWIIDALTGLRARHNKCFRNLDNYFQMVVDEHLDPTR

PKPEHEDLVDVLLGLSKDENFAFHLTNDHIKAILL (0)

NTFIGGTDTGAVTMVWAMSELMANPRVMKKVQAEV

RSCVGSKPKVDRDDLAKLKYLKMVVKETFRMHPAAPLLIPHRTRQHCQINANGCTYDIFPQTTILVNAFA

IGRDPNSWKNPDEFYPERFEDSDIDFKGQHFELLPFGAGRRICPAIAMAVSTVEFTLANLLYCFDWEMPM

GMKTQDMDMEEMGGITTHRKTPLCLVPIKYGCVE*

 

@10

>CYP71BC1-de2b CAAP02002092.1 C-term pseudogene, 67% to 71BC1

16741  KAQHTDMEEVGGITISR  16691

16647  PLCFVPIKYGWV  16612

 

@11

>CYP71BC3-de1b CAAP02002092.1 N-term pseudogene 80% to 71BC2

19486  VVLYSVICFFLVQKWGNRVVVERATTPPSPSKLAIIGNLHQLS*WSYRSLWTLSQKYGSI  19307

19306  MFLQLGSV  19283

 

#12

>CYP71BC3 gi|147781883|emb|CAN72169.1| 62% to CYP71BC1

CAAP02002092.1 28083-26374 (-) strand 100% match

MAMEIAEAVMEVFSPSSVTDWLFTLSVVLLSVLCFFLVQKWGNRAVLERATTPPSPPKLPIIGNLHQLSKLH

HRSLWTLAQKHGSIMFLQLGSIPTIVISSADMAEQVLRTRDNCCCSRPSSPGSKLLSYNFLDLAFAPYSD

HWKEMRKLFNANLLSPKRAESLWHAREVEVGRLISSISQDSPVPVDVTQKVFHLADGILGAFAFGKSYEG

KQFRNQKFYDVLVEAMRVLEAFSAEDFFPTGGWIIDAMSGLRAKRKNCFQNLDGYFQMVIDDHLDPTRPK

PEQEDLVDVFIRLLEDPKGPFQFTNDHIKAMLM (0)

NTFLGGTDTTAITLDWTMSELMANPRVMNKLQAEVRS

CIGSKPRVERDDLNNLKYLKMVIKEALRKHTPIPLLIPRETMDYFKIHDKSSSREYDIYPGTRILVNAWG

IGRDPKIWKDPDVFYPERFEDCEIEFYGKHFELLPFGGGKRICPGANMGVITAEFTLANLVYCFDWELPC

GMKIEDLGLEEELGGITAGRKKPLCLVARRCGCSCTEPM*

 

@12

>CYP71BC3-de1c CAAP02002092.1 N-term pseudogene, 78% to CYP71BC2

29938  KLATIGNLHQLSKWSYRSLWTLSQKYGSIMFLQLGSV  29828

 

@13

>CYP71BC3-de1d CAAP02002092.1 N-term pseudogene 79% to CYP71BC2

31559  VVLFSVICFFLVQKWGNRVVVERATTPPSPSKLAIIGNLHQLS*WSYRSLWTLSHKYGSI  31380

31379  MFLQLGSV  31356

 

@14

>CYP71BC3-de2b CAAP02002092.1 C-term pseudogene 95% to 71BC2

39030  KMVIKEAMRKHTPIPLLIPRETMDYFKIHDKSSSREYDIYRETRILVNAWGIGRDPKSWK  38851

38850  DPDVFYPERFEDCEIEFYGKHFELLPFGGGKRICPGANMGVITAEFTLANLVCCFDWELP  38671

38670  CGMKIEDLGLEEELGGITASRKTPLCLVARRCGC  38569

 

>CYP71BE1 CAAP02002803.1, 46% to CAAP02001743.1a, 42% to 71B37,

6 aa diffs to CYP71BE1 AM445470.2

28146  MEFPSSFLFPFLLFLFILFKVSKKSKPQISIPKRPPGPWKLPLIGNLHQLVGSLPHHSLRDL  28331

28332  AKKYGPLMHLQLGQVSMLVVSSPEIAKEVMKTHDINFAQRPHLLATRIVSYDSTDVAFSP  28511

28512  YGDYWRQLRKICVVELLSAKRVKSFQVIRKEEVSKLIRIINSSSRFPINLRDRISAFTYS  28691

28692  AISRAALGKECKDHDPLTAAFGESTKLASGFCLADLYPSVKWIPLVSGVRHKLEK  28856

28857  VQQRIDGILQIVVDEHRERMKTTTGKLEEEKDLVDVLLKLQQDGDLELPLTDDNIKAVIL (0) 29036

       DIFGGGGDTVSTAVEWTMAEMMKNPEVMKKAQAE  29216

29217  VRRVFDGKGNVDEAGIDELKFLKAVISETLRLHPPFPLLLPRECREKCKINGYEVPVKTR  29396

29397  VVINAWAIGRYPDCWSEAERFYPERFLDSSIDYKGADFGFIPFGSGRRICPGILFGIPVI  29576

29577  ELPLAQLLFHFDWKLPNGMRPEDLDMTEVHGLAVRKKHNLHLIPIPYSPLTVG*  29738

 

>CYP71BE4P CAAP02000100.1e pseudogene

97018 LIGNMHQLISYLPHHALRDLAKKHGPLMDLQLGEVSTIIVSSPETAKGVIKTQII 96854

96853 ISQRPHV*KFWI*ELFTAKPVQFFQSIREEEVSGLVRSIS 96734

96733 LNIRSPINLAKE 96698

96499 SGTMVHRVMSEMLKNPQIMKKAQAEVRQTFETKGEVDDIGIHELKILKLVVKETPRLHPP 96320

96319 APLLLPRECGERFEISGCDDIPLNPMSLLLHGQLEEMEALNST*QLQPREIFLKSLVDYK 96140

96139 GTNFDFIPFG 96110

 

>CYP71BE5 CAAP02000100.1d 61% to CAAP02002803.1

81714 MELQFSFFPILCT

81675 FLLFIYLLKRLGKPSRTNHPAPKLPPGPWKLPIIGNMHQLVGSLPHRSLRSLAKKHGPLM 81496

81495 HLQLGEVSAIVVSSREMAKEVMKTHDIIFSQRPCILAASIVSYDCTDIAFAPYGGYWRQI 81316

81315 RKISVLELLSAKRVQSFRSVREEEVLNLVRSVSLQEGVLINLTKSIFSLTFSIISRTA 81142

81141 FGKKCKDQEAFSVTLDKFADSAGGFTIADVFPSIKLLHVVSGMRRKLEKVHKKLD 80977

80976 RILGNIINEHKARSAAKETCEAEVDDDLVDVLLKVQKQGDLEFPLTMDNIKAVLL 80812

80544 DLFVAGTETSSTAVEWAMAEMLKNPRVMAKAQAEVRDIFSRKGNADET 80401

80400 VVRELKFLKLVIKETLRLHPPVPLLIPRESRERCAINGYEIPVKTRVIINAWAIARDPKY 80221

80220 WTDAESFNPERFLDSSIDYQGTNFEYIPFGAGRRMCPGILFGMANVELALAQLLYHFDWK 80041

80040 LPNGARHEELDMTEGFRTSTKRKQDLYLIPITYRPLPVE* 79921

 

>CYP71BE6 CAAP02000100.1c 61% to CAAP02002803.1

60542 MELQFSFFPILCT

60513 FLLFIYLLKRLGKPSRTTHPAPNLPPGPWKLPIIGNMHQLVGSLPHHSLRNLAKKHGPLM 60334

60333 HLQLGEVSAIVVSSREMAKEVMKTHDIIFSQRPCILAASIVSYDCTDIAFAPYGDYWRQI 60154

60153 RKISILELLSAKRVQSFRSVREEEVLNLVRSISSQEGVSINLTESIFSLTFSIISRAA 59980

59979 FGKKCKDQEAFSVTLEKFAGSGGGFTIADVFPSIKLLHVVSGIRHKLEKIHKKLD 59815

59814 TILENIINEHKARSEASEISEAEVDEDLVDVLLKVQKQGDLEFPLTTDNIKAILL 59650

59202 DLFIAGSETSSTAVEWAMAEMLKNPGVMAKAQAEVRDIFSRKGNADETMIHELKFLK 59032

59031 LVIKETLRLHPPVPLLIPRESRESCEINGYEIPVKTRVIINAWAVARDPEHWNDAESFNP 58852

58851 ERFLDSSIDYQGTNFEYIPFGAGRRMCPGILFGMANVEIALAQLLYYFDWKLPNGTQHEE 58672

58671 LDMTEDFRTSLRRKLNLHLIPITYRPLPVE* 58579

 

>CYP71BE6-de1b CAAP02000100.1c-de1b pseudogene N-term

63932 ISILCTFLLFIYLLKRLGKPYRTNGPARKLPAGPWKLPIIGNMHQLFGSLPHHSLRNLAK 63753

63752 QHGTLMHLQPGEASTIVVS*REMEK 63678

 

>CYP71BE7 CAAP02000100.1b 61% to CAAP02002803.1

47703 MELHFPSFH

47676 ILSAFILFLVVVLRTQKRSKTGSLTPNLPPGPWKLPLVGNIHQLVGSLPHHALRDLAKKY 47497

47496 GPLMHLQLGEVSTIVVSSSEIAKEVMKSHDIIFAQRPHILATRIMSYNSTNIAFAPYGDY 47317

47316 WRHLRKICMSELLSANRVQSFQSIRNEEESNLVRSISLNTGSPINLTEKTFASICAIT 47143

47142 TRAAFGKKCKYQETFISVLLETIKLAGGFNVGDIFPSFKSLHLISGMRPKLEKLH 46978

46977 QEADKILENIIHEHKARGGTTKIDKDGPDEDLVDVLLKFHEDHGDHAFSLTTDNIKA 46807

46806 VLL (0) 46798

46624 DIFGAGSEPSSTTIDFAMSEMMRNPRIMRKAQEEVRRIFDRKEEIDEMGIQELKFLKLVI 46445

46444 KETLRLHPPLPLLLPRECREKCEIDGHEIPVKSKIIVNAWAIGRDPKHWTEPESFNPERF 46265

46264 LDSSIDYKGTNFEYIPFGAGRRICPGILFGLASVELLLAKLLYHFDWKLPNGMKQQDLDM 46085

46084 TEVFGLAVRRKEDLYLIPTAYYPLSHE* 46001

 

>CYP71BE8P CAAP02000100.1a frameshift and stop possible pseudogene 

44% to CYP71B33, 64% to CAAP02002803.1

      MEIHLPSSYAFFAFLLSMFIVFKIGKVQIQNL

31068 PAKLPPGPWKLPLIGNMHQLVGSLPHHTLKRLASKYGPFMHLELGEVSALVVSSPEIARE 30889

30888 VMKTHDTIFAQRPPLLSSTIINYNATSISFSPYGDYWRQLRKICTIELLSAKRVKSFQSI 30709

30708 RE*EVSKLIWSISLNAGSPINLSEKIFSLTYGITSRSAFGKKFRGQDAFVSAIL 30547

30546 EAVELSAGFCVADMYPSLKWLHYISGMKPKLEKVHQKIDRILNNIIDDHRKRKTTTKAG 30370

30369 QPETQEDLVDVLLNLQEHGDLGIPLTDGNVKAVLL (0) 30265

29794 DIFSGGGETSSTAVVWAMAEMLKSPIVMEKAQAEVRRVFDGKR 29666

29665 DINETGIHELKYLNSVVKETLRLHPSVPLLLPRECRERCVINGYEIPENTKVIINAWAIA 29486

29485 QDPDHWFEPNKFFPERFLDSSIDFKGTDFKYIPFGAGRRMCPGILFAIPNVELPLANLLY 29306

29305 HFDWKLPDGMKHEDLDMTEEFGLTIRRKEDLNLIPIPYDPFLVL* 29171

 

#13

>CYP71BE9Pv1 CAAP02000216.1a  pseudogene CYP71BE like, 78% to CAAP02001833.1a

7514 MDFLFSSILFAFLLFLYMLYKMGERSKASISTKKLPPGPWKLPLL 7648

7648 GNMHQLVGSLPHQSLSRLSKQYGPLMSLQLCEVYALTISSPEMAKQV 7788

7789 MKTHDINFAHRPPLLASNVLSYDSTDILYPPYGDYWRQLRNICVVELLTSKRVKSFQLVR 7968

7969 EAELSNLITAVVSCSRLPFNRNENLSSYTFSIISRAAFGEKFEDQDAFISVTKEMAELYS 8148

8149 GFCVADMYPSVKWLDLISGMRYKLDKVFQR 8238

8241 DRILQNIVDEHRDKL*PQAGKLQGEEDLVDVLLKLQQHGDLEFPLTDNNIKGVIL 8405

13594 NIFSGGGKTTFTSVD*

13642 AMSEMLKNPRVMEKAQAEVRRVFDGKGNVDETGLDG 13749

13749 IKIF*AVVKETLRLHTPFPLLLPRECREMCWIDGYEIPEKTRIIVNAWAIG*DSVYWVEA 13928

13929 ERFYPERFLDSSIDYKGTDFGYIPFGAGRRICPGIPFAMPYIELPLAHLLYHFDWKLP 14102

14103 KGIKAEDLDMTEAFCLAVCRKQDLHLIPIPYNPLHAQ* 14216

 

>CYP71BE9Pv2 Pinot noir (a highly heterozygous grape genome)

CAN66039.1

top part is a retrotransposon seq like AAP46207  putative retrotransposon protein Oryza sativa

97% (6 aa diffs) to CAAP02000216.1a probable ortholog to the pseudogene

from AM472203.2 exon 2 only

MNEEMKALQIDLPIGKIPVGCRWVFTIKYKVDGTVEWLRKSLYGLKQSPRAWFGRFTSFMKSIGYKQSNS

YHTLFLKHNKEQIIALIVCVDDMIVIGNDYEEMKTLQEHLAHDFEMKDLDKLKYFLGIEVSRSKKAYALS

VVCQFMHSPSKEHMNVVIHILRYLKSSPGKGILFTKGDNLDINGYTDADWAGSIQDRCSTSWYFTFKVVA

RSNAEAEYKGMAKAICELLWIRNLVKDLHIKQVSPMKLYCDNKAACDIAHNPVQHDRTKYVEVGRHFIKE

KLESKLIEVPHVRSQDQLADVLTKAMSNQ

2182 NIFSGGGKTTSTSVD*AMSEMLKNPRVMEKAQAEVRRVFDGKGNVDETGLDGLKFFK 2352

2352 AVVKETLRLHTPFPLLLPRECREMCWIDGYEIPEKTRIIVNAWAIG*DSVYWVEA 2516

ERFYPERFLDSSIDYKCTDFGYVP

FGAGRRICPGIPFAMPYIELPLAHLLYHFDWKLPKGIKAEDLDMTEAFCLAVCRKQDLHLIPIPYNPLHAQ* 2804

 

>CYP71BE10v1 CAAP02000216.1b 79% to CAAP02001833.1a

51405 MEFSSSSLLFAFLLFLYMLYKIGKRSKANISTQKLPPGPWKLPLIGNVHQLVGSLPHRS 51581

51582 LTLLAKKYGPLMRLQLGEVSTLIVSSPEMAKQVMKTHDTNFAQRPILLATRILSYDCSGV 51761

51762 AFAPYGDYWRQLRKICVVELLTAKRVKSFQSVREEEISNLITMVTSCSRLQINFTEKISS 51941

51942 LTFSIIARAAFGKKSEDQDAFLSVMKELVETASGFCVADMYPSVKWLDLISGMRYKIDKV 52121

52122 FRMTDRILQNIVDEHREKLKTQSGKLEGEADLVDVLLKLQQNDDLQFPLTDNNIKAVIL (0) 52298

52520 DIFGGAGESTSTSVEWAMSEMLKAPIVIEKAQAEVRSVFDGKGHVDETAIDELKFLKAVV 52699

52700 NETLRLHPPFPLLLPRECREMCKINGYEIPEKTRIIVNAWAIGRDSDYWVEAERFYPERF 52879

52880 LDSSIDYKGTDFGYIPFGAGRRICPGILFAMPGIELPLANLLYHFDWKLPNGMKAEDLDM 53059

53060 TEAFGLAVRRKQDLHLIPIPYNPSHAD* 53143

 

>CYP71BE10v2 Pinot noir (a highly heterozygous grape genome)

CAN81963.1 (partial translation of intact gene)

Overall 98% to 71BE10 probable ortholog

from AM487125.2 first exon 97% (7 aa diffs) to 71BE10,

second exon 1 aa diff to 71BE10

12930 MEFFSSSLLFAFLLFLYMLYKIAKRSKDNISTQKLPPGPWKLPLIGNVHQLVGSLPHRSL 12751

12750 TXLAKKYGPLMRLQLGEVSTLIVSSPEMAKQVMKTHDTNFAQRPILLATRILSYDCSGVA 12571

12570 FAPYGDYWRQLRKICVVELLTAKRVKSFQSVREEEISNLITMVTSCSRLQINFTEKISSL 12391

12390 TFSIIARAAFGKKSEDQDAFLSVMKELVEXASGFCVADMYPSVKWLDLISGMRYKIDKVF 12211

12210 RMTDRILQNIVDEHREKLKTQSGKLEGEADLVDVLLKLQQNGDLQFALTDNNIKAVIL (0) 12037

11816 DIFGGAGESTSTSVEWAMSEMLKAPIVMEKAQAEVRSVFDGKGHVDETAIDELKFLKAVV 11637

11636 NETLRLHPPFPLLLPRECREMCKINGYEIPEKTRIIVNAWAIGRDSDYWVEAERFYPERF 11457

11456 LDSSIDYKGTDFGYIPFGAGRRICPGILFAMPGIELPLANLLYHFDWKLPNGMKAEDLDM 11277

11276 TEAFGLAVRRKQDLHLIPIPYNPSHAD* 11193

 

>CYP71BE11-de1b CAAP02000216.1c pseudogene N-term

66338 WKLPLIVNMHGLV 66376

 

>CYP71BE11 CAAP02000216.1c pseudogene 85% to CAAP02001833.1a

66818 LLFLYMLYKIGKRSKGNISAQKLPLEPWKLPLIGNMHQLIDGSLPHRSLSRLTKQYESLM 66997

66998 SLQLGEVSTLIISSPEMAKQVMKTHDINFAQR 67093

72159 STLLATNILSYHSIDIDFPPYGDYGRHLQKICVVELLTS*RFKSFQLVGEDELSNL 72326

72327 IT 72332

72334 TLTSCSRLPINLTDKLSSCTFAIIAGAAFGEKCKDQDAFILVLKETLELLFGLCVTNM 72507

72508 YPSVKWLDLISGMRYKIEKVFQRTDRILQNIVDEHRDKMQTEAGKLQGEENIVDVLLKIQ 72687

72688 QHGDHEFPLTDNNIKSXXX 72735

74253 DIFAGGGETTSISVKWAISEMLKNX 74324

74320 RMMEKAQAEVRRVFDGQGNADEELKFLKGVVKETLRLHPPLPLLIPRECREMCEINRYEI 74499

74500 PKKTLIIINAWAIGRDSNYWVEAERFYPDRFLDSSIDYKGTDFGYIPFGAGRRMYHGILF 74679

74680 SLPIIELSLAHLLYHFDWKLPNGMKA*DLDMTEALGLVVRRKQDLHLIPILDNPLHAQ* 74856

 

>CYP71BE12 CAAP02000216.1d one frameshift 83% to CAAP02001833.1a

86311 MDFQFSSILFAFLLFLYMLYKMGERSKASISTQKLPPGPWKLPLIGNMHQLVGSLPHQS 86487

86488 LSRLAKQYGPLMSLQLGEVSTLIISSPDMAKQVMKTHDINFAQRPPLLASKILSYDSMDI 86667

86668 VFSPYGDYWRQLRKICVVELLTAKRVKSFQLVREEELSNLITAIVSCSRPINLTENIFS 86844

86845 STFSIIARAAIGEKFEGQDAFLSVMKEIVELFSGFCVADMYPSVKWLDLISGMRYKLDKV 87024

87025 FQRTDRMLQNIVDQHREKLKTQAGKLQGEGDLVDVLLELQQHGDLEFPLTDNNIKAVIL (0) 87201

87442 DIFSGGGETTSTSLDWAMSEMLENPRVMEKAQAEVRRVFDGKGNVDE 87582

87583 TGLDELKFLKAVVKETLRLHPPLPLLVPRECREMCEINGYEIPKKTSIIVNAWAIGRDSD 87762

87763 YWVEAERFYPERFLDSSIDYKGTDFGYIPFGAGRRMCPGILFSMPSIELSLAHLX 87924

87927 HFDWKLPNEMKAEDLDMTEAFGLAVRRKQDLLLIPIPHNQSHAQ* 88061

 

>CYP71BE13-de2b CAAP02001833.1a pseudogene 94% to CAN81963.1

10995 GHVDENAIDELKFLKAVVKETLRLHPPFPILLPRECREMRKINGYRIPEKTRIIVNAWA 11171

11172 IG*DSDYWVEAERFYPERFLDSSIDYKGADFGYIPFGAGRRICPGILFAMPNIELPLAYL 11351

11352 LYHFDWKLPNGMKAEDLDMTEAFGLAVRRKQDLHLIPIPYKP 11477

 

>CYP71BE13 CAAP02001833.1b 69% to 71BE1

15131 MDVLFSSILFASLLFLYMLYKIGKRWRGNISSQKLPPGPWKLPLIGNMHQLIDGSLPHHSLSRLA 15325

15326 KQYGPLMSLQLGEISTLIISSPEMAKQILKTHDINFAQRASFLATNTVSYHSTDIVFSPY 15505

15506 GDYWRQLRKICVVELLTSKRVKSFQLIREEELSNLITTLASCSRLPINLTDKLSSCTFAI 15685

15686 IARAAFGEKCKEQDAFISVLKETLELVSGPCVADMYPSVKWLDLISGMRHKIEKVFKRTD 15865

15866 RILQNIVDEHREKMKTEAGKLQGEEDLVDVLLKLQQHGDLEFPLTDNNIKAVIL (0) 16027

16320 DIFAGGGETTSISVEWAMSEMLKNPRVMDKAQAEVRRVFDGKGNADEELKFLKVVV 16487

16488 KETLRLHPPFPLLIPRECREMCEINGYEIPKKTLIIVNAWAIGRDSDHWVEAERFYPERF 16667

16668 LDSSIDYKGTDFGYIPFGAGRRMCPGILFSLPIIELSLAHLLYNFDWKLPNGMKADDLDM 16847

16848 TEALGIAVRRKQDLHLIPIPYNPSHVQ* 16931

 

>CYP71BE14P CAAP02008751.1 CYP71BE pseudogene 64% to 71BE1

6182 IQLTVSTLVVSSPEIAKEFMKTDDVSFAQRPNILVTSIVSYGSTNIGFAPYSDYWRQVR 6006

6005 KLCATELLSAKRVKSFQLIREEEVSNVIKRIASHSGSTINLSEEISSVTLPL 5850

5850 IARAAFGKICKDQDSFIGAVTEMAELATGFCAADVFPSVK*VDQVTGIRSKLEKLHERVD 5671

5670 RILQNIVKEHKESMTTKRGKLEAEDLVDTFLKIQEDGDLKFPLTENNVKAVIL (0) 5512

     DMFSG

5255 AGETSSTVGEWAMTELIRHPRVMEKAQ 5175

5178 TRVRREFAGKGTVEESGIHELKFIKAVVKETLRLHPPAPLLLPRECRERC 5029

5028 EINGYEIPVKTRVIDNA*AIGRDPDSWTEPERFNPERFLDSWLDYKGTDFEFIPFGAGRR 4849

4848 MCPDMSFAIPSVELSLANFIYHFDWKLPTGIKPEDLDMTEIISLSVRRKQNLHLIPIPYN 4669

4668 PFPAE* 4651

 

>CYP71BE15P CAAP02007291.1 pseudogene 78% to 71BE1

6388 MEFSSSSVLFPFLLFLFMLFRIGKRSKPNISTPKLPPGPWKLPLIGNLHQLVGSLPHHSL 6567

6568 KDLAEKYGPLMHLQLGQVS 6624

6627 ASPQIAKEVMKTHDLNFAQRPHLLVTRIVTYDSTDIAFAPYGDYWRQLRKICVIELLSAK 6806

6807 RVRSFQLIRKEEVSNLIRFIDSCSRFPIDLREKISSFTFAVISKAALGKEFKEQDSLESV 6986

6987 LEEGTKLASGFCLADVYPSVKWIHLISGMRHKLEKLHGRIDG 7112

7111 EHRERMEKRTGELEAEEDFIDVLLKLQQDGDLELPLTDDNIKAVIL 7248

7688 GHATASTAVEWAMSEMMKNPRVMEQAQAEVRRVFDGKGDVDETGIDELKFLKAVVSETLR 7867

7868 LHPPFPLLLPRECREKCKINGYEVPVKTRMTINAWAIGRDPDYWTEAERFYPERFLDSSV 8047

8048 DYKGADFGFIPFDAGRRMCPGILFAIPSIELPLAHLLFHFDWELPNGMRHEDLDMTEVHG 8227

8228 LSAKRKHSLHLIPIPYNS*PVG* 8296

 

>CYP71BE16P CAAP02000648.1 pseudogene 66% to CYP71BE13

1591 QVHQRLDRILQNIIDEHKESKTTTETGKQEANEDLVDILLKLQKHGNFGFPLIDNNIKAIIL (0) 1776

2908 NIFGGGGETSSIAIEWAM*KMM 2973

2975 KNPRVMEKA*AKVRQIFGGKKTLR*WMKQV*DTL

3077 KTKVIINAWAIGRDPYYQTKAKRFHPE*FLDSPIDYKGNNFEYIPFGAGKRICPGILFAIPNIELPL 3277

3278 ANMLDHFDWELLYGMKKDDIDMTESFGLKVRRKQDLCLILIPHNPLHVE* 3427

 

>CYP71BG1 Solanum tuberosum

DR034423.1, BQ514535.1, BM114062.1, BQ119583.2, BQ506191.2, CK717210.1

67% to CYP71BG2P

MEASILQLLLLLSLTSCTILFYKIRRWRRPPSPPSLPIIGHLHLLTDMPHHTFFHLSQKLG

PIIHLQLGQIPTLIISSPRLAELILKTNDHIFCSRPQIIAAQYLSFGCSDITFSPYGPYWRQARKICVTE

LLSSKRVNSFQFIRNEEINRMIQLISSHFDSELSSELDLSQVFFALANDILCRVAFGKRF

IDDRLKDKDLVSVLTETQALLAGFCLGDFFPDWEWVNWLSGMKKRLMNNLKDLGEVCDEI

IDEHLMKKRDDDQNGDGSEDFVDVLLRVQKRDDLQVPITDDNLKALIL (0)

DMFVAGTDTSAATLEWTMTELARHPSVMKKAQDEVREIAAN

KGKVEEFDLQHLHYMKAVIKETMRLHPPVPLLVPRESIEKCTLDDYEIPAKTRVLINTY

AIGRDPEYWNNPLDYNPERFMEKDIDFRGQDFRFLPFGGGRRGCPGYALGLATIELSLAR

LLYHFDWKLPTGVEAQDVNLSEIFGLATRKRVALKLVPTINKLYLLSD*

 

>CYP71BG2 tomato breaker fruit Solanum lycopersicum

BM411522.1, BM412569.1, BP881630.1, ES895470.1, DB685010

DU947425.1 (GSS)

MEASILQLLLLLSLTSCTILFYKIRGRWRRRPPSPPSLPIIGHLHLLNQMPHHTFFNLSQ

KLGKIIYLQLGQIPTLIISSPRLAELILKTNDHIFCSRPQIIAAQYLSFGCSDITFSPYG

PYWRQARKICVTELLSSKRVHSFEFIRDEEINRMIELISSRSQSEVDLSQVFFGLA

NDILCRVAFGKRFIDDKLKDKDLVSVLTETQALLAGFCFGDFFPDFEWVNWLSGMKKRLM

NNLKDLREVCDEIIKEHLMKNRDDDGSEDFVDVLLKVQKRDDLQVPITDDNLKALI

LDMFVAGTDTSAATLEWTMTELAR

HPSVMKKAQNEVRKIVANRGKVEEFDLQHLHYMKAVIKETMRLHPPVPLLVPRESIEKCS

IDGYEVPAKTRVLINTYAIGRDPEYWNNPLDYNPERFMEKDIDLRG

QDFRFLPFGGGRRGCPGYALGLATIELSLARLL

YRFDWKLPSGVEAQDMDLSEIFGLATRKKVALKLVPTITKLYPTF*

 

>CYP71BG3 CA993587.1 Gossypium hirsutum CO128388.1 Gossypium raimondii

DRKWLNSRSQSLTPPSPPS

LPIIGHLHLLTDMPHHTFTILAQKLGPIIYLQLGQVPTVIVSSPRLARLILKTHDHVFSN

RPQLVSAQYLSFNCSDVTFSPYGPYWRQARKICVTELLSSKRVNSFQLIRDEEVSRLL

TTLSAHPGSEVNVSELFLSLANDILCRVAFGRRFTERVGSSNHLAAVLRETQELFAGM

SVGDFFPEWEWVHSVSGYKRRLMKNLNELRRVCDEVIQEHLQRGETGIKEDFVDVLLR

VQKQDNLEVPITDDNLKALVLDMFVAG TDTSAATLEWTMTELVKHPEIMK

QAQEEVRAVARRTGKAIDETHLQHLHFTKSIIKEAMRLHPTVPLLVPRESMDECIIDGYK

IPPKTRLLINTYAIGRDPNSWDNPLQFNPNRFQDSNIDLKDQDFRFLPFGGGRRGCPGYG

FGLATVEIALARLLFHFDWELPYGIHTDDVDVDEIFGLASRKRTPLILVPTVNEGL*

 

>CYP71BG4 DY280303.1, DY276238.1, Citrus clementina,

Citrus reticulata x Citrus temple EX448715.1 (C-term)

Hybrid: Two different species of citrus are combined here

To try to achieve a full seq. Still missing the C-term

63% to 71BG1, 64% to 71BG3, 63% to 71BG2

MMDSFTPQVLLPLFVVSIITLLYWKLLS

RSRSQPATAANTPPSPPNKYPIIGHLHLLTDMPHHTFAALADKLGPIFHLQLGQVPTVVI

SSSELAKLVLKTHDHVFASRPQLIADQYISFGCSDVTFASYGPYWRQVRKICVTELLSSK

RVGSFQAVRDEEVKRLLTSVKSQCGSVTDMSKLFFTLANDILCLAAFGMRYVNEEGKKSN

NLASVFTESQELLSGFCIGDFFPEW

GWLSSLSGFTRRLRKNTQDLTVAIDEIISEHLFRKQATDDSGSSLMDGDGD

FIDVLLRVQQRDDLEVPITDDNLKALVLDMFMA

GTDTTAATMEWTMTELARHPRVMKKAQEEVRRVASGGGEVNESHIQQLRYMKAVIKETMR

LHPTVPLLVPRESMEKCVLEGYEIPAKTRILINSYAIGRDPKSWENPLEYIPERFDENNI

DFKDQDFRLLPFGGGRRGCPGYSFGLATVETALARLLYHFDWALPPGV

 

@8

>CYP71BG5P CAAP02000323.1  pseudogene 48% to CAAP02001743.1a, 52% to 71P1 rice

like potato and cotton ESTs, 67% to 71BG1, 65% to 71BG2, 65% to 71BG3

 7816 LPPSPPPLPIIGHLHLLTDMPHHSLSDLALKLGPIIHPRLGQVATVVVSSARLAALVLKT 7995

 7996 HDHVFASRPPLTAAQYLSFGCSDVTFSPHGTYWRQARKICVTELLSPKRVTYFQFIRNEE 8175

 8176 THPPPHHLPSSLSALSGSETDMSQLFFTLANNLCRVAFGKRFMDDSEGEKKHMVDVLTE 8352

 8353 TQALFAGFCIGDFFPDWKWLNSITGLNRRLRKNLEELIAVCNEIIEEHVNEKKERED 8523

 8524 FVGVLLRVQKRKDLEVAITDDNLKALVL (0) 8607

 9110 DMFVAGTDT 9136 XXXXXXXXXX

 9142 ELARHPHVMKKAQQEVRNIASGEGKVEETHLHQLHYKKAVIK*TMRLHPPVPLLVPRQSM 9321

 9322 ENCILDGYEIPAKIQLLINTYAIGCVPQSWE 9414

10537 NPLDYNPKRFVDGDVDFKGQDSGFLPFGGGRRGCPSYSFGLATVEIALARLLYHFDWELP 10716

10717 HGVEADDMDLNEIFGLATRKNSGLILVPRY 10806

 

CYP72 family (22 genes) [21 pseudogenes]

 

CYP72A subfamily (18 genes) [20 pseudogenes]

 

>CYP72A85 CAAP02000598.1a 65% to CAAP02002795.1

68% to CAAP02000473.1

GSVIVT00009392001 on Genoscope browser

chrUn_random from 57641820 to 57646375 (4556bp) on strand +

no P450 neighbors

35540 MEIAYDSVLIFCAFALLSLAWRAFYLVWLRPRRLERCLRRQGLMGNSYRPLHGDAKKVSIMLKEA 35734

35735 NSRPINLSDDIVPRVIPFLYKTIQQY (1) 35812

37936 GKNSFTWVGPIPRVNIMKPELIREVFLEAGRFQKQKPNPLANFLLTGL 38079

38080 VSYEGEKWAKHRKLLNPAFHVEKLK (0) 38154

38642 LMSPAFHLSCRQMISKMEEMVSPEGSCELDVWPFLKNLTADALSRTAFGSSYEEGRRLFQL 38824

38825 LQEQTYLTMEVFQSVYIPGW (2) 38884

39056 YLPTKRNKRMKKIDKEMNTLLNDIITKRDKAMKDGKTANEDLLGILMESNSKEIQEGGN 39232

39233 SKNAGISMQEVIEECKLFYLAGQETTSNLLLWTMVLLSKHPNWQTLAREEVFQVFGKNKP 39412

39413 EFAGLSRLKV (0) 39442

39670 VTMIFYEVLRLYPPGATLNRAVYEDINLGELYLPSGVEIVLPTILVHHDPEIWGDDVKEF 39849

39850 KPERFSEGVMKATKGQVSYFPFGWGPRICIGQNFAMAEAKMALAMILQCFTFELSPSYTH 40029

40030 APTSVLTLQPQYGAHLILHKI* 40095

 

$$$$

 

>CYP72A86P CAAP02000598.1b pseudogene exons 4 and 5, 84% to CAAP02002686.1

117092 FFPTKTNKRMKQISKEVH 117039

117038 ALLGGIINKREKAMEAGETANSDLLGILMESNFREIQEHQNNTKIGMSAKDVIDECKLFY 116859

116858 LAGQETTSVLLLWTMVLLSQHPDWQARAREEVLQVFGNNKPENDGLNHLKI (0) 116706

116303 VTMIFHEVLRLYPPVTVLTRMVSKDTQVGDMYFPAGVQVSLPTILVHHDHEIWGDDAKEF 116124

116123 NPERFAEGVSKATKNQVSFLPFGWGPRVCIGQNFAMMEAKIALAMILQRFSFELSPSYAH 115944

115943 APYSLITIQPQYGAHLILRGL* 115878

 

>CYP72A86P CAAP02000983.1 pseudogene 84% to CAAP02002686.1

3 aa diffs to CAAP02000598.1b

chrUn_random from 3711617 to 3699111 on strand -

GSVIVT00000151001 first two lines

GSVIVT00000150001 C-term part

16119 TGDVISRTAFGSSYEEGRRIFQLQKEQTYLAIKVAMSVYIPGWR 15988

 4832 FFPTKTNKRMKQISKEVHALLGGIINKREK 4743

      AMEAGETANSDLLGILMESNFREIQEHQN 4655

 4654 NTKIGMSAKDVIDECKLFYLAGQETTSVLLLWTMVLLSQHTDWQARAREEVLQVFGNNKP 4475

 4474 ENDGLNHLKI 4445

 4035 VTMIFHEVLRLYPPVTVLTRMVSKDTQVGDMYFPAGVQVSLPTILVHHDHEIWGDDAKEF 3856

 3855 NPERFAEGVSKATKNQVSFLPFGWGPRVCIGQNFAMMEAKTALAMILQRFSFELSPSYAH 3676

 3675 APFSLITIQPQYGAHLILRGL 3613

 

$$$$

 

>CYP72A86P-ie5b CAAP02000598.1b-ie5b pseudogene internal exon 5 fragment

chrUn_random from 3699666 to 3699734 on strand -

116504 IMMIFHEVLKLLYPLYT*HHAMH 116436

 

$$$$

 

>CYP72A87 CAAP02000355.1a   one stop codon, possible pseudogene  93% to CAN67740.1

2 aa diffs to CAN71061.1 exon4 and 5

GSVIVP00005878001 Genoscope browser version stops at PERFS*

chrUn_random from 35354787 to 35357214 (2422bp) on strand +

35357215 to 35357439 continues to the true end

52926 MEMKQLNLVALSFAFITILIYAWRILNWMWLRPKRLERCLKQQGLAGNSYRLLHGDFKEM 53105

53106 FMMIKEASSRPISISDDIVQRITPFHYHSIKKY (1) 53204

53464 GKSSFIWMGPKPRVNIMEPELIRDVLSMHTVFRKSRVHAFVKLLVSGLPFLDGEKWA 53635

53636 KHRKIINPAFRLEKLK (0) 53683

53868 NMLPAFHLSCSDMISKWEGKLSTEGSCELDVWPYLQNLTSDAISRTAFGSNYEEGRMIFE 54047

54048 LQREQAQLLVQFSDSAYIPGWW (2) 54113

54473 FLPTKSNKRMKQIRKEVNALLWGIIDKRGKAMKAGETLNDDLLGILLESNFKEIQEHEN 54649

54650 DKNVGMCIKDVIEECKIFYFAGQETTSALLLWTMVLLSKHPNLQARAREEVLHVFGNNKP 54829

54830 EGDGLNHLKI (0) 54859

55153 VMMILHEVLRLYPPVPLLARTVYEDIQVGDMYLPAGVDVSLPTILVHHDHEIW 55311

55312 GEDAREFNPERFS*GVLKATKSPVSFFPFGWGSRLCIGQNFAILEAKMVLAMILQRFSFS 55491

55492 LSPSYSHAPCSLVTLKPQHGAHLILHGI* 55578

 

>CYP72A87 gi|147816916|emb|CAN71061.1| 60% to 72A15

2 aa diffs to CAAP02000355.1a

MXXEALNRGVM

FLPTKSNKRMKQIRKEVNALLWGIIDKRGKAMKAGETLNDDLLGILLESNFKEIQEHEN

DKNVGMCIKDVIEECKIFYFAGQETTSALLLWTMVLLSKHPNWQARAREEVLHVFGNNKPEGDGLNHLKI

VMMILHEVLRLYPPVPLLARTVYEDIQVGDMYLPAGVDVSLPTILVHHDHEIWGEDAREFNPERFSQGVL

KATKSPVSFFPFGWGSRLCIGQNFAILEAKMVLAMILQRFSFTLSPSYSHAPCSLVTLKPQHGAHLILHG

I

 

$$$$

 

>CYP72A87-de1b CAAP02000355.1b pseudogene 97% to CAN67740.1

3 aa diffs to CAN67740.1

chrUn_random from 35362457 to 35362735 on strand +

60596 MEMKQLNLVALSFAFITILIYAWRVLNWMWLRPKRLERCLKQQGLAGNSYRLLYGDFKEM 60775

60776 SMMIKEATSRPISISDDIVQRVAPFHYHSIKKY (1) 60874

 

>CYP72A88 CAAP02000355.1c 94% to CAN67740.1, 97% to CAAP02000355.1d

GSVIVP00005881001 in Genoscope browser

chrUn_random from 35380636 to 35383685 (3050bp) on strand +

78889 MEMKQLNLVALSFAFITILIYAWRILNWMWLRPKRLERCLKQQGLAGNSYRLLHGDLKEM 79068

79069 FMMIKEASSRPISISDDIVQRIAPFQYHSIKKY (1) 79167

79982 GKSSFIWMGPKPRVNIMEPELIRDVLSMHTVFRKPRVHALVKLLVSGLLFLDGEKWA 80152

80153 KHRKIINPAFRLEKLK (0) 80200

80450 NMLPAFHLSCSDMISKWEGKLSTEGSCELDVWPYLQNLTGDAISRTAFGSNYEEGRMIFE 80629

80630 LQREQAQLLVQFSESAFIPGWR (2) 80695

80839 FLPTKSNKRMKQIRKEVNALLWGIIDKRGKAMKAGETLNDDLLGILLESNFKEIQEHEN 81015

81016 DKNVGMSIKDVIEECKLFYFAGQETTSVLLLWTMVLLSKHPNWQARAREEVLHVFGNNKP 81195

81196 EGDGLNHLKI (0) 81225

81519 VMMILHEVLRLYPPVPLLARTVYEDIQVGDMYLPAGVDVSLPTILVHHDHEIWGEDAREF 81698

81699 NPERFSQGVLKATKSPVSFFPFGWGSRLCIGQNFAILEAKMVLAMILQRFSFSLSPSYSH 81878

81879 APCSLVTLKPQYGAHLILHGI 81941

 

>CYP72A89 CAAP02000355.1d 95% to CAN67740.1

GSVIVT00005885001 in Genoscope browser

chrUn_random from 35415527 to 35417550 (2024bp) on strand +

112703 MEMKQLNLVALSFTFITILIYAWRILNWMWLRPKRLERCLKQQGLAGNSYRLLHGDFKEM 112882

112883 FMMIKEATSRPISISDDIVQRIAPFHYHSIKKY (1) 112981

113632 GKSSFIWMGPKPRVNIMEPELIRDVLSMHTVFRKPRVHALVKLLVSGLLFLDGEKWA 113802

113804 KHRKIINPAFRLEKVK 113851

114101 NMLPAFHLSCSDMISKWEGKLSTEGSCELDVWPYLQNLTGDAISRTAFGSNYEEGRMIFE 114280

114281 LQREQAQLLVQFSESAFIPGWR 114346

114705 FLPTKSNKRMKQNRKEVNELLWGIIDKREKAMKAGETLNDDLLGILLESNFKEIQEHGN 114881

114882 DKNVGMSIKDVIDECKIFYFAGQETTSVLLLWTMILLSKHPNWQARAREEVLHVFGNNKP 115061

115062 EGDGLNHLKI 115091

115384 VMMILHEVLRLYPPVPLLARTVYEDIQVGDMYLPAGVDVSLPTILVHHDHEIW 115542

115543 GEDAREFNPERFSQGALKATKSLVSFFPFGWGSRLCIGQNFAILEAKMVLAMILQRFSFS 115722

115723 LSPSYSHAPCSLVTLKPQYGAHLILHGI 115806

 

>CYP72A90 gi|147795635|emb|CAN67740.1| 55% to 72A15

95% to CAAP02000355.1d

no exact match in Genoscope

MEMKQLNLVALSFAFITILIYAWRVLNWMWLRPKRLERCLKQQGLAGNSYRLLYGDFKEMSMMIKEATSR

PISFSDDILQRVAPFHYHSIKKYGKSSFIWMGLKPRVNIMEPELIRDVLSMHTVFRKPRVHALGKQPASG

LFFLEGEKWAKHRKIINPAFRLEKLKNMLPAFHLSCSDMISKWEXKLSTXGSCEXDVWPYLQNLTGDAIS

RTAFGSNYEEGRMIFELQREQAQLLVQFSQSACIPGWRFLPTKSNKRMKQNRKEVNELLWGIIDKREKAM

KAGETLNDDLLGILLESNFKEIQEHGNDKNVGMSIKDVIDECKIFYFAGQETTSVLLLWTMVLLSKHPNW

QARAREEVLHVFGNNKPEGDGLNHLKIVMMILHEVLRLYPPVPLLARTVYEDIQVGDMYLPAGVDVSLPT

ILVHHDHEIWGEDAREFNPERFSQGALKATKSLVSFFPFGWGSRLCIGQNFAILEAKMVLAMILQRFSFS

LSPSYSHAPCSLVTLKPQYGAHLILHGI

 

>CYP72A91P gi|147791559|emb|CAN72865.1| AM476150.2

52% to 72A15 95% to CAAP02000355.1c CYP72A88

has no exact match in Genoscope

MEMKQLNLVALSFAFITILIYAWRILNWMWLRPKRLERCLKQQGLAGNSYRLLHGDFKEMFMMIKEATSR

PISISDDIVQRIAPFHYHSIKKYGKSSFIWMGPKPRVNIMEPELIRDVLSMHTVFRKPRVHALVKLLVSG

LLFLDGEKWAKHRKIINPAFRLEKVK

NMLPAFHLSCSDMISKWD

(deletion of 7 aa)

SCELDVWPYLQNLTGDAISRTAFGSNYEKGRMIFE

LQREQAQLLVQFSESAFIPGWRFXPTKSNKRMKQIRKEVNALLWGIIDKRGKAMKAGETLNDDLLGILLE

SNFKEIQEHENDKNVGMSIKDVIEECKLFYFAGXETTSALLLWTMVLLSKHPNWQARAREEILHVFGNNK

PEGDGLNHLKIVMMILHEVLRLYPPVPFLARSVYEDIQVGDMYLPAGVDVSLPTILVHHDHEIWGEDARE

FNPERFSQGVLKAMKSPVSFFPFGWGSQSCIGQNFAILEAKMVLAMILQRFSFSLSPSYSHAPSSLVTLI

PQYGAHLXLHGI

 

>CYP72A92 CAAP02000149.1a 90% to CAAP02002795.1

GSVIVP00005888001 in Genoscope browser

chrUn_random from 35487285 to 35497057 (9773bp) on strand -

33193 MKLSSVAISFAFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMLRM 33014

33013 ISEANSRSISLSDDIVQRVLPFHCHSIKKY (1) 32921

31022 GKNYFIWMGPKPVVNIMDPELIRDVFLKYNAFRKPPPHPLGKLLATGLVTLEGEQ 30858

30857 WTKRRKIINPAFHLEKLK (0) 30804

30164 HMVPAFQLSCSDMVNKWEKKLSKDGSCELDIWPDLENLAGDAISRTAFGSSYEEG 30000

29999 RRIFQLQKEQAHLAVKVFRSVYIPGWR (2) 29919

29675 FVPTKTNKRMRQISNEVHALLKGIIERREKAMKVGETANDDLLSLLMESNFREMQEHDE 29499

29498 RKNVGMSIKDVIEECKLFYFAGQETTSDLLLWTMVLLSKHSNWQARAREEVLQVFGNKKP 29319

29318 DGDGLNHLKI (0) 29289

24060 VTIIFHEVLRLYPPVSMLIRTVVADSQVGGWYFPDGALITLPILLIHHDHEIWGEDAKEF 23881

23880 NPERFSEGVSKATKGQFAFYPFGYGPRVCIGQNFAMMEAKMALAMILQRFSFELSPSYAH 23701

23700 APSNIITIQPQYGAYLILHGL* 23635

 

$$$$

 

>CYP72A93 CAAP02000149.1b 86% to CAAP02002795.1

GSVIVP00005893001 in Genoscope browser

chrUn_random from 35547921 to 35552540 (4620bp) on strand -

88676 MELISVAISFAFITLLIYAWRLLNWVWLRPKKLERCLRQQGITGNSYRLLHGDVREMLRM 88497

88496 ISEANSRPISLSDEIVQRVLPFHYHSLKKY (1) 88407

86832 GKNYFIWMGPKPVVNIMDPELIRDVFLRYNAFHKPAPHPLGKLLATGLVTLEGEQ 86668

86667 WTKHRKIINPAFHLEKLK (0) 86614

85979 HMVPAFQLSCGDMVNKWEKKLSKDGSCELDIWPDLENLTGDAISRTAFGSSYEEG 85815

85814 RRIFQLQKEQAHLAVKVFRSVYIPGWR (2) 85734

85493 FVPTKTNKRIRQIRNELHALLKGIIEKREKAMLVGETANDDLLSLLMESNFREMQEHDE 85317

85316 RKNVGMSIDDVIEECKLFYFAGQETTSDLLLWTMILLSKHSNWQARAREEILQVFGNKKP 85137

85136 DGNGLNHLKI (0) 85107

84572 VTMIFHEVLRLYPPVSMLIRTVFVDSQVGRWYFPVGSHVALPILLIHHDHEIWGEDAKEF 84393

84392 NPERFSEGVSKATKGGQFAFFPFGYGPRACIGQNFAMMEAKMALAMILQRFSFELSPSYA 84213

84212 HAPFNVITVQPQYGAHLILHGL* 84144

 

>CYP72A93 gi|147833897|emb|CAN66491.1| AM486124.1

62% to 72A15

4 aa diffs to CAAP02000149.1b

11981 MELISVAISFAFITLLIYAWRLLNWVWLRPKKLERCLRQQGITGNSYRLLHGDVREMLRM 11802

11801 ISEANSRPISLSDEIVQRVLPFHYHSLKKY () 11712

10111 GKNYFIWMGPKPVVNIMDPELIRDVFLRYNAFHKPAPHPLGKLLATGLVTLEGEQ 9947

9946 WTKHRKIINPAFHLEKLK 9893

9253 HMVPAFQLSCSDMVNKWEKKLSKDGSCELDIWPDLENLTGDAISRTAFGSSYEEG 9089

9088 RRIFQLQKEQAHLAVKVFRSVYIPGWR ()

8767 FVPTKTNKRIRQIRNELHALLKGIIEKREKAMXVGETANDXLLSLLMESNFREMQEHDE 8591

8590 RKNVGMSVXDVIEECKLFYFAGQETTSDLLLWTMVLLSKHSNWQARAREEILQVFGNKKP 8411

8410 DGNGLNHLKI (0) 8381

7846 VTMIFHEVLRLYPPVSMLIRTVFPDSQVGRWYFPVGSHVALPILLIHHDHEIWGEDAKEF 7667

7666 NPERFSEGVTKATKGGQFAFFPFGYGPRACIGQNFAMMEAKMALAMILQRFSFELSPSYA 7487

7486 HAPFNVITVQPQYGAHLILHGL* 7418

 

$$$$

 

>CYP72A94P CAAP02000149.1c pseudogene exon 4 only, 79% to CAAP02001786.1

chrUn_random from 35615448 to 35615834 on strand –

152090 FFPTKTNKRMKQISKEVHALLRGIINKREKAMEAGETANSGLLGILMESNFKEIHEHQN 151914

151913 NMKIGMSAKDVIDECKLFYLAGQETISVLLLWTMVLPSQHSDWQARAREEV*QVFGNNK 151737

151736 RQNDGLNHLKI (0)

 

$$$$

 

>CYP72A95 CAAP02000149.1d frameshift in exon 5, possible pseudogene

same as CAN72247.1, 94% to CAAP02002686.1 another pseudogene

GSVIVT00005897001 in Genoscope browser

chrUn_random from 35636714 to 35642885 (6172bp) on strand –

not correctly assembled, only contains C_term from MMEAK to end

184279 MKHSSVAISFGFLTVLISCLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLHGDFKEMSM 184100

184099 MLKEAYSRPISLSDDIAPRVLPFHCHFIKKY (1) 184007

183058 GKNFFAWFGPNPMVNIMEPELIRDILLKSNVFQKPPPHPLGKLLVSGLVTLEGERWA 182888

182887 KRRKNINPAFHLEKLK (0) 182840

182021 NMLPAFHLSCSDMVTKWKMLSVGGSCELDVWPYLENLTGDVISRTAFGSSYEEGR 181857

181856 RIFQLQKEQTHLAIQVTMSVYIPGWR (2) 181779

180119 FLPTKTNRRMKQISKEVYALLRGIVNKREKAMKAGETANSDLLGILMESNFREIQEHQN 179943

179942 NKKIGMSVRDVIEECKLFYLAGQETTSVLLVWTMVLLSEHPNWQARAREEVLQVFGNKKP 179763

179762 EADGLNHLKI (0) 179733

179322 VTMIFHEVLRLYPPIAMLARAVYKDTQVGDMCFPAGVQVRP 179200

179203 PTILVHHDHEIWGDDAKEFNPERFAEGVLKATKNQVSFFPFGWGPRVCIGQNFAMMEAKI 179024

179023 ALAMILQHFSFELSPSYAHAPFNILTMQPQYGAHLILRGLQC* 178895

 

>CYP72A95 gi|147815271|emb|CAN72247.1| 50% to 72A10, = CAAP02000149.1d, cyan part too long

MKYQKVQIXWSSRAGSTLRHLPRCEGCELSLEALKKSLKLE

MKHSSVAISFGFLTVLISCLWRLLNWVWL

RPKRLERCLREQGLAGNSYRLLHGDFKEMSMMLKEAYSRPISLSDDIAPRVLPFHCHFIKKYGKNFFAWF

GPNPMVNIMEPELIRDILLKSNVFQKPPPHPLGKLLVSGLVTLEGERWAKRRKNINPAFHLEKLKNMLPA

FHLSCSDMVTKWKMLSVGGSCELDVWPYLENLTGDVISRTAFGSSYEEGRRIFQLQKEQTHLAIQVTMSV

YIPGWR

 

$$$$

 

>CYP72A96 CAAP02000473.1 97% to CAAP02000149.1d, bad boundary at RKNFF

GSVIVP00000189001 on Genoscope browser (missing N-term)

chrUn_random from 4489657 to 4495024 on strand –

52293 MKHSSVAISFGFLTVLISCLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLHGDFKEMS 52117

52116 MMLKEAYSRPISLSDEIAPRVLPFHCHFIKKY (1) 52021

51446 RKNFFAWFGPNPMVNIMEPELIRDVLLKSNVFQKPPPHPLGKLLVSGLVTLEGERWA 51276

51275 KRRKIINPAFHLEKLK (0)

50085 NMLPAFHLSCSDMVTKWKMLSVGGSCELDVWPYLENLTGDVISRTAFGSSYEEGR 49921

49920 RIFQLQKEQTHLAIQVTMSVYIPGWR (2) 49843

48155 FLPTKTNRRMKQISKEVYALLRGIINKREKAMKAGETANSDLLGILMESNFREIQEHQN 45979

47978 NKKIRMSVKDVIEECKLFYLAGQETTSVLLVWTMVLLSEHPNWQARAREEVLQVFGNKKP 47799

47798 EAAGLNHLKI (0) 47769

47357 VTMIFHEVLRLYPPVAMLARAVYKDTQVGDMCFPAGVQVVLPTILVHHDHEIWGDDAKEF 47178

47177 NPERFAEGVLKATKNQVSFFPFGWGPRVCIGQNFAMMEAKIALAMILQHFSFELSPSYAH 46998

46997 APFSILTMQPQYGAHLILRGLQC* 46926

 

>CYP72A97P CAAP02002686.1 pseudogene, 76% to CAAP02004668.1 CYP72A

94% to CAAP02000149.1d

GSVIVP00000152001 in Genoscope browser not assembled correctly

Only exon6 and 7 correct in this model (VFGN-HRAV)

chrUn_random 3749666 to 3760267 on strand –

13105 MKHSSVAISFGFLTVLISCLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLHGDFKEMSM 12926

12925 MLKEAYSRPISLSDDTTPRVLPFHFHFIKKY 12833

11910 GKNSFAWFGPNPMVNIMEPELIRDVLLKSNVFQKPPPHPLGKLLVSGLVTLEGERWA 11740

11739 KRRKIINPAFHLEKLK 11692

10658 NMLPAFHLSCSDMVTKWKMLSVGGSCELDVWPYLENLTGDVISRTAFGSSYE 10503

10502 EGRRIF*LQKEQTHFASQ 10449

5472 VTMSVYIPGWR 5440

3730 FYPQRRNRRMKQISKEVYALLRGIVSNREKAMKAGETASSDLLGILMESNFREIQEHQNN 3551

3550 KKIGMSVKDVIEECKLFSLDGQETTSVLLVWTMVLLSEHPNWQACAREEVLQ 3395

3395 VFGNKKPEADGLNHLKI 3345

2933 VTMIFHEVLRLYPLVAMLHRAV 2868

2866 YKDTQVGDMCFPVGVQVVLPTILVHHDHEIWGDDAKEFNPKRFAEAVLKATKNQVSFFPF 2687

2686 GWGPRVCIGQNFAMMEAKIALAMILQHFSFELSPSYAHAPFSILTMQPQYGAHLILRGLQC* 2501

 

>CYP72A97P-ie5b CAAP02002686.1a-ie5b see CAAP02000598.1b-ie5b

3152 IMMIFHEVLKL 3120

 

>CYP72A98 gi|147777099|emb|CAN63404.1| AM456876.2

49% to 72A14

86% to CAAP02002686.1, 88% to CAAP02000149.1d

no exact match in Genoscope

MKHSSVAISFGFLTVLISCLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLHGDFKEMSMMLKEAYSRPI

SLSDDIAPRVLPFHCHFIKKY ()

GKNSFAWFGPNPMVNIMEPGLIRDVLLKSNVFQKPPPHPLGKLLVSGLV

TLEGERWAKRRKIINPAFHLEKLK ()

NMLPAFQLSCSDMVTKWKKLSVGGSCELDVWPXXXXXXXX

VISRTAFGSSYEEGRRIFQLQKELTHLASQ

VTMSVYIPGXR ()

FLSTKMNRRMKXISKEVYALLRGIINKREKAMKAGKXANSEXLLGILMESNFREI

QEHQNNKKIGMSAKDXIEECKLFYLAGQETTSVLLLWTMFLLSEHPNWQACAREEVLQVFGKK

KPEADGLNHLKI

VTMIFHEVLRLY

PLVAMLNRAVYKDTQVGDMYFPARVQVALPTILVHHDHEIWGDNAKGFDPERFAEGILKATKTSSA

(Deletion)

CIGQNFAMMEAKIALAMILQHFSFELSPSYAHAPFNILTMQPQYGVHLILRGLQC

 

$$$$

 

>CYP72A99 CAAP02004338.1a runs off the end 84% to CAN72247.1

100% to CAO16049.1 end is 98% to CAAP02000983.1

1697 MKLSSVAISFGFLTVLISCVWRLLNWVWLRPKRLERCLREQGLAGNSYRLLQGDSKEMSR 1518

1517 MMKEAYSRPISLSDDIVQRVLPFHCHFIKKY (1) 1425

172  GKNFFTWVGPSPRVNIMEPELMRDVLLKSNIFQKTPSHPLVKLLVSGLVALEGEQWA 2

 

CYP72A99 gi|157327641|emb|CAO16049.1| unnamed protein product [Vitis vinifera]

4 aa diffs to CAAP02000983.1 from TDGE to end

GSVIVP00009398001 in Genoscope

chrUn_random from 57722038 to 57738734 on strand -

MKLSSVAISFGFLTVLISCVWRLLNWVWLRPKRLERCLREQGLAGNSYRLLQGDSKEMSRMMKEAYSRPI

SLSDDIVQRVLPFHCHFIKKYGKNFFTWVGPSPRVNIMEPELMRDVLLKSNIFQKTPSHPLVKLLVSGLV

ALEGEQWAKRRKIINPAFHPEKLKNMLSAFHLSCSDMVNKWKKLSVEGSCELDVWPYLENL

TGDVISRTA

FGSSYEEGIRIFQLQKEQTYLAIKVAMSVYIPGWRFFPTKTNKRMKQISKEVHALLGGIINKREKAMEAG

ETANSDLLGILMESNFREIQEHQNNTKIGMSAKDVIDECKLFYLAGQETTSVLLLWTMVLLSQHPDWQAR

AREEVLQVFGNNKPENDGLNHLKIVTMIFHEVLRLYPPVTVLTRMVSKDTQVGDMYFPAGVQVSLPTILV

HHDHEIWGDDAKEFNPERFAEGVSKATKNQVSFLPFGWGPRVCIGQNFAMMEAKIALAMILQRFSFELSP

SYAHAPYSLITIQPQYGAHLILRGL

 

$$$$

 

>CYP72A100P CAAP02004338.1b 90% to CAAP02004439.1

100% to CAO16050.1 = CU459449.1

32686 SLPQAT MIFHKVLRLYPLVAMLPRVVYKDTQVGDMCFPAGVQVLLSTILVHHDHEILGDD 32507

32506 AKEFNPERFAEGVLKATKNQVSFFPFGWGPRVCIGQNFAMMEAKIALAMIL*HFSFELSP 32327

32326 SYTHASFSILTMQPQYGAHLILRGLQC* 32243

 

>CYP72A100P gi|157327642|emb|CAO16050.1| unnamed protein product [Vitis vinifera]

beginning = CAAP02001786.1

92% to CAAP02000473.1

GSVIVP00009400001 in Genoscope browser not correctly assembled

57770331 to 57770582 on strand – NY* to CKLF

57769280 to 57770326 on strand – YLAG to end

NY*HAHRFLPTKMNRRMKQISKEVYALLRGIINKREKAMKAGKTANSDLLGILMESNFREIQEHQNNKKIGMS

VKDVIEECKLF

YLAGQKTTSVLLVWTMALLSEHPNWQAHAREEVLQVFGNKKWEVDGLNHLKIAT

MIFHKVLRLYPLVAMLPRV

VYKDTQVGDMCFPAGVQVLLSTILVHHDHEILGDDAKEFNPERFAEGVLKATKNQVSFFPFGWGPRVCIG

QNFAMMEAKIALAMIL*HFSFELSPSYTHASFSILTMQPQYGAHLILRGLQC*

 

>CYP72A101PX = CYP72A100P

CAAP02001786.1  pseudogene 89% to CAAP02004439.1 CYP72A

57770174 to 57770582 on strand –

note CYP72A100P may be identical to 72A101P (merge)

426 NY*HAHRFLPTKMNRRMKQISKEVYALLRGIINKREKAMKAGKTANSDLLGILMESNFRE 247

246 IQEHQNNKKIGMSVKDVIEECKLFY 172

170 YLAGQKTTSVLLVWTMALLSEHPNWQAHAREEVLQVFGNKKWEVDGLNHLK 18

 

$$$$

 

>CYP72A102P CAAP02004439.1 pseudogene 83% to CAAP02000101.1

91% to CAAP02000473.1

GSVIVP00000178001 in Genoscope browser not correctly assembled

4296186 to 4297437 on strand –

2539 NY*HAHRFLPTKTNRKMKQISKEVYALLRGIVNKREKAMKVGETTNSDLLGMLMESNFRE 2360

2359 IQEHQNNKKIRISVKDVIEECKLFYLAGQKTTSVLLVWTMVLLSEHPN*QARAREEVLQV 2180

2179 FGNKKWEADGLNHLKI (0) 2135

1719 VTMIFHEVLRLYPPIAMLPRVVYKDTQVGDMCFPTGLQVVLPTILVHHDHEIWGDD 1552

1551 AKEFNPKRFVEGVLKVTKNQVSFFPFGWGPRVCIGQNFAMMEAKIALAMIL*HFSFELSP 1372

1371 SYTHASFNILTM*PQYGAHLILHGLQC* 1288

 

$$$$

 

>CYP72A103 CAAP02002795.1 87% to CAAP02004668.1

90% to CAAP02000149.1a

25135 MKLSSVAISFAFITLLIYAWRLLNSVWLKPKKIERYLRQ

25018 QGLIGNSYRLLHGDFREMSRMIDEANSRPISLSDDIVQRVLPFHYHSIKKY (1) 24866

24228 GKNCFIWMGPKPVVNIMEPELIRDVLLKHNAFQKPPVHPLGKLLATGVIALEGEQ 24064

24063 WTKRRKIINPAFHLEKLK (0) 24010

23631 HMVPAFQLSCSEMVNKWEKKLSKDGSCELDIWPDLENLAGDVISRTAFGSSYEE 23470

23469 GRRIFQLQKEQAHLAVQVSQSIYIPGWR (2) 23386

23080 FVPTKTNKRMRQISNEVNALLKGIIERREKAMKVGETANDDLLGLLMESNYKEMQEHGE 22904

22903 RKNVGMSNKDVIEECKLFYFAGQETTSVLLLWTMVLLSKHSNWQARAREEVLQVFGNKKP 22724

22723 DGDGLNHLKI (0) 22694

22147 VTMIFHEVLRLYPPASMLIRSVYADTEVGGMYLPDGVQVSLPILLLHHDHEIWGDDAKDF 21968

21967 NPERFSEGVSKATKGQFAFFPFGYGPRVCIGQNFAMMEAKMALAMILQRFSFELSPSYAH 21788

21787 APISVITIQPQYGAHLILHGL* 21722

 

>CYP72A103 gi|157356442|emb|CAO62605.1| unnamed protein product [Vitis vinifera]

identical to CAAP02002795.1

GSVIVP00000202001 in Genoscope browser

4708885 to 4712295 on strand –

MKLSSVAISFAFITLLIYAWRLLNSVWLKPKKIERYLRQQGLIGNSYRLLHGDFREMSRMIDEANSRPIS

LSDDIVQRVLPFHYHSIKKYGKNCFIWMGPKPVVNIMEPELIRDVLLKHNAFQKPPVHPLGKLLATGVIA

LEGEQWTKRRKIINPAFHLEKLKHMVPAFQLSCSEMVNKWEKKLSKDGSCELDIWPDLENLAGDVISRTA

FGSSYEEGRRIFQLQKEQAHLAVQVSQSIYIPGWRFVPTKTNKRMRQISNEVNALLKGIIERREKAMKVG

ETANDDLLGLLMESNYKEMQEHGERKNVGMSNKDVIEECKLFYFAGQETTSVLLLWTMVLLSKHSNWQAR

AREEVLQVFGNKKPDGDGLNHLKIVTMIFHEVLRLYPPASMLIRSVYADTEVGGMYLPDGVQVSLPILLL

HHDHEIWGDDAKDFNPERFSEGVSKATKGQFAFFPFGYGPRVCIGQNFAMMEAKMALAMILQRFSFELSP

SYAHAPISVITIQPQYGAHLILHGL

 

$$$$

 

>CYP72A104P gi|147798934|emb|CAN63796.1| AM469525.2 56% to 72A7

pseudogene

5 aa diffs plus some errors to CAAP02002795.1 (same with CAO62605.1)

no exact match in Genoscope, may be the same seq as CYP72A103

RFVPTXTNKRMRQISNEVNALLKGIIERREKxxEVGExxTSTANXXLLGLLMESNYKEMQEHDERKNVGMS

NKDVIXECKLFYFAGQETTSVLLLWTMVLLSKHSXWQARAREEVLQVFGNKKPDGDGLXHLKI (0)

14303 VTMIFHEVLRLYPPASMIXX 14250

14251 SVYXDTEVGG

MYLPDGVXVSLPILLVHHDHEIWGDDAKDFNPERFSEGVSKATKGQFAFFPFGYGPRVCIGQNFAMMEAK

MALAMIVQRFSFELSPSYAHAPFSVITIQPQYGAHLILHGL

 

$$$$

 

>CYP72A105 CAAP02002402.1a 91% to CAAP02004668.1

no exact match in Genoscope

 9868 MKLSSVAVSFAFITLLIFAWRLLNWVWLRPKKLERCLRKQGLTGNSYRLLHGDFREMSRM 10047

10048 NNEANSGPISFSDDIVKRVLPFFNHSIQKY (1) 10137

11178 GKNSFTWLGPKPVVNIMEPELIRDVLLKHNVFQKPPPHPLGKLLATGVVALEGEQW 11345

11346 TKRRKIINPAFHLEKLK (0) 11396

11706 HMVSAFQLSCSDMVNKWEKKLSMDDSCELDIWPYLQILTGDVISRTAFGSSYEEGRRIFQ 11885

11886 LQKEQAHLVAQVTQSVYVPGWR (2) 11951

12540 FFPTKINRRMRQIRNEVNALLKGIIEKREKAMKVGETANHDLLGLLMESNYREMQENDE 12716

12717 RKNVGMSIKDVIEECKLFYFAGQETTSVLLLWTMVLLSKHSNWQARAREEVLQVFGNKKP 12896

12897 DGDGLNHLKI (0) 12926

13502 VTMIFHEVLRLYPPASMLIRTVFADSQVGGLYLSDGVLIALPILLIHHNHEIWGEDAKEF 13681

13682 NPGRFSEGVSKAAKTQVSF 13738

13737 FFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFDLSPSYAHAPSS 13874

13875 LLMQPQHGAHLILHGL* 13925

 

>CYP72A105 gi|147810740|emb|CAN67452.1| 64% to 72A15

3 aa diffs to CAAP02002402.1a

MVLLSKHSNWQARAREEVLQVFGNKKPDGDGLNHLKIVTMIFHEVLRLYPPASMLIRTVFADSQVGGLYL

PDGVLIXLPILLIHHNHEIWGEDAKEFNPGRFSEGVSKAAKTQVSFFPFGYGPRICVGQNFAMMEAKMAL

AMILQRFSFDLSPSYAHAPXSLLTMQPQHGAHLILHGL

 

$$$$

 

>CYP72A106P CAAP02002402.1b pseudogene

GSVIVP00011018001 on Genoscope Browser not assembled correctly

69155641 to 69158451 on strand +

48350 ISVAISFAFITLLIYAWRLLNWVWLRPKKLERCLRQQGITGNSYRLLHGDVREMLRMISE 48529

48530 ANSRPISLSDEIVQRVLPFHYHSLKKYGIAGFL 48628

49855 SRFVPTKTNKRMRQISNEVNALLKGSIERREKAMKVGEMREHDERKNVG 50001

50002 MSNKDVIKECKLFYFAGQETTSVLLLWTMVPLSKHSNWQGRAREEVLQVFGNKKPDGDG 50178

50179 LNHLK 50193

50799 VYADTEVGGMYLPDGVQVSLPILLVHHDHEIWGDDAKDFNPERFSEGVSKATKGQFAFFP 50978

50979 FGYGPRVCIGQNFAMMEAK 51035

50717 MIKLFSILQVTMIFHEVLRLYPPASMICLC

      MALAMIVQRFS 51067

51068 FELSPSYAHAPFSVITIQPQYGAHLILHGL 51157

 

$$$$

 

>CYP72A107 CAAP02002484.1 96% to CAAP02004668.1 CYP72A

no exact match in Genoscope

10902 MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYSCLYGDFKEMSRM 11081

11082 INEANSRPISFSDDIVQRVLPFHDHSIQKY (1) 11171

12213 GKNSFTWLGPKPVVNIMEPELIRDVFLKHNAFQKVPPHPLGKLLATGVVALEGEQW 12380

12381 TKRRKIINPAFHLEKLK (0) 12431

12655 HMVSAFQLSCSDMVNKWEKKLSLDGSCELDVWPYLENLAGDVISRTAFG 12801

12802 SSYEEGRRIFQLQREQAHLAIQVTRSIYVPGWR (2) 12900

13371 FFPTKTNRRMRQISNEVNALLKGIIEKREKAMKVGETANHDLLGLLMESNYRDMQENDE 13547

13548 RKNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNWQTRAREEVLRVFGNKKP 13727

13728 DGDGLNHLKI (0) 13757

14373 VTMIFHEVLRLYPPVSMLLRTVFADSQVGGLYLPDGVQIALPILLLHHDHEIWGEDAKEF 14552

14553 NPGRFSEGVSKAAKTQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSYAH 14732

14733 APISLITMQPQYGAHLILHGL* 14798

 

>CYP72A107 gi|147791938|emb|CAN72443.1

 gi|147791939|emb|CAN72444.1| AM462621.1 65% to 72A15

100% to CAAP02002484.1

Note CAN72443.1 is the N-terminal of the same gene

same as CAN68126.1 and and 1 aa diff to CAAP02004668.1

adjacent to CAN72443.1

MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYSCLYGDFKEMSRMINE

ANSRPISFSDDIVQRVLPFHDHSIQKY

GKNSFTWLGPKPVVNIMEPELIRDVFLKHNAFQKVPPHPLGKLLATGVVALEGEQW

TKRRKIINPAFHLEKLK ()

HMVSAFQLSCSDMVNKWEKKLSLDGSCELDVWPYLENLAGDVISRTAFG

SSYEEGRRIFQLQREQAHLAIQVTRSIYVPGWR ()

FFPTKTNRRMRQISNEVNALLKGIIEKREKAMKVGETANHDLLGLLMESNYRDMQENDE

RKNVGMSIKDVIEECKLFYLAGQETTSVLLLWT

MVLLSKHSNWQTRAREEVLRVFGNKKPDGDGLNHLKI (0)

VTMIFHEVLRLYPPVSMLLRTVFADSQVGGLYL

PDGVQIALPILLLHHDHEIWGEDAKEFNPGRFSEGVSKAAKXQVSFFPFGYGPRICVGQNFAMMEAKMAL

AMILQRFSFELSPSYAHAPISLJTXXPQYGAHLILHGL

 

>CYP72A107 gi|147781059|emb|CAN68126.1| AM465661.2 partial seq

66% to 72A15

1 aa diff to CAAP02004668.1

3 aa diffs to CAAP02002484.1

508 FFPTKTNRRMRQISNEVNALLKGIIEKREKAMKVGETANHDLLGLLMESNYRDMQENDE 684

RKNVGMSIKDVIEECKLFYLAGQETTSVLLLWT

MVLLSKHSNWQTRAREEVLRVFGNKKPDGDGLNHLKI (0)

VTMIFHEVLRLYPPVSMLLRTVFADSQVGGLYL

PDGVQIALPILLLHHDHEIWGEDAKEFNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMAL

AMILQRFSFELSPSYAHAPISLLTTHPQYGAHLILHGL

 

$$$$

 

>CYP72A108 CAAP02004668.1 72% to CAN67740.1 CYP72A

96% to CAAP02002484.1

GSVIVP00011014001 on Genoscope Browser

69064443 to 69068470 on strand +

7761 MKLSSVAISFAFIVLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMS 7934

7935 EMINEANSRPISFSDDIVQRVLPFHDHSIQKY (1) 8030

9054 GKNSFTWFGPKPVVYIMEPELIRDVLLKHNVFQKPPPHPLSKLLATGVVAL 9206

9207 EGEQWTKRRKIINPAFHLEKLK (0) 9272

9534 HMVSAFQLSCSDMVNKWEKKLSLDGSCELDVWPYLENLAGDVISRTAF 9677

9678 GSSYEEGRRIFQLQREQAHLAIQVTRSIYVPGWR (2) 9779

10366 FFPTKTNRRMRQISNEVNALLKGIIEKREKAMKVGETANHDLLGLLMESNYRDMQENDER 10545

10546 KNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNWQTHAREEVLRVFGNKKPD 10725

10726 GDGLNHLKI (0) 10752

11363 VTMIFHEVLRLYPPVSMLLRTVFADSQVGGLYLPDGVQIALPILLLHHDHEIWGEDAKE 11539

11540 FNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSYA 11719

11720 HAPISLLTTHPQYGAHLILHGL* 11788

 

>CYP72A108 gi|147858656|emb|CAN80407.1| 40% to 72A10

2 aa diffs to CAAP02004668.1 and CAO21263.1

MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMSXMINEANSRPIS

FSDDIVQRVLPFHDHSIQKYGEQWTKRRKIINPAFHXEKLKHMVSAFQLSCSDMVNKWEKXLSLDGSCEL

DVWPYLENLAGDVISRTAFGSSYEEGRRIFQLQREQAHLAIQVTRSIYVPGWR

 

>CYP72A108 gi|157328551|emb|CAO21263.1| unnamed protein product [Vitis vinifera]

100% to CAAP02004668.1

MKLSSVAISFAFIVLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMSEMINEANSRPIS

FSDDIVQRVLPFHDHSIQKYGKNSFTWFGPKPVVYIMEPELIRDVLLKHNVFQKPPPHPLSKLLATGVVA

LEGEQWTKRRKIINPAFHLEKLKHMVSAFQLSCSDMVNKWEKKLSLDGSCELDVWPYLENLAGDVISRTA

FGSSYEEGRRIFQLQREQAHLAIQVTRSIYVPGWRFFPTKTNRRMRQISNEVNALLKGIIEKREKAMKVG

ETANHDLLGLLMESNYRDMQENDERKNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNWQTH

AREEVLRVFGNKKPDGDGLNHLKIVTMIFHEVLRLYPPVSMLLRTVFADSQVGGLYLPDGVQIALPILLL

HHDHEIWGEDAKEFNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSP

SYAHAPISLLTTHPQYGAHLILHGL

 

$$$$

 

>CYP72A109 CAAP02001850.1 6 aa diffs to CAAP02003454.1

exact match to 53909800 to 53913784 + strand

GSVIVP00009051001 in Genoscope browser

 163 MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMSRM 342

 343 INEANSRPISFSDDIVQRVLPFHDHSIQKY (1) 432

1262 GKNNFIWLGPKPVVNIMQPELIRDVLLKHNAFQKPPPHPLGKLLASGISSLDGEQW 1428

1429 TKRRKIINPAFHLEKLK (0) 1479

1885 HMVSAFQLSCSDMVNKWEKQLSLDGSCELDIWPYLQNLTGDVISRTAFGSSYEEG 2049

2050 RRIFQLQKEQALLTVQVTRSVYVPGWR (2) 2130

2730 FFPTKTNRRMRQISSEVDALLKGIIEKREKAMQAGETANDDLLGLLMESNYREMQENDE 2906

2907 RKNVGMSIKDVIEECKLFYLAGQETTSALLLWTMVLLSKHSNWQARAREEVLRVFGNKKP 3086

3087 DGDGLNHLKI (0) 3116

3722 VTMIFHEVLRLYPPAPMLTRAVFADSQVGGLYLPDGVQIALPILLIHHDDKIWGEDAK 3895

3896 EFNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSY 4075

4076 AHAPISLLTIQPQHGAHLILHGL* 4147

 

>CYP72A110 CAAP02001422.1 6 aa diffs to CAAP02001850.1

no exact match in Genoscope

11379 MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMSRM 11200

11199 INEANSRPISFSDDIVQRVLPFHDHSIQKY (1) 11110

10292 GKNNFIWLGPKPVVNIMQPELIRDVLLKHNAFQKPPRHPLGKLLASGVASLEGEQW 10125

10124 TKRRKIINPAFHLEKLK 10074

 9745 HMVSAFQLSCSDMVNKWEKQLSLDGSCELDIWPYLQNLTGDVISRTAFGSSYEEG 9582

 9581 RRIFQLQKEQALLTVQVTRSVYVPGWR 9501

 8901 FFPTKTNRRMRQISSEVDALLKGIIEKREKAMQAGETANDDLLGLLMESNYREMQENDE 8725

 8724 RKNVGMSIKDVIEECKLFYLAGQETTSALLLWTMVLLSKHSNWQARAREEVLRVFGNKKP 8545

 8544 DGDGLNHLKI 8495

 7909 VTMIFHEVLRLYPPAPMLTRAVFADSQVGGLYLPDGVQIALPILLIHHDDKIWGDDAKEF 7730

 7729 NPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSYAH 7550

 7549 APISLLTMQPQHGAHLILHGL* 7484

 

>CYP72A111P CAAP02001422.1 pseudogene 96% to CAAP02003454.1

GSVIVP00009781001 on Genoscope Browser not assembled correctly

Chr19_random 699832 to 701684 on strand -

32669 GKNNFIWLGPKPVVNIMQPELIRDVLLKHNAFQKAPRHPLRKLLASGIASLEGEQW 32502

32501 TKRRKIINPAFHLEKLK 32451

32049 HMVSAFQLSCSDMVNKWEKQLSLDGSCELDIWPYLQNLTGDVISRTAFGSSYEEG 31885

31884 RRIFQLQKEQALLAVQVTRSVYVPGWR 31804

31203 FFPTKTNRRMRQISSEVNALLKGIIEKREKAMQAGETANDDLLGLLMESNYREM 31042

31041 QENDERKNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNRQACAREEVLRLF 30862

30861 GNKKPDGDGLNHLKI 30717

 

>CYP72A112P CAAP02001422.1 pseudogene fragment 52% to CAAP02002795.1

Chr19_random 732617 to 733044 on strand +

63362 VIS*TTFGSSYEEGRRILLLQEELA*LTIRIF 63457

63664 KGNKRIKKADKEIQELLRGIIDQREKAMKVCETVNDDLLSIL 63789

 

>CYP72A113 CAAP02003454.1 91% to CAN67740.1

GSVIVP00000208001 in Genoscope browser

4821416 to 4825397 on strand +

21735 MKLSSVAISFVFITLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMS 21908

21909 RMINEANSRPISFSDDIVQRVLPFHDHSIQKY (1) 22004

22833 GKNNFIWLGPKPVVNIMQPELIRDVLLKHNAFQKPPPHPLGKLLASGISSL 22985

22986 DGEQWTKRRKIINPAFHLEKLK (0) 23051

23456 HMVSAFQLSCSDMVNKWEKQLSLDGSCELDIWPYLQNLTGDVISRTAFGSSYEEGRRIF 23632

23633 QLQKEQALLAVQVTRSVYVPGWR (2) 23701

24301 FFPTKTNRRMRQISSEVDALLKGIIEKREKAMQAGETANDDLLGLLMESNYREMQENDER 24480

24481 KNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNWQACAREEVLRVFGNKKPD 24660

24661 GDDLNHLKI (0) 24687

25294 VTMIFHEVLRLYPPVPMLTRAVFADSQVGGLYLPDGVQIALPILLIHHDDKIWGEDAKE 25470

25471 FNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSYA 25650

25651 HAPISLLTMQPQHGAHLILHGL* 25719

 

>CYP72A114P CAAP02000680.1  pseudogene missing exons 2 and 3

7 aa diffs to CAAP02003454.1

GSVIVP00000210001 in Genoscope browser not assembled correctly

chrUn_random 4873006 to 4875944 on strand +

1243 MKLSSVAISFAFIVLLIYAWRLLNWVWLRPKKLERCLRQQGLTGNSYRLLHGDFREMSRM 1422

1423 INEANSRPMSFSDDIVQRVLPFHDHSIQKY (1) 1512

2769 FFPTKTNRRMRQISSEVNALLKGIIEKREKAMKAGETANDDLLGLLMESNYREMQENDE 2945

2946 RKNVGMSIKDVIEECKLFYLAGQETTSVLLLWTMVLLSKHSNWQACAREEVLRVFGNKKP 3125

3126 DGDDLNHLKI (0) 3155

3759 VTMIFHEVLRLYPPAPMLTRAVFADSQVGGLYLPDGVQIALPILLIHHDDKIWGDDAK 3932

3933 EFNPGRFSEGVSKAAKSQVSFFPFGYGPRICVGQNFAMMEAKMALAMILQRFSFELSPSY 4112

4113 AHAPISLTTMQPQHGAHLILHGL* 4184

 

>CYP72A115P CAAP02000101.1 N-term exon may be a pseudogene or the rest of the gene

may run off the end of the contig, 1 aa diff to CAN63404.1

chrUn_random 3956335 to 3956628 on strand -

6161 MKHSSIAISFGFLTVLISCLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLHGDFKEMSM 5982

5981 MLKEAYSRPISLSDDIAPRYELLLFIIVLKFADLFKLW 5868

 

>CYP72A116P CAAP02000101.1 N-term exon pseudogene, 93% to CAN63404.1

chrUn_random 3957647 to 3957925 on strand -

7458 MKHSSVAISFGFLTVLISYLWRLLNWVWLRPKRLERCLREQGLAGNSYRLLLHGDFKEMS 7279

7278 MMLKEAYSRPINLSDDIALCVLPFHCRFIKKYG 7180

 

>CYP72A117P CAAP02000101.1, pseudogene, missing exons 2,3, 83% to CAN63404.1

about 90% to CAAP02000149.1d

GSVIVP00000171001 in Genoscope browser not assembled correctly

chrUn_random 4136790 to 4143738 on strand -

193271 MKHNSVAISFGFLTVFISCLWMLLNWVWLRPKRLERCLREQGLAENSYSLLHGDFKEMSM 193092

193091 ILKEAYSRPISLSDDIAPRVLPFRCHFIKKY 192999

 

187549 FLPTKTNRKMKQISKKAYALLRGIINKREKTMKADKTGNSDLLVILMESNFR* 187391

187390 IQEHKNNKKIGMSVKEVIEECKIFYLAGQETTSVFLVWTMVLLSENPNWQARAREEVLQV 187211

187210 FGNKKLEANGLNHLKI (0) 187163

186751 VTMIFHEVLRLYPPVAMLTRAVYKDTQVGDMYFPAGVQVALPTILVHHDHEIWGDD 186584

186583 VKEFNPERLAEGISKAKKNQVSFFPFGWGPQACIGQNFAMMEAKIALAMILQHFLFELSP 186404

186403 SYAHAPFNILTMQLQYGGHLILHGLQC 186323

 

>CYP72A118P gi|147818466|emb|CAN71976.1| 51% to 72A10 probable pseudogene

80% to CAAP02000149.1d

chrUn_random 57735623 to 57737209 on strand –

first line does not match

     MEAXELGVXETEEXREMPLESKAWLGIRIGYYKGDSKEMSRMMKEAYSRPISLSDDIVQRVLPFHCHFIKKY

1865 GKNFFTWVGPSPRVNIMEPELMRDVLLKSNIFQKTPSHPLVKLLVSGLVALEGEQWAKRRKIINPAFHPEKLK 1649

1598 RKIINPVFHPE 1566 (small duplication)

1271 NMLSAFHLSCSDMVNKWKKLSVEGSCELDVWPYLENLTGDVISRTAFGSSYEEGIRIFQLQKEQT

     YLAIKVAMSVYIPGWR 1029

 

$$$$

 

>CYP72A119P gi|147779725|emb|CAN67214.1| AM437669.2

45% to 72A8 C-helix

CAAP02016581.1, 74% to CAN71976.1 pseudogene

876  KLGKNSFTWVSPNPRVNIM

     KPNIFQKTPSHPLVKXLVSGLVAQEGEQWAKRRKIINPVFHPEKLK

1151 RKIINPAFHPE 1183 (small duplication)

1443 NMLPTIHLSCS 1475

1475 LSVEGSRESDVWPYLENLTWDVIARTAFGSSYEEGRKIFQLQKEQTYLAINVATWVNIPGWTYA 1666

 

AM437669.2 4 aa diffs to CAAP02003489.1, 68% to CAAP02000473.1

chrUn_random 69166373 to 69166585 on strand –

from GKNS to FHPEK 1 aa diff

15215 GKNSFTWVGPN 15247

15249 PRVNIMKPELMRDVLLK

15298 RPNIFQKTPSHPLVKXLVSGLVAQEGEQWAKRRKIINPVFHPEKLK 15435

15486 RKIINPAFHPE 15518

      MLPTIHLSCS

      LSVEGSRESDVWPYLE

15860 NLTWDVIARTAFGSSYEEGRKIFQLQKEQTYLAINVATWVNIPGWTY 16000

 

>CYP72A119P CAAP02003489.1 pseudogene CYP72A 4 aa diffs to CAN67214.1

39277 GKNSFTWVGPNPRVNIMKPELMRDVLLKPNIFQKTPSHPLVKLLVSGLVAQEGEQWAKRR 39098

39097 KIINPVFHPEKLK 39059

38658 NLTWDMIARTAFGSSYEEGRKIFQLQKE*TYLAINVATSVNIPGWTY 38518

 

$$$$

 

CYP72D SUBFAMILY (4 genes) [1 pseudogene]

 

>CYP72D3 gi|147795107|emb|CAN60851.1| 43% to 72A15 yellow region too long

CAAP02006515.1a 1-3060 runs off end, missing exon 1

87% to CAAP02007230.1

GSVIVP00009515001 on Genoscope browser

chrUn_random 60546398 to 60549431 on strand + missing exon 1

MAYSFAILTMYTLSRVVYSIWWRPKSLEKQLRRQGIRGTRYKLLFGDAKAMKQSFMEARSKPMALNHSIV

PRVVPFYHEIAQKY

GKVSVSWNFTTPRVLIVEPELMRLILTSKNGHFQRLPGNPLGYLLSRGLSYLQGEK

WAKRRKLLTPAFHFEKLK

SYHRTVRGHHSRNGPDSAESYGAICLFGSWGVTSLGFELKTKVFLVALL

GMVPAFSVSCRKLIERWKNLVAPQGTYELDMMHEFQNLTGDVISQVAFGSNYEEGKKVFELQKEQAVLVMEAF

RTFYIPGFRFVPIGKNKKRYYIDSEIKAILKKIILKRKQTMKPGDLGNDDLLGLLLQCQEQTDSEMTIED

VIEECKLFYFAGQETTANWLTWTILLLSMHPNWQEKAREEVLQLCGKKMPDIEAINRLKIVSMILHEVLR

LYPPVTQQFRHTCERINIAGMCIPAGVNLVLPTLLLHHSPEYWGDDVEEFKPERFSEGVSKASKGDQIAF

YPFGWGHRICLGQGFAMIEAKMALAMILQHFWFELSPTYTHAPHTVITLQPQHGAPIILHEI

 

>CYP72D3 CAAP02015403.1 = CAN60851.1 4 aa diffs, runs off end

1507  MAYSFAILTMYTLSRVV

1456  YSVWWRPKSLEKQLRRQGIRGTRYKLLFGDAKAMKQSFMEARSKPMALNHSIVPRVL  1286

1285  PFYHEIAQKY  1253

1178  GKVSVSWNFTTPRVLIVEPELMRLILTSKNGHFQRLPGNPLGYLLTRGLSYLQGEKWAK  1002

1001  RRKLLTPAFHFEKLK  957

232   GMVPAFSVSCRKLIERWKNLVAPQGTYELDMMPEFQ  125

124   NLTGDVISQVAFGSNYEEGKKVFELQKEQAVLVMEAFRTFY  2

 

>CYP72D4 gi|147795108|emb|CAN60852.1| AM443849.2 43% to 72A14

CAAP02006515.1b 5359-7817 adjacent to CAN60851, 87% to CAN60851

97% to CAAP02007230.1, 100% to CAO16149.1

GSVIVP00009516001 in Genoscope Browser

chrUn_random 60551733 to 60554191 on strand +

MAYSFAILTVYTLLRVVYSIWWRPKSLEKQLRRQGIRGTHYKLLFGDAKAMKQSFVEARSKPMALNHSIV

PRVTPFYHEMAQKYGKVSVSWHFTTPRVLIVEPELMRMILXYKNGHLXRLPGNPLGYHLSRGLLSLEGEK

WAKRRKLLSPAFHLEKLK

GMMPAFSTSCHXLIERWKNLVGPQGTYELDVMPEFQ

NLTGDVISRTAFGSSYEEGRRVFELQKEQIVLVMEDFRNFYIPGFRFVPTRK

NKRRYYMDSEIKAMIKKIILKKKQTLKNGDPGNDDLLGLLLQCQEQTDSEMTIEDVVEECKLFYFVGQET

TANWLTWTILLLSMHPNWQEKARAEVLQICGKKMPDIEAISNLKIVSMILHEVLRLYPPVIMQFRHTRER

INIAGMYIPAGVDLVLPTVLLHHSPEYWGDDVEEFKPERFSEGVSKASKGDQTAFYPFGWGHRICLGQGL

AMIEAKMALAMILQHFWFELSPAYTHAPYRIITLQPQYGAPIILHQI

 

>CYP72D5 CAAP02007230.1 87% to CAN60851.1, 100% to CAO41622.1

GSVIVP00013480001 in Genoscope Browser

chrUn_random 89958971 to 89961429 on strand +

1439 MAYSFAILTVYTLLRVVYSIWWRPKSLEKQLRRQGIRGTHYKLLFGDAKAMKQSFMEARS 1618

1619 KPMALNHSIVPRVIPFYHEMAQKY 1690

1768 GKVSVSWHFTTPRVLIVEPELMRMILKYKNGHLHRLPGNPLGYHLSRGLLSLE 1926

1927 GEKWAKRRKLLSPAFHLEKLK 2019

2699 GMMPAFSTSCHDLIERWKNLVGPQGTYELDVMPEFQNLTGDVISRTAFGSSYEEGRRVFE 2878

2879 LQKEQIVLVMEDFRNFYIPGFR 2944

3035 FVPTRKNKRRYYMDSEIKAMIKKIILKKKQTLKNGDPGNDDLLGLLLQCQEQTDSEMTI 3211

3212 DDVVEECKLFYFVGQETTANWLTWTTLLLSMHPNWQEKARAEVLQICGKKMPDIEAISNL 3391

3392 KI 3397

3469 VSMILHEVLRLYPPVIMQFRHTGERINIAGMCIPAGVDLVLPTALLHHSPEYWGDDVEEF 3648

3649 KPERFSEGVSKASKGDQTAFYPFGWGHRICLGQGLAMIEAKMALAMILQHFWFELSPTYT 3828

3829 HAPHRIITLQPQYGAPIILHQI* 3897

 

>CYP72D6 CAAP02003169.1 48% to 72A15, 75% to CAAP02007230.1, 76% to CAN60851.1

GSVIVP00032271001 in Genoscope Browser

Chr4 1480668 to 1483369 on strand +

19453 MAFSFAILVVYGLLRAVYTIWWRPKSLEKQLRQQGIRGTRYKPMYGDMKALKLSFQEAQS 19632

19633 KPMTLNHSIVPRVIPFFHQMFQNY 19704

20367 GKISMSWIFTRPRVMIVDPELIRMILADKNGQFQKPPLNPLVDLLTLGLSTLE 19525

20526 GEQWAKRRKLITPAFHVEKL 19585

20884 GMVPAFSMSCCNLIERWKNWVGPQGTYELDVMPEFQNVTGDVISRAAFGSSYEEGKKVFE 21063

21064 LQKEQAVLVIEASRAIYLPGFR 21129

2219 FVPTVKNRRRYHIDNEIKAMLRSMIDRKKQAMKNGDSGYNDDLLGLLLQLTEEID 21383

21384 NEMRIEDLIEECKLFYFAGQETTANLLTWTMILLSMNPKWQDKAREEVLQICGKKIPDLE 21563

21564 AIKHLKI 21584

21726 VSMILHEVLRLYPSVVNLLRYTHKRTDVAGLSIPAGVELYLPTILLHHSPEYWGDDVEEF 21905

21906 KPERFSEGVSKASKGDQIAFYPFGWGPRICLGQSFAMIEAKMALAMILQNFWFELSPTYT 22085

22086 HAPYTVITLQPQYGAPIILHQI* 22155

 

>CYP72D6 gi|147773778|emb|CAN65255.1| 53% to 72A8

1 aa diff to CAAP02003169.1 missing first two exons

GMVPAFSMSCCNLIERWKNWVGPQGTYELDVMPEFQNVTGDVISRAAFGSSYEEGKKVFELQKE

QAVLVIEASRAIYLPGFRFVPTVKNRRRYHIDNEIKAMLRSMIDRKKQAMKNGDSGYNDDLLGLLLQLTE

EIDNEMRIEDLIEECKLFYFAGQETTANLLTWTMILLSMNPKWQDKAREEVLQICGKKIPDLEAIKHLKI

VSMILHEVLRLYPSVVNLLRYTHKRTDVAGLSIPAGVELYLPTILLHHSPEYWGDDVEEFKPERFSEGVS

KASKGDQIAFYPFGWGPRICLGQSFAMIEAKMALAMILQNFWFELSPTYTHAPYTVITLQPQYGAPIILH

QI

 

>CYP72D7P CAAP02000210.1 pseudogene, one stop codon, insert in C-helix 59% to CAAP02007230.1

GSVIVP00016465001 in Genoscope Browser (missing C-helix region)

chr11 1013195 to 1015601 on strand -

131775 MAVSMFSCLLISSLVLLLYGVLRVSYSIWWKPKWLEKRLRQQGIRGTPYKLVMGDMKEYI 131596

131595 RLITEAWSKPMNLNHHIVSRVDPFTQNNMQQY

131412 GKVSLFWAGTTPRLIVMDPGMIKEVLSNKQGHFQKPYISPLILTLARGLTALEGEVWAK 131236

131317 RRIINPAFHLEKLK 131192

131015 VMIPAFTTSCSMLIERWKELASLQETCEVDIWPELQNLTRDVISRAALGSSFEEGRQ 130845

130844 IFELQKEHITLTLEAMQTLYIPGFR 130770

130633 FIPTKKNQRRKYLQKRTTSMFRDLIQRKKDAIRTGQAEGDNLLGLLLLSSSQNNLPEN 130460

130459 VMSTKDNAITLEEVIEECKQFYLAGHETTSSWLTWTVTVLAMHPNWQEKAREEVMQICGK 130280

130279 KEPDSEALSHLKI 130241

129794 VSMILYEVLRLYPPVIAVYQHAYKETKIGTISLPAGVDLTLPTLLIHHDPELWGDDAEEF 129615

129614 KPERFAEGVSKASKDQLAFFPFGWGPRTCIGQNFAMIEAKVALAMILQHFSFELSPSYTH 129435

129434 APHTVMTLQPQHGAQLKFYQL* 129369

 

CYP73 family

 

>CYP73A78 gi|147821469|emb|CAN70035.1| = AM455281.2

CAAP02004907.1 7360-5632 (-) strand, 1 aa diff

MAHLLNKPVFFSTLLTIILLSSTRLLASYLSISPPLIASFLPLAPLILYLFYSIAKRSASLPPGPLSIPL

FGNWLQVGNDLNHQLLASMAQKYGPVFLLKLGSKNLAVVSDPELASQVLHTQGVEFGSRPRNVVFDIFTG

NGQDMVFTVYGDHWRKMRRIMTLPFFTNKVVHHYSEMWEEEMELVVDDLRNKESVKSEGLVIRKRLQLML

YNIMYRMMFDSKFESQEDPLFIQATRFNSERSRLAQSFDYNYGDFIPFLRPFLRGYLNKCRELQSRRLAF

FNNYFVEKRREIMAANGEKHKIRCAIDHIIDAQLKGEISEANVLYIVENINVAAIETTLWSMEWAIAELV

NHPHVQCKIRDEITTILQGDAVTESNLHQLPYLQATVKETLRLHAPIPLLVPHMNLEEAKLGGYTIPKES

KVVVNAWWLANNPSWWKNPEEFRPERFLEEESGTDAVAGGKVDFRFLPFGVGRRSCPGIILALPILALVI

AKLVMNFEMRPPIGVEKIDVSEKGGQFSLHIANHSTVALTPIAA

 

>CYP73A81 CAAP02000489.1 94% to 73A78

132479  MAHLLNKPLFFTLVTIILLSSTRLLASYLPISPNIARFLPLAPLILYLFYSISKRSAS  132652

132653  LPPGPLSIPIFGNWLQVGNDLNHQLLASMAQKYGPVFLLKLGSKNLTVVSDPELASQVLH  132832

132833  TQGVEFGSRPRNVVFDIFTGNGQDMVFTVYGDHWRKMRRIMTLPFFTNKVVHQYSEMWEE  133012

133013  EMDLVVDDLRNKESVKTEGLVIRKRLQLMLYNIMYRMMFDAKFESQEDPLFIQATRFNSE  133192

133193  RSRLAQSFDYNYGDFIPLLRPFLRGYLNKCRELQSSRLAFFNNYYVEKRR  133342

133464  EIMAANGEKHKIRCAIDHIIDAQHKGEISEENVLYIVENINVAAIETTLWSMEWAIAEL  133640

133641  VNHPHVQSKIRDEITTVLQGGAVTESNLHQLPYLQATVKETLRLHSPIPLLVPHMNLEEA  133820

133821  KLGGYTIPKESKVVVNAWWLANNPEWWKNPEEFRPERFLQEESATDAVAGGKADFRFLPF  134000

134001  GVGRRSCPGIILALPILALVIGKMVMNFEMRPPIGVEKIDVSEKGGQFSLHIANHSTVAF  134180

134181  TPITA*  134198

 

>CYP73A82 gi|147775009|emb|CAN77208.1| 63% to CYP73A78

MDLILIEKALLAVFCAIILAITISKLLGKKLKLPPGPLPVPVFGNWLQVGDDLNHLNLSDLAKKFGDIFM

LRMGQRNLVVVSSPDLAKDVLHTQGVEFGSRTRNVVFDIFTGKGQDMVFTVYGEHWRKMRRIMTVPFFTN

KVVQQYRVGWEDEAARVVEDVKKNPEASTNGIVLRRRLQLMMYNNMYRIMFDRRFDSEEDPLFVKLKALN

GERSRLAQSFEYNYGDFIPILRPFLRGYLKICKEVKERRLQLFKDHFLEERKKLASTKSTDHNSLKCAVD

HILDAQQKGEINEDNVLYIVENINVAAIETTLWSIEWGIAELVNHPHIQKKLRDELNTVLGPGVQVTEPD

IQKLPYLQAVIKETLRLRMAIPLLVPHMNLNDAKLGGYDIPAESKILVNAWWLANDSSKWKKPEEFRPER

FLEEESKVEANGNDFRYLPFGVGRRSCPGIILALPILGITIGRLVQNFELLPPPGQAKLDTTGKGGQFSL

HILKHSTIVARPIEA

 

>CYP73A82 CAAP02000415.1 = CAN77208.1

86000  MDLILIEKALLAVFCAIILAITISKLLGKKLKLPPGPLPVPVFGNWLQVGDDL  86158

86159  NHLNLSDLAKKFGDIFMLRMGQRNLVVVSSPDLAKDVLHTQGVEFGSRTRNVVFDIFTGK  86338

86339  GQDMVFTVYGEHWRKMRRIMTVPFFTNKVVQQYRVGWEDEAARVVEDVKKNPEASTNGIV  86518

86519  LRRRLQLMMYNNMYRIMFDRRFDSEEDPLFVKLKALNGERSRLAQSFEYNYGDFIPILRP  86698

86699  FLRGYLKICKEVKERRLQLFKDHFLEERK  86785

86992  KLASTKSTDHNSLKCAVDHILDAQQKGEINEDNVLYIVENINVA  87120

88683  AIETTLWSIEWGIAELVNHPHIQKKLRDELNTVLGPGVQVTEPDIQKLPYLQAV  88844

88845  IKETLRLRMAIPLLVPHMNLNDAKLGGYDIPAESKILVNAWWLANDSSKWKKPEEFRPER  89024

89025  FLEEESKVEANGNDFRYLPFGVGRRSCPGIILALPILGITIGRLVQNFELLPPPGQA  89195

89196  KLDTTGKGGQFSLHILKHSTIVARPIEA  89279

 

>CYP74A13 CAAP02000041.1a CYP74A 54% to 74A4 (CAO47688.1)

in contig CU459225.1 chr3 scaffold_8

234521 MSSLSSSSSSSRSELPLLKIPGDYGLPFFGPIRDRFDYFYNQGQDEFFKTRMQKYHSTVFRAN 234709

234710 MPPGPFISSDSKVVVLLDAVSFPVLFDSSKVEKRNVLDGTFMPSTDLTGGYRVLAFLDPS 234889

234890 EPKHDLLKRFSFSLLASRHRDFIPVFRSGLPDLFTTIEDDVSSKGKANFNNIADGMYFNF 235069

235070 VFRLICGKDPSDAKIRSEGPNIFSKWLFLQLSPLMTLGLSMLPNFIEDLLLHTFPLPPFL 235249

235250 VKSDYNKLYKAFYESASSVLDEGERMGINRDEACHNLVFLAGFSTFGGMKVLFPPLIKWV 235429

235430 GLAGEKLHRELADEIRTVVKAEGGVTFAALDKMALTKSVVYEALRIGPPVPFQYGKARE 235606

235607 DMVIHSHDAAFEIKKGEMIFGYQPFATKDPKVFENPEDFVAHRFMGEGEKLLKYVYWSN 235783

235784 GRETDNPTAENKQCSGKDLVVLISKLMLVEIFLRYDTFEVESGTMVLGSAVLFKSLTKSS 235963

235964 YT* 235972

 

>CYP74A14 CAAP02000041.1b CYP74A 54% to 74A4 (CAO47689.1)

in contig CU459225.1 chr3 scaffold_8

244102 MSSSSSSSSSSRPELPLRKIPGDYGLPFFGPIRNRFDYFYNQG 244230

244231 QDEFFKTRMQKYHSTVFRANMPPGPFISSDSKVVVLLDTVSFPVLFDSSKVEKRNVFVGT 244410

244411 FMPSTDLTGGYRVLPYLDPSEPKHDLLKRFSFSLLASRHRDFIPVFRSGLPDLFSTIEDD 244590

244591 VSRKGKANFNDIADDMYFNFVFRLICGKDPSDAKIRSEGPNIFLKWLFLQLSPLLTLGLS 244770

244771 ILPNFIDDLLLHTFPFPPFLVKSDYNKLYKAFYESASSVLDEGERMGIKRDEACHNLVFL 244950

244951 AGFNSFGGMKVFFPALIKWVGLAGEKLHRELADEIRTVIKAEGGVTFAALDKMALTKSM 245127

245128 VYEALRIEPPVPFQYGKAREDMVIHSHDAAFEIKKGEMIFGYQPFATKDPKVFENPEEFV 245307

245308 AHRFMGEGEKLLKYVYWSNGRETDNPTAENKQCSGKDLVVLISRLMLVEIFLRYDTFEV 245484

245485 ESGTMLLGSSLLFKSLTKTSYT* 245553

 

>CYP74A15 CAAP02000041.1c CYP74A 56% to 74A5 (CAO47690.1 fused)

in contig CU459225.1 chr3 scaffold_8 upstream of CAAP02006275.1a

252843 MSSSSSSLPLNFDNSSSSSKLPLRSIPGDCGSPFFGPIKDRFDYFYNEGRDQFFRTRMQKY 253025

253026 QSTVFRANMPPGPSMASNPNVVVLLDAISFPILFDTSRIEKRNVLDGTYMPSTAFTGGYR 253205

253206 VCAYLDPSEPNHALLKRLFMSSLAARHHNFISVFRSCLTELFITLEDDASRKGKADFNGI 253385

253386 SDNMSFNFVFKLFCDKHPSETKLGSNGPNLVTKWLFLQLAPLITLGLSMLPNVVEDLLLH 253565

253566 TFPLPSLFVKSDYKNLYHAFYASASSILDEAESMGIKRDEACHNLVFLAGFNAYGGMKTL 253745

253746 FPALIKWVGLAGEKLHGQLADEIRSIVKAEGGVTFAALDKMALTKSVVYEALRIEPPVP 253922

253923 FQYGKAKEDMVIHSHDAAFEIKKGEMIFGYQPFATKDPKVFDNPEEFVAHRFMGDGEKM 254099

254100 LEYVYWSNGRESDDPTVENKQCPGKDLVVLLSRVMMVEFFLRYDTFNIECGTLLLGSSVT 254279

254280 FKSLTKQPTFDHKSITHVS* 254339

 

>CYP74A16 CAAP02006275.1a CYP74A, 96% to CAAP02000041.c (CAO47690.1 fused)

in contig CU459225.1 chr3 scaffold_8

5348 MSSSSSSLPLNFVNSSSSSKLPLRSIPGDCGSPFFGPIKDRFDYFYNEGRDQFFRTRMQK 5527

5528 YQSTVFRANMPPGPFMAFNPNVVVLLDAISFPILFDTSRIEKRNVLDGTYMPSTAFTGGY 5707

5708 RVCAYLDPSEPNHALLKRFFTSSLAARHHNFIPVFRSCLTELFTTLEDDVSRKGKADFNG 5887

5888 ISDNMSFNFVFKLFCDKHPSETKLGSNGPNLVTKWLFLQLAPLITLGLSMLPNVVEDLLL 6067

6068 HTFPLPSLFVKSDYKKLYHAFYASASSLLDEAESMGIKRDEACHNLVFLAGFNAYGGMKT 6247

6248 LFPALIKWVGLAGGKLHRQLADEIRSIVKAEGGVTFAALDKMALTKSVVYEALRIEPPVP 6427

6428 FQYGKAKEDMVIHSHDAAFEIKKGEMIFGYQPFATKDPKVFDNPEEFVAHRFMGDGEKLL 6607

6608 EYVYWSNGRESDDPTVENKQCPGKDLVVLLSRVMLVEFFLHYDTFDIECGTLLLGSSVTF 6787

6788 KSLTKQPTFDHKSIKHVS* 6844

 

>CYP74A17 CAAP02006275.1b CYP74A, 84% to CAAP02000041.c (CAO47691.1)

in contig CU459225.1 chr3 scaffold_8

11732 MSSSSDKNDLNSSSSLSKLPLRKIPGDYGLPFFGAIKDRLDYFYKQGREEFFNARMHK 11905

11906 YQSTVFRANMPPGPFMASNPNVIVLLDSISFPILFDTSKVEKRNVLDGTYMPSTAFTGGY 12085

12086 RVCAYLDPSETNHALLKRLFMSALAARHHNFIPLFRSSLSELFTSLEDDISSKGEADFND 12265

12266 ISDNMSFNFVFRLFCDKYPSETALGSQGPSIVTKWLFFQLAPLITLGLSLLPNFVEDLLL 12445

12446 HTFPLPSIFVKSDYKKLYRAFYASASSILDEAESMGIKRDEACHNLVFLAGFNAYGGMKA 12625

12626 LFPSLIKWVGSAGEKLHRELADEIRTVVKAEGGVSFAALEKMSLTKSVVYEALRIDPPVP 12805

12806 FQYGKAKEDMVIHSHDAAFEIKKGEMIFGYQPFATKDPKVFDNPEEFMGNRFMGEGERLL 12985

12986 KYVYWSNGRESGNPTVENKQCAGKDLVLLLSRVMLVEFFLRYDTFDIESGTLLLGSSVTF 13165

13166 KSITKATDS* 13195

 

>CYP74A1 CAAP02000063.1 (CAO61246.1) in contig CU459218.1

chr18 scaffold_1

61% to 74A1 Arab. 70% to 74A1 tomato

next closest match to the tomato 74A1 is CAAP02000041.1b 60%

so this is considered the ortholog of CYP74A1. Note it is distant from the

other CYP74 gene cluster on chr 3

149291 MASPSLTFPSLQLQFPTHTKSSKPSNHKLIVRPIFASVSEKPSVPVSQSQVTPPGPIRKI 149470

149471 PGDYGLPFIGPIKDRLDYFYNQGREEFFRSRAQKHQSTVFRSNMPPGPFISSNSKVIVLL 149650

149651 DGKSFPVLFDVSKVEKKDVFTGTFMPSTEFTGGFRVLSYLDPSEPDHTKLKRLLFFLLQS 149830

149831 SRDRVIPEFHSCFSELSETLESELAAKGKASFADPNDQASFNFLARALYGTKPADTKLGT 150010

150011 DGPGLITTWVVFQLSPILTLGLPKFIEEPLIHTFPLPAFLAKSSYQKLYDFFYDASTHVL 150190

150191 DEGEKMGISREEACHNLLFATCFNSFGGMKIIFPTILKWVGRGGVKLHTQLAQEIRSVVK 150370

150371 SNGGKVTMASMEQMPLMKSTVYEAFRIEPPVALQYGKAKQDLVIESHDSVFEVKEGEMLF 150550

150551 GYQPFATKDPKIFERSEEFVPDRFVGEGEKLLKHVLWSNGPETENPTLGNKQCAGKDFV 150727

150728 VLAARLFVVELFLRYDSFDIEVGTSLLGSAINLTSLKRASF* 150853

 

>CYP74B13 AM441513 PLN 18-MAY-2007 Vitis vinifera (Pinot noir grape)

11751 MLSSTVMSVSPGVPTPSSLTPPSPPSSSPVRAIPGSYGWPVLGPIADRLDYFW 11593

11592 FQGPETFFRKRIDKYKSTVFRTNVPPSFPFFVDVNPNVIAVLDCKSFSFLFDMDVVEKKN 11413

11412 VLVGDFMPSVKYTGDIRVCAYLDTAETQHAR 11320

10198 VKGFAMDILKRSSSIWASEVVASL 10127

10125 DTMWDTIDAGVAKSNSASYIKPLQRFIFHFLTKCLVGADPAVSPEIAESGYVMLDKWVFL 9946

 9945 QLLPTISVNFLQPLEEIFLHSFAYPFFLVKGDYRKLYEFVEQHGQAVLQRGETEFNLSKE 9766

 9765 ETIHNLLFVLGFNAFGGFTIFFPSLLSALSGKPELQAKLREEVRSKIKPGTNLTFESVK 9589

 9588 DLELVHSVVYETLRLNPPVPLQYARARKDFQLSSHDSVFEIKKGDLLCGFQKVAMTDPKI 9409

 9408 FDDPETFVPDRFTKEKGRELLNYLFWSNGPQTGSPSDRNKQCAAKDYVTMTAVLFVTHMF 9229

 9228 QRYDSVTASGSSITAVEKAN* 9166

 

>CYP74B13 CAAP02000110.1 (CAO24035.1) in contig CU459253.1

chr12 scaffold_36

no heme Cys 56% to 74B2 also w/o Cys, 3 aa diffs to AM441513

146202 MLSSTVMSVSPGVPTPSSLTPPSPPSSSPVRAIPGSYGWPVLGPIADRLDYFWFQGPETFFRKR 146011

146010 IDKYKSTVFRTNVPPSFPFFVGVNPNVIAVLDCKSFSFLFDMDVVEKKNVLVGDFMPSVK 145831

145830 YTGDIRVCAYLDTAETQHAR (0) 145771

144656 VKSFAMDILKRSSSIWASEVVASLDTMWDTIDAGVAKSNSASYIKPLQRFIFHFLTKCL 144480

144479 VGADPAVSPEIAESGYVMLDKWVFLQLLPTISVNFLQPLEEIFLHSFAYPFFLVKGDYR 144303

144302 KLYDFVEQHGQAVLQRGETEFNLSKEETIHNLLFVLGFNAFGGFTIFFPSLLSALSGKPE 144123

144122 LQAKLREEVRSKIKPGTNLTFESVKDLELVHSVVYETLRLNPPVPLQYARARKDFQLSS 143946

143945 HDSVFEIKKGDLLCGFQKVAMTDPKIFDDPETFVPDRFTKEKGRELLNYLFWSNGPQTGS 143766

143765 PSDRNKQCAAKDYVTMTAVLFVTHMFQRYDSVTASGSSITAVEKAN* 143625

 

AM # and CAN # are from Velasco et al. heterozygous Pinot Noir grapevine variety

CAO # and CAAP # are from Jaillon et al. PN40024 highly homozygous French-Italian Public Consortium

Note: CYP75s and CYP79s are interleaved

 

CYP75 family

 

CYP75A subfamily (9 genes) [5 pseudogenes] 2 alleles, 15 orthologs from other strains

31 sequences

 

>CYP75A28 gi|83715792|emb|CAI54277.1| AJ880356

Shiraz mRNA

flavonoid-3,5'-hydroxylase 78% to CYP75A8 Catharanthus roseus

MAIDTSLLLEFAAATLLFFITRFFIRSLLPKPSRKLPPGPKGWPLLGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNSMVVASTPEAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML

GGKALEDWSQVRAVELGHMLRAMLELCQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM

VVELMTSAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDWLLTKMMEEHTASAHERKGNPDFLDVIMAN

QENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLP

KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFL

SGRNTKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDEVEINMDEAFGLALQKA

VSLSAMVTPRLHQSAYAV*

 

>CYP75A28 gi|85679310|gb|ABC72066.1| flavonoid 3',5'-hydroxylase 99%

only 3 aa diffs, DQ351701 cv. Sangiovese Berries genomic

RFFIRSLLLKPSRKLPPGPKGWPLLGALPLLGNMPHVALAKMAKRYGPVMFLKMGTNSMVVASTPEAARA

FLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHMLGGKALEDWSQVRAVELGHMLR

AMLELCQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDMVVELMTSAGYFNIGDFIPSIA

WLDIQGIQRGMKHLHRKFDWLLTKMMEEHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNL

FTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPL

NLPRVSTQACEVNGYYIPKNTGLSVNIWAIGRDPDVWESPEEFRPERFLSGRNTKIDPRGNDFELIPFGA

GRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV

 

>CYP75A28 gi|147862221|emb|CAN82592.1| AM436340.2c Pinot Noir genomic

99% only 4 aa diffs

MAIDTSLLLEFAAATLLFFITRFFIRSLLPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNSMVVASTPEAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML

GGKALEDWSQVRAVELGHMLRAMLELCQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM

VVELMTSAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDWLLTKMMEEHTASAHERKGNPDFLDVIMAN

QENSTGEKLTITNIKALLL

NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDXVIGRSRRLVESDLP

KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWESPEEFRPERFL

SGRNEKIDPRGNDFELIPFGAGRRILRWH

 

>CYP75A28-de2b gi|157028306|emb|CAAP02012536.1| PN40024,

contig_12536

Length=4320

 

Middle exon on (-) strand near end of contig = CAN82592.1| AM436340.2c

This is a pseudogene fragment that is missing the end of the exon

4021  NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLPKLPYLQAIC  3842

3841  KESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIW  3722

 

There is another exon 1 on (-) strand at 1-382

= CAN82592.1| AM436340.2c CYP75A28

 

>CYP75A28 gi|83944624|gb|ABC48916.1| DQ298201.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 98% to CYP75A28

4 aa diffs

this seq is called VvF3’5’H-1a in Castellarin et al. BMC Genomics 2006

IGQVILSRRVFETKGSESNEFKDMVVELMTSAGYFNIGDFIPSIAWLDIQGIQRGMEHLHRKFDWLLTKM

MEEHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSI

LKRAHEEMDKVIGRSRWLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSV

NIWAIGRDPDVWESPEEFRPERFLSGRNE

 

>CYP75A28 gi|83944626|gb|ABC48917.1| DQ298202.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 98% to CYP75A28

4 aa diffs

this seq is called VvF3’5’H-1b in Castellarin et al. BMC Genomics 2006

IGQVILSRRVFETKGSESNEFKDMVVELMTSAGYFNIGDFIPSIAWLDIQEIQRGMEHLHRKFDWLLTKM

MEEHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSI

LKRAHEEMDKVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSV

NIWAIGRDPDVWESPEEFRPERFLSGRNE

 

>CYP75A28 gi|83944628|gb|ABC48918.1| DQ298203.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase (1 aa diff)

this seq is called VvF3’5’H-1c in Castellarin et al. BMC Genomics 2006

IGQVILSRRVFETRGSESNEFKDMVVELMTSAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDWLLTKM

MEEHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSI

LKRAHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSV

NIWAIGRDPDVWESPEEFRPERFLSGRNT

 

>CYP75A32P CAAP02004900.1b pseudogene

96% to 75A28 CAI54277.1 AJ880356.1 Shiraz mRNA

2 frameshifts

CAO16882.1 + CAO16883.1 CU459242.1

19834 MAIDTSLLLEFAAATLLFFITRFFIRSILPKPSRKLPPGPKGWPLLGALPLVGNMPHVALAKMAK 19640

19639 RYGPVMFLKMGTNSMVVASTPEAARAFLKTLDINFSSRPPNAGATLLAYHAQD 19481

19480 MVFADYGARWKLLRKLSNLHMLGG 19409

19409 KALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGS 19230

19229 ESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRMIEEHT 19053

19052 ASAHERKGNPDFLDVVMGHQGNSTGEKLTLTNIKALLL (0) 18939

18547 NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLPKLPYLQAIC 18368

18367 KESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRL 18263

18261 SVNIWAIGRDPDVWESPEEFRPERFLSGRNAKIDPRGNDFELIPFGAGRRICAGTRMGIVL 18079

18078 VEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV* 17917

 

>CYP75A33 CAAP02004900.1a 5 aa diffs to CAN60359.1, Pinot Noir genomic

N-term corrected

100% to CAO16880.1 CU459242.1

10277 MAIDTSLLPELAAATLLFFITRFFIRSLFPKPSRKLPPGPRGWPLLGALPLLGNMPHVAL 10098

10097 AKMAKRYGPVMFLKMGTNSMVVASTP

10019 EAARAFLKTLDINFSNRPPNAGATHLAYDAQDMVFADYGARWKLLRKLSNLHMLGGKALE 9840

 9839 DWSQVRTVELGHMLRAMLELSQREEPVVVPEMLSFSVANMIGQVILSRRVFETKGSESNE 9660

 9659 FKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHE 9480

 9479 RKGNPDFLDVVMGHQGNSTGEKLTLTNIKALLL 9381

 8991 NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLQKLPYLQAIC 8812

 8811 KESFRKHPSTPLNLPRVSNEACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFSPERF 8632

 8631 LSGRNAKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINM 8452

 8451 DEAFGLALQKAVSLSAMVTPRLHQSAYAV 8365

 

>CYP75A34 gi|147861244|emb|CAN81079.1| AM457118.1 Pinot Noir genomic

91% to CYP75A28

4 aa diffs to CAO16875.1 CU459242.1

MAIDTSLLPELAAATLLFFITRFFICSLFPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTSSMVVASTPEAARAFLKTLDINFSNRPPNAGATHLAYGAQDMVFADYGPRWKLLRKLSNLHML

GGKALEDSSQVRTVELGHMLRAMLELSQREEPVVVPEMLSFSIANMIGQVILSRRVFETKGSESNEFKDM

VVELMTCAGYFNIGDFIPSIAWMDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDVVMGH

QGNSTGEKLTLTNIKALLQNLFAAGTDTSASIIEWSLAEMLKNPSILKRAQEEMDHVIGRNRRLVESDLP

KLPYLQAICKESLRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWENPEEFRPERFL

SGRNAKIDPRGNDFELIPFGAGRRICAGARMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKA

VSLSAMVTPRLHQSAYAV

 

>CYP75A35 gi|147777347|emb|CAN62887.1| AM437324.2 Pinot Noir genomic

72% to CYP75A28

CAAP02001548.1 61270-63341 (+) strand, N-term corrected

100% to CAO16871.1 CU459242.1

MAIDTSFFIVSAAATLLFLIVHSFIHFLVS

RRSRKLPPGPKGWPLLGVLPLLKEMPHVALAKMAKKYGPVMLLKMGTSNMVVASNPEAAQAFLKTHE

ANFLNREPGAATSHLVYGCQDMVFTEYGQRWKLLRRLSTLHLLGGKAVEGSSEVRAAELGRVLQTMLEFS

QRGQPVVVPELLTIVMVNIISQTVLSRRLFQSKESKTNSFKEMIVESMVWAGQFNIGDFIPFIAWMDIQG

ILRQMKRVHKKFDKFLTELIEEHQASADERKGKPDFLDIIMANQEDGPPEDRITLTNIKAVLVNLFVAGT

DTSSSTIEWALAEMLKKPSIFQRAHEEMDQVIGRSRRLEESDLPKLPYLRAICKESFRLHPSTPLNLPRV

ASEACEVNGYYIPKNTRVQVNIWAIGRDPDVWENPEDFAPERFLSEKHANIDPRGNDFELIPFGSGRRIC

SGNKMAVIAIEYILATLVHSFDWKLPDGVELNMDEGFGLTLQKAVPLLAMVTPRLELSAYAA

 

>CYP75A36 CAAP02004490.1 21113-19182 PN40024

MAIDTSLLLKLAAAILLFFITRFFIRSLLPKPSRKLPP

GPRGWPLLGALPLLGNMPHVALAKMAKRYGPVMFLKMGTNCMVVASTPEAAQAFLKTLDI

NVSNRPPNAGATHLAYDAQDMVFADYGPRWKLLRKLSNLHMLGGKALEDWSQVRTVELGH

MLRAMLELSQREEPVVVPEMLSFSVANMIGQVILSRRVFETKGSESNEFKDMVVELMTTA

GYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDVVM

GHQGNSTGEKLTLTNIKALLL (0)

NLFTAGTDTSSSVIEWSLAEMLKN

PSILKRAHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSAQACEV

NGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFLSGRNAKIDPRGNDFELIPFGAGR

RICAGTRMGIVLVEYILGSLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQ

SAYAV*

 

>CYP75A36 gi|86156244|gb|ABC86840.1| DQ356236.1 Sangiovese genomic

flavonoid 3',5'-hydroxylase 94% to 75A28

CAAP02004490.1 21113-19182, 3 aa diffs, N-term corrected

CAO23870.1 translated from CAAP02004490.1

21113 MAIDTSLLLKLAAAILLFFIT

RFFIRSLLPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPVMFLKMGTNCMVVASTPEAARA

FLKTLDINVSNRPPNAGATHLAYDAQDMVFADYGPRWKLLRKLSNLHMLGGKALEDWSQVRTVELGHMLR

AMLELSQREEPVVVPEMLSFSVANMIGQVILSRRVFETKGSESNEFKDMVVELMTTAGYFNIGDFIPSIA

WLDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDVVMGYQGNSTGEKLTLTNIKALLLNL

FTAGTDTSSSVIEWSLAEMLKNPSILKRVHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPL

NLPRVSAQACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFLSGRNAKIDPRGNDFELIPFGA

GRRICAGTRMGIVLVEYILGSLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV*

 

>CYP75A36 gi|83944630|gb|ABC48919.1| DQ298204.1 Cabernet Sauvignon genomic

 flavonoid 3'-hydroxylase 94% to CYP75A28

2 aa diffs to CYP75A36

this seq is called VvF3’5’H-2a in Castellarin et al. BMC Genomics 2006

IGQVILSRRVFETKGSESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRM

IEEHTASAHERKGNPDFLDVVMGHQGNSTGEKLTLTNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSI

LKRAHEEMDKVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSV

NIWAIGRDPDVWESPEEFRPERFLSGRNE

 

>CYP75A36 gi|83944632|gb|ABC48920.1| DQ298205.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 94% to CYP75A28

this seq is called VvF3’5’H-2b in Castellarin et al. BMC Genomics 2006

2 aa diffs to CYP75A36

IGQVILSRRVFETKGSESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRM

TEEHAASAHERKGNPDFLDVVMGHQGNSTGEKLTLTNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSI

LKRAHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSAQACEVNGYYIPKNTRLSV

NIWAIGRDPDVWESPEEFRPERFLSGRNE

 

>CYP75A37P gi|147794774|emb|CAN60359.1| AM429113.2 Pinot Noir genomic

93% to CYP75A28 wrong N-term

CAO23867 pseudogene

MVQFKSCGTLGQRMRSIHLHPTILHGWGTPNLSVLEVWMVLELFFPNFQLLLVSRSPAMQGLPGEAPGRP

LRWLNGRKKYVYNNNNWRVDVVC

EAARAFLKTLDINFSNRPPNAGATHLAYDAQDMVFADYGPRWKLLRK

LSNLHMLGVKALEDWSRVRTVELGHMLRAMLELSQREEPVVVPEMLSFSVANMIGQVILSRRVFETKGSE

SNEFKDMVVELMTSAGYFNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDF

LDVVMGHQGNSTGEKLTLTNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRR

LVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSNEACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEE

FSPERFLSGRNAKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAF

GLALQKAVSLSAMVTPRLHQSAYAV

 

>CYP75A37P CAAP02002140.1c  translation = CAO23867 PN40024

CAN60359.1 pseudogene, Pinot Noir genomic

missing the N-term not in next 15kb

These genes are usually intact in the first exon, not split.

48449 AARAFLKTLDINFSNRPPNAGATHLAYDAQDMVFADYGPRWKLLRKLSNL 48300

48299 HMLGVKALEDWSRVRTVELGHMLRAMLELSQREEPVVVPEMLSFSVANMIGQVILSRRVF 48120

48119 ETKGSESNEFKDMVVELMTSAGYLNIGDFIPSIAWLDIQGIERGMKHLHKKFDKLLTRM 47943

47942 IEEHTASAHERKGNPDFLDVVMGHQGNSTGEKLTLTNIKALLL (0) 47814

47492 NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLPKLPYLQAIC 47313

47312 KESFRKHPSTPLNLPRVSNEACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFSPERF 47133

47132 LSGRNAKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGV 46965

46964 EINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV 46866

 

>CYP75A38v1 gi|157331175|emb|CAO63558.1| same seq as CAAP02005443.1 PN40024

next gene is CYP79A29P CAO63559

3 aa diffs to CAN82588.1| AM436340.2a Pinot Noir genomic

7  aa diffs to BAE47007.1,

5 aa diffs to DQ786631.1, 12 aa diffs to CAI54277.1,

9 aa diffs to ABC86841, 11 aa diffs to ABC72066.1,

2 aa diffs to ABC48916 and ABC48917 (partials)

MAIDTSLLLELAAATLLFFITRFFIRSLLLKSSRKLPPGPKGWPLVGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNSMVVASTPGAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML

GGKALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM

VVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMEHLHRKFDWLLTKMMEEHTASAHERKGNPDFLDVIMAN

QENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDKVIGRSRRLVESDLP

KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWESPEEFRPERFL

SGRNEKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKA

VSLSAMVTPRLHQSAYAV

 

>CYP75A38v2 gi|157332081|emb|CAO68617.1| CU460585.1 PN40024

runs off the end

2 aa diffs to DQ356237.1 Sangiovese genomic

3 aa diffs to DQ786631.1 Cabernet Sauvignon mRNA

STPGAARAFLKTLDINFSNRPPNAGASLLAYHAQD

MVFADYGARWKLLRKLSNLHMLGGKALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIG

QVILSRRVFETKGSESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIQHGMKHLHRKFDRLLTKMME

EHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILK

RAHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNI

WAIGRDPDVWESPEEFRPERFLSGRNEKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFD

WKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV

 

>CYP75A38v3 gi|147862217|emb|CAN82588.1| AM436340.2a Pinot Noir genomic

97% to 75A28, 11 aa diffs

3 aa diffs to CAAP02005443.1a 8380-10309

MAIDTSLLLELAAATLLFFITRFFIRSLLLKSSRKLPPGPKGWPLVGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNSMVVASTPGAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML

GGKALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM

VVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDRLLTKMMEEHTASAHERKGNPDFLDVIMAN

QENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDKVIGRSRRLVESDLP

KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFL

SGRNEKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKA

VSLSAMVTPRLHQSAYAV

 

>CYP75A38v4 gi|111144659|gb|ABH06585.1| translated from DQ786631.1 Cabernet Sauvignon mRNA

flavonoid 3'5' hydroxylase

2 aa diffs to CAN82588.1 Pinot Noir genomic

4 aa diffs to CAAP02007407.1 PN40024

11 aa  diffs 97% to 75A28 CAI54277 Shiraz mRNA

MAIDTSLLLELAAATLLFFITRFFIRSLLLKSSRKLPPGPKGWPLVGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNSMVVASTPGAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML

GGKALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFRDM

VVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDRLLTKMMEEHTASAHERKGNPDFLDVIMAN

QENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLP

KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFL

SGRNEKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKA

VSLSAMVTPRLHQSAYAV

 

>CYP75A38v5 gi|78183426|dbj|BAE47007.1| AB213606.1 Cabernet Sauvignon genomic

flavonoid 3',5'-hydroxylase 98%

4 aa diffs to CYP75A38v4 and CYP75A38v3

7 aa diffs to 75A28, EST = EE066764

MAIDTSLLLEFAAATLLFFITRFFIRSLLLKSSRKLPPGPKGWPLVGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNSMVVASTPGAARAFLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHML

GGKALEDWSQVRAVELGHMLRAMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM

VVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDRLLTKMMEEHTASAHERKGNPDFLDVIMAN

QENSTGEKLTITNIKALLLNLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDLP

KLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFRPERFL

SGRNTKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDEVEINMDEAFGLALQKA

VSLSAMVTPRLHQSAYAV

 

>CYP75A38v6 gi|86156246|gb|ABC86841.1| DQ356237.1 Sangiovese genomic

flavonoid 3',5'-hydroxylase

2 aa diffs to CAAP02008469.1 translation = CAO68617 PN40024

runs off end

8 aa diffs to 75A28

note: 79A29P is close

CAO68617 may be allelic with CAO63558 that also has CYP79A29P close

If not it is a nearly identical duplication

RFFIRSLLPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPVMFLKMGTNSMVVASTPEAARA

FLKTLDINFSNRPPNAGATLLAYHAQDMVFADYGARWKLLRKLSNLHMLGGKALEDWSQVRAVELGHMLR

AMLELSQRAEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDMVVELMTTAGYFNIGDFIPSIA

WLDIQGIQRGMKHLHRKFDRLLTKMMEEHTASAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLLNL

FTAGTDTSSSVIEWSLAEMLKNPSILKRVHEEMDQVIGRSRRLVESDLPKLPYLQAICKESFRKHPSTPL

NLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWESPEEFRPERFLSGRNEKIDPRGNDFELIPFGA

GRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV

 

>CYP75A39Pv1 gi|157332520|emb|CAO70765.1| CU460864.1 same seq as CAAP02008469.1

100% to CYP75A38v4

P450 gene does not continue upstream

ESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDRLLTKMMEEHTASAHERKGNPD

FLDVIMANQENSTGEKLTITNIKALLL

NLFTAGTDTSSSVIEWSLAEMLKNPSILKRAHEEMDQVIGRSRRLVESDL

PKLPYLQAICKESFRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPNVWESPEEFRPERF

LSGRNEKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQK

AVSLSAMVTPRLHQSAYAV

 

>CYP75A39Pv2 gi|157023020|emb|CAAP02017822.1| PN40024, contig_17822 Length=1454

Identical to CAN82588.1 and AB213606.1 and DQ356237.1

Does not extend, pseudogene fragment

609  ESNEFKDMVVELMTTAGYFNIGDFIPSIAWLDIQGIQRGMKHLHRKFDRLLTKMMEEHTA  430

429  SAHERKGNPDFLDVIMANQENSTGEKLTITNIKALLL  319

 

>CYP75A40P gi|147819898|emb|CAN60738.1| AM440112.2 Pinot Noir genomic

9 aa diffs to CYP75A38v3, end of the gene is missing

probably pseudogene

94% to CYP75A28

MAIDTSLLLELAAATLLFFITRFFIRSLLPKSSRKLPPGPKGWPLVGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNGMVVASTPGAARAFLKTLDINFSNRPLNAGATLLAYRSQDMVFADYGARWKLLRKLSNLHML

GGKALEDWSQVRAVELGHMLRAMLELSQRTEPVVVPEMLTFSMANMIGQVILSRRVFETKGSESNEFKDM

VVELITTAGYFNIGDFIPSIAWLDIQGIQHGMKHLHRKFDRLLTKMMEEHTASAHERKGNPDFLDVIMAN

QEKSTGEKLTITNIKALLLVGTIWHRNLWYNIHVIQHAILYDHCSEYGILQIGIRAFVG

 

>CYP75A41 gi|157333816|emb|CAO18026.1| PN40024

7 aa diffs to CYP75A34

MAIDTSLLPELAAATLLFFITRFFIRSLFPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNSMVVASTPEAARAFLKTLDINFSNRPPNAGATHLAYGAQDMVFADYGPRWKLLRKLSNLHML

GGKALEDSSQVRTVELGHMLRAMLELSQREEPVVVPEMLSFSIANIIGQVILSRRVFETKGSESNEFKDM

VVELMTCAGYFNIGDFIPSIAWMDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDVVMGH

QENTTGEKLTLSNIKALLQNLFAAGTDTSASIIEWSLAEMLKNPSILKRAQEEMDHVIGRNRRLVESDLP

KLPYLQAICKESLRKHPSTPLNLPRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWENPEEFRPERFL

SGRNAKIDPRGNDFELIPFGAGRRICAGARMGIVLVEYILGTLVHSFDWKIPDGVEINMDEAFGLALQKA

VSLSAMVTPRLHQSAYAV

 

>CYP75A41 gi|147862169|emb|CAN82604.1| AM436584.2 Pinot Noir genomic

86% to CYP75A28

CAAP02012125.1 2184-4508, 1 aa diff

CAAP02007036.1 12858-10533 (-) strand, 1 aa diff, no seq gap

MAIDTSLLPELAAATLLFFITRFFIRSLFPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTNSMVVASTPEAARAFLKTLDINFSNRPPNAGATHLAYGAQDMVFADYGPRWKLLRKLSNLHML

GGKALEDSSQVRTVELGHMLRAMLELSQREEPVVVPEMLSFSIANIIGQVILSRRVFETKGSESNEFKDM

VVELMTCAGYFNIGDFIPSIAWMDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDV

(small seq gap)

NLFAAGTDTSASIIEWSLAEMLKNPSILKRAQEEMDHVIGRNRRLVESDLPKLPYLQAICKESLRKHPSTPLNL

PRVSTQACEVNGYYIPENTRLSVNIWAIGRDPDVWENPEEFRPERFLSGRNAKIDPRGNDFELIPFGAGR

RICAGARMGIVLVEYXLGTLVHSFDWKMPDGVEINMDEAFGLALQKAVSLSAMVTPRLHQSAYAV

 

>CYP75A42 gi|147802021|emb|CAN61852.1| Pinot Noir genomic

91% to CYP75A28

7 aa diffs to 75A41

MAIDTSLLPELAAATLLFFITRFFIRSLFPKPSRKLPPGPRGWPLLGALPLLGNMPHVALAKMAKRYGPV

MFLKMGTXSMVVASTPEAARAFLKTLDINFSNRPPNAGATHLAYGAQDMVFADYGPRWKLLRKLSNLHML

GGKALEDSSQVRTVELGHMLRAMLELSQREEPVVVPEMLSFSIANIIGQVILSRRVFETKGSESNEFKDM

VVELMTCAGYFNIGDFIPSIAWMDIQGIERGMKHLHKKFDKLLTRMIEEHTASAHERKGNPDFLDVVMGH

QENTTGEKLTLSNIKALLQNLFAAGTDTSASIIEWSLAEMLKNPSILKRAXEEMDXVIGRXRRLVESDLP

KLPYLQAICKESXRKHPSTPLNLPRVSNEACEVNGYYIPKNTRLSVNIWAIGRDPDVWESPEEFSPERFL

SGRNAKIDPRGNDFELIPFGAGRRICAGTRMGIVLVEYILGTLVHSFDWKMPDGVEINMDEAFGLALQKA

VSLSAMVTPRLHQSAYAV

 

>CYP75A43 gi|147852187|emb|CAN80142.1| Pinot Noir genomic

70% to CYP75A28, 73% to CYP75A41

MSRSSRRLPPGPRGWPVVGCLPLLGAMPHVALAQLAQKYGAIMYLKLGTCDVVVASKPDSARAFLKTLDL

NFSNRPPNAGATHIAYEAQDFVFADIGPRWNLLRKLTSLHMLGAKSFKDWGAIRGAEIGHMIQAMCELSR

RGEPVVVPEMVSCALANIIGQKSLSRRVFETQGSESNDFKEMVVELMRLAGLFNVGDFIPSIAWMDLQGX

EGKMKLLHNKFDALLTRMIEEHSATAHERLGNPDILDVVMAEQEYSCGVKLSMVNIKALLLNLFIAGTDT

SSGTIEWALAEILKNPTMLKRAHAEMDRVIGKNRLLQESDVPKLPXLEAICKETFRKHPSVPLNIPRVSA

NACEVDGYYIPEDTRLFVNVWAIGRDPAVWENPLEFKPERFLSEKNARISPWGNDFELLPFGAGRRMCAG

IRMGIEVVTYALGTLVHSFDWKLPKGDELNMDEAFGLVLQKAVPLSAMVTPRLHPSAYKAQV

 

>CYP75A43 CAAP02001252.1 PN40024 genomic

2 aa diffs to CAN80142.1

35518  MVQIDEL

35539  LFTALVFLVTNFFVKRITSMSRSSRRLPPGPRGWPVVGCLPLLGAMPHVALAQLAQKYGA  35718

35719  IMYLKLGTCDVVVASKPDSARAFLKTLDLNFSNRPPNAGATHIAYEAQDFVFADIGPRWN  35898

35899  LLRKLTSLHMLGAKSFKDWGAIRGAEIGHMIQAMCELSRRGEPVVVPEMVSCALANIIGQ  36078

36079  KSLSRRVFETQGSESNDFKEMVVELMRLAGLFNVGDFIPSIAWMDLQGTEGKMKLLHN  36252

36253  KFDALLTRMIEEHSATAHERLGNPDILDVVMAEQEYSGGVKLSMVNIKALLL (0)  36408

       NLFIAGTDTSSGTIEWALAEILKNPTMLKRAH  36603

36604  AEMDRVIGKNRLLQESDVPKLPYLEAICKETFRKHPSVPLNIPRVSANACEVDGYYIPED  36783

36784  TRLFVNVWAIGRDPEVWENPLEFKPERFLSEKNARISPWGNDFELLPFGAGRRMCAGIRM  36963

36964  GIEVVTYALGTLVHSFDWKLPKGDELNMDEAFGLVLQKAVPLSAMVTPRLHPSAYKAQV* 37143

 

CYP75B subfamily (2 genes) [3 pseudogenes] 4 orthologs 2 alleles

21 sequences

 

>CYP75B32v1 gi|83715794|emb|CAI54278.1| AJ880357.1 Shiraz mRNA

flavonoid-3'-hydroxylase

same as AJ880357

CAAP02002732.1 7596-5384 (-) strand, 1 aa diffs

MNPLALIFCTALFCVLLYHFLTRRSVRLPPGLKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF

RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRSRLVTDLDLPQLT

YVQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

AAPLMVHPLPRLSPQVFGK

 

>CYP75B32v1 gi|157342333|emb|CAO64446.1| CU459229.1 PN40024

complement(join(4340871..4341512,4341761..4342213,

                     4342649..4343083))

2 aa diffs to 75B32v1 (CAI54278.1), 6 aa diffs to BAE47006

6 aa diffs to 75B32v3 BAE47005.1

6 aa diffs to DQ786632.2, 5 aa diffs to 75B32v2 CAN68303.1

MNPLALIFCTALFCVLLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF

RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRSRLVTDLDLPQLT

YVQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

AAPLMVHPRPRLSPQVFGK

 

>CYP75B32v2 gi|147833535|emb|CAN68303.1| Pinot Noir genomic

99% 5 aa diffs to CYP75B32v1, 2 aa diffs to CAO64446

MNPLALIFCTALFCVLLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF

RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRXRLVTDLDLPQLT

YXQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEKPLEFRPSRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

AAPLMVHPLPRLSPQVFGK

 

>CYP75B32v3 gi|78183422|dbj|BAE47005.1| AB213604.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 98%

6 aa diffs to CYP75B32v1

MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF

RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPQLT

YLQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEKPLEFRPSRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

AAPLMVHPLPRLSPQVFGK

 

>CYP75B32v4 gi|111144661|gb|ABH06586.1| translated from DQ786632.2 Cabernet Sauvignon mRNA

flavonoid 3' hydroxylase 99%

8 aa diffs

MNPLALFFCTALFCVLLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF

RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPQLT

YLQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEKPLEFRPSRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

APPLMVHPRPRLSPQVFGK

 

>CYP75B32v4 gi|78183424|dbj|BAE47006.1| AB213605.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 98% to CYP75B32

100% to DQ786632.2, 100% to AB213605.1, 4 aa diffs to AB213604

3 aa diffs to CAN68303, 8 aa diffs to 75B32v1

MNPLALFFCTALFCVLLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGKALDDF

RHIRQEEVAVLTRALARAGQTPVNLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVAAKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISVRDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPQLT

YLQAIIKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEKPLEFRPSRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

APPLMVHPRPRLSPQVFGK

 

$$$$

 

>CYP75B38 gi|83944614|gb|ABC48911.1| DQ298196.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 97% to CYP75B32

this seq is called VvF3’H-1a in Castellarin et al. BMC Genomics 2006

100% to CAN75347

VCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMVLAGVFNIGDFVPALEWLDLQGVASKMKKLH

ARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNADGEGGKLTDVEIKALLLNLFTAGTDTSSSTV

EWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLTYLQAIVKETFRLHPSTPLSLPRMAAESCEI

NGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLG

LRMVHLLTATLVHAFNWELPEGQVAEKLNMDEA

 

>CYP75B38 gi|83944616|gb|ABC48912.1| DQ298197.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 97% to CYP75B32

this seq is called VvF3’H-1b in Castellarin et al. BMC Genomics 2006

1 aa diff to CAN75347

VCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMVLAGVFNIGDFVPALEWLDLQGVASKMKKLH

ARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNADGEGGKLTDVEIKALLLNLFTAGTDTSSSTV

EWAIAELIRHPEMMAQAQQEPDAVVGRGRLVTDLDLPKLTYLQAIVKETFRLHPSTPLSLPRMAAESCEI

NGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLG

LRMVHLLTATLVHAFNWELPEGQVAEKLNMDEA

 

>CYP75B38 gi|83944618|gb|ABC48913.1| DQ298198.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 97% to CYP75B32

this seq is called VvF3’H-1c in Castellarin et al. BMC Genomics 2006

1 aa diff to CAN75347

VCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMVLAGVFNIGDFVPALEWLDLQGVASKMKKLH

ARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNADGEGGKLTDVEIKALLLNLFTAGTDTSSSTV

EWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLTYLQAIVKETFRLHPSTPLSLPRMAAESCEI

NGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLG

LRMVHLLTATLVHAFNWELPEGQVAEKRNMDEA

 

>CYP75B38 gi|83944620|gb|ABC48914.1| DQ298199.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 97% to CYP75B32

this seq is called VvF3’H-1d in Castellarin et al. BMC Genomics 2006

100% to CAO64444.1

VCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMVLAGVFNIGDFVPALEWLDLQGVASKMKKLH

ARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNADGEGGKLTDVEIKALLLNLFTAGTDTSSSTV

EWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLTYLQAIVKETFRLHPSTPLSLPRMAAESCEI

NGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLG

LRMVHLLTATLVHAFNWELPEGQVAEKLNMDEA

 

>CYP75B38 gi|157342331|emb|CAO64444.1| CU459229.1 PN40024 100% to CAN75347.1

1 aa diff to AB213603.1

complement(join(4317456..4318097,4318193..4318645,

                     4319126..4319560))

MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGQALDDF

RHIRQEEVLALMRALARAGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLT

YLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

AAPLMVHPRPRLSPQVFGK

 

>CYP75B38-de3b CU459229.1 1206 bp upstream of CAO64444

Same as CAAP02002916.1-de3b C-term fragment

4320766 GQVAEKLNMDKAYGLALQ*AAPLMVHPQPRLSPQGFG 4320656

 

>CYP75B38-de3c CU459229.1 same as CAAP02002916.1-de3c  C-term fragment

4309781 GLTLQRAAPLMVHPQPRLSPQGFG 4309707

 

>CYP75B38 gi|147801850|emb|CAN75347.1| Pinot Noir genomic

97% to CYP75B32

MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGQALDDF

RHIRQEEVLALMRALARAGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLT

YLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

AAPLMVHPRPRLSPQVFGK

 

>CYP75B38 CAAP02002916.1  100% to CAN75347.1, Pinot Noir genomic

97% to CYP75B32, 1 aa diff to AB213603

45444 MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYG 45265

45264 PLMHLRMGFVDVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRW 45085

45084 RMLRKICSVHLFSGQALDDFRHIRQ 45010

44529 EEVLALMRALARAGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEM 44353

44352 VVELMVLAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSER 44173

44172 HVDLLSTLISLKDNADGEGGKLTDVEIKALLL 44077

43981 NLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLTYLQAIV 43802

43801 KETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRF 43622

43621 LPGGERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAE 43442

43441 KLNMDEAYGLTLQRAAPLMVHPRPRLSPQVFGK* 43340

 

>CYP75B38-de3b CAAP02002916.1-de3b C-term fragment

46650 GQVAEKLNMDKAYGLALQ*AAPLMVHPQPRLSPQGFGK* 46534

 

>CYP75B38-de3c CAAP02002916.1-de3c  C-term fragment

35664 GLTLQRAAPLMVHPQPRLSPQGFG 35593

 

>CYP75B38 gi|78183420|dbj|BAE47004.1| AB213603.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase 96% to CYP75B32

1 aa diff to CAO64444.1

MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGQALDDF

RHIRQEEVLALMRALAREGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNAD

GEGGKLTDVEIKALLLNLFTAGTDTSSSTVEWAIAELIRHPEMMAQAQQELDAVVGRGRLVTDLDLPKLT

YLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVNVWAIARDPEVWEEPLEFRPNRFLPGG

ERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQR

AAPLMVHPRPRLSPQVFGK

 

$$$$

 

>CYP75B39P gi|83944622|gb|ABC48915.1| DQ298200.1 Cabernet Sauvignon genomic

truncated flavonoid 3'-hydroxylase, pseudogene

only 2 aa diffs to BAE47003.1

this seq is called VvF3’H-2 in Castellarin et al. BMC Genomics 2006

VCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMVLAGVFNIGDFVPALEWLDLQGIASKMKKLH

ARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNADGEGGKLTDVEIEALLL (0)

NLFTAGTDTSSSTV

EWAIAELIRHPEMMAQA & (23 bp deletion)

GRGRLVTDLDLPQLTYLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIP

KNATLLVNVWAIARDPEVWEKPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAG

MSLGLRMVHLLTATLVHAFNWELPEGQVAEKLNMDE

 

>CYP75B39P gi|78183418|dbj|BAE47003.1| AB213602.1 Cabernet Sauvignon genomic

flavonoid 3'-hydroxylase pseudogene

96% to CYP75B32

same deletion as DQ298200.1, 2 aa diffs

MNPLALSFCTALFCVLLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGQALDDF

RHIRQ ()

EEVLALMRALARAGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNAD

GEGGKLTDVEIKALLL (0) 1396

NLFTAGTDTSSSTVEWAIAELIRHPEMMAQA 1584 & (23 bp deletion)

1586 GRGRLVTDLDLPQLTYLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIPKNATLLVN 1762

1763 VWAIARDPEVWEKPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMV 1942

1943 HLLTATLVHAFNWELPEGQVAEKLNMDEAYGLTLQRAAPLMVHPRPRLSPQVFGK 2107

 

>CYP75B39P gi|147825152|emb|CAN62275.1| AM488740.1 Pinot Noir genomic

96% to CYP75B32 1 aa diff to CAN75347 plus deletion same as in AB213602 and DQ298200

MNPLALIFCTALFCILLYHFLTRRSVRLPPGPKPWPIVGNLPHLGPVPHHSIAALAKTYGPLMHLRMGFV

DVVVAASASVAAQFLKTHDANFSNRPPNSGAKHIAYNYQDLVFAPYGPRWRMLRKICSVHLFSGQALDDF

RHIRQEEVLALMRALARAGQTPVKLGQLLNVCTTNALGRVMLGRRVFGDGSGGEDPKADEFKEMVVELMV

LAGVFNIGDFVPALEWLDLQGVASKMKKLHARFDAFLGAIVEEHKISGSAGSERHVDLLSTLISLKDNAD

GEGGKLTDVEIKALLL (0)

NLFTAGTDTSSSTVEWAIAELIRHPEMMAQA & (23 bp deletion)

PGRGRLVTDLDLPQLTYLQAIVKETFRLHPSTPLSLPRMAAESCEINGYHIPKN

ATLLVNVWAIARDPEVWEKPLEFRPNRFLPGGERPNADVRGNDFEVIPFGAGRRICAGMSLGLRMVHLLT

ATLVHAFNWELPEGQVAEKLNMDEAYGLTLQRAAPLMVHPRPRLSPQVFGK

 

 

CYP76 family (24 genes) [23 pseudogenes]

 

CYP76A subfamily (5 genes) [3 pseudogenes]

 

>CYP76A10 CAAP02005158.1b 66% to CAN77399.1

23434 MEWTTNFLVWLIIPFLSALLLLLHRLKSGFNKHLPPGPPGWPIFGNIFDLGTLPHQKLAG 23255

23254 LRDTYGDVVWLNLGYIGTMVVQSSKAAAELFKNHDLSFSDRSIHETMRVHQYNESSLSLA 23075

23074 PYGPYWRSLRRLVTVDMLTMKRINETVPIRRKCVDDLLLWIEEEARGMDGTATGLELGRF 22895

22894 FFLATFNMIGNLMLSRDLLDPQSRKGSEFFTAMRISMESSGHTNFADFFPWLKWLDPQGL 22715

22714 KKRMEVDLGKSIEIASGFVKERMRQGRAEESKRKDFLDVLLEFQGDGKDEATKISEKGIN 22535

22534 IFIT (0) 22523

22430 EMFMAASETTSSTMEWAMTELLRSPESMTKVKAELGRVIGEKRKLEESDLDDLPYLH 22260

22259 AVVKETLRLHPAAPFLVPRRAVEDTKFMGYHIPKGTQVFVNVWAIGREAETWDDALCFKP 22080

22079 ERFVDSNMDYKGQNFEFIPFGAGRRICVGIPLAYRVLHFVLGSLLHHFDWQLERNVTPE 21903

21902 TMDMKERRGIVICKFHPLKAVPKIKPIST* 21813

 

>CYP76A12 gi|147791648|emb|CAN77399.1| AM476034.2 (871-3002)

44% to 76G1

CAAP02005158.1a 16011-13907 (-) strand 100% match

MVDWASNILLWCIILVIPVLFLLLHRRRSGSVRLPPGPPGWPVFGNMFDLGAMPHETLAGLRHKYGDVVW

LNLGAIKTTVVQSSKAAAELFKNQDLCFSDRTITETMRAQGYHESSLALAPYGPHWRSLRRLMTMEMLVT

KRINETAGVRRKCVDDMLSWIEEEARGVGGEGRGIQVAHFVFLASFNMLGNLMLSCDLLHPGSKEGSEFF

EVMVRVMEWPGHPNSADFFPWLRWMDPQGLRKKAERDLGIAMKIASGFVQERIKRGPAAEDHKKDFLDVL

LDFQGSGKNEPPQISDKDLNIIILEIFMAGSETTSSTVEWALTELLRHPECMAKVKAELGRVVGASGKLE

ERHIDDLQYLQAVVKETFRLHPPIPFLVPRKAVRDTNFMGYHIPKNTQLFVNVWAIGREAELWEEPSSFK

PERFLDLNHIDYKGQHFZLIPFGAGRRMCAGVPLAHRMVHLVLGSLVYHFDWQLDSSITLETMDMRENLA

MVMRKLEPLKALPKKVSL

 

>CYP76A13 gi|147791649|emb|CAN77400.1 AM476034.2 (10417-12389)

45% to 76G1

CAAP02005373.1b 19219-17303 (-) strand, 2 aa diffs

Missing N-term seq

Nearly identical to adjacent gene CAN77399.1 3 aa diffs

MXDWASNILLWCIILVIPVLFLLLXXRRSGSVRLPPGPPG

WPVFGNMFDLGA

MPHETLAGLRHKYGDVVWLNLGAIKTTVVQSSKAAAELFKNQDLCFSDRTITETMRAQGYHESSLALAPY

GPHWRSLRRLMTMEMLVTKRINETAGVRRKCVDDMLSWIEEEARGVGGEGRGIQVAHFVFLASFNMLGNL

MLSCDLLHPGSKEGSEFFEVMVRVMEWSGHPNFADFFPWLRWMDPQGLRKKAERDLGIAMKIASGFVQER

IKRGPAAEDHKKDFLDVLLDFQGSGKNEPPQISDKDLNIIILEIFMAGSETTSSTVEWALTELLRHPECM

AKVKAELGRVVGANGKLEERHIDDLQYLQAVVKETFRLHPPIPFLVPRKAVRDTNFMGYHIPKNTQLFVN

VWAIGREAELWEEPSSFKPERFLDLNHIDYKGQHFELIPFGAGRRMCAGVPLAHRMVHLVLGSLVYHFDW

QLDSSITLETMDMRENLAMVMRKLEPLKALPKKVSL*

 

>CYP76A14P CAAP02005373.1b pseudogene 64% to CAN77399.1

5191 MERASNFLLYLIVISSSAMSFMLCRRKSGFNRLPPRPIGWPILSNMLDLGTMLHQTLAGLRHK 5003

5002 YGDVVWLRLGAIKTMVILSSKAAGELFKNHDLSFADRSIGETMRVHEYNEGSLALVPYGP 4823

3499 LTTDMFTVRRINETANVRRKCVDDMLLWIEKEALGVNGEASSVHVAEAVFLSNMLGNL 3326

3325 MLSRDVLDLRSEEGSEFFTIMSNLTEWSGHPNLADFFPWLGWLNL*GLRK 3176

2669 KSQQRDLGKAMEMASGFVNERMKKQRTEGTKRKDFLDVLLEFEGNGRDEPAKTSDRDVNI 2490

2489 FIL (0) 2481

2336 EIFMAGSETSSSIVEWVMTELLRNPKSMSKVKDELARVVGADRNVEESDIDELQYLQ 2166

2165 AVVKETLRLHPPIPFLIPRSAIQDTSFMGYHIPKDTQVLVNAWAIGRDPGS*EDPSSFKP 1986

1985 ERFLDSKKIDYKGQNFE 1935

 752 LIPFGAGRRICAGIPLAHRVLHLVLGTLLHHFDWQLEGNVTPETMDMKEKWGLVMLESQP 573

 572 LKAVPKKLT* 543

 

>CYP76A15 gi|147774514|emb|CAN76783.1| 42% to 76G1

CAAP02013124.1 = CAN76783.1

CAAP02000672.1b 113501-112101 added missing parts

113663 MELSTASIVFWSCFFSAALLLFLRLIKFTKGSTKSTPPGPQGWPIFGNIFDLGT

1403 LPHQTLYRLRPQHGPVLWLQLGAINTMVVQSAKAAAELFKNHDLSFSDRNVPFTLTAHNY 1224

1223 DQGSMALGKYGPYWRMIRKVCASELLVNKRINEMGSLRRKCVDDMIRWIEEDAAKSGAEG 1044

1043 RAGEVELPHFLFCMAFNLIGNITLSRDVVDIKSKDGHEFFQAMNGVVEWAGRPNIADFFP 864

 863 LLKRLDPLGMMRNMVRDMGQALNLIARFVKERDEERQSGMVREKRDFLDVLLECRDDEKE 684

 683 GPHEMSDNKVKIIVL (0) 639

 413 EMFFAGSDTTSSTLEWAMTELLRRPESMRKAQEELDRVVGPHGKVEESDIDQLLYLQ 243

 242 AVVKETLRLHPPIPLLLPRNALQDTNFMGYFVPKNTQVFVNAWAIGRDPDAWKEPLSFKP 63

  62 DRFLGSNLDYKGQNFEFIPF 3

     GSGRRICIGISLANKLLPLALASLLHCFDWELGGGVTPET 111981

111980 MDMNERVGITVRKLIPLKPIPKRRTV* 111900

 

>CYP76A15-de2b CAAP02000672.1b-de2b pseudogene, C-term

111539 KEERFIDSDKQ*GDGFVLMASLAGIPSTLAHKVMHLVLLGLLLHRFDWDLEWDIFPK 111369

 

>CYP76A16 gi|147774515|emb|CAN76784.1| 49% to 76G6, 44% to 76G1

CAAP02000672.1a 105702-104085 (-) strand 100% match

MSSLLWWSAFFSAALLVLLRRIKPRKGSTKLRPPGPQGWPILGNIFDLGTMPHQTLYRLRSQYGPVLWLQ

LGAINTVVIQSAKVAAELFKNHDLPFSDRKVPCALTALNYNQGSMAMSNYGTYWRTLRKVCSSELLVIKR

INEMAPLRHKCVDRMIQWIEDDATMARVQGGSGEVEVSHLVFCVAFNLIANLMLSRDFFDMKPKEGNEFY

BAMNKIMELAGKPNTADFFPFLKWLDPQGIKRNMVRELGRAMDIIAGFVKERVEERQTGIEKEKRDFLDV

LLEYRRDGKEGSEKLSERNMNIIILEMFFGGTETTSSTIEWAMTELLRKPKSMRKVKEELDRVVGPDRKV

EESDIDELLYLQAVVKETLRLHPALPLLIPRNALQDTNFMGYFIPQNTQVFVNAWSIGRDPEAWHKPLSF

KPRRFLGSDIDYKGQNFELIPFGSGRRMCIGMPFAHKVVPFVLASLLHCFDWELGSNLTPETIDMNERVG

LTLRKLVPLKAIPRKRIVRDR

 

>CYP76A17P CAAP02000672.1c pseudogene 93% to CAAP02005373.1b

117421 DLRSKEGSEFFTIMSNLTEWSGHPNLSDFFPWLGWLDLQGLRKNMERDLGKAMEMASGF 117245

117244 VNERMKKQRTEGTKRKDFLDVLLEFEGNGKDEPAKISDRDVIIFIL (0) 117107

117000 EIFLAGSETSSSIVEWAMTELLRNPKSMSEVKDELARVVGADRNVEESDIDELQYLQAVV 116821

116820 KETLRLHPPIPFLILRSAIQDTSFMGYHIPKDTQVLVNARAIGRDPGSWEDPSSFKPERF 116641

116640 LDSKKIEYKGQNFELIPFGAGRRICAGIPLAHRVLHLVLGTLLHHFDWQLKGNVTPETMD 116461

116460 MKEKWGLVMRKSQPLKAVPKKLT* 116389

 

CYP76F subfamily (5 genes) [2 pseudogenes]

 

>CYP76F2 gi|7406712|emb|CAB85635.1| putative ripening-related P-450 enzyme

same as AJ237995

MELLSCLLCFLAAWTSIYIMFSARRGRKHAAHKLPPGPVPLPIIGSLLNLGNRPHESLANLAKTYGPIMT

LKLGYVTTIVISSAPMAKEVLQKQDLSFCNRSIPDAIRAAKHNQLSMAWIPVSTTWRALRRTCNSHLFTS

QKLDSNTHLRHQKVQELLANVEQSCQAGGPVDIGQEAFRTSLNLLSNTIFSVDLVDPISETAQEFKELVR

GVMEEAGKPNLVYYFPVLRQIDPQGIRRRLTIYFGRMIEIFDRMIKQRLQLRKIQGSIASSDVLDVLLNI

SEDNSNEIERSHMEHLLLDLFAAGTDTTSSTLEWAMAELLHNPETLLKARMELLQTIGQDKQVKESDISR

LPYLQAVVKETFRLHPAVPFLLPRRVEGDADIDGFAVPKNAQVLVNAWAIGRDPNTWENPNSFVPERFLG

LDMDVKGQNFELIPFGAGRRICPGLPLAIRMVHLMLASLIHSYDWKLEDGVTPENMNMEERYGISLQKAQ

PLQALPVRV

 

>CYP76F10P CAAP02001054.1  75% to 76F2

34678 MDLMSYLLCLLVAWTSIYIVVSARRSKSGAGKLPPGPVPFPIIGNLLNLGNKPH 34517

34516 ESLANLAKIYGPVMSLKLGCVTTVVITSATMAKEVLQKKDQSFCNRTIPDALRALNHNQI 34337

34336 SMVWLPVSTKWRTLRKICNSHIFTNQKLDSSNYLRHQKVQDLLANVEQSCQAGDVVDIGQ 34157

34156 EAFRTTLNLLSNTTFSVDLVEPSSDTVQEFKELVRHMMEEAAKPNLADYFPVVRKIDPQG 33977

33976 IRRRMAIHFGKMIKVLDKKVKQRLRSRQVQGWMASSDVLDTLLNISEDSNNFLDITHIDH 33797

33796 LLL (0)

32324 DLFVAGTDTTANTLEWAMAELLHNPETLLRVQAELRQTIGKDKLVKESDIARLPYLQA 32151

32150 VVKETFRLHPAVPFLLPRKVEVDTEMCGFIVPKDAQVLVNVWAIGRDPNLWENPNLFMPE 31971

31970 RFLGSDMDVRGQNFELIPFGAGRRICPGLLLGIRMVQLMLASLIHSNDWKLEDGLTPENM 31791

31790 NMEEKFGFTLQKAQPLRVLPIHV 31722

 

>CYP76F10P gi|147816105|emb|CAN66326.1 AM481161.2

60% to 76C1

2 AA DIFFs to 76F10P

4841 MDLMSYLLCLLVAWTSIYIXVSARRSKSGAGKLPPGPVPFPIIGNLLNLGNKPHESLANL 4662

4661 AKIYGPVMSLKLGCVTTVVITSATMAKEVLQKKDQSFCNRTIPDALRALNHNQISMVWLP 4482

4481 VSTKWRTLRKICNSHIFTNQKLDSSNYLRHQKVQDLLANVEQSCQAGDVVDIGQEAFRTT 4302

4301 LNLLSNTTFSVDLVEPSSDTVQEFKELVRHMMXEAAKPNLADYFPVVRKIDPQGIRRRMA 4122

4121 IHFGKMIKVLDKKVKQRLRSRQVQGWMASSDVLDTLLNISEDSNNFLDITHIDHLLL 3951

2471 DLFVAGTDTTSNTLEWAMAELLHNPETLLRVQAELRQTIGKDKLVKESDIARLPYLQA 2298

2297 VVKETFRLHPAVPFLLPRKVEVDTEMCGFIVPKDAQVLVNVWAIGRDPNLWENPNLFMPE 2118

2117 RFLGSDMDVRGQNFELIPFGAGRRICPGLLLGIRMVQLMLASLIHSYDWKLEDGLTPENM 1938

1937 NMEEKFGFTLQKAQPLRVLPIHV 1869

 

>CYP76F11 CAAP02001054.1  pseudogene 71% to 76F2

10316 MDLFSCLLCLLVAWASIYIVVSARRRKSGAGKLPPGPVPFPIIGNLLNLGNKPH 10155

10154 ESLANLAKIHGPVMTLELGCVTTVVITSATMAKEVLQKKDQSFCNRTIPDALRAHNHNQL 9975

 9974 SVVWLPASTKWRTLRK 9927

      NSHIFTSQKLDSNAHL (this seq is inverted)

 9885 NCQAGDVVDIGLEAFRTTLNLLSNTIFSVDLVEPSSDTVQEFKELVRNMMEEAAKPN 9715

      (GAP)

 9713 MAIHFGNMIEVFDKMVKQRLRSRQVQGWMASSDVLHILLTISEDSNNVLDITNIDHLLL 9537

 9389 DLFAAGTDTTTNTLEWAMA 9333

 9333 KLLHKPETLRRVQVELLQTIGKDKLVKESDIAQLPYLQAVVKETFRLHPAVPLLLPRKAD 9154

 9153 VDTDICGFIVPKDAQVLVNVWAIGRDPNLWENPNSFMPERFLGSDMDVRGQNFELIPFGA 8974

 8973 GRRICPGIRMIHLMLASLLHSYDWKLEDGVTPENMNMEEKFGVTLQNAQPLRALPT 8806

 8805 LV* 8797

 

>CYP76F11 gi|147802689|emb|CAN72997.1| AM480526.2 60% to 76C1

3 aa diffs to 76F11

11345 MDLFSCLLCLLVAWASIYIVVSARRRKSGAGKLPPGPVPFPIIGNLLNLGNKPHESLANL 11166

11165 AKIHGPVMTLELGCVTTVVITSATMAKEVLQKKDQSFCNRTIPDALRAHNHNQLSVVWLP 10986

10985 ASTKWRTLRK 10956

10911 NSHIXTSQKLDSNAHL 10958 (this seq is inverted)

10914 NCQAGDVVDIGLEAFRTTLNLLSNTIFSVDLVEPSSDTVQEFXELVRNMMEEAAKPN 10744

      (gap)

10742 MAIHFGNMIEVFDKMVKQRLRSRQVQGWMASSDVLHILLTISEDSNNVLDITNIDHLLL 10566

10415 DLFAAGTDTTTNTLEWAMA 10359

10359 KLLHKPETLRRVQVELLQTIGKDKLVKESDIAQLPYLQAVVKETFRLHPAVPLLLPRKAE 10180

10179 VDTDICGFIVPKDAQVLVNVWAIGRDPNLWENPNSFMPERFLGSDMDVRGQNFELIPF 10006

      XAGRRICPGIRMIHLMLASLLHSYDWKLEDGVTPENMNMEEKFGVTLQKAQPLRALPTLV 9826

 

>CYP76F12 CAAP02002347.1d = CAN79423.1 86% to CYP76F2

CAN79423.1 has 2 aa diffs and small deletion

24772  MEMLSCLLCFLVAWTSIYIMFSVRRGSQHTAYKLPPGPVPLPIIGNLLNLGNRPHESLAE  24951

24952  LAKTYGPIMTLKLGYVTTIVISSAPMAKEVLQKQDLSFCNRFVPDAIRATNHNQLSMAWM  25131

25132  PVSTTWRVLRKICNSHLFTTQKLDSNTHLRHHKVQELLAKVEESRQAGDAVYIGREAFRT  25311

25312  SLNLLSNTIFSVDLVDPISETVLEFQELVRCIIEEIERPNLVDYFPVLRKIDPQGIRRRL  25491

25492  TIYFGKMIGIFDRMIKQRLQLRKMQGSIATSDVLDTLLNISEDNSNEIERNHMEHLLL (0) 25677

       DLFVAGTDTTSSTLEWAMAE  25851

25852  LLHNPEKLLKARVELLQTIGKDKQVKESDITRLPFLQAVVKETFRLHPVVPFLIPHRVEE  26031

26032  DTDIDGLTVPKNAQVLVNAWAIGRDPNIWENPNSFVPERFLELDMDVKGQNFELIPFGAG  26211

26212  RRICPGLPLATRMVHLMLASLIHSCDWKLEDGMTPENMNMEDRFGITLQKAQPLKAIPIRV*  26397

 

>CYP76F13P CAAP02002347.1c pseudogene missing N-term and insertion in exon 2

93% to 76F2 CAB85635.1

11894  KVQELLANVEQRCQAGGPVDIGREAFRTSLNLLSNAIFSVDLVDPISETAQEFKELVRGV  11715

11714  MEEAGKPNLVDYFPVLRQIDPQGIRRGLTIYFGRMIEIFDRMIKRRLRLRKMQGSIASSD  11535

11534  VLDILLNISEDNSNEIERSHMEHLLL (0)

       DLFVAGTDTTSSTLEWAM  11250

10487  AELLYNPEKLLKARMELLQTIGQDKQVKESDITRLPYVQAVVKETFRLHPAVPFLL  10320

10319  PRRVEEDTDIQGFTVPKNAQVLVNAWAIGRDPNTWENPNSFVPERFLGLDMDVKGQNFEL  10140

10139  IPFGAGRRICPGLPLAIRMVHLMLASLIHSYDWKLEDGVTPENMNMEESFGLSLQKAQPL  9960

9959   QALPVRV  9939

 

>CYP76F14 CAAP02002347.1b 98% to 76F2 CAB85635.1 7 aa diffs

6638