This file includes 222 sequences found in GenPept by searching for

Vitis[orgn] AND P450 on Sept. 20, 2007.  These start with CAN.

Note: on Oct 4, the same search found 642 accessions.  These include

416 Sequences from the other grape genome project starting with CAO.

Click here for a link to those 416 sequences.

 

These automated assemblies have not been checked against known

P450s for errors in assembly, gene fusions etc.

 

262 accessions from the grape genome project in the WGS section have been

mined for P450s and they have been assembled and sorted into family groups. 

(see bottom of file for a complete list of the CAAP accessions)

 

591 sequences are present below but some are duplicates. Gene sequences are being clustered into identical or presumed identical gene bins indicated with an #

followed by a number.  Pseudogenes are being labeled in a similar way with an

@ sign followed by a number.  (in progress)

Oct. 4, 2007, revised Nov. 14, 2007

 

215 genes are now named with 375 sequences remaining to be named.

 

Last modified Feb. 26, 2008

 

P450 sequences in CYP family order

 

Table of 49 P450 families present

CYP83-like sequences here merged with CYP71AT

(missing 91 (merged with CYP81), 95 (part of the CYP72 family), 99 (grass specific),

702 (Brassicales only), 705 (part of CYP712), 708 (Brassicales only), 713 (merged with CYP71A),

717 (merged with CYP81), 719 (Ranunculales), 723 (grass specific), 725 (Taxus only overlaps 716),

726 (part of CYP71, Euphorbia), 729, 730 (protist contaminant),

731 (protist contaminant), 732 (protist contaminant))

 

The only missing family that appears lost in Vitis is CYP729

 

The green shading below indicates that those families have had their genes named.

 

CYP51   2 genes, 1 pseudogene

CYP71  24 genes 28 pseudogenes

CYP72                            55 sequences

CYP73   3 genes, 0 pseudogenes

CYP74   7 genes, 0 pseudogenes

CYP75                            43 sequences

CYP76                            46 sequences

CYP77   2 genes, 0 pseudogenes

CYP78   7 genes, 0 pseudogenes

CYP79                            26 sequences

CYP80   6 genes, 0 pseudogenes

CYP81                            43 sequences

CYP82                            82 sequences

CYP84   3 genes, 0 pseudogenes

CYP85   2 genes, 1 pseudogene

CYP86   6 genes, 1 pseudogene

CYP87   7 genes, 0 pseudogenes

CYP88   2 genes, 0 pseudogenes

CYP89                            30 sequences

CYP90   4 genes, 1 pseudogene

CYP92   6 genes, 1 pseudogene

CYP93   4 genes, 0 pseudogenes

CYP94   9 genes, 0 pseudogenes

CYP96   6 genes, 2 pseudogenes

CYP97   3 genes, 0 pseudogenes

CYP98   1 gene,  0 pseudogenes

CYP701  1 gene,  0 pseudogenes

CYP703  1 gene,  0 pseudogenes

CYP704  6 genes, 0 pseudogenes

CYP706  9 genes, 7 pseudogenes

CYP707  5 genes, 0 pseudogenes

CYP709  1 gene,  0 pseudogenes

CYP710  1 gene,  1 pseudogene

CYP711  1 gene,  1 pseudogene

CYP712  2 genes, 2 pseudogenes

CYP714                           22 sequences

CYP715  1 gene,  0 pseudogenes

CYP716                           28 sequences

CYP718  0 genes, 1 pseudogene

CYP720  1 gene,  0 pseudogenes

CYP721  5 genes, 3 pseudogenes

CYP722  1 gene,  0 pseudogenes

CYP724  2 genes, 0 pseudogenes

CYP727  1 gene,  0 pseudogenes

CYP728  6 genes, 2 pseudogenes

CYP733  1 gene,  0 pseudogenes

CYP734  2 genes, 0 pseudogenes

CYP735  1 gene,  0 pseudogenes

CYP736  8 genes, 4 pseudogenes

 

Totals 160   +   56        +     375   =   591

160 named genes, 56 named pseudogenes = 216 named sequences, 375 to name.

 

#1

>CYP51G CAAP02000072.1 81% to 81G1 Arab.

190429  MDVDNKFFNVALLIVATVVVAKLISALLIPKSRKRLPPTVKAFPVIGGLLRFLKGPVV  190256

190255  MLREEYPKLGSVFTLNLLNKNITFFIGPEVSAHFFKAPEADLSQQEVYQFNVPTFGPGVV  190076

190075  FDVDYSVRQEQFRFFTESLRVTKLKGYVDQMVTETE (0) 189968

188248  DYFSKWGDSGEVDLKYELEHLIILTASRCLLGQEVRDKLFADVSALFHDLDNGMLPISV  188072

188071  IFPYLPIPAHRRRDQARTKLAHIFANIIASRRETGKSENDMLQCFMDSKYKDGRQTTEAE  187892

187891  VTGLLIAALFAGQHTSSITSTWTGAYLFRHKEFLSAVLDEQKNLMKKHGNKVDHDILSEM  187712

187711  DVLYRCIKEALRLHPPLIMLLRSSHSDFSVTTKDGKEYDIPKGHIVATSPAFANRLPHIY  187532

187531  KDPERYDPDRFAVGREEDKVAGAFSYISFGGGRHGCLGEPFAYLQIKAIWSHLLRNFEFE  187352

187351  LISPFPEIDWNAMVVGVKGKVMVRYKRRVLPVD*  187250

 

#2

>CYP51G CAAP02000381.1 = AM475390.2, 81% to 81G1 Arab. 90% to CAAP02000072.1

97293   MDVDNKFFNAAFLLVATLVVAKLISALIIPRSKKRLPPTIKAFPLIGGLIRFLKGP  97460

97461   VVMLREEYPKLGSVFTLKLLNKNISFFIGPDVSAHFFKAPESDLSQQEVYRFNVPIFGPG  97640

97641   VVFDVDYSVRQEQFRFFTEALRVTKLKGYVDQMVMEAE (0) 97754

104754  DYFSKWGDCGEVDLKYELEHLIILTASRCLLGQEIRNKLFADVSALFHDLDNGMLPISV  104930

104931  IFPYLPIPAHRRRDQARKKLAEIFANIIASRKETGKSENDMLQCFIASKYKDGRPTTESE  105110

105111  VTGLLIAALFAGQHTSSITSTWTGAYLLRHKEYLSAVQDEQRSLMKKYGSKVDHDILSEM  105290

105291  DVLYRCIKEALRLHPPLIMLLRSSHTDFSVTTRDGKEYDIPKGHIVATSPAFANRLPHIY  105470

105471  KDPDRYDPDRFAVGREEDKAAGAFSYISFGGGRHGCLGEPFAYLQIKAIWSHLLRNFELE  105650

105651  LISPFPEVDWNAMVVGVKGKVMVRYKRRELPVN*  105752

 

>CYP51G1 AM475390.2 Vitis vinifera (Pinot noir grape) = CAAP02000381.1

9521  MDVXXKFFNAXFLLVATLLVAKLISALIIPRSKKRLPPTIKAFPLIGGLIRFLKGPVVML  9342

9341  REEYPKLGSVFTLKLLNKNISFFVGPDVSAHFFKAPESDLSQQEVYRFNVPIFGPGVVFD  9162

9161  VDYSVRQEQFRFFTEALRVTKLKGYVDQMVMEAE   (0) 9060

3862  DYFSKWGDCGEVDLKYELEHLIILTASRCLLGQEIRNKLFADVSALFHDLDNGMLPISV  3686

3685  IFPYLPIPAHRRRDQARKKLAEIFANIIASRKETGKSENDMLQCFIDSKYKDGRPTTESE  3506

3505  VTGLLIAALFAGQHTSSITSTWTGAYLLRHKEYLSAVQDEQRSLMKKYGSKVDHDILSEM  3326

3325  DVLYRCIKEALRLHPPLIMLLRSSHTDFSVTTRDGKEYDIPKGHIVATSPAFANRLPHIY  3146

3145  KDPDRYDPDRFAVGREEDKAAGAFSYISFGGGRHGCLGEPFAYLQIKAIWSHLLRNFELE  2966

2965  LISPFPEVDWNAMVVGVKGKVMVRYKRREL  2876

 

@1

>CYP51G1 pseudogene CAAP02006913.1

792 SHIFIGGGRNRCLGQHFAYLQVKAMWSHLL*NFEL*PISPFSKINWNAMVVGV 950

 

>CYP71AH1 old 71A11 tobacco

MKFLLVVASLFLFVFLILSATKRKSKAKKLPPGPRKLPVIGNLLQIGKLPHRSLQKLSNEYGDFIFLQLGSVPTVV

VFSAGIAREIFRTQDLVFSGRPALYAGKRFSYNCCNVSFAPYGNYWREARKILVLELLSTKRVQSFEAIRDEEVSS

LVQIICSSLSSPVNISTLALSLANNVVCRVAFGKGSDEGGNDYGERKFHEILFETQELLGEFNVADYFPGMAWINK

INGLDERLEKNFRELDKFYDKIIEDHLNSSSWMKQRDDEDVIDVLLRIQKDPNQEIPLKDDHIKGLLADIFIAGTD

TSSTTIEWAMSELIKNPRVLRKAQEEVREVAKGKQKVQESDLCKLEYLKLVIKETLRLHPPAPLLVPRVTTASCKI

MEYEIPADTRVLINSTAIGTDPKYWENPLTFLPERFLDKEIDYRGKNFELLPFGAGRRGCPGINFSIPLVELALAN

LLFHYNWSLPEGMLPKDVDMEEALGITMHKKSPLCLVASHYNLL

 

>CYP71AH2 tobacco

MNFLVVLASLFLFVFLMRISKAKKLPPGPRKLPIIGNLHQIGKL

PHRSLQKLSNEYGDFIFLQLGSVPTVVVSSADIAREIFRTHDLVFSGRPALYAARKLS

YNCYNVSFAPYGNYWREARKILVLELLSTKRVQSFEAIRDEEVSSLVQIICSSLSSPV

NISTLALSLANNVVCRVAFGKGSAEGGNDYEDRKFNEILYETQELLGEFNVADYFPRM

AWINKINGFDERLENNFRELDKFYDKVIEDHLNSCSWMKQRDDEDVIDVLLRIQKDPS

QEIPLKDDHIKGLLADIFIAGTDTSSTTIEWAMSELIKNPRVLRKAQEEVREVSKGKQ

KVQESDLCKLDYLKLVIKETFRLHPPVPLLVPRVTTASCKIMEYEIPVNTRVFINATA

NGTNPKYWENPLTFLPERFLDKEIDYRGKNFELLPFGAGRRGCPGINFSIPLVELALA

NLLFHYNWSLPEGMLAKDVDMEEALGITMHKKSPLCLVASHYTC

 

>71A9/CYP71AH3 Glycine max

MISFTVFVFLTLLFTLSLVKQLRKPTAEKRRLLPPGPRKLPFIG

NLHQLGTLPHQSLQYLSNKHGPLMFLQLGSIPTLVVSSAEMAREIFKNHDSVFSGRPS

LYAANRLGYGSTVSFAPYGEYWREMRKIMILELLSPKRVQSFEAVRFEEVKLLLQTIA

LSHGPVNLSELTLSLTNNIVCRIALGKRNRSGADDANKVSEMLKETQAMLGGFFPVDF

FPRLGWLNKFSGLENRLEKIFREMDNFYDQVIKEHIADNSSERSGAEHEDVVDVLLRV

QKDPNQAIAITDDQIKGVLVDIFVAGTDTASATIIWIMSELIRNPKAMKRAQEEVRDL

VTGKEMVEEIDLSKLLYIKSVVKEVLRLHPPAPLLVPREITENCTIKGFEIPAKTRVL

VNAKSIAMDPCCWENPNEFLPERFLVSPIDFKGQHFEMLPFGVGRRGCPGVNFAMPVV

ELALANLLFRFDWELPLGLGIQDLDMEEAIGITIHKKAHLWLKATPFCE

 

#9

>CYP71AH4 CAAP02005003.1a, 53% to 71B.d, 64% to 71A9, 62% to 71AH2 Nicotiana tabacum DQ350356.1

note 71A9 is 58% to 71AH2 so it is probably misnamed should be CYP71AH3

17504  MGISSFQASHSMVSQSLLLLLLVIFSALLLFLLSTKQKRKSVASRRLPPGPKKLPLIGNLHQLGSLPH  17301

17300  VGLQRLSNEYGPLMYLKLGSVPTLVVSSADMAREIFREHDLVFSSRPAPYAGKKLSYGCN  17121

17120  DVVFAPYGEYWREVRKIVILELLSEKRVQSFQELREEEVTLMLDVITHSSGPVYLSELT  16944

16943  FFLSNNVICRVAFGKKFDGGGDDGTGRFPDILQETQNLLGGFCIADFFPWMGWFNKLNG  16767

16766  LDARLEKNFLELDKIYDKVIEEHLDPERPEPEHEDLVDVLIRVQKDPKRAVDLSIEKIKGVLLT (0) 16575

16475  DMFIAGTDTSSASLVWTMAELIRNPSVMRKAQEEVRSAVRGKYQVEESDLSQLIYLKLVVKE  16310

16309  SLRLHPPAPLLVPRKTNEDCTIRGYEVPANTQVFVNGKSIATDPNYWENPNEFQPE  16142

16141  RFLDSAIDFRGQNFELLPFGAGRRGCPAVNFAVLLIELALANLLHRFDWELADGMRREDL  15962

15961  DMEEAIGITVHKKNPLYLLATPAN*  15887

 

@7

>CYP71AH5P CAAP02005003.1b, pseudogene 70% to CAAP02005003.1a

26375  NVAFTSFGEY*KEVRNIVILEVLSAKRVHSFQ

25611  HGWMQAIKLMFDVIAHSSGPVNSIELRVFLSNNVIC*VAFGTKFDGGGDNGTRRFPEIL  25435

25434  QETQNLLGGFCIADFFPWMGWFDKLNAWLGCQVDKNFMELNRIYDKGIEMHLDPERPEPE  25255

25254  HEDLVDVLI*VQKDLRQVVSLSNEKIKGVLT (0)

25074  VHCSD*YPFSLAGMDNAEMIRNRSVMRKAQEKVRSTVRGKYQVEESDLSQLIYLKLVVKE  24895

24894  SLRLHLPAPSLVPRKTTKNCTI  24829

24815  FPQIHVFVNGNLISIDSNYWENPNEFQPERFVDSSIDFRGQSFEFLPFGASMRGCPGANF  24636

24635  AVLLIEVALTNILHRLTGNFLMG  24567

 

>CYP71AH6 Gossypium raimondii 58% to CAAP02005003.1a, 53% to 71A9/71AH3

CO072855.1 CO095493.1, CO072856.1

MDFQFILTLSFIAFTLMVFKYKARTRRLPPGPWKLPIIGNLHQLGDSSHKSIQRLSQ

QYGPMMFLQLGAVPTLVISSADAAMAIFKGPGGGYDLAFSGRPTNLYVAKKLSYEYNGIT

FAPYGELWREMRKIAVAELLSSKRVQSFRTIREEEVAAMLNHIDIASSSSAPVNLKKLSL

LLANHVVCRVTFGKKYGGGGDGGTNRFDRVLHEVQHLVGEFVVSDYFPWMWWVNKLNGMETRVEKNFEELDKLY

DEVIADHVAPTRTKANHEDIVDVLLRLQKDARQLITLNNQQIKGVLTDMFIAGTDTTAS

SLVWTFTELIRNPPSMEKVKYEVRKVGNGRDKIEESDIPKLHCLHSVIKETLRLHPPAPL

LVPRETTEDCVVGDYEIPAKTRVIINAKSIGTDPKYWENPHDFQPDRFMKSSVDFKGQHL

EFLPFGVGRRGCPGMSFAIMLLQLMVANFLYRFDWELPEGMSVEDVDMEEELGITVFKKT

PLCLVPIRVV*

 

#10

>CYP71AP5 CAAP02001743.1a, 43% to 71B2, 51% to 71B.c 53% to 71A1, 78% to 71AP4

15554  MALLQWLKEGFLPSFLFAGIILVAVLKFLQKGMLRKRKFNLPPSPRKLPIIGNLHQLGNMPHIS  15363

15362  LHRLAQKFGPIIFLQLGEVPTVVVSSARVAKEVMKTHDLALSSRPQIFSAKHLFYDCTDI  15183

15182  VFSPYSAYWRHLRKICILELLSAKRVQSFSFVREEEVARMVHRIAESYPCPTNLTKILGL  15003

15002  YANDVLCRVAFGRDFSAGGEYDRHGFQTMLEEYQVLLGGFSVGDFFPSMEFIHSLTGMK  14826

14825  SRLQNTFRRFDHFFDEVVKEHLDPERKKEEHKDLVDVLLHVKEEGATEMPLTMDNVKAIIL (0)  14646

14513  DMFAAGTDTTFITLDW  14466

14465  GMTELIMNPKVMERAQAEVRSIVGERRVVTESDLPQLHYMKAVIKEIFRLHPPAPVLVPR  14286

14285  ESMEDVTIDGYNIPAKTRFFVNAWAIGRDPESWRNPESFEPQRFMGSTIDFKGQDF  14118

14117  ELIPFGAGRRSCPAITFGAATVELALAQLLHSFDWELPPGIQAQDLDMTEVFGITMHR  13944

13943  IANLIVLAKPRFP* 13902

 

#7

>CYP71AS3.a CAAP02000057.1 Vitis vinifera 6 genes in a cluster 62% to CYP71AS1

177875  MELYSPSMWLHLLLLLLPLMFLIKRKIELTGQKKPLPPGPTKLPIIG 177735

177734  NLHQLGALPHYSLWQLSKKYGSIMLLQLGVPTVVVSSAEAAREFLKTHDIDCCSRPPLV  177558

177557  GLGKFSYNHRDISFAPYGDYWREVRKICVLEVFSTKRVQSFQFIREEEVALLIDSIVQSS  177378

177377  SSGSPIDLTERLMSLTANIICRIAFGKSFQASEFGDGRFQEVVHEAMALLGGLTAADFFP  177198

177197  YVGRIVDRLTGLHGRLERSFHEMDGFYQQVIEDHLNPGRVKEEHEDIIDVLLRIEREQSE  177018

177017  SSALQFTKDNAKAIVM (0) 176970

176205  DLFLAGVDTGAITVSWAMTELARNPRIMKKAQAEVRNSIGNKGKVTEGDVDQ  176050

176049  LHYLKMVVKETLRLHPPAPLLLPRETMSHFEINGYHFYPKTQVHVNVWAIGRDPNLWKNP  175870

175869  EEFLPERFMDNSVDFRGQHFELLPFGAGRRICPGMYMAIATVELALANLLYRFNWNLPNG  175690

175689  MKETDISMEEAAGLTVRKKFALNLVPILHHC*  175594

 

@5

>CYP71AS3-de1b CAAP02000057.1 54% to CYP71B.a

178567  RLTRLYGWLERRTSYELDGFY*QVIGLHDLKDVKEDFIDVLLQTERD  178427

 

#6

>CYP71AS4.b CAAP02000057.1 Vitis vinifera 6 genes in a cluster

170360  MALYSPSMWLHLLLLLLPLMYLIKRRIELKGQKKPLPPGPTKLPIIG 170220

170219  NLHQLGTLPHYSWWQLSKKYGPIILLQLGVPTVVVSSAEAAREFLKTHDIDCCSRPPLV  170043

170042  GLGKFSYNHQDIGFAPYGDYWREVRKICVHEVFSTKRLQSFQFIREEEVALLIDSIAESS  169863

169862  SSGSPIDLTERLMSLTANIICRIAFGKSFQVSEFGDGRFQEVVREAMALLGGFTAADFFP  169683

169682  YVGRIVDRLTGLHGRLERSFLEMDGFYQRVIEDHLNPGRVKEEHEDIIDVLLKIQRERSE  169503

169502  SGAVQFTKDSAKAILM  (0) 169455

169009  DLFLAGVDTGAITLTWAMTELARNPRIMKKAQVEVRNSIGNKGKVTEGDVDQ  168854

168853  LHYLKMVVKETLRLHPPVPLLLPRETMSHFEINGYHIYPKTQVQVNVWAIGRDPNLWKNP  168674

168673  EEFLPERFMDNSVDFRGQHFELLPFGAGRRICPGMYMVIATVELALANLLYRFNWNLPNG  168494

168493  MREADINMEEAAGLTVRKKFALNLVPILHHC*  168398

 

#8

>CYP71AS4v2 CAN60733.1| 73% to CAN83446.1 62% to 71AS1 55% to 71B34

96% to 71B.d and 71B.e, possible allele of 71B.b, since CAN83446.1 = 71B.e

MELYSPSIWLCLLLLLLPLMYLIKRRIELKGQKKPLPPGPTKLPIIGNLHQLGALPHYSWWQLSKKYGPI

MLLQLGVPTVVVSSVEAAREFLKTHDIDCCSRPPLVGLGKFSYNHRDIGFAPYGDYWREVRKICVLEVFS

TKRVQSFQFIREEEVTLLIDSIAQSSSSGSPIDLTERLMSLTANIICRIAFGKSFQVSEFGDGRFQEVVH

EAMALLGGFTAADFFPYVGRIVDRLTGHHGRLERSFLEMDGFYERVIEDHLNPGRVKEEHEDIIDVLLKI

ERERSESGAVQFTKDSAKAILMDLFLAGVDTGAITLTWAMTELARNPRIMKKAQVEVRNSIGNKGKVTEG

DVDQLHYLKMVVKETLRLHPPAPLLVPRETMSHFEINGYHIYPKTQVXVNVWAIGRDPNLWKNPEEFLPE

RFMDNSVDFRGQHFELLPFGAGRRICPGMYMAIATVELALANLLYRFNWNLPNGMREADINMEEAAGLTV

RKKFALNLVPILHHC

 

#5

>CYP71AS5.c CAAP02000057.1 Vitis vinifera 6 genes in a cluster

157794  MALYSPSIWLHLLLLLLPLMFLIKRKIELKGQKKPLPPGPTKLPIIG 157654

157653  NLHQLGALPHYSLWQLSKKYGSIMLLQLGVPTVVVSSAEAAREFLKTHDIDCCSRPPLV  157477

157476  GLGKFSYNHRDISFAPYGDYWREVRKICVLEVFSTKRVQSFQFIREEEVALLIDSIVQSS  157297

157296  SSGSPIDLTERLMSLTANIICRIAFGKSFQASEFGDGRFQEVVHEAMALLGGLTASDFFP  157117

157116  YVGRIVDRLTGLHGRLERSFHEMDGFYQQVIEDHLNPGRVKEEHEDIIDVLLRIEREQSE  156937

156936  SSALQFTKDNAKAILM  (0) 156889

156035  DLFLAGVDTGAITVAWAMTELARNPGIMKKAQAEVRSSIGNKGKVTESDVDQ  155880

155879  LHYLKVVVKETLRLHPPAPLLLPRETMSHFEINGYHIYPKTQVHVNVWAIGRDPNLWKNP  155700

155699  EEFLPERFMDNSVDFRGQHFELLPFGAGRRICPGMYMAIATVELALANLLYRFNWNLPNG  155520

155519  IREADISMEEAAGLTVRKKFALNLVPILHHC*  155424

 

@4

>CYP71AS5-de1b CAAP02000057.1 65% to CYP71B.c

159495  VKEEHENFIDVLLQTERDRT  159436

 

#4

>CYP71AS6v1 .d CAAP02000057.1 Vitis vinifera 6 genes in a cluster

152305  MALYSPSIWLHLLLLLLPLMFLIKRKIELKGQKKPLPPGPTKLPIIGNLHQLGALPHYSL  152126

152125  WQLSKKYGSIMLLQLGVPT 152069

152068  VVVSSAEAAREFLKTHDIDCCSRPPLVGPGKFSYNHRDIGFAPYGDYWREVRKICVLEVF  151889

151888  STKRVQSFQFIREEEVTLLIDSIAQSSSSGSPIDLTERLMSLTANIICRIAFGKSFQVSE  151709

151708  FGDGRFQEVVHEAVALLGGFTAADFFPYVGRIVDRLTGLHGRLERSFLEMDGFYERVIED  151529

151528  HLNPGRVKEEHEDIIDVLLKIERERSESGAVQFTKDSAKAIIM (0) 151400

150931  DLFLAGVDTGAITLTWAMTELARNPRIMKKAQVEVRSSIGKKGKVTKGDVDQLHYLKMVV  150752

150751  KETLRLHPPVPLLVPRETMSHFEINGYHIYPKTQVHVNVWAIGRDPNLWKNPEEFLPERF  150572

150571  MDNSVDFRGQHFELLPFGAGRRICPGMYMAIATVELALANLLYRFNWNLPNGMREADISM  150392

150391  EEAAGLAVRKKFALNLVPILHHC* 150320

 

>CYP71AS6v2 gi|147855782|emb|CAN83446.1a 2 genes 55% to CYP71B

97% to 71B.d missing some seq after LPII

This may be an allele of 71B.d since it is upstream of 71B.e

MALYSPSXWLHLLLLLLPLMYLIKRXIELKGQKKPLPPGPTKLPII

 

VSSAEAAREFLKTHDIDCCSRPPL

VGXGKFSYNHRDIGFAPYGDYWREVRKICVLEVFSTKRVQSFQFIREEEVTLLIDSIAQSSSSGSPIDLT

ERLMSLTANIICRIAFGKSFQASEFGDGRFQEVVHEAMALLGGFTAADFFPYVGRIVDRLTGLHGRLERS

FLEMDGFYQRVIEDHLNPGRVKEEHEDIIDVLLKIERERSESGAVQFTKDSAKAILMDLFLAGVDTGAIT

LTWAMTELARNPRIMKKAQVEVRNSIGNKGKVTE GDVDQLHYLKMVVKETLRLHPPAPLLVPRETMSHFE

INGYHIYPKTQVHVNVWAIGRDPNLWKNPEEFLPERFMDNSVDFRGQHFELLPFGAGRRICPGMYMAIAT

VELALANLLYRFNWNLPNGMREADINMEEAAG

 

#3

>CYP71AS7v1.e CAAP02000057.1 Vitis vinifera 6 genes in a cluster, 58% to CYP71AS1

135067  MAPYSPDLWLPLVLLFLSLLFLLKKILELKEQKGPPGPPKLPIIG 134933

134932  NLHQLGALIHQSLWQLSKKYGPVMLLHLGFVPTLVVSSAEAAKKVLKDHDISCCSRPPLI  134753

134752  SIGRLSYNYLDISFAPYGPYWREIRKICVLQLFSTNRVQSFQVIREAEVALLIDSLAQSS  134573

134572  SSASPVDLTDKIMSLTANMICRIAFGRSFEGSEFGKGRFQEVVHEATAMMSSFFAADFFP  134393

134392  YVGRIVDRLTGIHERLEKSFHELDCFYQQVIEEHLNPGRMKEEHEDIIDVLLNIEKEQDE  134213

134212  SSAFKLTKDHVKAILM (0) 134165

134087  DLFLAGVDTGAITVVWAMTELARKPGVRKKVQDEVRSHIRERGKVRESDIEQ  133932

133931  FHYLKMVVKETLRLHPPVPLLLPKETMSTIEISGYQIYPKTQVYVNVWAIGRDPNLWNNP  133752

133751  EEFFPERFIDNSVDFKGQHFEFLPFGAGRRVCPAMNMAIAMVELTLANLLYHFNWKLPHG  133572

133571  MKEGDINMEEAPGLSVHKKIALSLVPIKYP*  133479

 

>CYP71AS7v2

CAN83446.1b 2 genes

Contains some intron seq. This seq ortholog to CYP71AS7v1

KKILELKEQKGPPGPPKLPIIGNLHQLGALIHQSLWQLSKKH

GPVMLLHLGFVPTLVVSSAEAAKKVLKDHDISCCSRPPLISIGRLSYNYLDISFAPYGPYWREIRKICVL

QLFSTNRVQSFQVIREAEVALLIDSLAQSSSSASPVDLTDKIMSLTANMICRIAFGRSFEGSEFGKGRFQ

EVVHEATAMMSSFFAADFFPYVGRIVDRLTGIHERLEKSFHELDCFYQQVIEEHLNPGRMKEEHEDIIDV

LLNIEKEQDESSAFKLTKDHVKAILMAYFFEQDLFLAGVDTGAITVVWAMTELARKPGVRKK

Missing some seq here

EKFRESDI

EQFHYLKMVVKETLRLHPPVPLLLPKETMSTIEISGYQIYPKTQVYVNVWAIGRDPNLWNNPEEFFPERF

IDNSVDFKGQHFEFLPFGAGRRVCPAMNMAIAMVELTLANLLYHFNWKLPHGMKEGDINMEEAPGLSVHK

KIALSLVPIKYP

 

@3

>CYP71AS8P.f CAAP02000057.1 Vitis vinifera 6 genes in a cluster

pseudogene 74% to .c, missing first exon

126787  NLLLAGVNTSASTVVWAMAELARNPIVMKKAQAEVRSVIGN  126660

126659  KGKVTESDLDQLLYFKLVVKETFRLHPPSPLLLPRETMSHFQMNGYHIHPKTRVHVNV*A  126480

126479  IGRDPNVWKNPKEFFPESFIDNSIDFKGQHFELLPFGAGRRVCPAINMGIAMLELTFANL  126300

126299  LYHFNWKLPHGMKEEDINMEEGAGITSPKKFALILRPTQYP*  126174

 

@2

>CYP71AS9P.fg CAAP02000057.1 pseudogene 60% to CYP71B.d

122651  FLAGAKQECPTMV*EMAELARNPRTMKKTQAEVRSCAGKQGKVLGT

122506  DLDQLNYLKMMTMKEMLRLYPSV

        TILPTETMQHFNIN

        VYPKTQFLQLDVLAIGKDP  122327

122326  NIWEN

122309  PEEFSLERF  122283

 

@6

>CYP71AS10P CAAP02000950.1  pseudogene (+) strand, 49% to CYP71AS5.c   CAAP02000057.1

8049 VIKKATVVLASFSREDFFQFGGWIIDKFIGVHA*REKSFHIFDQFYQKVIDDHLDLNRP 8225

8226 KPEHEDIVDVLLGL*KDQTNV 8288

8601 NLFLGII*ATTITIVWALTELAKNPRVMKVAQAEIKSCLGYKLMVEESDLDRFQYLKIVF 8780

8781 K 8783

8780 QTLRLHPPLVMLTPWETVAHCKIGGYDVYPKTRIHINVWVIGKDPRVWDNLEEFNPERF 8956

8957 MNSDIDFRGQHFALVPLGAGRRLCLGMNIATTIMELTLANLLYSFD*RLPSGMKMEEIST 9136

9137 EEGFGSPGHKNEPLYLIP 9190

 

>CYP71AS11P AM481172 missing part of exon 1

CAN66328.1

this part 66% to CYP71AS7v1

11384 MATYSPFLWLPLLLLLPSLFFLIKRTVDQ*RVQREQLPPGLPIIGNLHQLGQLPHQS 11214

11213 LWQLFHKYG 11187

11185 TVIVLHLGFVPTLVVSSAEAARVVLKTRD 11099

(gap)  this part 73% to 71AS7v1

10117 NLLLAGVNTSASTVV*AMAELARNPRVMKKAQAEVRSVMGNKGKVTESDLDQLLYLKLVV 9938

 9937 KEIFRLHPPGPLLLPRETMSHFQMNGYHIHPKTRVHVNV*AIGREPNVWKNPEEFFPLRF 9758

 9757 IDNSIDFKGQHFELLPFGAGRRVCPAINMGIAMVELTFANLLYHFNWKLPHGMKEEDINM 9578

 9577 EEGAGITSPKKFAFILRP 9524

 

>CYP71AS12P second pseudogene on AM481172 56% to CYP71AS6v1

8682 VMLLQLGSVPTVVVSSA*ATKEVKT 8608

7225 FLAGAKQECPTMV*EMAELARNPRIMKKTQAEVRSCAGKQGKVLGT 7088

7080 DLDQLNYLKMMTMKEMLRLYP

7018 FSHTILPTETMQHFNIN 6068

6966 SSSVYPKTQFLQLDVLAIGKDP 6901

6900 NIWENTQKNF 6871

6883 PEEFSLERF 6857

 

>CYP71AT3 CAAP02000328.1a, 92% to CAN64422.1

(CAO61025.1)

46439  MTLLLFVILAFPLFLLFLYRKHRKNGGLLPPGPPGLPFIGNLHQMDNSAPHRYLWQLS  46612

46613  KQYGPLMSLRLGFVPTIVVSSAKIAKEVMKTQDLEFASRPSLIGQQRLSYNGLDLAFSPY  46792

46793  NDYWREMRKICVLHLFTLKRVKSYTSIREYEVSQMIEKISKLASASKLINLSEALMFLTS  46972

46973  TIICRVAFGKRYEGEGCERSRFHGLLNDAQAMLGSFFFSDHFPLMGWLDKLTGLTARLEK  47152

47153  TFREMDLFYQEIIEEHLKPDRKKQELEDITDVLIGLRKDNDFAIDITWDHIKGVLM (0)  47320

47389  NIFLGGTDTGAATVTWAMTALMKNPRVMKKAQEEVRNTFGKKGFIGEDDVEKLPYLKA  47562

47563  VVKETMRLLPSVPLLVPRETLQKCSLDGYEIPPKTLVFVNAWAIGRDPEAWENPEEFMPE  47742

47743  RFLGSSVDFRGQHYKLIPFGAGRRVCPGLHIGVVTVELTLANLLHSFDWEMPAGMNEEDI  47922

47923  DLDTIPGIAMHKKNALCLVAKKYN*  47997

 

>CYP71AT4 CAAP02000328.1b, 96% to CAN64422.1 but 5.8 kb upstream, different gene

76820  MTVLLFVILAFPLLLLFLHRKHRKNGGLLHLPPGPPGLPVIGNLHQMDNSAPHRYLWQLS  76999

77000  KQYGPLMSLRLGFIPTIVVSSARIAKEVMKTHDLKFASRPSLIGPRRLSYNCLDLAFSPY  77179

77180  NDYWREMRKICVLHLFTLKRVQSYTPIREYEVSQMIEKISKLASASKLINLSETVMFLTI  77359

77360  TIICRVSFGKRYEDEGCETSRFHGLLNDAQAMLGSFFFSDHFPLMGWLDKLTGLTARLEK  77539

77540  TLRDMDLFYQEIIEDHLKPDRKKQEQEDITDVLIELQKDNSFAIDITWDHIKGVLM (0)  77707

77780  NIFVGGTDAGTATVIWAMTALMKNPRVMKKAQEEVRNTFG  77899

77900  KKGFIGEDDVEKLPYLKAVVKETMRLLPAAPLLLPRETLQKCSIDGYEIPPKTLVFVNAW  78079

78080  AIGRDPEAWENPEEFIPERFLGSSVDFRGQNYKLIPFGAGRRVCPAIHIGAVTVELTLAN  78259

78260  LLYSFDWEMPAGMNKEDIDFDVIPGLTMHKKNALCLMAKKYN*  78388

 

>CYP71AT5P gi|147832399|emb|CAN64422.1| 48% to CYP83A2/83B1

CAAP02000328.1c 84167-85735 100% match, 96% to CYP71AT4

MTVLLFVILAFPLLLLFLHRKHRKNGGLLHLPPGPPGLPFIGNLHQMDNSARHRYLWQLSKQYGSLMSLR

LGFIPTIVVSSARIAKEVMKTHDLEFASRPSLIGPQRLSYNCLDLAFSPYNDYWREMRKICVLHLFTLKR

VQSYTPIREYEVSQMIEKISKLASASKLINLSETLMFLTSTIICRVAFGKRYEDEGFERSRFHGLLNDAQ

AMLGSFFFSDHFPLIGWLDKLTGLTARLEKTFRDMDLFYQEIIEDHLKPDRKKQEQEDITDVLIGLQKDN

SFAIDITWDHIKGVLM (0)

NIFVGGTDTGAATVIWAMTALMKNPRVMKKAQEEVRNTFGKKGFIGEDDVEKLP

YLKAVVKETMRLLPAVPLLIPRETLQKCSIDGYEIPPKTLVFVNAWAIGRDPEAWENPEEFIPERFLGSS

VDFRGQNYKLIPFGAGRRVCPGIHIGAVTVELTLANLLYSFDWEMPAGMNKEDIDFDVIPGLTMHKKNAL

CLMAKKYN*

 

>CYP71AT6P CAAP02000328.1d, pseudogene 100% to CAN64424.1 in overlaps, 69% to CAN64422.1

94192  ILLALPLILLEIRETMEECFFRPPGPPGLPFIGNLLHLDKSAPHRYLWQLSEKYGAL  94362

94363  MFLRLGFVPTLVVSSARMAEEVMKTHDLEFSSRPSLLGQQKLS*NGLDLAFAPYTNYWRE  94542

94543  MKKICTLHLFNSKRAQSFRSIREDEVSRMIEKISKFASASKLVNLSETLHFLTSTIICRI  94722

94723  AFSKRYEDEGWERSRFHTLLSEAQAIMGASFFKDYFPFMGWVDKLTGLTARLQKILRELD  94902

94903  LFYQEIIDHLNPERTKYEQEDIADILIG

       RINDSSFAIDITQDHIKAVVM

95017  NIFVGGTDTIAAILVWAMTALMKDPIVMKKAQEEIRNIG  95259

95260  GKKGFRDEDDIEKLPYLKALTKETMKLHPPIPLIPRATPENCSVNGCEVPPKTLVFVNA  95436

95437  WAIGRDPESRENPHEFNPERFLGTFIDFKGQHYGLMAFRAGRRGCPGIYLRTVIIQLALG  95616

95617  NLLYSFDWEMPNGMTKEDIDTDVKHGVTM  95703

 

>CYP71AT6P CAN64424.1 44% to 83A2/B1, pseudogene

MKKICTLHLFNSKRAQSFRSIREDEVSRMIEKISKFASASKLVNLSETLHFLTSTIICRIAFSKRYEDEG

WERSRFHTLLSEAQAIMGASFFKDYFPFMGWVDKLTGLTARLQKILRELDLFYQEIIDHLNPERTKYEQE

DIADILI

 

GGTDTIAAILVWAMTALMKDPIVMKKAQEEIRNIGGKKGFRDEDDIEKLPYLKALTKETMKLH

PPIPLIPRATPENCSVNGCEVPPKTLVFVNAWAIGRDPESRENPHEFNPERFLGTFIDFKGQHYGLMAFR

AGRRGCPGIYLRTVIIQLALGNLLYSFDWEMPNGMTKEDIDTD

GHFTGQLGQLAGNILGGFRQLRFSGVSITMWKLKRWKLRVHETQKNI

 

>CYP71AT7 CAAP02000328.1e, 84% to 104360

99326   MMILLLILLALPLFLLFLLRNRRRTPLPPGPPGLPLIGNLLQLDKSAPHIYLWRLS  99493

99494   KQYGPLMILRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSLLGLRKLSYNGLDVAFSPY  99673

99674   NDYWREMRKICVLHLFNSKRAQSFRPIREDEVLEMIKKISQFASASKLTNLSEILISLTS  99853

99854   TIICRVAFSKRYDDEGYERSRFQKLVGEGQAVVGGFYFSDYFPLMGWVDKLTGMIALADK  100033

100034  NFKEFDLFYQEIIDEHLDPNRPEPEKEDITDVLLKLQKNRLFTIDLTFDHIKAVLM (0)  100201

100333  NIFLAGTDTSAATLVWAMTMLMKNPRTMTKAQEELRNLIGKKGFVDEDDLQKLPYLKAIV  100512

100513  KETMRLHPASPLLVPRETLEKCVIDGYEIPPKTLVYVNAWAIGRDPESWENPEEFMPERF  100692

100693  LGTSIDFKGQDYQLIPFGGGRRICPGLNLGAAMVELTLANLLYSFDWEMPAGMNKEDIDI  100872

100873  DVKPGITMHKKNALCLLARIPMH*  100944

 

>CYP71AT8 CAAP02000328.1f, 71% to CAN64422.1

104360  MMILLLILLALPLFLLFLLRNQRRAPLPPGPPGLPFIGNLLQLDKSAPHLYLWRLS  104527

104528  KQYGPLMFLRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSLLGQQKLFYNGLGLTFTPY  104707

104708  NDYWREMRKICVLHLFNSKRVQSFRYIREDEVLEMIKKISKFASASKLTNLSEILIPLTS  104887

104888  TIICRVAFGKRYDDEGCERSRFHELLGGIQTMAIAFFFSDYFPLMSWVDKLTGMISRLEK  105067

105068  VSEELDLFCQKIIDEHLDPNKPMPEQEDITDILLRLQKDRSFTVDLTWDHIKAILM (0) 105235

105143  DIFIAGTDTSAATLVWAMTELMKNP  105427

105428  IVMKKAQEEFRNSIGKKGFVDEDDLQMLCYLKALVKETMRLHPAAPLLVPRETREKCVID  105607

105608  GYEIAPKTLVFVNAWAIGRDPEFWENPEEFMPERFLGSSIDFKGQDYQFIPFGGGRRACP  105787

105788  GSLLGVVMVELTLANLLYSFDWEMPAGMNKEDIDTDVKPGITVHKKNALCLLARSHT*  105961

 

>CYP71AT8 AM489206.2a 58% to 71AT1 tomato

1212 MMILLLILLALPLFLLFLLRNQRRAPLPPGPPGLPFIGNLLQLDKSAPHLYLWR 1373

1374 LSKQYGPLMFLRLGFVPTLVVSSARMAKEVMKTHDLEFSGRPSLLGQQKLFYNGLGLTFT 1553

1554 PYNDYWREMRKICVLHLFNSKRVQSFRYIREDEVLEMIKKISKFASASKLTNLSEILIPL 1733

1734 TSTIICRVAFGKRYDDEGCERSRFHELLGGIQTMAIAFFFSDYFPLMSWVDKLTGMISR 1910

1911 LEKVSEELDLFCQKIIDEHLDPNKPMPEQEDITDILLRLQKDRSFTVDLTWDHIKAILM (0) 2087

2206 DIFIAGTDTSAATLVWAMTELMKNPIVMKKAQEEFRNSIGKKGFVDEDDLQMLCYLKA 2378

2379 LVKETMRLHPAAPLLVPRETREKCVIDGYEIAPKTLVFVNAWAIGRDPEFWENPEEFMPE 2558

2559 RFLGSSIDFKGQDYQFIPFGGGRRACPGSLLGVVMVELTLANLLYSFDWEMPAGMNKEDI 2738

2739 DTDVKPGITVHKKNALCLLARSH 2807

 

>CYP71AT9 CAAP02000328.1g, 73% to CAN64422.1

(CAO61031.1) on contig CU459218.1 chr18 scaffold_1