Rat
cytochrome P450s
108 Rat P450 sequences. This is a beginning of a revision of
the rat P450s.
I am currently looking for more
members in the 7 gene clusters seen in mouse.
The April 1, 2004 Nature issue
on the rat genome had a figure showing 84 P450s
in the rat on a tree
diagram. I am looking for these
and more. There are some
major nomenclature problems due
to naming the rat genes for the closest match in
the database, usually a mouse gene. This will not work if there is not an
orthologous relationship. Many of the names in Genbank will need
to be changed.
The 4F gene cluster appears to
be conserved with all 9 functional genes occurring
in the same order and
orientation as in the mouse 4f cluster.
4F5 is the ortholog of
4f16, 4F4 is the ortholog of
4f15, 4F1 is the ortholog of 4f14 and 4F6 is the ortholog
of 4f13. The other new rat genes (4F39, 4F17,
4F37, 4F40 and 4F18) will be named
for their ortholog in the
mouse. The pseudogenes are not
conserved.
Gene order and orientation(+/-)
is: 4F39+, 4F17+, 4F5/4f16+, 4F37+, 4F40+,
4F4/4f15+, 4F1/4f14-,
4F6/4f13-, 4F18+
The CYP2ABFGST cluster has 14
full length genes, one complete pseudogene with
a few splice site errors
(2B16P) and 9 small pseudogene fragments.
The gene order is:
2S1-, 2B1+, 2B2+, 2B3+, 2B16P+,
2B14P+, 2B21-, 2B12+, 2BNEW+, 2B15+, 2G1+,
2A3+, 2ANEW+, 2A2+, 2F4+, 2T1+
Only 2S1 and 2B21 are oriented
opposite to the cluster major orientation (+).
2b23 in mouse is also (-) and
these two appear to be in orthologous locations, so
the orientation may be
preserved. In the mouse 2a22 is
oriented opposite to the
other genes, but it was on a
small contig that might be incorrectly oriented.
The rat has three genes between
2B21 and 2G1. The mouse has 2b19
in this
location, so the rat may have
expanded the 2b19 gene to three genes.
If we assume
this is correct, there is a
reasonable orthologous relationship of genes in the rat and mouse clusters.
2S1/2s1, 2B1/2b10, 2B2/2b13,
2B3/2b9, 2B21/2b23, 2B12/2b19, 2BNEW/2b19,
2B15/2b19, 2G1/2g1, 2A3/2a5,
2ANEW/2a22, 2A2/2a12, 2F4/2f2, 2T1/2t4.
Last modified Oct. 12, 2004
D. Nelson
>CYP1A1
X00469
MPSVYGFPAFTSATELLLAVTTFCLGFWVVRVTRTWVPKGLKSP
PGPWGLPFMGHVLTLGKNPHLSLTKLSQQYGDVLQIRIGSTPVVVLSGLNTIKQALVK
QGDDFKGRPDLYSFTLIANGQSMTFNPDSGPLWAARRRLAQNALKSFSIASDPTLASS
CYLEEHVSKEAEYLISKFQKLMAEVGHFDPFKYLVVSVANVICAICFGRRYDHDDQEL
LSIVNLSNEFGEVTGSGYPADFIPILRYLPNSSLDAFKDLNKKFYSFMKKLIKEHYRT
FEKGHIRDITDSLIEHCQDRRLDENANVQLSDDKVITIVFDLFGAGFDTITTAISWSL
MYLVTNPRIQRKIQEELDTVIGRDRQPRLSDRPQLPYLEAFILETFRHSSFVPFTIPH
STIRDTSLNGFYIPKGHCVFVNQWQVNHDQELWGDPNEFRPERFLTSSGTLDKHLSEK
VILFGLGKRKCIGETIGRLEVFLFLAILLQQMEFNVSPGEKVDMTPAYGLTLKHARCE
HFQVQMRSSGPQHLQA
>CYP1A2
K02422
MAFSQYISLAPELLLATAIFCLVFWVLRGTRTQVPKGLKSPPGP
WGLPFIGHMLTLGKNPHLSLTKLSQQYGDVLQIRIGSTPVVVLSGLNTIKQALVKQGD
DFKGRPDLYSFTLITNGKSMTFNPDSGPVWAARRRLAQDALKSFSIASDPTSVSSCYL
EEHVSKEANHLISKFQKLMAEVGHFEPVNQVVESVANVIGAMCFGKNFPRKSEEMLNL
VKSSKDFVENVTSGNAVDFFPVLRYLPNPALKRFKNFNDNFVLSLQKTVQEHYQDFNK
NSIQDITGALFKHSENYKDNGGLIPQEKIVNIVNDIFGAGFETVTTAIFWSILLLVTE
PKVQRKIHEELDTVIGRDRQPRLSDRPQLPYLEAFILEIYRYTSFVPFTIPHSTTRDT
SLNGFHIPKECCIFINQWQVNHDEKQWKDPFVFRPERFLTNDNTAIDKTLSEKVMLFG
LGKRRCIGEIPAKWEVFLFLAILLHQLEFTVPPGVKVDLTPSYGLTMKPRTCEHVQAW
PRFSK
>CYP1B1
U09540
MATSLSADSPQQLSSLSTQQTILLLLVSVLAIVHLGQWLLRQWR
RKPWSSPPGPFPWPLIGNAASVGRASHLYFARLARRYGDVFQIRLGSCPVVVLNGESA
IHQALVQQGGVFADRPPFASFRVVSGGRSLAFGHYSERWKERRRAAYGTMRAFSTRHP
RSRGLLEGHALGEARELVAVLVRRCAGGACLDPTQPIIVAVANVMSAVCFGCRYNHDD
AEFLELLSHNEEFGRTVGAGSLVDVMPWLQLFPNPVRTIFREFEQINRNFSNFVLDKF
LRHRESLVPGAAPRDMMDAFILSAEKKATGDPGDSPSGLDLEDVPATITDIFGASQDT
LSTALLWLLILFTRYPDVQARVQAELDQVVGRDRLPCMSDQPNLPYVMAFLYESMRFT
SFLPVTLPHATTANTFVLGYYIPKNTVVFVNQWSVNHDPAKWSNPEDFDPARFLDKDG
FINKALASSVMIFSVGKRRCIGEELSKTLLFLFISILAHQCNFKANQNEPSNMSFSYG
LSIKPKSFKIHVSLRESMKLLDSAVEKLQAEEACQ
>CYP2A1-de2b
exon 2 pseudogene Chr1 (-) only 240bp from Cyp2a22 ortholog start Met
82084718
YNAVKEALVDQAEGFSGQGEQA 82084653
>CYP2A1
NP_036824 88% T0 2A2 chr1 (+) Cyp2a22 ortholog
82084958
MLDTGLLLVVILASLSVMLLVSLWQQKIRGRLPPGPTPLPFIGNYLQLNTKDVYSSITQ 82085134
82085434
LSERYGPVFTIHLGPRRVVVLYGYDAVKEALVDQAEEFSGRGEQATYNTLFKGY 82085595
82088031
GVAFSSGERAKQLRRLSIATLRDFGVGKRGVEERILEEAGYLIKMLQGTC 82088180
82088398
GAPIDPTIYLSKTVSNVISSIVFGERFDYEDTEFLSLLQMMGQMNRFAASPTG 82088556
82089778
QLYDMFHSVMKYLPGPQQQIIKVTQKLEDFMIEKVRQNHSTLDPNSPRNFIDSFLIRMQE 82089957
82093158
EKNGNSEFHMKNLVMTTLSLFFAGSETVSSTLRYGFLLLMKHPDVE 82093295
82093737
AKVHEEIEQVIGRNRQPQYEDHMKMPYTQAVINEIQRFSNLAPLGIPRRIIKNTTFRGFFLPK 82093925
82094440
ATDVFPILGSLMTDPKFFPSPKDFDPQNFLDDKGQLKKNAAFLPFST 82094580
82098022
GKRFCLGDGLAKMELFLLLTTILQNFRFKFPMKLEDINESPKPLGFTRIIPKYTMSFMPI 82098201
>CYP2A2-de2b
exon 2 pseudogene Chr1 (-)
82115528
LKPHWVVVLYEWDAVKEALGDQAEELSG*GEQANL 82115445
>CYP2A2
J04187 Cyp2a12 ortholog
82117349
MLDTGLLLVVILASLSVMFLVSLWQQKIRERLPPGPTPLPFIGNYLQLNMKDVYSSITQ 82117525
82117991
LSERYGPVFTIHLGPRRIVVLYGYDAVKEALVDQAEEFSGRGELPTFNILFKGY 82118152
82123228
GFSLSNVEQAKRIRRFTIATLRDFGVGKRDVQECILEEAGYLIKTLQGTC 82123377
82123595
GAPIDPSIYLSKTVSNVINSIVFGNRFDYEDKEFLSLLEMIDEMNIFAASATG 82123753
82124978
QLYDMFHSVMKYLPGPQQQIIKVTQKLEDFMIEKVRQNHSTLDPNSPRNFIDSFLIRMQE 82125157
82139054
EKYVNSEFHMNNLVMSSLGLLFAGTGSVSSTLYHGFLLLMKHPDVE 82139191
82139607
AKVHEEIERVIGRNRQPQYEDHMKMPYTQAVINEIQRFSNLAPLGIPRRIIKNTTFRGFFLPK 82139795
82140311
GTDVFPIIGSLMTEPKFFPNHKDFNPQHFLDDKGQLKKNAAFLPFSI 82140451
82141451
GKRFCLGDSLAKMELFLLLTTILQNFRFKFPMNLEDINEYPSPIGFTRIIPNYTMSFMPI 82141630
>CYP2A3
J02852 NM_012542 exon 4 in a seq gap in genome seq chr1 (+) Cyp2a5 ortholog
82023007
MLASGLLLVASVAFLSVLVLMSVWKQRKLSGKLPPGPTPLPFIGNYLQLNTEKMYSSLMK 82023186
82023453
ISQRYGPVFTIHLGPRRVVVLCGQEAVKEALVDQAEEFSGRGEQATFDWLFKGY 82023614
82024296
GVAFSSGERAKQLRRFSIATLRDFGVGKRGIEERIQEEAGFLIESFRKTN 82024445
GALIDPTFYLSRTVSNVISSIVFGDRFDYEDKEFLSLLRMMLGSFQFTATSTG
82026488
QLYEMFSSVMKHLPGPQQQAFKELQGLEDFITKKVEQNQRTLDPNSPRDFIDSFLIRMLE 82026667
82028068
EKKNPNTEFYMKNLVLTTLNLFFAGTETVSTTLRYGFLLLMKHPDIE 82028208
82028659
AKVHEEIDRVIGRNRQAKYEDRMKMPYTEAVIHEIQRFADMIPMGLARRVTKDTKFREFLLPK
82028847
82029417
GTEVFPMLGSVLKDPKFFSNPNDFNPKHFLDDKGQFKKSDAFVPFSI 82029557
82030741
GKRYCFGEGLARMELFLFLTNIMQNFCFKSPQAPQDIDVSPRLVGFATIPPNYTMSFLSR 82030920
>CYP2A3-de1b
exon 1 pseudogene Chr1 (+)
82052140
MLGSRLLLVAVLSCLCVMVFMPVWQQQYRDTIPPG 82052244
>CYP2B3-se1[9]
exon 9 100% match to 2B3 chr1 (+)
81263180
GKRMCLGEGIARSELFLFFTTILQNYSVSSPVDPNTIDMTPKESGLAKVAPVYKICFVAR* 81263362
>CYP2B3-se2[1]
duplicate exon 1 100% match Chr1 (-)
81308557
MDTSVLLLLAVLLSFLLFLVRGHAKVHGHLPPGPRPLPLLGNLLQMDRGGFRKSFIQ 81308387
>CYP2B1
J00719 Rn.91353 chr1 (+) 1 aa diff to CYP2B1
81344956
MEPSILLLLALLVGFLLLLVRGHPKSRGNFPPGPRPLPLLGNLLQLDRGGLLNSFMQ 81345126
81357886
LREKYGDVFTVHLGPRPVVMLCGTDTIKEALVGQAEDFSGRGTIAVIEPIFKEY 81358047
81358200
GVIFANGERWKALRRFSLATMRDFGMGKRSVEERIQEEAQCLVEELRKSQ 81358349
81360925
GAPLDPTFLFQCITANIICSIVFGERFDYTDRQFLRLLELFYRTFSLLSSFSS 81361083
81361768
QVFEFFSGFLKYFPGAHRQISKNLQEILDYIGHIVEKHRATLDPSAPRDFIDTYLLRMEK 81361947
81362389
EKSNHHTVFHHENLMISLLSLFFAGTETSSTTLRYGFLLMLKYPHVA
(1) 81362529
81363958
EKVQKEIDQVIGSHRLPTLDDRSKMPYTDAVIHEIQRFSDLVPIGVPHRVTKDTMFRGYLLPK 81364146
81364315
NTEVYPILSSALHDPQYFDHPDSFNPEHFLDANGALKKSEAFMPFST 81364455
81368014
GKRICLGEGIARNELFLFFTTILQNFSVSSHLAPKDIDLTPKESGIGKIPPTYQICFSAR 81368193
>CYP2B2
J00720 Rn.91353 chr1 (+) 4aa diffs with CYP2B2 14aa diffs to CYP2B1
81423536
MEPSILLLLALLVGFLLLLVRGHPKSRGNFPPGPRPLPLLGNLLQLDRGGLLNSFMQ (0) 81423706
81426789
FREKYGDVFTVHLGPRPVVMLCGTDTIKEALVGQAEDFSGRGTIAVIEPIFKEY
(1) 81426950
81427104
GVFFANGERWKALRRFSLATMRDFGMGKRSVEERIQEEAQCLVEELRKSQ
(1) 81427253
81429793
GAPLDPTFLFQCITANIICSIVFGERFDYTDRQFLRLLELFYRTFSLLSSFSSQ 81429954
81430659
VFEFFSGFLKYFPGAHRQISKNLQEILDYIGHIVEKHRATLDPSAPRDFIDTYLLRMEK 81430835
81431274
EKSNHHTEFHHENLMISLLSLFFAGTETGSTTLRYGFLLMLKYPHVT (1) 81431414
81432829
EKVQKEIDQVIGSHRPPSLDDRTKMPYTDAVIHEIQRFADLAPIGLPHRVTKDTMFRGYLLPK
81433017
81433190
NTEVYPILSSALHDPQYFDHPDTFNPEHFLDADGTLKKSEAFMPFST (1) 81433330
81436959
GKRICLGEGIARNELFLFFTTILQNFSVSSHLAPKDIDLTPKESGIAKIPPTYQICFSAR 81437138
>CYP2B3
M20406 chr1 (+) exon 9 not adjacent to this gene. Found at 81263180-81263359
81486567
MDTSVLLLLAVLLSFLLFLVRGHAKVHGHLPPGPRPLPLLGNLLQMDRGGFRKSFIQ (0) 81486737
81514647
LQEKHGDVFTVYFGPRPVVMLCGTQTIREALVDHAEAFSGRGIIAVLQPIMQEY (1) 81514808
81514950
GVSFVNEERWKILRRLFVATMRDFGIGKQSVEDQIKEEAKCLVEELKNHQ (1) 81515099
81516395
GVSLDPTFLFQCVTGNIICSIVFGERFDYRDRQFLRLLDLLYRTFSLISSFSSQ (0) 81516556
81530756
MFEVYSDFLKYFPGVHREIYKNLKEVLDYIDHSVENHRATLDPNAPRDFIDTFLLHMEK (0) 81530932
81531383
EKLNHYTEFHHWNLMISVLFLFLAGTESTSNTLCYGFLLMLKYPHVA (1) 81531523
81536877
EKVQKEIDQVIGSQRVPTLDDRSKMPYTEAVIHEIQRFSDVSPMGLPCRITKDTLFRGYLLPK (0) 81537065
81537233
NTEVYFILSSALHDPQYFEQPDTFNPEHFLDANGALKKCEAFMPFSI (1) 81537373
GKRMCLGEGIARSELFLFFTTILQNYSVSSPVDPNTIDMTPKESGLAKVAPVYKICFVAR
>CYP2B32P
pseudogene partial Chr1 (+)
81806528
VLLLLTLIVGFLLFLVSQSQPKTHGHLPPGLCPLPFLGNLLQIKRRGLLNSFMQ 81806689
81808348
AQEKYGDVLTVHPGPRPVVRLCGTDTIREFLFDQAGTFSGQGTVAVLNPVVHGY 81808509
exon
3 missing
81809871
GVPLIPTSFFQRIAANIICSIVFGECFDYKDHQFLHLLDLIYQTFALMAPCPARS 81810035
81810759
VFQLFSGFLKYFPGVHKQISKNLQEILNYIGHSVEKHMATLDPSAPRDFINTYLLHMEN 81810935
81811666
EKSNHHTEFHHQTSVLSHFFDGTETTSTTLCCSFLIMLKYHHVK 81811797
>CYP2B12-de9b
exon 9 Chr1 (-)
81829155
GKFICLGEGIG*NESFIFFTGILQNLSLASPVAPENIDLTPIKSGAGKIPSTYQIHILSR 81829012
>CYP2B12
X63545, S48369, NM_017156 Rn.108913 chr1 (+) 87% to 2b19 possible ortholog
81858238
MEFGVLLLLTLTVGFLLFLVSQSQPKTHGHLPPGPRPLPFLGNLLQMNRRGFLNSFMQ 81858411
81860089
LQEKYGDVFTVHLGPRPVVILCGTDTIREALVDQAEAFSGRGTVAVLHPVVQGY 81860250
81860393
GVIFATGERWKTLRRFSLVTMKEFGMGKRSVDERIKEEAQCLVEELKKYK 81860542
81860739
GAPLNPTFLFQSIAANTICSIVFGERFDYKDHQFLHLLDLVYKTSVLMGSLSSQ 81860900
81861646
VFELYSGFLKYFPGAHKQIFKNLQEMLNYIGHIVEKHRATLDPSAPRDFIDTYLLRMEK 81861822
81862554
EKSNHHTEFNHQNLVISVLSLFFAGTETTSTTLRCTFLIMLKYPHVA 81862694
81864745
EKVQKEIDQVIGSHRLPTPDDRTKMPYTDAVIHEIQRFADLTPIGLPHRVTKDTVFRGYLLPK 81864933
81865086
NTEVYPILSSALHDPRYFEQPDTFNPEHFLDANGALKKSEAFLPFST 81865226
81868929
GKRICLGEGIARNELFIFFTAILQNFTLASPVAPEDIDLTPINIGVGKIPSPYQINFLSR 81869108
>CYP2B14P
U33540 exon 1 add Chr1 (+) exons 7,8,9 72% to 2B21 to this pesudogene
81706300
MKPNVLLLLAILLSFLLFLVRGHAKVHGHLPPGPRPLPILGNLLQMDRGGLLQSF 81706464
81728276
EKVQKEIGEVTGSHWFPILYSSKIPNTEAVIPEIQR 81728383
81728385
FSDLSSVVLPQRVTKDTFFQGFLLHK 81728462
81728634
NTEVYPILSSVLHDPQ 81728681
81728681
VLEYPVTFNPEHFLDANGALKKNEAFTPFSR 81728773
>CYP2B21
AF159245 Chr1 (-)
81765226
MDPSVLLLFALFTGFLLLLIRGQGNGYGHLPPGPCPLPLLGNVLQMDRRGLLKSFIQ 81765056
81759108
LRDKYGDVVTVHLGPRPIVMLYGTETIREALVDHAEAFSGRGTVAVVQPIIQDY 81758947
81758804
GMIFANGERWKILRRFSLATMRDFGMGKRSVEERIKEEAQCLVEELKKYK 81758655
81757889
GAPLDPTFHLQCITANIICSIVFGERFDYTDHQFLHLLDLFYEILSLVSSFSSQ 81757728
81749057
VFELFPGFLKYFPGTHRHISKNIEEILNFIGHCVEKHRATLDPSTPRDFIDTYLLRMEK
81748881
81748412
EKLNHHTEFHHQNLMMSVLSLFFAGTETSSTTLRYGFLLMLKYPHVA 81748272
81747109
EKVQKEIDQVIGSHRVPTLDDRIKMPYTDAVIHEIQRFSDLVPIGLPHRVTKDTLFRGYLLPK 81746921
81746748
NIEVYPILSSALHDPQYFEHPDTFNPEHFLDANGALKKNEAFLPFST 81746608
81736831
GKRVCLGEGIARNELFLFFTTILQNFSVSSPVSPKDIDLTPKESGFAKIPPTYQICFLSRQLG 81736643
>CYP2B31
86% to 2b19 possible ortholog
81918041
MELGVFLLLTFTVGFLLLLASQNRPKTHGHLPPGPRPLPFLGNLLQMNRRGLLRSFMQ 81918214
81919826
LQEKYGDVFTVHLGPRPVVILCGTDTMREALVDQAEAFSGRGTVAVLHPVVQGY 81919987
81920130
GVIFANGERWKILRRFSLVTMRNFGMGKRSVEERIKEEAQCLVEELKKYK 81920279
81922129
GALLNPTSIFQSIAANIICSIVFGERFDYKDHQFLRLLDLIYQTFSLMGSLSSQ 81922290
81923031
VFELFSGFLKYFPGVHKQISKNLQEILNYIDHSVEKHRATLDPNTPRDFIDTYLLHMEK 81923207
81923977
EKSNHHTEFHHQNLVISVLSLFFAGTETTSTTLRYSFLIMLKYPHVA 81924117
81926113
EKVQKEIDQVISSHRLPTLDDRIKMPYTDAVIHEIQRFADLAPIGLPHRVTKDTMFRGYLLPK 81926301
81926476
NTEVYPILSSALHDPRYFDHPDTFNPEHFLDANGTLKKSEAFLPFST 81926616
81930286
GKRTCLGEGIARNELFIFFTALLQNFSLASPVAPEDIDLTPINSGAGKIPSPYQINFLSR 81930465
>CYP2B15
D17343 to D17349 86% to 2b19 exons 2-4 in a seq gap in the genome
seq
Chr1 (+)
81945068
MELGVLLLLTFTVGFLLLLASQNRPKTHGHLPPGPRPLPFLGNLLQMNRRGLLRSFMQ 81945241
LQEKYGDVFTVHLGPRPVVILCGTDTIREALVDQAEAFSGRGTVAVLHPVVQGY
GVIFANGERWKILRRFSLVTMRNFGMGKRSVEERIKEEAQCLVEELKKYK
ALLNPTSIFQSIAANIICSIVFGERFDYKDHQFLRLLDLIYQTFSLMGSLSSQ
81950148
VFELFSGFLKYFPGVHKQISKNLQEILNYIDHSVEKHRATLDPNTPRDFINTYLLRMEK 81950324
81951073
EKSNHHTEFHHQNLVISVLSLFFTGTETTSTTLRYSFLIMLKYPHVA 81951213
81953132
EKVQKEIDQVIGSHRLPTLDDRTKMPYTDAVIHEIQRFADLIPIGLPHRVTNDTMFLGYLLPK 81953320
81953491
NTEVYPILSSALHDPRYFDHPDTFNPEHFLDVNGTLKKSEAFLPFST 81953631
81957185
GKRICLGEGIAQNELFIFFTAILQNFSLASPVAPEDIDLSPINSGISKIPSPYQIHFLSRCVG 81957373
>CYP2B16P
U33541 to U33546 bad boundary introns 1,5,7 chr1 (+)
81633949
MEPSVLLLLAVLLSFLLLLVRGHAKIHGRLPPGPCPVPLLGNLLQMDRRGLLKSFIQ (?) 81634119
81641847
LQEKYGDVFTVHLGLRPVVVLCGTQTIREALVDHAEAFSGRGTIAGLEPVFQDY (1) 81642008
81642149
GIFFSSGEQWKTLRRFSMATMRDFGMRKKSVEERIKEESQCLVEELKKYQ (1) 81642298
81642886
GAPLDPTFLFQCITSNIICSIVFGECFDYTDHQFLHLLDLMYQTFSLLSSIFSQ (0) 81643047
81645234
VFELFPGVLKYFPGAHRQISRNLHEILDFIGQSVEKHRATLDPNAPRDFIYTYLLHMEK (?) 81645410
81645864
QKSNHYTEFHHWNLLSSVLSLFFAGTETSSTTLRYGFLIMLKYPHIT (1) 81646004
81654168
EKVQKEIDCVIGSHRLPTLDDRSKMPYTEAVIHEIQRFSDLAPIGTPHRVIKDTIFRGYLLPK (?) 81654356
81654524
QNTEVFPILSSVLHDPQYFEQPDIFNLQHFLDANGALKIIEAFLPFST (1) 81654667
81659608
GKRICLGESIARNELFLFFTTILQNFSVSSPVAPKDIDLTPKESGIGRIPQVYQICFLAH* 81659781
>CYP2C6v1_v1-de1b2b3b4b5b
upstream pseudogene 96% identical to seq c
93%
identical to seq upstream of CYP2C6v2 allele (temp name = CYP2Cnewb)
243935799
MDLVMLLVLTLSCLIFLSIWRQSSGRGKLP 243935888
243935888
SGPTPLPIIGNFFHLDLKNITQSLTN 243935965
243937699
FSKVNGSVFTLYFGMKPIVILHGYEAIKEGLIDHGEEFTERGSFPVAEKINKGL 243937860
243938035
GIAFSHGNRWKEIRRFTLMTLQNLGMGKKSIEDRVQEESRCLV 243938163
243939079
GSPCDPTFILGCAPCNVICSIIFQNCFDYKDQDFLSLVEKLNENIKIVSSPWI* 243939231
243940291
FCSSFPVFIDYCPGSHMTLAKNVYHTRNYILKKIKEHQESLDVTNPHDFIDYYLINWKQ 243940467
>CYP2C6_v1
M13711 two aa changes to match many ESTs (lower case mi) due to frameshift
97%
to 2C77 and 2C6v2
243955584
MDLVMLLVLTLTCLILLSIWRQSSGRGKLPPGPIPLPIIGNIFQLNVKNITQSLTS 243955751
243964779
FSKVYGPVFTLYFGTKPTVILHGYEAVKEALIDHGEEFAERGSFPVAEKINKD 243964937
243965112
LGIVFSHGNRWKEIRRFTLTTLRNLGMGKRNIEDRVQEEARCLVEELRKTN 243965264
243966104
GSPCDPTFILGCAPCNVICSIIFQNRFDYKDQDFLNLMEKLNENMKILSSPWTQ 243966265
243967336
FCSFFPVLIDYCPGSHTTLAKNVYHIRNYLLKKIKEHQESLDVTNPRDFIDYYLIKWKQ 243967512
243984646
ENHNPHSEFTLENLSITVTDLFGAGTETTSTTLRYALLLLLKCPEVT 243984786
243989157
AKVQEEIDRVVGKHRSPCMQDRSRMPYTDAmiHEVQRFIDLIPTNLPHAVTCDIKFRNYLIPK 243989345
243990948
GTTIITSLSSVLHDSKEFPDPEIFDPGHFLDGNGKFKKSDYFMPFSA 243991088
243992245
GKRMCAGEGLARMELFLFLTTILQNFKLKSVLHPKDIDTTPVFNGFASLPPFYELCFIPL 243992424
>CYP2C6P
M18336 J03509 M18774 an alternate splice version of 2C6
exon
8 is skipped and replaced by a cryptic exon just past the true exon 8
The
GT boundary of the true exon 8 are the first two nucleotides of CYP2C6_v3
Cryptic
exon 8
MDLVMLLVLTLTCLILLSIWRQSSGRGKLPPGPIPLPIIGNIFQLNVKNITQSLTSFSKV 200
201
YGPVFTLYFGTKPTVILHGYEAVKEALIDHGEEFAERGSFPVAEKINKDLGIVFSHGNRW 380
381
KEIRRFTLTTLRNLGMGKRNIEDRVQEEARCLVEELRKTNGSPCDPTFILGCAPCNVICS 560
561
IIFQNRFDYKDQDFLNLMEKLNENMKILSSPWTQFCSFFPVLIDYCPGSHTTLAKNVYHI 740
741
RNYLLKKIKEHQESLDVTNPRDFIDYYLIKWKQENHNPHSEFTLENLSITVTDLFGAGTE 920
921
TTSTTLRYALLLLLKCPEVTAKVQEEIDRVVGKHRSPCMQDRSRMPYTDAMIHEVQRFID 1100
1101
LIPTNLPHAVTCDIKFRNYLIPK 1169
CYP2C6_v2
CK224594.1 CK224593.1 note: the _v2 means alternative splice version 2
CYP2C6_v3
CK224595.1 CK224596.1 (3 nuc shorter at the joint uses the second AG)
Beginning of exon 7 AGCTAAAG
TCCAGGAAGA GATTGATCGT 243989183
GTGGTTGGCA
AACATCGCAG CCCTTGCATG CAGGACAGGA GCCGCATGCC CTACACAGAT 243989243
GCCATGATTC
ATGAGGTCCA GAGGTTCATT GACCTCATTC CTACCAACCT GCCACATGCG 243989303
GTGACCTGTG
ACATTAAGTT CAGGAACTAC CTAATACCCA AG GT end of exon 7
Beginning
of cryptic exon out of frame agcaggtaa
tagaaactca 243991103
tttccatggt
tccagtgaca tgcagaaccg tggggactta gagtgtgact ctacatgtgc 243991163
tgatagcttg
catctgcatg ataaggagca taattttcat tgtgtatgca ctgtcctgga 243991223
tatgaccacc
ttctttatca gggt end of cryptic exon
normal
exon 9
1328
GKRMCAGEGLARMELFLFLTTILQNFKLKSVLHPKDIDTTPVFNGFASLPPFYELCFIPL
>CYP2C6v2-de1b2b3b4b4c5b
upstream pseudogene
EST
CK224599.1 = 100% match with 4 frameshifts) so this is a real gene
clone_lib="RALIUNN03
Sprague-Dawley rat female liver
The
CYP2C6_v1 sequence is also seen in this same mRNA library
This
GNOMON prediction adds two
upstream exons that do not belong to this gene
58596732 MDLVMLLVLTLSCLILLSIWRQSSGRGKHP
58596643 exon 1 frameshift
58596643 SGPTPLPIIGNFFHLDLNNITQSLTS (0)
58596566 exon 1
58594823 FSKVNGSVFTLYFGMKLIVILHGYAATKEGLIDHGEEFTKRGSFPVAEKINKGL (1) exon 2 58594662
58594487 GIAFSHGNRWKEIRRFTLMTLQNLGMGKESIEDRVQEETQCLV*ELRKTN
(1) exon 3 58594338
58593451
GSPCDPTFILGCAPCNVICSIIFQNCFDYKDQDFLSLMEKLNENIKIVSSPW 58593296
58592013
GSPCDPTFILGCAPCNVICSIIFQNCFDYKDQDFLSLMEKLNENIKIVSSPW 58591858
58590797
FCSSFPVFIDYCLGSHMTLA 58590738
58590736
NVYHTRNYILKKIKEHQESLDVTNPHDFIDYDLIKWKQ 58590620
AVSIKRNS
>CYP2C6v2
allele not in figure, 13 aa diffs to CYP2C6_v1 XM_215255 NW_047916
we
are assigning this allele status but it may be a separate gene
(temp
name = CYP2Cnewb)
58578624
MDLVMLLVLTLTCLILLSIWRQSSGRGKLPPGPIPLPIIGNIFQLNVKNITQSLTS (0) 58578457
58576741
FSKVYGPVFTLYFGLKPTVILHGYEAVKEALIDHGEEFAERGSFPVVEKINKDL (1) 58576583
58576405
GIAFSHGNRWKEIRRFTLTTLRNLGMGKRNIEDHVQEEARCLVEELRKTN 58576256
58575415
GSPCDPTFILGCAPCNVICSIIFQNRFDYKDQDFLNLMEKLNENMKVLSSPWTQ 58575254
58574189
FCSFFPVLIDYCPGSHTTLAKNIYYIRNYLLKKIKEHQESLDVTNPRDFIDYYLIKWKQ 58574013
58554666
ESHNPHLEFTLENLSVTVTDLFGAGTETTSTTLRYALLLLLKYPEVT 58554526
58534931
AKVQEEIDRVVGKHRSPCMQDRSRMPYTDAMIHEVQRFIDLIPTNLPHAVTCDIKFRNYLIPK 58534743
58533131
GTTIITSLSSVLHDSKEFPDPEIFDPGHFLDGNGKFKKSDYFMPFSA 58532991
58531833
GKRMCAGEGLARMELFLFLTTILQNFKLKSVLQPKDIDTTPVFHGFASLPPFYELCFIPL 58531654
>CYP2C7
M18335 exons 1,2,3 and 6 are in sequence gaps 93% to 2C7 variant and 2C81
the
yellow labels are from a random Chr1 piece that is similar to the CYP2C7 N-term
differences
with the published 2C7 sequence M18335 are in cyan
MDLVTFLVLTLSSLILLSLWRQSSRRRKLPPGPTPLPIIGNFLQIDVKNISQSLTK
FSKTYGPVFTLYLGSQPTVILHGYEAIKEALIDNGEKFSGRGSYPMNENVTKGF
GIVFSNGNRWKEMRRFTIMNFRNLGIGKRNIEDRVQEEAQCLVEELRKTK
243849546
GSPCDPSLILNCAPCNVICSITFQNYFDYKDKEMLTFMEKVNENLKIMSSPWMQ
243849385
243847566
VCNSFPSLIDYFPGTHHKIAKNINYMKSYLLKKIEEHQESLDVTNPRDFVDYYLIKQKQ 243847390
243829444
GSPCDPSLILNCAPCNVICSITFQNHFDYKDKEMLTFMEKVNENLKIMSSPWMQ 243829283
this
duplicate exon 4 is not in the right sequence order
ANNIEQSEYSHENLTCSIMDLIGAGTETMSTTLRYALLLLMKYPHVT
243803857
AKVQEEIDRVIGRHRSPCMQDRKHMPYTDAMIHEVQRFINFVPTNLPHAVTCDIKFRNYLIPK 243803669
243800623
GTKVLTSLTSVLHDSKEFPNPEMFDPGHFLDENGNFKKSDYFLPFSA 243800483
243799465
GKRACVGEGLARMQLFLFLTTILQNFNLKSLVHPKDIDTMPVLNGFASLPPTYQLCFIPS 243799286
>CYP2C7 variant unmapped 93% to 2C7
88% to 2C81
3463873
MDLVTFLVLTLSSLILLSLWRQSSRRRKLPPGPTPLPIIGNFLQIDVKNISQSLTK 3464040
3479907
FSKTYGPVFTLYLGSQPTVILHGYEAIKEALIDNGEKFSGRGSYPMIENVTKGF
3480068
3480234 GIVFSNGNRWKEMRRFTIMTFRNLGIGKRNIEDRVQEEAQCLVEELRKTK 3480383
3489182 GSPCDPSLILNCAPCNVICSITFQSHFDYKDKEMLTFMEKVNENLKIMSSPWMQ 3489343
3491162 VCNSFPSLVDYFPGTHHKIAKNINYMKSYLLKKIEEHQESLDVTNPRDFVDYYLIKQKQ 3491338
3505354
ANNIEQSEYSHENLTCSIMDLIGAGTETMSTTLRYALLLLMKYPHVT 3505494
3406504 AKVQEEIDRVVGKHRSPCMQDRSRMPYTDAMIHEVQRFIDLIPTNLPHAVTCDIKFRNYLIPK
3406692
3408304 GTTIITSLSSVLHDSKEFPDPEIFDPGHFLDGNGKFKKSDYFMPFSA 3408444
3409602 GKRMCAGEGLARMELFLFLTTILQNFKLKSVLQPKDIDTTPVFPGFASLPPFYELCFIPS 3409778
New
frags on the plus strand between 2C7 and 2C6
>CYP2C79-se1[9]
frag q Exon 9 100% to 2C79
243885148
GKRICVGEGLARTELFLFLTTILQNFNLKSPVDLKELDTNPVANGFVSVPPKFQICFIPI* 243885330
>CYP2C-se6[9]
frag p exon 9 100% to CYP2C82P-de9b
243895387
GKWICVREDLAQMTLFLFCPTILKNFNLKSQVNPKEL 243895497
>seq
upstream of 2C11
>CYP26A1
AF439720, NM_130408 Chr1 1Mb upstream of CYP2C cluster
242138769
MGLPALLASALCTFVLPLLLFLAALKLWDLYCVSSRDRSCALPLPPG
TMGFPFFGETLQMVLQ (0) 242138581
242138389
RRKFLQMKRRKYGFIYKTHLFGRPTVRVMGADNVRRILL
GEHRLVSVHWPASVRTILGAGCLSNLHDSSHKQRKK (0) 242138165
242137906
VIMQAFNREALQCYVPVIAEEVSGCLEQWLSCGERGLLVYPEV
KRLMFRIAMRILLGCEPGPAGGGEDEQQLVEAFEEMTRNLFSLPIDVPFSGLYR
(0) 242137616
242137537
GVKPRNLIHARIEENIRAKIRRLQAAERNAGCKDALQLLIEHSWERGERLDMQ
(0) 242137379
242136717
ALKQSSTELLFGGHETTASAATSLITYLGLYPHVLQKVREEIKSK (0) 242136583
242136000
GLLCKSHHEDKLDMETLEQLKYIGCVIKETLRLNPPVPGGFRVALKTFELN (0) 242135848
242135595
GYQIPKGWNVIYSICDTHDVADSFTNKEEFNPDRFTSLHPEDTSRFSFIPFGGGLRSCRSKEFAKI
LLKIFTVELARRCDWQLLNGPPTMKTSPTVYPVDNLPARFTHFQGDI* 242135254
>CYP26C1
XM_217935 94% TO 26C1 MOUSE Chr1 1Mb upstream of CYP2C cluster
242151281
MFSWGLSCLSMLGAAGTALLCAGLLLGLAQQLWTLRWTLSRDWASTLPLPKG
SMGWPFFGETLHWLVQ (0) 242151079
242150553
GSRFHSSRRERYGTVFKTHLLGRPVIRVSGAENVRTILLGEHRL (0) 242150422
242149883
VLARVFSRPALEQFVPRLQEALRREVRSWCAAQRPVAVYQAAKALTFRMAAR
ILLGLQLDEARCTELAQTFERLVENLFSLPLDVPFSGLRK
(0) 242149608
242148160
GIRARDQLYQHLDEVIAEKLREELTAEPGDALHLIINSARELGRELSVQELK (0) 242148005
242146368
ELAVELLFAAFFTTASASTSLILLLLQHPAAIAKIQQELSAQGLGSPCSCAPRASGSRP
DCSCEPDLSLAVLGRLRYVDCVVKEVLRLLPPVSGGYRTALRTFELD (0) 242146051
242144220
GYQIPKGWSVMYSIRDTHETAAVYRSPPEGFDPERFGVESEDARGSGGRFHYI
PFGGGARSCLGQELAQAVLQLLAVELVRTARWELATPAFPVMQTVPIVHPVD
GLLLLFHPLPTLGAGDGSPF* 242143843
>CYP2C11
J02657 72% to CYP2C6_v1
243377899
MDPVLVLVLTLSSLLLLSLWRQSFGRGKLPPGPTPLPIIGNTLQIYMKDIGQSIKK 243378066
243379842
FSKVYGPIFTLYLGMKPFVVLHGYEAVKEALVDLGEEFSGRGSFPVSERVNKGL 243380003
243380160
GVIFSNGMQWKEIRRFSIMTLRTFGMGKRTIEDRIQEEAQCLVEELRKSK 243380309
GAPFDPTFILGCAPCNVICSIIFQNRFDYKDPTFLNLMHRFNENFRLFSSPWLQVCNT
FPAIIDYFPGSHNQVLKNFFYIKNYVLEKVKEHQESLDKDNPRDFIDCFLNKMEQEKH
NPQSEFTLESLVATVTDMFGAGTETTSTTLRYGLLLLLKHVDVTAKVQEEIERVIGRN
RSPCMKDRSQMPYTDAVVHEIQRYIDLVPTNLPHLVTRDIKFRNYFIPKGTNVIVSLS
SILHDDKEFPNPEKFDPGHFLDERGNFKKSDYFMPFSA
243416959
GKRICAGEALARTELFLFFTTILQNFNLKSLVDVKDIDTTPAISGFGHLPPFYEACFIPVQRADSLSSHL*
243417171
>CYP2C24
92% to 2C80, M86678 has alternative splice first exon
no
ESTs have this splice
CK481568.1
matches exons 1,2,3,4
CO565602.1
matched the end of the gene sequence and extends it a little 6 aa
Used
this EST to blast the trace files to find the end of exon 7
MDPVLVLVLTLSCLLLLSLWRQSSGRGKLPPGPTPLPIIGNILQIDVKDISKSFTN CK481568.1 exon
1
QLSCSRKFGLTCGPEAQ
243522306 FTDKLTAKCHSSVSLHIDLPGNLL
243522235 yellow region not P450 seq.
243522073
FSKIYGPVFTLYFGPKPTVVVHGYEAVKEALDDLGEEFSGRGSFPIVERMNNGL 243521912
243521366
GVIFSNGTKWKELRHFSLMTLRNFGMGKRSIEDRIQEEASCLVEELRKTN
243521217
243518830
GSLCDPTFILSCAPSNVICSVVFHNRFDYKDENFLNLMEKLNENFKILNSPWMQ 243518669
VCNALPAFIDYLPGSHNRVIKNFAEI
676
677
KSYILRRVKEHQETLDMDNPRDFIDCFLIKMEQEKHNPRTEFTIEILMATVSDVFVAGSE 856
857
TTSTTLRYGLLLLLKHIEVT
AKVQEEIDHVIGRHRRPCMQDRTRMPYTDAMVHEIQRY 1030
gnl|ti|132779224
rts18e73.g from trace files for exon 7
AKVQEEIDHVIGRHRRPCMQDRTRMPYTDAMVHEIQRYINLIPNNVPHAATCNVRFRNYVIPK
>CYP2C80
XM_217906.2 GNOMON exon 2 on AC109577.4 in HTGS 92% to 2C24, 73% to 2C11
MGWLSDP
wrong N-term from GNOMON prediction (temp name = CYP2CNEWC)
Correct
N-term possibly in a sequence gap
244632544
FSEVYGPVFTLYFGLKPTVVVYGYEVVKEVLDGEEFSGRGVFPIVTKVNNDL 244632389
this exon 2 does
not match 2C24
244632205
GVIFSNGTKWKELRRFSLMTLRNFGMGKRSIEDRIQEEASCLVEELRKTN 244632056
244628281
GSLCDPTFILSCAPSNVICSVIFHNRFDYKDENFLNLMEKFNENFKILNSPWMQ 244628120
244624041
VCNAIPAFIDYLPGSHNKVIKNFAEIKSYILRRVKEHQETLDMDNPRDFIDCFLIKIE 244623868
244620080
QEKHNPCTEFTIQSLVATVTDVFVAGSETTSTTLRYGLLLLLKHTEVT 244619937
244619006
AKVQEEIDHVIGRHRRPCMQDRTRMPYTDAMVHEIQRYINLIPNNVPHAATCNVRFRNYVIPK 244618818
244616897
GTDLITSLTSVLHDDKEFPNPEVFDPGHFLDEHGNFKRSDYFMPFSS 244616757
244614348
GKRMCVGEALARMELFLLLTTIVQNFNLKSFVATKDIDTTPLTNTFGCVPPSYQLYFTPR* 244614166
>2C80
EST no ESTs have this splice
CK481568.1
matches second exon
MDPVLVLVLTLSCLLLLSLWRQSSGRGKLPPGPTPLPIIGNILQIDVKDISKSFTN
FSKIYGPVFTLYFGPKPTVVVHGYEAVKEALDDLGEEFSGRGSFPIVERMNNGL
GVIFSNGTKWKELRHFSLMTLRNFGMGKRSIEDRIQEEASCLVEELRKTN
GSLCDPTFILSCAPS
>CYP2C79
XM_219933 minus strand 72% to 2C6_v1 95% to seq e, 100% to seq q (exon 9),
93%
to seq z (exon 5) (temp name = CYP2CNEWD)
244590183
MILGVFLGLFLTCLLLLSLWKQNFQRRNLPPGPTPLPIIGNILQIDLKDISKSLRN 244590016
244575990
FSKVYGPVFTLYFGRKPAVVLHGYEAVKEALIDHGEEFAGRGIFPVAEKFNKNC 244575829
244575612
GVVFSSGRTWKEMRRFSLMTLRNFGMGKRSIEDRVQEEARCLVDELRKTN 244575463
244553851
GVPCDPTFILGCAPCNVICSIVFQNRFDYKDQEFLALIDILNENVEILSSPWIQ 244553690
244525726
ICNNFPAIIDYLPGRHRKLLKNFAFAKHYFLAKVIQHQESLDINNPRDFIDCFLIKMEQ 244525550
244524359
EKHNPKTEFTCENLIFTASDLFAAGTETTSTTLRYSLLLLLKYPEVT 244524219
244517844
AKVQEEIDHVIGRHRSPCMQDRHHMPYTDAVLHEIQRYIDLLPTSLPHALTCDMKFRDYFIPK 244517656
244516177
GTTVIASLTSVLYDDKEFPNPEKFDPSHFLDENGKVKKSDYFFPFST 244516037
244496745
GKRICVGEGLARTELFLFLTTILQNFNLKSPVDLKELDTNPVANGFVSVPPKFQICFIPI 244496566
>CYP2C79-de9b
exon 9 62% to 2C79 2 aa diffs to seq d and seq p minus strand
244491372
G*WICVREDLAQMTLFLFCPTILKNFNLNSQVNPKEL 244491262
interval
between 2C79 and 2C6
>CYP2C6-se1[1:2:3:2:3]
frag n exons 1,2,3 2C6 like pseudogene plus strand exon 2,3 100% to seq m
244044941
MDHTTGTYTLSLILLSL*RQSSGRGKIPPGPTPLPIIDNLLQLDIKNVTQYLAN (0) 244045102
244050420
LSKVHGPVLTLYFWMKSNVVLHVDEAVNEDLIDHGE*FAVRRSIPLAEKLIKAL 244050581
244050793
XXXXXXXXXXXXXKTFTLMTLQNLRMGKGNIEDHVQE*AQ 244050873
frag
m Exons 2,3 2C6 like pseudogene 100% to seq n
244052306
LSKVHGPVLTLYFWMKSNVVLHVDEAVNEDLIDHGE*FAVRRSIPLAEKLIKAL 244052467
244052679
XXXXXXXXXXXXXKTFTLMTLQNLRMGKGNIEDHVQE*AQ 244052759
>CYP2C7-se2[2:3]
frag k exons 2,3 = 100% to 2C7 variant, 2 aa diffs to 2C7
exons
2,3,6,7,9 (6,7 and 9 have 1 aa diff to 2C7)
244064158
FSKTYGPVFTLYLGSQPTVILHGYEAIKEALIDNGEKFSGRGSYPMIENVTKGF 244064319
244064485
GIVFSNGNRWKEMRRFTIMTFRNLGIGKRNIEDRVQEEAQCLVEELRKTK 244064634
>CYP2C7-se1[6:7:9]
frag j exons 6,7,9 (6,7 and 9 have 1 aa diff to 2C7)
244103321
ANNIEQSEYSHENLTCSIMDLIGAGTETMSTTLRYALLLLMKYPHVT 244103461
244120225
AKVQEEIDRVIGRHRSPCMQDRKHMPYTDAMIHEVQRFIDFVPTNLPHAVTCDIKFRNYLIPK 244120413
244124319
FLXXXLQNFNLKSLXHPKDIDTMPVLNXXASLPPTYQLCFIPS 244124447
>CYP2C13-se1[6]
frag h 72% to 2C13 exon 6 plus strand 100% to seq s
70%
to 2C12 exon 6 h
244165142
ENGNQQMNYTQEHLATMVTDLL 244165207
244165209
FGGRETLNSTMRFAFLFLMKYPYTT 244165284
>CYP2C22-se1[8]
frag g exon 8 72% to 2C22 minus
strand
244201638
KFDHGNFLDDR 244201606
244201606
GNFK*NDYFMAFLA 244201565
>CYP2C13-se3[1:2:3:2:3:]
frag f Exons 1,2,3,2,3 exon 1 =
66% to 2C13 Minus Strand
exons
2,3 = 57% to 2C13
two
identical copies of exons 2,3 100% to seq v exons 2,3
244215468
SQSFLLLLSLSSQISSKGKLPLDPTSLPILGYFF*VLMKDICQSLIN 244215328
244214467
FLKTSGPLYTQHFSLQPAVVFCGYAAVKGAFVDHSR*FS*RGWFSIFGKFSKVQ 244214306
244214137
GIGFSHKNVWKVKRFFTLITLKNLHMGNDNIKNKVQEEAQCLVKELKKIN 244213988
244213484
R*FS*RGWFSIFGKFSKVQ 244213428
244213259
GIGFSHKNVWKVKRFFTLITLKNLHMGNDNIKNKVQEEAQCLVKELKKIN 244213110
>CYP2C82P
frag e Exons 1,4,4,5,6,7,8,9 almost an exact duplicate of seqs w,x,y,z,
exons
6-9 of the wxyz cluster in a seq gap Plus Strand
244218695
MDPVVVLMPSFSSLLLLSLWRQNSWRRKLPPGPNPLPIIGSFLQIDLNDLCQSLINE (0) 244218865
244233879
LILSYASCNVICSITFQNRFDYKDKEILTLMEKVNENVKIMSSPWIQ 244234019
244240189
GVPCDPTFILGCAPCNVICSIVFQNHFNYKGQEFLALIDTLNENVEILSSPWIQ 244240350
244265531
ICNNFPAIIDYLPGRHRKLLKKFAFAKHYFLAKVIQHKESLDINNPRDFIDCFLIKMEQ 244265707
244266904
KHNPKTEFTCKNLIFTASDLFAAGTETTSPTLRYSLLLLPKYPEV 244267038
244273480
AKVQEEIDHVIGRHRSPCMQDRHHMPYTDAVLHEIQ*YIDLLPTSLPHALTCDMKFRDYFIPK 244273668
244275197
GTTVIASLTSVLYDDKEFPNPEKFDLSHFLDENGKFKKSDYFFPFST 244275337
244286429
GKRICVGEGLAQTELFLFLTTILQNFNLKSPVDLKELDTNPVANGFVSVPPKFQICFIP 244286605
>CYP2C82P-de9b
frag d Exon 9 identical to seq p
244289962
GKWICVREDLAQMTLFLFCPTILKNFNLKSQVNPKEL 244290072
>CYP2C77-de1b2b3b4b5b
frag c Pseudogene 96% to 2C6_v1 exons 1-5 with partial deletion of exon 3 Plus
Strand
244337898
MDLVMLLVLTLSCLILLSIWSQSSGRGKLP 244337987
244337987
SGPTPLPIIGNFFHLDLKNITQSLTS 244338064
244339793
FSKVNGSVFTLYFGMKPIVILHGYEAIK*GLIDHREEFTERGSFPVAEKINKGL 244339954
244340129
GIAFSHGNRWKEIRRFTLMTLQNLGMGK 244340212
244341157
GSPCDPTFILGCAPCNVICSIIFQNSFDYKDQDFLSLMEKLNENIKIVSSPWI* 244341318
244342872
FCSSFPVFIDYCPGIHMTLA 244342931
244342933
KNVYHTRNYILKKIKEHQESLDVTNPHDFIDYYLIKWKQ 244343049
>CYP2C77
variant of 2C6 13 aa diffs to CYP2C6_v1, 16 aa diffs to 2C6v2
This
gene has three frameshifts
244357850
MDLVMLLVLTLTCLILLSIWRQSSGRGKLPPGPIPLPIIGNIFQLNVKNITQSLTS (0) 244358017
244359760
FSKVYGPVFTLYFGMKPTVILHGYEAVKEALIDHGEEFAERGSFPVAEKINKDL (1) 244359921
244360096
GIIFSHGNRWKEIRRFTLTTLRNLGMGKRNIEDRVQEEARCLVEE 244360230
244360232
MRKTN 244360246
244361085
GSPCDPTFILGCAPCNVICSIIFQNRFDYKDQDFLNLMEKLNENMKILSSPWTQ 244361246
244362321
FCSFFPVLIDYCPGSHTTLAKNVYHIRNYL 244362410
244362412
LKKIKEHQESLDVTNPQDFIDYYLIKWKQ 244362498
244381928
ESHNPHSEFTLENLSITVTDLFGAGTETTSTTLRYALLLLLKYPEIT 244382068
244392235
AKVQEEIDRVFGKHRSPCMQDRSRMPYTDAMIHEVQRFIDLIPTNLPHAVTCDIKFRNYLIPM 244392423
244394012
GTTIITSLSSVLHDSKEFPNPEIFDPGHFLDGNGKFKKSDYFMPFSA 244394152
244395307
GKRMFAGEGLA 244395339
244395341
RMELFLFLTTILQNFKLKSVLQPKDIDTTPVFHGFASLPPFYELCFIPL 244395487
>CYP82P-se[1:4:4:5]
frag z Exon 5 minus strand 1 aa diff to seq e
243632036
ICDNFPAIIDYLPGRHRKLLKKFAFAKHYFLAKVIQHKESLDINNPRDFIDCFLIKMEQ (0) 243631860
frag
y Exon 4 minus strand 92% to seq e
243654367
GVPCDPTFILGCAPCNVICSIVFQNHFNYKDQEFLALIE 243654251
243654249
LNENVEILSSP*IQ 243654208
frag
x exon 4 minus strand 100% to seq e short exon 4
243659542
LILSYASCNVICSITFQNRFDYKDKEILTLMEKVNENVKIMSSPWIQ 243659402
frag
w Exon 1 minus strand 100% to seq e
243675609
MDPVVVLMPSFSSLLLLSLWRQNSWRRKLPPGPNPLPIIGSFLQIDLNDLCQSLIN 243675442
>CYP2C13-se4[1:2:3]
frag v Exon 1 (+) 59% to 2C13
243678671
FLLLLSLSSQISSKGKLPLDPTSLPILGYFF*VLMKDICQSLIN 243678802
Exon
2 (+) 48% to 2C79
243679647
FLKTSGPLYTQHFSLQPAVVFCGYAAVKGAFVDHSR*FS*RGWFSIFGKFSKVQ 243679808
Exon
3 (+) 100% to seq f
243679977
GIGFSHKNVWKVKRFFTLITLKNLHMGNDNIKNKVQEEAQCLVKELKKIN 243680126
>CYP2C7-se4[8:9]
frag u Exon 8 minus strand exon 8 = 87% to frag 2, 8+9 = 63% to 2C7
243726168
GVMVITSLSSALHDNKEFPNPKRFDPG*FLDRNGNFKKTDYFILFSA 243726028
Exon
9 minus strand 60% to 2C7
243723025
CVGEGLTPIELFLFLTRILQNFNLKHLTHTEAVDTTPVLSRLTSVSPALKLFFIP 243722861
>CYP2C7-se3[8]
frag t Exon 8 minus strand 82% to 2C7
243749788
TIVIT*LTSVLHDSKKFPNPEMLDSGHFLDENGNFKKSEYFMPFSA 243749651
>CYP2C13-se2[6:7]
frag s Exons 6-7 minus strand 72% to 2C12 exon 6 100% to seq h
243766431
ENGNQQMNYTQEHLATMVTDLL 243766366
243766364
FGGRETLNSTMRFAFLFLMKYPYTT 243766290
243760156
XQINEEIGQVIWRHHSPSMLDWSHMIYTNAMVHEVQRYIDLAPNGVVCEVNCDTKYPRDYFIPK 243759968
>CYP2C7-de7b
frag r Exon 7 (+) 100% to seq a CYP2C81-de7b
243792966
RVQEEIDQVIGRNPSPCMQDRSHMPYTNAMVHEVQR*SNIVPNNIVYEVTCDTKFRNYFIPK 243793151
>CYP2C81
93% to 2C7 28 aa diffs missing exon 1 Plus Strand, 91% to seq j (exons 6,7)
93%
to seq k (exons 2,3)