Atlantic Salmon cytochrome P450s assembled from ESTs

 

The CYP2008 Bioionformatics class found and assembled P450s from

Over 400,000 Salmo salar ESTS IN THE NCBI ESTdb. 

This was their first assignment in the course.

 

54 CYP Proteins assembled from heterozygous, polymorphic Salmo salar ESTs

 

CYP1A   whole TRANSLATION plus partial paralog Fazle Chowdhury

CYP1B   partial seq. Found by Ali Ellebedy

CYP1C.a partial seq Found by Ali Ellebedy

CYP1C.b partial seq

CYP1C.c partial seq

CYP1D   not found in ESTs

CYP2K.a whole TRANSLATION Brandon Hale, Frank Zhang, Rubi Mahato

CYP2K.b partial seq.

CYP2K.c partial seq. Brandon Hale

CYP2M1  whole TRANSLATION ortholog to trout 2M1 Yanhua Qu, Brandon Hale, Rubi Mahato, Mekel Richardson, Julie Philippart

CYP2M2  missing 27 aa in middle 68% to 2M1 Yanhua Qu

CYP2P   partial seq. Julie Philippart

CYP2R1  not found in ESTs

CYP2U1  missing 60 aa in middle found in 2D6 search Frank Zhang

CYP2X.a partial seq

CYP2X.b partial seq

CYP2X.c partial seq

CYP2X.d C-term

CYP2Y.a partial N-term Brandon Hale

CYP2Y.b partial seq. Brandon Hale

CYP2AD  partial seq. Mekel Richardson

CYP2AE  partial seq. N-term

CYP3A.a whole TRANSLATION Fazle Chowdhury

CYP3A.b partial 91% to CYP3A.a Fazle Chowdhury

CYP3A.c whole TRANSLATION 63% to 3A48

CYP4F   whole TRANSLATION plus partial paralog? Mitzi Dunagan, Rubi Mahato, Julie Philippart

CYP4T   whole TRANSLATION plus paralog Mitzi Dunagan, Akshata R. Udyavar, Brandon Hale, Julie Philippart, Ali Ellebedy

CYP4V.a partial seq. Akshata R. Udyavar

CYP4V.b partial seq. Ryoko Tsukahara

CYP5A   partial seq with 3 missing pieces Yanhua Qu

CYP7A   partial seq. Mekel Richardson

CYP7C.a whole TRANSLATION Ryoko Tsukahara, Mekel Richardson

CYP7C.b partial seq. 88% to CYP7C.a

CYP8A1  partial seq.

CYP8A2  partial seq.

CYP8B   partial seq.

CYP11A  partial seq.

CYP11B  partial seq.

CYP17A  not found in ESTs

CYP19A  brain partial seq

CYP19A  ovary partial seq

CYP20   whole TRANSLATION

CYP21   not found in ESTs

CYP24   partial seq. missing middle region Ryoko Tsukahara

CYP26A1 partial seq. C-term Frank Zhang

CYP26B  not found in ESTs

CYP26C1 partial seq. N-term mine

CYP27A.a partial seq Akshata R. Udyavar

CYP27A.b partial seq

CYP27B  not found in ESTs

CYP27C  partial seq Akshata R. Udyavar

CYP39A1 partial seq Ryoko Tsukahara

CYP46   whole TRANSLATION plus partial paralog Mitzi Dunagan, Brandon Hale

CYP51   partial seq.

 

DNA translator

http://ca.expasy.org/tools/dna.html

 

 

 

CYP1A1      Salmo salar (salmon)

            AF361643

            Christopher Rees, Weiming Li

            submitted to nomenclature committee Nov. 9, 2001

            a second gene is being isolated so this is called 1A1

            rather than just CYP1A.  This does not imply orthology to the

            mammalian 1A1, 1A2.  The CYP1A gene duplications in fish and mammals

            occurred independently.

 

>CYP1A AF361643       

Salmo salar cytochrome P450 1A (CYP1A) mRNA, complete cds

MVLMILPIIGSVSVSEGLVAMVTLCLVYMFMKYKHTEIPEGLKR

LPGPKPLPIIGNVLEVHNNPHLSLTAMSERYGSVFQIQIGMRPVVVLSGSETVRQALI

KQGEDFAGRPDLYSFKFINDGKSLAFSTDKAGVWRARRKLAMSALRSFATLEGSTPEY

SCALEEHVCKEGEYLVKQLTSVMDVSGSFDPFRHIVVSVANVICGMCFGRRYSHDDQE

LLSLVNLSDEFGQVVGSGNPADFIPILRYLPNRTMKRFMDINDRFNAFVQKIVSEHYE

SYDKDNIRDITDSLIDHCEDRKLDENANIQVSDEKIVGIVNDLFGAGFDTISTALSWA

VVYLVAYPEIQERLHQELTEKVGLNRTPRLSDKTNLPLLEAFILEIFRHSSFLPFTIP

HCTIKDTSLNGYFIPKDTCVFINQWQVNHDPELWKEPSLFNPDRFLSADGTELNKLEG

EKVLVFGMGKRRCIGEAIGRNEVYLFLAILLQRLRFQEKPGHPLDMTPEYGLTMKHKR

CQLKASLRPWGQEE

 

>CYP1A AF364076 has an extra C base after NGYF causes frameshift

Salmo salar cytochrome P450 1A mRNA, complete cds

Differs from AF361643 at blue and green

matches DR696646.1, DY719050.1 at blue

matches C-term at blue C CB505556.1, DY719520.1,

CA044359.1, DY692330.1, DY692329.1, CK878675.1,

EG852844.1, BQ036391.1

AM402919.1 matches grayed aa.

MVLMILPIIGSVSVSEGLVVMVTLCLVYMIMKYMHTEIPEGLKR

LPGPKPLPIIGNVLEVHNNPHLSLTAMSERYGSVFQIQIGMRPVVVLSGSETVRKALI

KQGEDFAGRPDLYSFKFINDGKSLAFSTDKAGVWRARRKLAMSALRSFATLEGSTPEY

SCALEEHVCKEGEYLVKQLTSVMDVSGSFDPFRHIVVSVANVICGMCFGRRYSHDDQE

LLSLVNLSDEFGQVVGSGNPADFIPILRYLPNRTMKRFMDINDRFNAFVQKIVSEHYE

SYDKDNIRDITDSLIDHCEDRKLDENANIQVSDEKIVGIVNDLFGAGFDTISTALSWA

VVYLVAYPEIQERLHQELTEKVGLNRTPRLSDKTNLPLLEAFILEIFRHSSFLPFTIP

HCTIKDTSLNGYFHSQGHLCLHQPVAGQPPGAVEGAFFIQPYRFLSADGTELNKLEGE

KVLVFGMGKRRCIGEAIGRNEVYLFLAILLQRLCFQEKPGHPLDMTPEYGLTMKHKRC

QLKASLRPWGQEE

 

>CYP1A revised AF364076 Corrected seq with frameshift removed

found by Fazle Chowdhury

MVLMILPIIGSVSVSEGLVVMVTLCLVYMIMKYMHTEIPEGLKR

LPGPKPLPIIGNVLEVHNNPHLSLTAMSERYGSVFQIQIGMRPVVVLSGSETVRKALI

KQGEDFAGRPDLYSFKFINDGKSLAFSTDKAGVWRARRKLAMSALRSFATLEGSTPEY

SCALEEHVCKEGEYLVKQLTSVMDVSGSFDPFRHIVVSVANVICGMCFGRRYSHDDQE

LLSLVNLSDEFGQVVGSGNPADFIPILRYLPNRTMKRFMDINDRFNAFVQKIVSEHYE

SYDKDNIRDITDSLIDHCEDRKLDENANIQVSDEKIVGIVNDLFGAGFDTISTALSWA

VVYLVAYPEIQERLHQELTEKVGLNRTPRLSDKTNLPLLEAFILEIFRHSSFLPFTIP

HCTIKDTSLNGYFIPKDTCVFINQWQVNHDPELWKEPSSFNPDRFLSADGTELNKLEGE

KVLVFGMGKRRCIGEAIGRNEVYLFLAILLQRLCFQEKPGHPLDMTPEYGLTMKHKRC

QLKASLRPWGQEE

 

DY692329.1, CB505556.1, DY719520.1, CK878675.1, CB504968.1,

DY692330.1

HCTVKDTSLNGYFIPKDTCVFINQWQVNHDPELWKEPSSFNPDRFLSADGTELNKLEG

 

>CYP1A There seem to be two very similar sequences with only minor changes

DY692329.1, CB504968.1, CB505556.1

GRRYSHDDQELLGLVNLSDEFGQVVGSGNPADFIPILRYLPNRTMKRFMDINDRFNTFVQ

KIVSEHYESYDKDNIRDITDSLIDHCEDRKLDENANVQVSDEKIVGIVNDLFGAGFDTIS

TALSWAVVYLVAYPEIQERLHQELKEKVGMTRTPRLSDKTNLPLLEAFILEIFRHSSFLP

FTIP

 

>CYP1B EG855910.1 opp end = EG855909

70% to 1B1 fugu

EG877309.1 opp end = EG877308, DY700468.1, EG856153.1

Found by Ali Ellebedy

GTDAFIMALDHSQDSSPGVSPGKDYVPPTIGDIFGASQDTLSTALQWIILILVRFPHIQL

RLQEEVDKVVYRSRLPTIEDQSQLPYVMAFIYEVMRFTSFVPLTIPHSTITDTTIMGYTI

LKDTVIFLNQWSINHDPARWTQPETFDPLRFLDQDSSLNKDLASSVLIFSLGKRRCIGEE

LSKMQLFLFTALLAHQAHFSPDPDKLPTIDYTYGLTLKPNNFSIAVNLRDSMDVLEEASQ

KPLYGETQEDTGNSRSD*

 

>CYP1C.a 81% to CYP1C2

EG933863.1, EG762705.1, DY701849.1, EG806773, EG890885.1

DW472969.1, EG765700.1, EG759216.1

Found by Ali Ellebedy

GVVLNGDASIREALVQHSTEFAGRPNFVSFQSVSGGNSMTFTNYSKQWRTHRKIAQSTIR

AFSSANSQTKKAFEQHIVAEATELIEAFLKLKGQFFNPAHELTVAAANVICALCFGKRYG

HDDIEFRTLLGSVDKFGETVGAGSLVDVMPWLQYFPNPVRRVYQNF

KDLNKEFFTFVRDKVVEHRETFDPEVTRDMSDAIIGVIDKADSDTGLTEAHTEGTVS

DLIGAGLDTVSTCLHWMLLLLVKYPNIQTKLQEQIDKVVGRDRLPCIEDKASLAYLDAVI

YETMRYTSFVPLTIPHSTTSDVTIEGFHIPKDTVVFINQWSVNHDPLQWKDPHLFDPSRF

LDENGALDKDLTSSVMIFSAGKRRCIGDQIAKVEVFLFSAILIHQCTFENNPSQDLSLDC

SYGLTLKPLNYKISAQLRGELLTGA*

 

>CYP1C.b DW541384.1 DY721531.1

MALLDTEFGVKGSSIIREWSGQVQPALVASFVFLFCLEACLWVRNLRLKRRLPGPFAW

PVVGNAMQLGQMPHITFSKLAKKYGNVYQIRLGCNNIVVLNGDTAIREALVQHSTEFAGR

PNFISFQMISGGRSLTFTNYSKQWKMHRKIAQSTIRAFSSANSQTKKAFEHHVLGECMDL

VQVFLRRSADGRYFNPAHEFTVAAANVICALCFGKRYGHDDIEFRTLLGRMDRFGETVGA

GSLVDVMPW

 

>CYP1C.c CX353507 C-term 86% to 1C.a might be the opp end of CYP1C.b

EG763264.1 2aa diffs. EG885248.1 CX154036.1

STIFQWILLLLIKYPNIQTKLQEQIDKVVGRDRLPSIEDKASLAYLDAVIYETMRYTSFV

PLTIPHSTTSDVTIEGFHIPKDTVVFINQWSVNHNPLKWKDPHLFDPSRFLDENGALDKD

LTNSVMIFSTGKRRCIGNQIAKVETFLFTAVLLHQCTFESNPSEALTLDCSYGLTLKPLH

YTITTKLRGKLLALVSPA*

 

>CYP2K.a DY730660.1 47% to 2K16 found by Brandon Hale, Frank Zhang,

Rubi Mahato

EG806893.1, DW567947.1, CK889346.1, EG853617.1, DY741080.1

EG929755.1, DY741081.1, EG851124.1, EG779526, BG935485.1

DN047609.1, CB510582.1, EG851123.1, EG853616.1

MSVLELFSLSGMSVMFITLNLIILIFIINRNTNPKNFPPGPRGLPLLGNTLNLDLKKPYQTMMEMKDKFGPVFSI

QMGLRKIVVLCGYEMVKEALVTQADQFAERPDIPLFKQITRGNGIIFGHGDSWRTIRRFT

LTVLRDLGMGKRNIEEKIIEESENLVKSFAAHNGDGFQTTIPLNAAASNIIVSLLMGHRM

EYDDPIFIKLLEMNYESFRLASGPFIQLYNMYPVIHPLPGPHHKVLAYQDNLKAFFRKSF

IQHRQILDENDSRSYIDAFLKKQQE

EKDNPSSHFHEWNLLCSVTNMFVAGTETTSSTLAW

ALVIMIKYPEIQSKVHEEIDKVISGSTPRIQHRQMMPFTDAVIHETQRFADILPMG

LPHETTADISFKGFFIPKGTYIIPLLRSVHRDKAHWEKPDDFYPHHFLNVD

DKFVKREAFMPFSAGRRVCVGETLARMELFLFYTSLMQRFSFLPP

IGMTADDVDISTCGGLGLAAPPVKVRALPRFHDT*

 

>CYP2K.b DV106223 75% to 2K10, 59% to 2K.a

CK889250.1, CA037986.1

EG843507.1

PWLGPWINNLTRLKKNIADMKMDVTELVRGLKETLNPQMCRG

FVDSFLVRKQTLEESGNMDSLYHDDNLVISVT

NMFGAGTDTTGTTLRWCLLLMAKYPHIQDQVQEEISRVIGSRQPLVEDRKNLPYTDAVI

HETQRLASILPIAIPHTTSRDITFQGYFIKKGTSVIPLLTSVLQDNSEWESPHTFTPSHF

LDEHGGFVKRDAFMAFSAGHRVCLGEGLARMELFLFFTSLLQRFRFTPPPGVTEDDLDLT

PFVGFTLNPSPHQLCAVSRL*

 

>CYP2K.c EG832531.1 opp end = EG832542, EG883217.1, EG842866.1

EG883216.1, EG842865.1, DN162836.1, CA061778.1

Found by Brandon Hale  

MSLIEGLLQTSSTVTLLGTVLFLLVLYLRSSGSSSEEQGKEP

PGPRPLPLLGNMLQLDFKKPYCT

LCELSKKYGSIFTVHFGPKKVVVLAGYKTVKQALVNQAEDFGDRDITPV

FYEFSQGHGILFANGDSWKEMRRFALTNLRNFGMGKKGSEEKILEEIHYLIEVLEKHEGK

AFDTAQPVIYAVSNIISAIVYGSRFEYTDPLFTGMADRSNESIHLTGSASIQIYNMF

(157 aa gap)

GIFHQKKGQSVIPLLT

SVLQDDSEWESPHSFTPSHFLDEHGRFVKRDAFMAFSAGRRVCLGEGLARMELFLFFTSL

LQRFRFTPPPGVTEDDLDLTPSVGFTLNPSPHQLCAVSRL*

 

>CYP2M1 ortholog to trout CYP2M1 95% to 2M1

CX353908 DW539128.1 DW562631.1 DY725141.1 DY728977.1

DY732623.1 DW541770.1 DY706581.1, DY701550.1 DW549411.1,

CK896372.1, BQ036463.1, EG355231.2, CA053315.1, DY725142.1

AM397486.1

found by Yanhua Qu, Brandon Hale, Rubi Mahato, Mekel Richardson,

Julie Philippart

MDVLHILQTNFVSI

VIGVVVIILLWMNRGKQSNSRLPPGPAPIPLLGNLLGMDVKAPYKLYMELSKKYGSVFTV

WLGSKPVVVISGYQAIKDAFVTQGEEFSGRANYPVIMTVSKGYGVLVSSGKRSKDLRRFS

LMTLKTLGMGRRSIEERVQEEAKMLV

KAFSEYGDSVVNPKELLCNCVGNVICSIVFGH

RFENDDPMFQLIQKAVDAYFNVLSSPIGAMYN

MFPRIIWYFPDKHHEMFAVVNKAIAYIQEQAEIRLKTLDTSEPQDFIEAFLVKMLEEKDD

PNTEFNNDNMVMTAWSLFAAGTETTSSTLRQSFLMMIKYPHIQASVQKEIDEVIGSRVPT

VDDRVKMPYTDAVIHEVQRYMDLSPTSVPHKVIRDTEFYNYHIPEGTMVLPLLSSVLADP

KLFKNPDEFDPENFLDENGVFKKNDGFFAFGVGKRACLGE

ALARVELFLFFTSLLQRFTFTGTKPPEEINIEPACSSFCRMPRSYDCYIKLRTEE*

 

>CYP2M1 paralog CK888131

95% to 2M1 above, paralog seq

E*LCNCVGNVICSIVFGHRFENDDPMFQLIQKAVDAYFNVLSSPIGAMYNMCPRIIWCFP

DKHHEMFAVVNKGIAYIQEQADIRLKTLDTSEPQDFIEAFLVNMLAEKDDPNTEFNNDNM

VMTAWSLFAAGTESTSSTLRQSFLMMITYPHIQASVQKEIDEVIGSRVPTVDDRVKMPYTD

 

>CYP2M2  68% to CYP2M1 found by Yanhua Qu

EG828494.1, opp end = EG828495 not CDS

DY732623.1, EG830130.1 opp end = EG830129

MDLLHVLQTNILSIIMAIVVIILLWKYMGKQSSYGRLPPGPSPIPLLGNLLQMDLKRPDMSYMEFS

KKYGSVFTVWLGSKPVVVISGYQAIKDAFVTQGEDFNGRANSAITPKLINEHGVGLSNGQ

RWKTLRSFSLMSLKNLGKGCRSLEERVQVEAKSLVKAFSEYGDSTVNPKELLFHSIINLF

WSIVFGRRFEYNDPEFQILYKPVYTYFDMLKSKVAMLYNISPRIVECFPGKHHELF

KAIDKAKAYIREE

(27 aa gap)

KGQPETEFNYDNLFPCVWDLFAAGTETQSSTLSHACLMMIKYPDIQEKVQKEIDEVIGSN

RVPTVDDRHKMPYTDAVIHEIQRSMSLAPIAFPHQMTRDTTFHNYHIPKGTTVFPLLSSV

LFDPKLFKNPDEFDPENFLDENGVFKENNGFLVFGLGKRYCLGDGMGIPRTVLFLFFTSL

LQRFTFRGTKPPEEIDASVVSYFHGRLARPYTCYVKLRTPNI*

 

>CYP2P.a fragment  Atlantic salmon

                GenMEBL BI468047 EST00457

                77% to CYP2P10

EG901171.1 opp end = EG901174, CA044176.1

found by Julie Philippart

1   DPSSPRDFIDCFLNEIEKCEDDTRAGFNLENLSFCTLDLFVAGTETTSTTLYWGLLFMIN 180

181 YPEIQAKVQAEIDAVVRSSRQPSMEDRDSMPYTDAVIHETQRMGNIIPLNVSRMATKDTE 360

361 VGGYTIPKNTIVLGTLQSILFDESEWETPHTFNPGHFLDQEGKFRKRDAFLPFSLGKRVC 540

541 PGEQLAKMELFLFFTSLLQRFTFFSPPGVEPSLDFQMGATHSPKPYQLCATPR*

 

>CYP2U1 DW576911.1 found by Frank Zhang

Top seq is 59% to CYP2U fugu

131  MELWHELLGTSALSHVCILALTVFVAVYYIMHTFRKHQDFSNIPPGPKPLPIVGNFGGFL  310

311  VPNFIWRRLRREDDPKSKTRALISPPVIITEQAKVYGNIYSMWVGSQLVVVLNGYEVVR  487

488  DALSNRADVFSDRPEIPTVTIMTKRKGIVFAPYGPVWRRQRKFCHTTLRNFGLGKLSLEP  667

668  CILEGLAVVKSELLRLSEQDTEGSGVDLTPLITNSV  775

(60 aa gap)

DY739225, DY738492.1, 75% to CYP2U fugu

QVERDITAFLKQIITRHRETLDPANPRDLIDMYLVEMLAQEAAGETDSSFSEDYLF

YIIGDLFIAGTDTTTNTVLWMILYMAVFPDIQERVQAEMDAVVGPDRVPSLTDKGSLPFT

EATIMEVQRMTVVVPLAIPHMASETTEFRGYTIPKGTVIIPNLWSVHRDPTVWEQPDDFN

PSRFLDDQGNLLRKECFIPFGIGRRVCMGEQLAKMELFLMFTSLLQAFSFSLPEGLAPPP

MHGRFGLTLAPCPYTVAVRPRR*

 

>CYP2X.a CA042231 80% to 2X8 danio C-term, CK887325.1 

GEISHIIAPSAKSVGKSMNPQVLFHNAASNIICLVLFGSRYDYNDEFLKTFVKLYTENAK

IANGPWAMLYDTVPMLRYLPLPFQKAFKNATCVKQMSVGLITQHKETRNPGAPRDFIDCY

LDELDKRGDDGSSFSEAQLIMYVLDLHFAGTDTTSNT

LLTAFLYLMTHPEIQERCQQEIDTVLEGKEHASFEDRHRMPYTQAVTHESQRIASTVPLS

VFHSTTRDTEVMGYSIPKGTLIIPNLTSVLSEEGQWKFPHEFNPLNFLNEQGEFEKPEAF

MPFSAGPRMCLGEGLARMELFLIMVT

 

>CYP2X.b DW532476 57% to CYP2X10, 58% to CYP2X.a mid region

CB504239

HEGIQDLTKIAIGPWAMFYDIIPALRALPLPFKKAFHIYDEIKEHAQKAVTNHKSSRVSG

EPRDLIDCYLDQIDMTGDGGSTFNDVQMVFLLIDLFLAGTDTTSNTLRCAVLHLMTNQHI

QERCHREIEDVLEGRSCALFEDRHAMPYVQAMIHESQRMADTVPLSVFHMTSCNTQLQGY

QLP

 

>CYP2X.c N-term 2 frameshifts EG647534 65% to 2X11 danio

LILLLFFFIRVRRPKNFPPGPRPLPILGNLLQLDPANPLKDLERLKRRYGNVFSLYIGSR

PAVVLNGLEVVREALVTRAAEYAGRPTHLMISHLFKGKGVVMANYGSSWRDHRRFALTTL

RNFGLGKRS

MEERILEE

VSHICTELESSAGSSMDPQHLFHLAASNIICS

 

>CYP2X.d C-term CA042805  70% to CYP2X1

DW532475.1, CA767912.1, CA038044.1

TRMIHESQRMADTVPLSVFHMTSCNTQLQGYQLXQGTMVIPNLSSVLHEEGQWKFPQEFN

PDNFLNEDGEFVKPEAFLPFSAGSRVCLGEGLARTELFLILVTLLRRFQFVWPEQEGAPD

FRKVFGIAQAQKPFRLGVRLRSSQ*

 

>CYP2Y.a  N-term Found by Brandon Hale  

Top = 68% to 2Y1

compiled from:

salmon EST sequence DY725570.1 opp end = DY725569 no CDS

 

MEMCSSVVLGGLVLLLLLWLFRLRRQRHVHLPPGPCALPLLGNL

HQIDKQAPFKTLTKWSGVYGPVMTVYLGPQRAVVLVGYEAVKEALVDQAEDFTGRAPVPFLVLVTR

GYGLAISNGERWRQLRRFTLTTLRDFGMGRKRMEEWIQEESQHLIDSLDATKAVPFDPHPFLSRTV

SNVICSLVFGQRFGYDDDNFLHLLNILSAVLRFGSSPCEQLYNIFPWLMERLPGRQ

 

>CYP2Y.b  DY717079 77% to CYP2Y4 danio C-term Found by Brandon Hale  

QRVPQMEDRKSLPFTEAVIHEVQRFLDIVPLNLPHYATKNISFRGYTIPQGTVILPMLHS

VLRDQNHWATPTTFNPNHFLDQNGNFQTNPAFLAFSAGKRACVGESLARMEIFLFLVSLL

QHFSFSCPGGPDSIDLSPEFSSFTNVPRHYQLIATPR*

 

>CYP2AD.a AM402774 53% to CYP2AD2 danio aa 112-193

VTFKGIGITLSNGHMWKNQRKFAHTHLRYFGEGKKGLEHYIQLESNFLCEAFREEQGGGF

NPHYILNNAVGNIISCVVFGVTFKY

 

>CYP2AD.b AM397512 C-term 68% to CYP2AD2 danio found by Mekel Richardson

ETQRIGNILPLGFPKMASKDTKLGEYFIPKGTVVNTNLSSVLFDKDEWETPNTFNPEHFL

DSEGQFRRRDAFLPFSAGKRACVGEHLARMELFLFFSSLLQRFSLSPVSGAMPSLDGVLG

FTHSPAEFLVRALPR

 

>CYP2AE N-term CB510051 61% to CYP2AE1 danio

MLEALVCLVGQWIDTKGVLLFLLVLLVTKYIHDLPPKNYPPGPFPLPFVGNMLNISIK

DYIGSFKKFVESYGDVTTLDLGGGNRCVLLSGLRGFKEAFVDQADTFTDRPSYPLNDRIS

RGLGLISSNGHMWRQQRRFAVSTLKYFGVGKKTLETSILQESHFLCDVYLA

 

>CYP3A.a found by Fazle Chowdhury

DW557151.1, EG828682.1 EG885620.1, EG885621.1, DQ361036

59% to 3A48, 60% to CYP3A.c

MSFLPYFSAETWTLLALLITLIVVYRYWPYGVFTKMGIPGPKPLPYIGTMMEYKKGFTNFDTECFQKYGRIWGIYDGRQ

PVLCIMDKSMIKTILIKECYNIFTNRRNFTFLSGELFDAVTIAEDDTWRRIRSVLSPSFTSGRLKEMFGI MKRHSANLLNGMKKQADKDQAIEVKEFFGPYSMDVVTSTAFSVDIDSLNNPSDPFVYNIKKMLKFDMFNP

LLLLIVLFPFIGPILDKMKFTFVPTEVTDFFYASLAKIKSGRDTGNSTSQVDFLQLMIDSQKGNDTKTGQEQTK

GLTDHEILSQAMVFIFAGYETSSSTMSFLAYNLATNPHTMTKLQEEIDTVFPNKAPIQYEALMQMDYLDCVLN

ESLRLFPVSPRLERVAKKTVEINGIVIPKDCVVLVPTWTLHRDPEIW

SDPEEFKPERFSKENKESIDPYTYMPFGAGPRNCIGMRFALIMIKLAMVEILQSFTFSVCDETEIPLEMDKQ

GLLMPKRPIKLRLEPRSNTPSNTTAISF*

 

>CYP3A.b  second gene 91% to first gene, found by Fazle Chowdhury

EG895305.1 opp end = EG895306

EG902122.1, opp end = EG902123

MSFLPYLSAETWTLLALLITFIVVYGYWPYGVFTKMGVPGPKPLPYFGTMME

YRKGFTNFDTECFQKYGRIWGIYDGRQPVLCTMDKSMIKTILIKECYNIFTNRRNFMFLN

GELFDALSFAEDDTWRRIRSVLSPSFTSGRLKEMFGIMKRHSANLLNGMKKQADKDQTIE

VKEFFGPYSMDVVTSTAFSVDIDSLNNPSDPFVSNVKKMIKFDMFNPLLLLFV

LFPFIGPILEKMKFSFFPSAVTDFFYASLAKIKSGRD

 

>CYP3A.c third gene

EG943017.1 DY734442.1, EG943028.1

63% to CYP3A48, 60% to CYP3A.a

MWYFLSVSTETWTLIAILFALFMWYGYAPYGFFKKLGIQGPKPLPFIGTFLEY

KRGLFIFDNDCYQKYGDVWGLFDGRLPVMGVMDTAMIKTILVKECYSVFTNRRDFGLNGE

LHDAVTTVEDNEWKRIRSTLSPSFTSGRLKDMFKIMTQHSRNLVKFLQKKVDSDEVLEVK

EIFGAYSMDVVASSAFSVDIDSINQPNDPLVVNIKKLLKFNLLSPLLILVVLFPFMRPLL

EKCKVSFLPAGAMKFFYSFLRKIKAERSKNVHNT

RVDLMQLMVDSQIPQDHSSKEAAHK

GLTDHEILSQALTYIFGGYETSSSTLGYLSYNLATNPDVQAALQDEIDKIFPDKAPPTYE

GLMQMEYLDMVINETLRVYPIISRLERVAKATVEVNGVTIPKGTVVMIPIMVLHHHPTHW

PKPEVFRPERFSKENRENIDPYTFLPFGMGPRNCIGMRFALQTIKLVIVEILQNFSFVTC

KETEVPLELFDNGFVAPTKPIKLKLVPRVLAPS*

 

>CYP4F found by Mitzi Dunagan, Rubi Mahato, Julie Philippart

DW540843.1, DY716972.1, DW540844.1, DW573410.1, DW584204.1

CA057516.1, DW573720.1, DY717493.1

61% to 4F43 Danio, 62% to 4F28 Fugu, 59% to 4F3 human

MAVIDAVLDRLLDGLLSLGRLLSPLY

SLFVLLQLSALVVLLLLSLRVVCLLWSHAQFTRRLQCFSKPPTQNWLMGHLGEMRSTEEG

LQAVDQMVRTYSHSCSWFLGPFYSLVRLFHPDYIKPLLLAPASITVKDELFYGFLRPWLG

QSLLLSNGEDWSRRRRLLTPAFHFDILKNYVKIFNHSSDIMHF

KWRRLVAEGESRQDMFSHISLMTLDSLLRCTFSYNSNCQESSSEYIAAIFELSTLVIE

RRGRILHHWDWLYWRSPEGQRFKQACNVVHRFTRTVVQERRAQLLHQGEPESHTDTTGGE

EKRKRVADFIDLLLLSKDEEGHGLTDEGIKAEADTFMFGG

HDTTASGISWVLYNLSQHQDYQDRCRAEVNDLLQDRETEDLDWEDLSSLPFTTMCIKESL RLHSPVSAVTRRYTKDITVPGGRVIPQGSICLVSIYGTHHNPEIWPDPDVYNPMRFDPEN

SKDRSSHAFIPFSSGPRNCIGQKFAMAELRVVVALTLRRFHLTPGGVEVRRLPQLVLRA

EGGLWLTLETLDTPQD*

 

>CYP4F paralog? 94% DW535744.1 C-term (only one, poor seq?)

opp end = DW535743.1

XXXXXXXXXXXXXX

LLSLGRLLSPLYSLFVLLQLSALVVLLLLSLRVVCLLWSHAQFTRRLQCFSKPPTQNWLM

GHLGEMRSTEEGLQAVDQMVRTYSHSCSWFLGPFYSLVRLFHPDYIKPLLLAPASITVKD

ELFYGFLRPWLGQSLLLSNGEDWSRRRRLLTPAFHFDILKNYVKIFNHSSDIMHFKWRRL

VAEGESRQDMFSHISLMTLDSLLRCTFSYNSNCQESSSEYIAAIFELSTLVIERRGRILH

HWDWLYWMSPEGQRFKQACNVVHRFTRTVVQERRAQLLHQGEP

(70 aa gap)

YQDRCRAEVNDLLQDRETEDLDWEDLSSLPFTPMCIKEFLRLHSPVFAVTRQYTKDITVP

GGRVIPQGSICLVSIYGTHHNPEIWPDPDVYNPMRFDPENSKDRSSHAFIPFFSGPRNCI

GQKFAMGELRVVVGLTLRRFRLTPGGVEVRGLPQLVLRAEGGLWLTLETLDTPQD*

 

>CYP4T salmon found by Mitzi Dunagan, Akshata R. Udyavar, Brandon Hale,

Julie Philippart, Ali Ellebedy

DY733068.1, DY712589.1, DY740889.1, DW555897.1, EG891086.1, DY713326.1

DY739259.1, EG891086.1

MELFETLKKVTLDSYRIHHLVAIFSLVYVILKISKLIVK

RNEWIRALETFPGPPKHWLFGHVREFKQDGNDMYKVVKWGESYPLAFQMWFGPFVSILNIHHPD

YVKTILASTEPKDDLSYRFLIPWIGDGLLVSGGQKWFRHRRLLTPGFHYDVLKPYVKMMSDSAK

TMLDKWETHSKSDESFELFEHVSLMTLDSIMKCAFSSNTNCQTVRGGESGTNSYIKAVYELSDL

VNVRLRTFPYHSEWIFQLSPHGYKYRKACNVAHSHTEEIIRKRKEALKDEKELGRIQAK

RNLDFLDILLCARDEDQQGLSDEAIRAEVDTFMFEGHDTTASGISWTLYSLACNPEHQQI

CRDEVISALEGRDTME

WEDLSKIPYTTMCIKESLRLYPPVPGMSRKLTKPMTFFD

GRTVPKGCLVGTSIFGIHRNATVWENPNAFDPLRFLPKNSAKRSPHAFVPFSAGPRNCIG

QNFAMNELKVVVAQTLKRYQLTED PMKKPKMI

PRLVLRSLNGIHVKIKPVDLVP*

 

>CYP4T paralog 95% to other salmon seq, found by Brandon Hale, Julie Philippart

EG839199.1, DW548509.1, CX352982.1,

Similar paralog seq 95%

EG839200 = opp. End of clone DY705018.1, CA055070

51   MELFETLKKVTLDSYRIRHLVAIFSLVYVVLKISKLIVKRNEWIRALETFPGPPKHWLFG  230

231  HVREFKEDGTDMYKVVKWGESYPPAFQMWFGPFVSFLNIHHPDYVKTILASTEPKDDLLY  410

411  RFLIPWIGDGLLVSEGLKWFRHRRLLTPGFHYDVLKPYVKLMADSAKTMLDKWETHSKFD  590

591  ESFELFEHVSLMTLDSIMKCAFSSNTNCQTVQGGESGTNSYIKAVYELSDLVNVRFRIFP  770

771  YHSEWIFQLSPHGYKYRKACNVAHSH  848

TEEIIRKRKEALKDEKELGRIQAKRNLDFLDILLCARDEDQQGLSDEAIRAEVD

TFMFEGHDTTASGISWTLYSLACNLEHQQICRDEVISALEGRD

TMEWEDLSKIPYTTMCIKESLRLYPPVPGMSRKITKPITFFDGRTVPEGCLVGTSIF

GIHRNATVWENPNAFDPLRFLPENSAKRSPHAFVPFSAGPRNCIGQNFAMNEMKVVVAQT

LKRYQLTEDPMKKPKMIPRLVLRSLNGIHLKIKPVNLEP*

 

>CYP4V.a

BQ036296 N-term 70% to CYP4V5 found by Akshata R. Udyavar

DY733547 C-term 72% to 4V5

CB506320 C-term

HPLKKYFQDWNELRPIPGVDGAYPIIGNALLFSTNAGDFFNQIIEGTKEFRHLPLLKVWV

GPLPLVVLFHAETVEGILHSSKHIDKAYFYRFMQPWLGTGLLTSTGDKWRGRRKMLTPTF

HFSILAEFLEVMNEQSEVLTQKLEKQAGGDPFNCFSYITLCALDIICETAMGKNIYAQSN

SESE

(33 aa gap)

KDHDNRLRILHSFTQSVIKERAESMENAGSDSESDHGIKSRRLAFLDMLLKATDEEGNYL

SHSDIQEEVDTFMFEGHDTTAASMNWTLHLLGSYPEVQTKVQEELQVVFGSSNRSVTVDD

LKRLRYLECVIKETLRLFPAVPMFARTVSDDCHINGFKIPKGVNALIIPFALHRDPRYFP

DPEEFRPERFLIENSTGRHPYAYIPFSAGPRNCIGQRFAMMEEKVVLSSVLRHFSVRACQ

SREELRPLGDLILRPEKGIWITLEKRQC*

 

>CYP4V.b CK896567.1 84% to 4V5 found by Ryoko Tsukahara

75% to 4V.a seq above

FLDMLLKTTDEEGNKLTHQDIQEEVDTFMFRGHDTTAAAMNWAIHLLGSHPEVQRKVQQE

LQEVFGVSDRPINTEDLKKLRYLECVIKESLRLFPSVPFFARSICEDCHINGFKVPKGAN

AIIMPYSLHRDPRHFPQPEEFRPERFMPENCVGRHPYAFIPFSAGLRNCIGQRFAVMEEK

VILASILRYFNVEACQK

 

>CYP5A1 DW567936 65% to 5A1 fugu even with gaps

DY702743.1, AJ425528 found by Yanhua Qu

MNILKMPVPSGVSVSVGLFMIFLALLYWYATFPYSALARCGIRHPKPSPFFGNMFLFRQG

FFGVHTDLIHKYGRVCGYYLGRRPVVVVADPDMLRQIMVKDFSTFPNRMTIRSATKPMSD

CLLMLRNEHWKRVRSILTPSFSAAKMKEMGPLINMATDTLLTNLLGHVESAESFDIHRCF

GCFTMDVIASVAFGTQVDSQKDPDDPFVHHAQKFFSFSFFRPIMFVFIAFPFLAPLARVIPF

(20 aa gap)

RDEQPVEERRRDFLQLMLDTRSTKECVPLEHFDVVNHADELAHTHDSGEQENGGAGSHES

PNRRSVQTQKRMMSEDEIVGQAFVFFLAGYETSSNTLAFTCYLLALHPECQSKLQAEVDD

FFTRYDSPDYTNVQDLKYLDMVISEALRLY

(16 aa gap)

NGQFLPKGATLEIPAGYLHYDPEYWPEPEKFIPDRFTAEAKASRHPFVYLPFGA

RTCVGMMLAQLKIKMALVHVFR

(missing 42 aa at end)

 

>CYP11A CA063876 86% to 11A fugu

GDMLQMLKMIPLVKGALKETLRLHPVAVSLQRYITEDIVIQNYHIPCGTLVQLGLYAMGR

DPDVFPRPEKYLPSRWLRTENQYFRSLGFGFGPRQCLGRRIAETEMQLFLIHMLENFRVD

KQRQVEVHSTFELILLPEKPILLTLKPLKSSQ*

 

>CYP11B1 DQ352841    

Salmo salar testicular cytochrome P45011beta mRNA, complete cds

Duplicated seq 75% to 11B1 fugu

PAVRRFLPLLDEVARDFCRLLVTRVEKEGGEEERGHSLTFDPSP

DLFRFALEASCHVLYGERIGLFSTSPSQESQKFIFAVERMLATTPPLLYLPPRLLWRL

GAPLWTQHATAWDHIFSHAEKRIQRGVQRLRSTQAAGGGSGGTEGEFTGILGQLMDKG

QLSLELIRANITELMAGGVDTTAVPLQFALYELGRNPLVPLQFALYELGRNPAVQEQV

RGQVRAAWARAGGDAHKALQGAPLLKGLVKETLRLYPVGITVQRYPVRDIIIQNYHIP

AGTCVQACLYPL

 

>CYP7A CA042205.1 79% to CYP7A1 fugu aa 128-310

found by Mekel Richardson

QTFLKTLQGEALPSLIETMMENLQSVMLQSDTLSPSKDRWDVDGIFAFCYKVMFESGYLT

LFGKDLGNNKNAARQEAQKALVLNALENFKEFDKIFPALVAGLPIHVFKSAHSARENLAK

TMLAENLSKRQNISDLISLRMLLNDTLSTFNDLSKARTHVALLWASQANTLPATFWSLLY

MIR

 

>CYP7A DY692254 83% to 7A1 fugu C-term found by Mekel Richardson

TRYRIRKDDVIALYPQMLHFDPKIYEDPLTYKYDRYLDDNGQEKTTFYREGRKLRYYYMP

FGSGVTKCPGRFFAVHEIKQFLALVLSYFNMELLDSAVKVPPLDQSRAGLGILQPTYDVD

FRYKLKTQ*

 

>CYP7C.a DW542316.1 56% to 7C1, 88% to DW550033

DW546163.1, DW574259.1 found by Ryoko Tsukahara, Mekel Richardson

MLEFVLPLFLGFLALYLLSVRFGRTRRDGEPPLINGWIPFLGKAMEFGKNAHGFL

AAHKEKHGDVFTVLIAGKYMTFIMNPLLYPYVIKHQKQLDFHEFSDQVAPLTFGYPPVRS

FFFSGMEEHIQRSFRLLQGDNLNNLTESMMGNLMFVFRQDYLTGESEWRTESVYQLCNSI

MFEATFLTLFGKPAHTSRHSGMVTLREDFVKFDTMFPLLIARIPI

SLLGGTKAIRDKLINYFHPQRMSPWSNTSGF

IKERAALFEQYDSMRDVDKAAHHFATLWASVGNTVPATFWAMYYLVTHPEALAVVREEIHG

VLQVSGIETDHNRDIAFTREQLDSLLYLESSINESLRLSSASMNIRVAQEDFSLRLEGE

RSIGVRKGDIVSLYPQSMHMDPGIYKNPEIYKFDRYIENGKEKTDFYKDGQKLKYYRMSF

GSGSTKCPGRYFAVNEIKQFLSLLLLYFDVEVLEGQEPC

TLDPSRAGLGILLPASDVQIRYRLR*

 

>CYP7C.b DW550033.1 58% to 7C1, 88% to CYP7C.a

opp end = DW550032

DY731800.1

KEKHGDVFTVLIAGKYMTFIMNPLLYPYVIKHGKQLDFHEFSDQVAPVTFGYPPVRSGKF

PGMDEHIQRSFRLLQGDNLDNLTESMMGNLMLVFHKDYLDGESEWRTESMYQFCNSVMFE

ATFLTMYGKPAHTNRHSGMVTLRQDFVKFDNMFPLLIARIPISLLGGTKTIRDKLINYFH

PQRMSTWSNTSGFIKERAALFEQYDCMGDVDKAAHHFAILWASVGNTVPATFWAMYYLLT

HPEALAVVREEIHCVLKVSGLEADHNQDA

TFTREQLDSLLYLESSINESLRLSSASMNIRVAQEDFSLRLEGERSIGVRKGDIISLYPQ

SMHMDPEIYENPEMYKFDRYVEDGKEKTDFFKDGQKLKYYRMPFGSGSTKCPGRYFAVNE

IKQFLSLLLLHFDMEVVEGQEPC

SLDFSRAGLGILLPATEVQIHYRPRQARGEE*

 

>CYP7C DW542061.1 90% to CYP7C.a, only 3 aa diffs to CYP7C.b

EG769244

QLDSLLYLESSINESLRLSSASMNIRVAQEDFSLRLEGERSIGVRKGDIISLYPQSMHMD

PEIYENPEMYKFDRYVEDGKEKTDFFKDGQKLKYYRMPFGSGSTKCPGRYFAVNEIKQFL

SLLLLHFDMEVVEGQEPC

SLDSSRAGLGILLPATDVQIHYRPRQAREEE*

 

>CYP8A2 DV107567.1 71% to 8A2 fugu, 66% to 8A1 fugu

FERPPGQVKKFYKGGERLKYYTMPWGAGDNMCVGRHFAVSGIKQFVFMVLSRLDLELCDP

TAIVPPVNPSRYGFGMLQPDGDLEVRYRLKTLH*

 

>CYP8A1 EG779368.1 71% to 8A1 fugu 60% to 8A2 fugu

GQEGSVKKDFFKGGRRLKYYTMPWGAGTNGCVGKRFAISSIRQFVYLVLSHLELELCDPE

AQMPEVNTSRYGFGMLQPEGDLAIRYKPRRSR

 

>CYP8B DW582478.1 76% to 8B.a danio