Amphioxus
(cephalochordate) ESTs, WGS and HTGS sequences
Branchiostoma
floridae (many seqs)
Branchiostoma
belcheri (1 seq)
D. Nelson
August 13, 2004, added CYP51 Jan. 19, 2005,
Many new
WGS Trace file sequences. modified May 10, 2005
To
retrieve the trace archive files such as AFSA125350.y1
http://www.ncbi.nlm.nih.gov/Traces/trace.cgi?size=1&cmd=retrieve&s=&m=&retrieve=Search&val=TRACE_NAME%3D%27ATUP646033.g1%27&file=trace&gz=on&fasta=on&scfrcf=scf&dopt=fasta
and add
the accession number into the search window as shown here:
TRACE_NAME='ATUP646033.g1'
The
TRACE_NAME= limits to the appropriate field. The quotes around the accession number are needed.
CYP2 clan
>AFPZ293133.y1
exon 1 90% to AC150407.1 gene at 44817-45095
ATGN222242.g1,
ATGI214615.g1, AFSA852354.g2, AFSA840251.g2
AFSA264268.g2,
AFSA229077.g2, ATGN243325.b1, ATGI179003.g1
ATUP200634.x2,
Exon 2
ATUP200634.y2 mate pair to ATUP200634.x2
AFSA601420.g2,
AFSA489196.b2, AFSA489196.g2
MESAVSFVSGLLANLTLQSILVLVLAFLVTYWLLGTGDRQKNLPPGPRGLPLLGN
LLSFRPSYLLSNLAAWRDKYGDVFCVRIANRLAVVLNG
()
HKAIQDALVKQPEVFSNRPPPFIDSAKDQG
AFSA125350.y1
walk with this one it overlaps AFSA489196.g2 on the way to exon 3
>AFSA820329.b2
new exon 3 AFSA896985.b2 ATUP864470.b1 ATUP552587.x1
AFPZ806277.x1
AFPZ776469.x1 AFPZ908625.y1 ATGI225190.b1 AFPZ278931.y1
AFSA277571.g2
Exon 4
ATGN165786.g1 exon 4 ATUP951310.g1 ATUP682533.g1 ATUP194164.y2
ATGI58079.g1
ATWW119241.g1 ATUP547159.g1 AFPZ278931.x4 AFPZ278931.x1
ASWX119777.b2 (joins exon 3 of AFSA820329.b2)
ASWX119777.g2 exon 2 seq with frameshift mate pair of ASWX119777.b2
AFSA254465.b2
exon 2 seq
ATUP194164.x2
mate pair of ATUP194164.y2,
AFPZ526846.y1
HKAIQDALVKQPEVFSNRPPPMLDSAKDQG
GVAMSEYGEDWKVKRRIGLT
ALRQFGMGKRSLEGKITEEARILCDVLAEKNGTATDMSLLLSNAVSNVICAMSF
GERFEHNDMEFQRLMRLMSEMVGGSGGNAGSSISRFIPLVRKLPFFKKGLERRV
KMSLEVVDFIKSKIKEHKETFDPADIRDIIDVYLMETQQQTPDDADRTITEMGMINTMRD
LFIAGAETTATTLKWGLLYLARHLEVQRKVQDEIDREFGASPPTLSQRGKLPYTEATILE
IQRIRPIAPLAVPHTTSTATVLHGFDIPADTFVIPNLWSAMMDPAVAPDPETFNPDRFLD
EDGTVVRPEWLIPFSL
GRRQCLGEQLAKMELFLFLTTLLQHFTFKLPDGAPALSMEGSLGIVLAPKAYQICAVPRDN*
>AFPZ869380.y1
AFSA270651.b2 (4 aa diffs to ATGN165786.g1) exon 4
GRRQCLGEQLAKMELFLFLATLLQHFTFKLPYGAPAPSMEGSMGIVLAPKAYQICAVPRDN*
>ASFW203349.g2
WGS exon 1 94% to AC150407.1 gene at 44817-45095
AFSA277180.g2,
AFSA337814.g2, AFSA337814.b2
MESAVSFVSGLLANLTLQSTLVLVLAFLVTYWLLGAGDRQKNLPPGPRGLPLLGNLLSF
RPSHLLSNLAAWRQQYGDVFCVRIANRLAVVLNG
>AFPZ80615.b2
ASWX143119.b2 ATUP19565.y2 ATUP590430.x1
ATUP871195.g1
AFPZ617150.x1
ATGI45147.b1 AFPZ676420.b2 ATUP848153.y1(poor seq)
New exon 3
seq
Exon 4
ATGI75778.b1 ATWX106000.g1 ATWX78968.b1 ATUP879942.b1
ATUP25175.g1
AFSA516244.g2 AFSA298646.b2 AFPZ676420.b2(joins with exon 3)
ATGI213326.g1
overlaps ATUP871195.g1
exon 3 and goes 800bp upstream
Has exact
match to exon 2 from AC150407.1 first seq.
Note:
there are two seqs with exact aa seq but one silent nuc diff
These are
proably from different genes.
ATGI213326.g1
AFPZ69924.g2 ASWX145321.g2 ATUP276953.x2 are 100% identical
In the
exon 2 region
HKAIQDALVKQPEVFSNRPPAVVDSANDQ
GVVMAQYGEGWKVKRRLGLTALRQFGMGKRSLEGKITEEARAVCDILAEK
DGAATDISLLLSNGVSNVICSMSFGERFEYNDTEFQRLMRLMSELVTGSAISR
FNPYVRKLPFIKKGVESRMKMAKEITEFIKAKIKEHKDTFDPADIRDIIDVYLMETQQQI
PDDGDRTITEMGMINTMRDLFIAGAETTATTLKWGLLYLARHLEVQRKVQDEIDQEFGSS
PPTLSQRGKLPYTEATILEIQRIRPIVPLSVPHTTSAATVLHGFDIPANTFVIPNLWSAM
MDPTVAPDPETFNPDRFLDEDGTHVRPEWFIPFSL
(1)
GRRQCLGEQLAKMELFIFLTTLLQHFTFKLPDGAPAPSMDGSLGVVLAPDPYQICAIQRD
>AC150407.1
two linked genes 5000-6800 region and 44000-48000 region
AFSA41261.x1
WGS exon 1 96% to AC150407.1 gene at 44817-45095
ATUP33911.b1,
ASWX26139.b2, ATUP603318.x1, AWXX12138.b1,
AFPZ106727.x1
ATWW205625.g1,
ATWW98645.b1, AFSA387158.g2, AFSA192393.g2, ATGN203140.g1
ATUP879942.g1,
AFSA748020.g2, ATGN123505.b1, ATUP723083.y1, AFPZ103877.x1
APWS99281.g1,
ATUP590430.y1 (mate pair to exon 1 sequence ATUP590430.x1)
ATUP871195.b1
(mate pair to exon 1 sequence ATUP871195.g1)
Note
ATGN123505.b1 overlaps with AFSA940316.g2 to join exons 1 and 2
AC150407.1
is missing exons 1 and 2 (sequence gap)
exon 2 WGS
seqs = APNK3267.b2, AFPZ279007.x4, AFPZ279007.x1, ATUP615432.y1
ATWX107943.g1,
ATGN209312.b1, ATUP704177.b1, AFSA940316.g2
exon 2
from ATUP615432.y1 (overlaps exon 3)
next 8 are
all exon 2 different from AFPZ80615.b2 exon 2 at one nucl.
AFPZ279007.x4
AFPZ279007.x1 ATWX107943.g1 AFSA940316.g2
APNK3267.b2
ATUP615432.y1 ATGN209312.b1 ATUP704177.b1
Exon 3 WGS
sequences ATUP603318.y1 (mate pair to exon 1 sequence ATUP603318.x1)
ATWW25636.b1,
AWYB3861.g1, ATUP183373.b1, ATUP688580.x1, ATUP664516.g1
ASWX26139.g2
(mate pair to exon 1 sequence ASWX26139.b2)
ATUP239851.g1,
ATWX22588.b1 ATUP615432.y1 ASFW157059.g2 ATWW157810.g1
AFPZ279007.x1
AFPZ279007.x4 ATUP704177.b1
Exon 4
AFPZ919116.b2 ASWX175598.x1 ATGN296415.g1 ASFW157059.g2
MESAVAFASGLLANLTLQSTLVLVLAFLTTYWLLGAGGRQKNLPPGPRGLPLLGNL
LSFRPSRLLSNLAAWRQQYGDVFCVRIANRLTVVLNG
HKAIQDALVKQPEVFSNRPPAVVDSANDQ
5313
GVVMAQYGEGWKVKRRLGLTALRQFGMGKRSLEGKITEEARAVCDILAEKDGAATD
ISLLLSNGVSNVICSMSFGERFEYNDTEFQRLMRLMSELVGGSAISRFNPYVRKL
PFIRKGVESRVKMSMEIVEFIKLKIKEHKETFDPADIRDIIDVYLMETQQQTPDDGDR
TITEMGMINTMRDLFIAGAETTATTLKWGLLYLARHLEVQRKVQDEIDREFGSSPPTLS
QRGKLPYTEATILEIQRIRPIVPLSVPHTTSTATVLHGFDIPANTFVIPNLWSAMMDPTV
APDPETFNPDRFLDEDGTLVRPEWFIPFSL
(1) 6266
6618
GRRQCLGEQLAKMELFIFLTTLLQHFTFKLPDGAPAPSMDGSLGVVLAPNPYQICAVPRDN* 6803
>AC150407.1
two linked genes 5000-6800 region and 44800-48000 region
WGS exon 1
= AFPZ47185.g2,
APWS114129.b1, AFSA346565.b2, APNK34495.b3
WGS exon 2
= ATUP592730.y1
ATWX46513.g1,
ATGN30351.b1, ATGN26143.g1, ATGN270605.g1, ATUP890454.x1
APWS24002.b1,
ATGI65252.g1, ATGN86876.g1, ATWW134155.g1, AFPZ623416.g2
Exon3
ATGI94162.b1 ASWX162986.g2 AFPZ47185.b2
AFPZ855725.x1 AFPZ929344.x1
ASWX162986.b2
ATGI65252.g1
Exon 4
ASWX162986.b2 ATUP592730.x3 ATUP592730.x1 AFSA165640.b2 AFSA319547.b2
ATWW189487.g1
44817
MESVVPFASGLLANLTLQSTLVLVLAFLTTYWLLGAGGRQKKPPPGPRGLPLLGNL
LSFRPSRLLSNLAAWRQQYGDVFCVRIANRLAVVLNG 45095 (2)
46004
HKAIQDALVKQPEVFSDRPSPFRFSDKDQ 46090 (1)
46542
GVVMAQYGESWKVKRRLGLTALRQFGMGKRSLEGKITEEARAVCDILAEKDGA
AMDVSLLLSNGVSNVICSMSFGERFEYNDEEFQRLMRLMSELVSAGGISRFIPLVRKLPF
LNEGSKNRAKMSMEIVEFIKVKIKEHKETFDPADIRDIIDVYLMETQQQTPDDVDR
TITEMGMIGTVRDLFIAGAETTATTLKWGLLYLARHLEVQRKVQDEIDREFGSSPPTLS
QRGKLPYTEATILEIQRIRPIVPLSVPHTTSAATVLHGFDIPANTFVIPNLWSAMMDSAV
APDPETFNPDRFLDEDGTLVRPEWFIPFSL (1) 47495
47838
GRRQCLGEQLAKMELFIFLTTLLQHFTFKLPEGAPAPSMDGSLGVVLAPKPYQICGVPR* 48017
>AC150409.1 Branchiostoma floridae very
similar to above clone
45% to 2U1
fugu
AFPZ598379.y1
AFPZ177295.y01 AFPZ177295.y1 ATUP642958.x1
ATGI278297.g1
ATUP598704.y1 ATUP909233.y1
AFSA820152.g2
exon 3
with some frameshifts
97% to
AC150407.1 46000 might be an allele
exon 3,4
ATWW157810.g1
exon 4
ATUP610859.x1 ATWX26371.g1 ATUP919332.x1 ATWW121802.b1 ATGI269677.g1
ASFW189763.g2
ASFW131684.g2 AFSA849352.b2 AFSA173661.g2 AFPZ99561.y1
10750
MESAVAFASGLLANLTLQSTLVLVLAFLTTYWLLGAGGRQKNLPPGPRGLPLLGNLLSFR 10571
10570
PSRLLSNLAAWRQQYGDVFCVRIANRLTVVLNG 10472 (2)
8793
HKAIQDALVKQPEVFSDRPSPFRFSDKDQ 8707 (1)
8256
GVVMAQYGESWKVKRRLGLTALRQFGMGKRSLEGKITEEARAVCDILAEKDGTATDIS 8083
8082
LLLSNGVSNVICSMSFGERFEYNDAEFQRLMRLMSELVSAGGISRFIPLVRKLPFFNEGS 7903
7902
KNRAKMSMEIV 7870
EFIKAKIKEHKETFDPADIRDIIDVYLMETQQQTPDDVDRTITEMGMIGTVRDLFI
AGAETTATTLKWGLLYLARHLEVQRKVQDEIDREFGSSPPTLSQRGKLPYTEATILEIQR
IRPIVPLSVPHTTSAATVLHDFDIPANTFVIPNLWSAMMDPTVAPDPETFNPDRFLDEDGTLV
RPEWFIPFSL
(1)
GRRQCLGEQLAKMELFTFLTTLLQHFTFKLPDGAPAPSMDGSLGVVLAPKPYQICAVPRDN*
>50% to
CYP2U1 zebrafish ASFW117295.g2 AFSA812739.g2 ASFW150452.g2 New seq
TTWKGGVFFLPRALKPRRGRPKVREEIAREFASPVPPWSERERLPYTEATIMEIQGIRPIVPLNIFHGN
TSATTLYGYDIPAGTYIIPSLWSAMMDPKVTPEPEEFRPERFLDDEGKVVKPEWFLPFSA
(1)
GRRRCLGEQLAKMELFLFYTSLLQHFTFKLPDGAPAPPMDGSLGFVLSPPAYDICAVPRHSS*
>62% to
2U1 zebrafish
AFSA220461.b2
ATWX77582.b1 APWS173478.g1 new C-term exon
GRRICLGEQLAKMELFLFLTSLLQQFTFKLPEGAPKPDMCGEIGATLLPKPYNIQAISRKK*
>DE198043.1
genomic survey sequence. NEW 1/6/06
Length=653
43% to 2U1 Fugu
VQTTVRAELDRVLMRGESVSAAHRRALPVTEATVMEILRLATPSPLNFRATACDVTLRG 476
YRLPEGTWTLMNCWAVHRDPLQWTEPDTFDFTRFLDREGRVTTPPAWRPFGIGTRS 308
>DE197854.1
genomic survey sequence. NEW 1/6/06
DE000432
Length=702
51% to 2U1
71
SVLHRYIIPKDTIVFAGQWSVHHDPELFPEPDMFDPERFLDDEGNFKNIEYFMPFSM (1) 241
375
GPRSCMGQPLAEVQLFLLFTNLMQNFKLKLPEGAAKPSSEGVMGITLAPKPFDLV 539
>DE017611.1 genomic survey
sequence. NEW 1/6/06
Length=625
46% to 2K11
GVLFAAYGPDWKHQRKFALMTLRDFGVGKRSLEGKIR 373
EEADALIQEVESKNGLPFDIKQMLPNAVSNVICSIAFGNRFEYGDPEFLRLIGLLNAAVE 553
AQPSRDILPNIHPVFRRLPFGS 619
>DE195161.1
genomic survey sequence. NEW 1/6/06
67% to
2N11
LFLAGTDTTSTTLRWALLYMILHPDIQEKVQQEIDSVLGPNQEPEMAHR 322
>DE189345.1
genomic survey sequence. NEW 1/6/06
61% to
2N11
LFLAGTDTTATTLHWAVLFMILHPDIQQKVQQEIDSVLGPNQDPSMEHR
>DE013036.1 genomic survey
sequence. NEW 1/6/06
Length=592
44% to 2Z2
VIYDLFFTGAETSSTCLRWAVFLMAVYPDVQARIHREVDTVLGSDGEVTLDKRAALPFL 386
DATISEVYYLNS 350
>DE012415.1 genomic survey
sequence. NEW 1/6/06
58%
to 2N11
AGTDTTATTLHWAVLFMILHPDIQQKVQQEIDSVLGPNQDPSMEHR
>CF918864
BI377274 Amphioxus 5-6 hrs cDNA 45% to 2U1 fugu
RVRRDATVSLAHRPEMPYTDAFLHEVLRIRPPGPLSVPHMAGPGATLNGYEIPQNTQVYA
NLWSLHMDPEYWPEPERFDPTRFIGPDGKVLPNPPSYAPFSLGRRACPGKQLAKSEAFLF
LVTMVQRFSFKLPEGAPVPPMDGVMGFSLAAQPHSLCAISRN*
>CF918826
BI383662 Amphioxus 5-6 hrs cDNA 51% to 2U1 fugu
AAESGTRPDYIIPQDAMIFVNLWSVHMDPQLFPDPNTFRPERFLDQDGNFVKQAVIPFGI
GPRVCLGEQLAKMEVFMLFVSLMQRFTFHLPEGAPEPSMLGKLASAINVPCPFELCAVAR*
>BI387982
Amphioxus 26hr cDNA library 48% to 2N2 zebrafish
NGKPVPKPAALMPFSA
GRRACPGEAVXKADTFLLLGGLVQNFRFSIPEGEGPPDLTPDDKTGGDTCIPYPYKVVMSCRKCML*
>BI388387.1
C-helix to mid
EGANYSDGCXGVIFAPYGSFWKEQRKFTLMSLRDFGFGNRSIYGKIVEESQVLQSVIAKF
DGQPFSTHRLLHNAVANVTCNILFGDRWEYDDPLFQRMMDALNYMVSTNVFAVPQNFIPF
TRYIPGWAGRLEPWLKKFLSIMGYLREELDKHKVIFDPTDLRDFINTYLLEIQNQ
>BI387848
Amphioxus 26hr cDNA library
52% to 2U1
FUGU, 50% to 2U1 mouse 75% to BI377261
AFSA235046.g2 ATGI91479.g1 ATUP593811.y1 AFSA108094.b2
RHASDLLLDGTETTGNTLLWALLYMTQNPTIQHK
(0)
VQQELDAVVGESQPTLSHRSQLPYVNACLLETMRIRTLVPLAVPHATTQDVTIQEFDIPQGTQ
(0)
VLPNLYSLHMDPTYWPDPDRFDPERFLDAEGNVINKPQSFMPFGG (1)
GRVCLGEQLARMELFLFFSTLLQSFTFKTPEGAPPPKTDGGLGITWTP
>AFPZ7602.y1
ATGI55268.b1
VILNLYSLHVDPTYWPDPERFDPERFLDAEGNVINKPESFMPFAG
(1)
>ASWX176511.y1 87% to BI377261
ATUP829661.g2
AFPZ642936.g2 ATUP921353.y1 ATWW61130.b1
VLTNLHSLHMDPAYWPDPDRFDPERFLDAEGKVINKPKSFMPFSG
(1)
>ATUP598105.y1 ATGN136393.b1 ATUP193767.y2 ATGI126577.b1
ATWW83807.g1
VHEELDAVVGESLPTLSHRSLLPYVNACLQEVMRIRPVGPLAIPHATTEAVRVRGYDIPKRTQ
(0)
VLLNLYSLHMDPAYWPDPDRCDPERFLDAEGNVINKPESFMPFGG
(1)
GRVCLGEQLARIELFLFFSTLLQSFTFKTPEGAPPPNADGILGLTLAPHPFQLCAIPR*
>ATUP937768.y3 ATUP937768.y1 ATUP905825.y1
ATWW1274.b2
AFSA664077.g2
VLFNLYSLHMDPAYWPDPDRFDPERFLDAEGNVINKPESFMPFGG
(1)
GRRVCLGEQLARMELFLFFSTLLQSFTFKTPEGAPPPNTDGIFRLTLKP
HPFQLCAIRR*
>AFSA690405.b2
exon 1 and part of exon 2 89% to AFSA636542.b2
walk to
ATWW106344.b1 APWS97989.g1
note: this
is probably a poor version of APWS97989.g1
downstream
the sequences are the same
MAILFSWIVESVLEILQISGLTLQTILVFCVPFLLACTF*KRPRNLPXYPAGRVPVLGH
849
LLALGRAPHLKLTXWRRQYGDVFTVRMGMEDVVVLNGYTAVRDALVDRSELFASRPPNYL
669
FDLTVGFGE
()
DIVTARWGSQFX
QRRRL
>ATUP47463.b1
exon 1,2 87% to AFSA636542.b2
MAAVVSWISESVQEIPQISGLTLQTCLVFXAAFLLTCALXRRPRNLPPYPAGHVPVLG
791
HLLALGRAPLLKLTAWRRQYGDVFTVRMGMEDVVVLNGYTAVKDALVDRSELFASRPPNY
611
LFDSSVGFGK
()
DIGAARWGTGLKQRRRFATAALKHLGMKVGTGSVEDNIRQEASCLRKR
(0)
>ATGI68302.b1
exon 1 82% to ASWX66916.b2
MAVIVSWIAELVWEIFQISGLTIQTFLVFCVVFLLAYVLLKRHKNLPPYPAGRVPVLGHL
326
LALGREPPLKLTAWRRQYGDVFTVRMGMEDVVVLNGYTAVKDALVDRSELFASRPPNYLL
506
DAIVGCGK
()
>AFPZ28428.y1
exon 1 79% to AFSA636542.b2
MATAVFRWIIQSVQDTLQIYGLNLQSLLVFCTAFVLACALLKRSPNLPPYPAGRVPVLG
304
HLLALGRAPHLKLTAWRRQYGDVFTVRMGMEDAVVLNGYTAVKDALVDRSELFASRPPNY
484
LFDLTVDSGK
()
>ASWX66916.b2
exon 1 89% to AFSA636542.b2
walked to
AWXX13027.b1 ATUP266482.b1
mate pair
= ASWX66916.g2 exon 3
AFSA35511.g2
exons 2,3 ATGN165304.g1 ASFW57081.b2
walked
upstream to ATUP266482.b1 which = ASWX66916.b2 join seqs.
MAAVVSWIAESVLEILPMSGPTLQTFLVFCVAFLLTWALLRRPRNLPPYPAGRVPVLG
489
HLLALGRAPHLQLTAWRRQYGDVFTVRMGMEDVVVLNGYTAVKDALVDRSELFASRPPNY
669
LFDSSVGFGK
()
DIGAARWGTELKQRRRFATAALKHLGMKVGTGSVEDNIRQEASCLRNR (0)
IAEYHGQPFGISNDMKVAVANVICSMAFGRRYGYEDETFRELSEAIRNLLAEIGSGQFISVFPLLRFVPG
>ATWX43498.b1
exons 2,3 very similar to AFSA35511.g2
walked to
ATUP237571.b1 no obvious exon kept walking to ASWX77262.g2 = exon 4
ASFW107932.b3
exon 4 walked to AFPZ187653.x1 exon 4,5 ASWX45971.b2
DIGAAPLGDRVEAEKRFATAALKHLGMKVGTGSVEDNIRQEASCLRKR
(0)
IAEYHGQPFAISNDMKVAVANVICSMAFGRRYGYEDETFRELSEAIRNLLAEIGSGQFISVFPLLRFVPG
()
ACKEVLKHLSKIHEVLWDEIARHRENFDRENPRDFLDFCLLELEQREK
VEGLTEENVLYMAQNLFLAGTDTTANTLLWSLLYMTLNPDIQNK
(0)
VHEELDA
>AFPZ601018.b2 ATGN133651.b1 ATGN143242.b1
walked to
ATUP71680.y1 (exon 5)
walked to ATUP705359.g2
(exon 4)
walked to
ATGI77993.b1 (exon 3)
walked to
AFPZ24940.g2 (exon 2)
walked
from exon 7 downstream to AFSA524984.b2 to try to find a mate pair
to exon 1
did not work
tried
finding more exon 3,4 hits to look for more mate pairs
ATUP206044.x2
mate pair = ATUP206044.y2 = exon 1
MTGAVQWIADSVQEILQISELTLQTFLVLCSTFLLACVVFNRSRSRNL
PPYPAGRVPVLGHLLALGRAPLLKLTAWRRQYGDVFTVRMGMEDVVVLNGYTAVQDALVD
RSELFASRSPFYYLFDALFAFGK
(1)
DIISARWGSGFRQKKRFATTVLKNLGMRVGRGSIEDSIREEASCLRNR
(0)
IAENNGQPFDIAHDVAVAVANIICSMAFGKRYDYEDETFRELTKAIATISIELGAGHIT
SVFPLLRFVPV (1)
VLYNHSHLYATVNRPIIKALEASSKVKNVMREEISRHREHLDRENPRDFLDLCLLELEQQE
KVEGLTEENVFHMAQDLFLGGTDTTANTLTWSLLYMTLNPDVQNK
(0)
VHEELDAVVGESLPALSHRSQLPYVNACLLETMRIRTIVPLASHATTQEVKVQGYDIPKGTQ
(0)
LMLTSPHMDPANWPDPDPFDPERFLDAEGNVIKKPESFMPFSG
(1)
GRRVCLGEQLARMELFLFFSTLLQSFTFKTPVGAPPPNTDGIPGLTFMPHPFQLLAIER*
>APWS97989.g1
exon 1 93% to AFSA636542.b2
ATUP196459.x2
AFSA321451.g2 ASFW202410.g2
Walked
upstream to ATGI153668.g1 AFPZ313895.x1 mate pair AFPZ313895.y1
to try to
find a mate pair in the C-term part
AFPZ313895.x1
mate pair AFPZ313895.y1 = AFSA636542.b2 seq exon 3
These two
seqs are 95% identical
APWS97989.g1
exon 4 end of exon 4 = BI377261 join seqs.
BI377261
Amphioxus 5-6 hrs cDNA 49% to 2U1 fugu 75% to BI387848
AFPZ459499.y1
ATUP541153.g1 AFSA636542.b2
MAIIVSWIVESVLEILQISGLTLQTILVFCVAFLLACTFWKRPRNLPPYPAGRVPVLGH
585
LLALGRAPHLKLTAWRRQYGDVFTVRMGMEDVVVLNGYTAVRDALVDRSELFASRPPNYL
405
FDLTVGFGE
()
Missing
exon 2
VAEYEGKPIDIAHGINVAVANVICSMTFGKRYDYEDETFRELSEAVVTIMSELGAGQIIS
VFPLLRFVPG (1)
ASYSVSAQLAKIQKVLREEMSRHREHLDRENPRDFLDFCLLELEQQEKVAGLTEENVLYMAQ
NLFFAGTDTTTNTLRWSLLYMALNPDIQKK
(0)
VQEELDAIVGESLPTLSHRSQLPYVNACLLETMRIRHIGPLAVPHATTDTVKVKEYDIAKGTQ (0)
VLPNLHSLHMDPAYWPDPERFDPERFLDAEGNVINKPESFMPFSG
(1)
GRRVCLGEQLARMELFLFFSTLLQSFTFKTPEGAPPPSTDGVF
GVTLTPHPFQLCAIPR*
>AFSA636542.b2
ATUP541153.g1 ATUP933964.y1
ATUP933964.x1 ATUP738986.y1
ATGI10244.g1
ATGN171873.g1 ATUP926693.b1 (exon
6) AFSA482736.g2 (exon 6)
AFSA726698.g2
(exon 6)
34% to 2N1
35% to 2D4
MAVIVSWIVESVLEILQISGLTLQTILVFCVAFLIACTFLLK
RPRNLPPFPAGRVPVLGHLLALGRAPHLTLTAWRRQYGDVFTVRMGMEDVVVLNGYTAV
KDALVDMSELFASRPPNYLFDLTVGFGE (1)
DIVTARWGSKFRQRRRFATTALRNLGMKVGTGSIEEKIREEAIRLRNR
(0)
VAEYEGKPIDIAHGINVAVANVICSMTFGKRYDYEDETFRELSEAVVTI
MSELGAGQIISVFPLLRFVPG
(1)
ASYSVSGQLAKIQKVLREEMSRHREHLDHENPRDFL
DFCLLELELQEKVAGLTEENVLYMTQNLFFGGTDTTTNTLLWSLLYMILNPDIQKK (0)
AQEELDAVVGESLPTLSHRSQLHYVNACLLEVMRIRH
IGPLAVPHATTDTVKVKEYDIAKGTQ
(0)
VLPNLHSLHMDPAYWPDPDRFDPVRFLDAE
GNVINKPESFMPFSG (1)
GRRVCLGEQLARMELVLFFSTLLQSFTFKTPEGAPPPSTDGIFGITLTPHPFQLCAIPR
>exon 1
ATGN171873.g1 APWS102929.b1 ATGI10244.g1
ATGGCTGTAATTGTCAGCTGGATAGTTGAGTC
CGTCCTGGAGATTTTGCAGATCTCCGGGCTGACTCTGCAAACAATTCTCGTCTTCTGTGT
GGCCTTCCTCATTGCGTGCACGTTCTTGTTAAAGCGCCCCAGGAACCTGCCACCTTTCCC
GGCAGGACGCGTGCCTGTTCTCGGGCACCTCCTCGCCTTGGGCCGAGCGCCTCACCTCAC
GCTGACGGCGTGGAGGCGGCAGTACGGGGACGTCTTCACCGTCAGGATGGGGATGGAAGA
TGTGGTGGTTCTGAACGGCTACACTGCCGTCAAGGATGCGCTCGTGGACATGTCCGAGCT
GTTCGCGTCCAGGCCGCCAAACTACCTGTTCGATTTGACAGTTGGATTCGGAGAAGGT (1)
>DIVT
exon 2 ATGI10244.g1
AGACATTGTTACTGCACGTTGGG
GGAGCAAGTTCAGACAGAGACGGAGGTTTGCTACCACGGCGTTAAGGAACCTCGGCATGA
AGGTCGGCACTGGCAGCATTGAAGAGAAAATCCGAGAGGAAGCTATACGTCTCCGCAACA
GGGT
>VAE
exon 3 ATUP738986.y1
AGGTTGCAGAATACGAGGGAAAAC
CTATTGATATCGCCCATGGTATCAACGTGGCGGTCGCGAACGTCATCTGCTCCATGACGT
TCGGAAAGCGCTACGACTACGAGGATGAAACGTTCCGGGAGCTCTCTGAGGCGGTTGTGA
CAATAATGTCTGAGCTTGGAGCGGGGCAGATTATCAGTGTCTTCCCCCTGTTACGGTTTG
TTCCAGGAGGT
>ASYS
exon 4
AGCCAGCTACAGTGTATCTGGACAACTGGCGAAGATCCAAAAGGT
GTTGAGGGAAGAAATGTCTCGCCATCGAGAACACCTGGATCACGAGAACCCACGAGACTT
CCTCGACTTCTGCCTGCTGGAGCTGGAACTGCAGGAAAAGGTGGCTGGTCTGACGGAAGA
GAACGTCCTGTATATGACACAGAACCTTTTCTTCGGTGGAACAGACACGACCACCAACAC
ATTGCTGTGGAGTCTACTCTACATGATTTTGAACCCAGACATCCAAAAGAAGGT
>AQEEL
exon 5
AGGCACAAGAGGAGCTTGATGCCGTTGTTGGTGAGAGTCTGCCCACCCTGTCCC
ACCGTTCCCAGCTGCACTACGTGAACGCCTGCCTGTTGGAGGTCATGAGGATCCGCCATA
TCGGGCCTCTTGCCGTTCCCCACGCCACCACAGACACGGTCAAAGTGAAGGAGTACGACA
TCGCTAAGGGAACCCAGGT
>VLP
exon 6 AFSA726698.g2 ATUP926693.b1 AFSA482736.g2
ATUP933964.y1
AGGTACTACCGAA
CTTGCACTCCCTCCACATGGACCCCGNCTACTGGCTTGATCCGGACC
GTTTTGACCCCGTAAGATTCCTGGACGCGGAA
GGGAACGTCATCAACAAGCCTGAGTCCTTCATGCCTTTTTCTGGAGGT
>GRR
exon 7 ATUP933964.x1
AGGCCGACGTGTGTGTCTTGGTGAGCAGCTGGCCAGGATGGAACTTGTCCTG
TTCTTCTCGACTCTACTGCAGTCCTTCACCTTCAAGACGCCAGAGGGCGCCCCTC
CTCCAAGCACTGACGGCATCTTTGGGATAACATTGACACCGCATCCGTTCCAGCTTTGTG
CAATACCACGTTAG
Other
closely related exons
>ATUP699472.x1
exons 6,7
VLLNVYSLHMDPAYWLDPDRFDPERFLDAEGKVINKPESFLPFGG
(1)
GGRVCLGEQLARMELFLFFTTLLQSFTFKPPEGASPPNADGILGLTLAPHPFQLSAIPR*
>AFPZ728456.y1
exons 5,6
VHEELDAVVGESLPTLSHRSQLPYVNACLQEVMRIRPVGPLAIPHATTEAVKVRGYDIPKRTQ
(0)
VLLNLYSLHMDPAYWPDPDRFDPERFLDAEGKVINKPDSFLPFGG
(1)
>AFPZ476483.b2
exons 5,6
VHEELDAVVGESLPTLSHRSQLPYVNAC
LQEVMRIRPVGPLAIPHATTEAVKVRGYDIPKRTQ
(0)
VLLNLYSLHMDPAYWPDPDGFDPEXFLDAEGKVXHKPES
>exons
5,6
AFSA16336.x4 AFSA16336.x1 AFPZ506410.x1 APNK80508.g2 ASWX68286.g3
ATUP343092.y1
ATGN182700.g1 ATUP756295.y1 AFPZ866552.y1 ATUP443435.g1
ASFW36405.b2 AFSA625448.b2 AFSA427303.b2
AFSA716480.g2 AFPZ471003.x1
VQQELDAVMGASLPSLSHRSKLPYVNACLMETMRIRTLLSVILHATAQEVKVQGYDIPKGTR
(0)
VLMNMHSLHMDPAYWPDPDRFDPERFLDAEGNVINKLPSFMPFSG (1)
AGGTACAGCAGGAGCTTGATGCC
GTTATGGGCGCGAGTCTGCCCAGCCTGTCCCACCGCTCCAAGCTGCCCTACGTGAACGCC
TGCCTGATGGAGACCATGCGGATCCGCACTCTTCTGTCTGTCATCCTTCACGCCACCGCG
CAGGAGGTCAAAGTGCAGGGATACGACATTCCTAAGGGAACTCGGGT
AGGTGTTGATGAACATGC
ACTCCCTCCACATGGACCCCGCCTACTGGCCTGACCCGGACCGGTTTGACCCCGAAAGGT
TTCTGGACGCGGAAGGGAACGTCATCAACAAACTTCCATCCTTCATGCCTTTTTCAGGAGGT
>ATGI42736.b1 ATGN217089.g1 ATUP49594.g2 AFSA786188.b2
AFSA126109.g2 AFPZ657783.y1 ATUP738387.x1
AFPZ495923.y1 dup. exon 5 (pseudogene) exon 6 and
part of 7
VHEELDAVVGASLPALSDRSQLL
YVNACLLETMRIRTLVPVSLPH
VQQELDAVVGASLPALSHRSQLPYVNACLMETMRIRTLLSVILHATAQEVKVQGYDISKGTR
(0)
VLMNMHSLHMDPAYWPDPDRFDPERFLDAEGNVINKLPAFMPFSG (1)
GHRVCLGEQLARMELFLFFSTLLQSFTIKTPEGAPPPNTDGIFGLALKPHPFQLCAIPR*
AGGTGTTGATGAACATGC
ACTCCCTCCACATGGACCCCGCCTACTGGCCTGACCCGGACCGGTTTGACCCCGAAAGGT
TTCTGGACGCGGAAGGGAACGTCATCAACAAACTTCCAGCCTTCATGCCTTTTTCAGGAGGT
>AFSA241515.g2 AFPZ140710.y1 APWS45577.b1
ASWX65492.b2
ATUP320554.x1 ATGN323264.g1 ATGN284296.b1
ATGI170827.b1
ATUP12995.x2 ATUP716729.x1 AFSA152443.b2
VLMNMYSLHMDPVYWPDPDRFDPERFLDAEGNVINKPESFMPFGG
(1)
GRRVCLGEQLARMELFLFFSTLLQSFHFKTPEGAPAPCADGIFRMTVTPHPFELCAIPV*
>AFSA83521.b2
VLMNMYSIHMDPVYWPDPDRFDPERFLDAEGNVINKPESFMPFGG
(1)
GRRVCLGEQLARMELFLFFSDLLQSFTFKTPEGAPAPCADGIFPMTLTPXPFELCAIPR*
>APWS92234.g2 ATWX24634.b1 ATGN357284.b1
ATUP895861.x1 ATGI42736.g1
ATGI104268.g2 ATWW86466.b1 ATWW117588.b1
ATUP559927.b1 AFPZ509619.y1
ATGN267193.b1
IHEELDAVVGESLPALSHRPQLPYVNACLLETLRIRTLV
XXXXHATTQDVKVQQFDIPKGTQ
(0)
VLPNLHSLHTDPAYWPDPDRFDPERFLDAEGNVINKPESFMPFSG
(1)
>ATUP710771.b1
1 aa diff to APWS92234.g2
AFSA525510.b2
VLPNLHSLHTDPAYWPDPDRFDPERFLDAEGNVINKPESFMPFGG (1)
>AFPZ185379.x1
ATGI56471.b1 ATGN311806.g1 ATUP672818.b1 AFSA909926.b2
AFSA523356.b2 AFSA330395.g2 AFSA330395.b2
ATGI93799.g1
APWS110835.b1 ASWX76430.g2
GTQCKLHACRSTLEDPLEQQAKLSSLTEENVLHMAGDLFLAGTETTTNTLQWSLLYMTLNPDIQNK
(0)
VQEELDAVVAESLPTLSHRSQLPYVNACLLEVMRIRTLIPAVRHVTTQEVKVQEYHISMGTW
(0)
VLANLHSLHTDPAYWPDPDRFDPERFLDAEGNVINNPKSFMPFGG
(1)
GRRACLGEQLARMELFLFFSTLLQSFTFTTPEGALPPNTDGVFGLTLVPHPFQLCATPR*
>ATUP912010.y1
ATGI128166.b1 ASFW164761.b2 AFSA840082.g2
AFSA174046.g2 AFSA315286.g2 AFPZ159110.y1
ATWW201417.g1
AFSA778163.g2
VLVNLHSLHMDPVYWPDPDRFDPERFLDAEGNVVNKPQSFMPFAG
>ATUP680104.g1
VILNLHSVHMDPAFWPDPDRFDPDRFLDAEGNFINKPESFMPFSA
(1)
>ATWW125683.g1
VHEELDAVVGASLPTLAHRSQLPYVNAFLMEVMRIRYVGPLGVPHATTAAVKVQEYDIPEGTQ
(0)
IILNLHSVHMDPAFWPDPDRFDPDRFLDAEGNFINKPESFIPFSA
>APWS102434.g1
ATWW177217.g1
KVQGYDIPKGXX
VLMNLYSLHMDPAYWPDPDRFDPERFLDAEGNLINKPESFMPFG
(1)
>ATWW233361.g1
ATUP551452.y1
VHEELDAVVGESLPALSHRSQLPYVNACLMEIMRIRYVGPLSVPHATTAPVKVQEYDIPKGTQ
(0)
VIVNLHSLHVDPAYWPDPDRFDPDRFLDAEGNFINKPESFMPFS
>AFPZ859823.x1
AWYB2850.g1 ATWW63772.b1 AFPZ870007.y4 AFPZ870007.y1
mate pair
AFPZ859823.y1 = exon 7 ATUP557464.y1
ATUP557464.y1
ASFW50972.b2 AFSA305932.b2
AFPZ122560.y1 ATUP820771.x1
Almost 51%
to CYP2U1 human
VWTKIQFSNIPLLITIVSGKLVTRFLFPVLFLPLVNR ???? uncertain
AFMEVLKQNSRVHEVLWDEIARHRETFDSENPRDFIDFCLLELEQQE
KVDGLTEENVMYMAQDLFFAGTETATNTLLWSLLYMTLNPGVQQK
(0)
VHEELDTVVGASLPTLSHRSRLPYANACLMETMRIRHIAPLIIPHATTDTVRVQEYDIPEGTQ
(0)
VLMNMYSLHMDPAYWPDPDRFDPERFLDAEGNVINKPESFMPFGG
(1)
GRRVCLGEQLARMELFLFFSTLLQSFSFKTPEGAPAPCADGIFRMTLTPHPFELCAIPR*
>ATWW225973.b1 CYP2U like ATUP29908.b1 ATUP481728.g1
APNK56784.b2
AFSA784029.g2
ATUP411423.g1
FLDSDGKVVTRPESFMPFST
(1)
GRRVCLGEQLAKMELFLLFSSLLKHFTLKLPEGAAAPSTDGIMGFFYVPPKVNMCITKR*
>ATGI187647.g1
MWLMTITVGLVTLILVKWLKDYVQRWRMPPGPFFWPVIGNLSCKYRGS (0)
>ATGI151113.b1 ATWW15542.g1
SYLTFIDLAKTYGDVFSLKMGMTDVVVLNSLDAVKEAFVKKGEDFAGRPKMT (1)
>AFPZ866519.x1 66% to C-helix of CYP17A2
TDISSEGGKDIAFADYSPTWKLHRKLFHSAIR
(2)
>ATGI157309.b1
ATGN157240.b1 ATUP362994.b1 AFPZ866519.x1
GYASAQNLQSKVHESLEDTIAVFSKMEGQAVDLEDYIYQLVYNVICSAAFGTR
(2)
>AFPZ866519.y1
YNMDDEDFDTLMKISKDTTETFGQGLLADVYPVLRFLPSS
(1)
>AFPZ295620.x1
SVTANRKMTHQLMEIMQRHLEQHRESFDP
(1)
IPLNEYQCTLLQITSVTSQITMIKAQKDAEEEGIQDIDSLTDTHLRQLIGDISF
(1)
>ASWX154218.b2 I-helix to
EXXR region ATGN97768.g1 AFSA255326.b2
47% to
CYP1A7
AGTISTILTLRWAILYLAVHPEIQEKVAAELDSVVGRDRLPELSDREATPYTEAIFHEVMRMASMDPV
SLPHATTVDTTLS
()
GYQIPKGTWILPNLWALHHDPDTWGDPDVFRP
>AFPZ295620.y1
ASWX154218.g2 (very
end + downstream seq)
DVFRPERFLDESGKPIPKPAALMPFG
(2)
VGRRACPGEALGKADTFLLLGGLVQNFRFSIPEGEGPPDLTPDEIGQ
GSISIPYPYNVVMTCRK*
35% to
Xenopus CYP17 and 36% to CYP1A6 and CYP1A7
MWLMTITVGLVTLILVKWLKDYVQRWRMPPGPFFWPVIGNLSCKYRGS (0)
SYLTFIDLAKTYGDVFSLKMGMTDVVVLNSLDAVKEAFVKKGEDFAGRPKMT (1)
TDISSEGGKDIAFADYSPTWKLHRKLFHSAIR
(2)
GYASAQNLQSKVHESLEDTIAVFSKMEGQAVDLEDYIYQLVYNVICSAAFGTR
(2)
YNMDDEDFDTLMKISKDTTETFGQGLLADVYPVLRFLPSS (1)
SVTANRKMTHQLMEIMQRHLEQHRESFDP
(1)
IPLNEYQCTLLQITSVTSQITMIKAQKDAEEEGIQDIDSLTDTHLRQLIGDISF
(1)
AGTISTILTLRWAILYLAVHPEIQEKVAAELDSVVGRDRLPELSDREAT
PYTEAIFHEVMRMASMDPVSLPHATTVDTTLS
()
GYQIPKGTWILPNLWALHHDPDTWGDPDVFRPERFLDESGKPIPKPAALMPFG
(2)
VGRRACPGEALGKADTFLLLGGLVQNFRFSIPEGEGPPDLTPDEIGQGSISIPYPYNVVMTCRK*
>DE040433.1 Amphioxus genomic
survey sequence. No introns, NEW 1/6/06
41%
to 1B1 Danio, 40% to CYP1C1 fugu, 39% TO 1A1 HUMAN, 39% to Xenopus 1A6, 1A7
trace
file 630869645 632546376 539391436
MAAVATAALFGLSYLQVVLIAVLLVLVAAVVASSLRQNTPSLPPGPWGF
PVVGIFPALGSRPHHAFSRMAEKYGDVFRVKFGSRT
VIILNGIDMVKDACVKQSACFAGRPALYSFKQVKNGITFKTYSPSWVARKKVTVGALKGF
VNGRVGALTASAETMITEEAQELARVFLSKSGQPSNPEEYAHTAVANVVCALCFGKRYEH
GDQEFRQLLRNTEKFRQAIGAGNPADFMPWLRFFPNKNMKLFKEAMESSTQLFDKHINAH
LQTYDPSVIRDIADALIYNMRENKEAGLTDEFVLECVIDIFGAGQDTTSQMLHWAFLY
MLVFPDVQARVQREIDGVVGRERAPTLADEASLPYTVAVIQEIVRHTGVVPMSIPHLTTK
DTQLHGYTLPKDTIVFANLFSVGHDRRIWGDPSSFRPERFLDPSGTTLDPAAVEKNLPFS
AGKRRCPGEHLAKQEMFLFFSILLQQCSFERVNGTASPTLEGTFGLVMRPQPYSMIVRPR*
>gi|62381799|gb|DN791732.1|
90857715 Sea Urchin primary mesenchyme cell cDNA library
Strongylocentrotus purpuratus cDNA clone PMCSPR2-126F11
5',
mRNA sequence.
Length = 983
Score = 259 bits (662), Expect = 5e-68
Identities = 124/306 (40%), Positives =
189/306 (61%)
Frame = +3
Query: 172
LVYNVICSAAFGTRYNMDDEDFDTLMKISKDTTETFGQGLLADVYPVLRFLPSSSVTANR 231
++YNV+ FG Y ++D + M ++ D
+ G GL AD++ +++P+S +
Sbjct:
84 IMYNVLAHLCFGLSYELEDPNVTQWMDVNNDVNDKLGLGLAADIFSWAKYIPTSGPRMIK
263
Query: 232
KMTHQLMEIMQRHLEQHRESFDPIPLNEYQCTLLQITSVTSQITMIKAQKDAEEEGIQDI 291
++T + ++ +++ RE +DP
+N++ LL
KAQ+DA +EG +++
Sbjct: 264
EITETMFGFLRSQVDEAREHYDPENINDFYSLLL------------KAQEDARKEG-ENV 404
Query: 292
DSLTDTHLRQLIGDISFAGTISTILTLRWAILYLAVHPEIQEKVAAELDSVVGRDRLPEL 351
D
LTDTH+ Q + DI AG +T+ TL WA+ L +PEIQ K+ AE+D
V+GRDRLP +
Sbjct: 405
DKLTDTHIFQTVADIFGAGIQTTVETLYWAMALLVTYPEIQAKIRAEIDDVIGRDRLPTI 584
Query: 352
SDREATPYTEAIFHEVMRMASMDPVSLPHATTVDTTLSGYQIPKGTWILPNLWALHHDPD 411
+DR PYTEA +EV+R +S+ P+++PHAT+ DT GY IPKGT ++ N ++H+DP
Sbjct: 585
NDRGNLPYTEASLYEVLRYSSIAPIAVPHATSRDTEFGGYHIPKGTTVMINTHSMHYDPQ 764
Query: 412
TWGDPDVFRPERFLDESGKPIPKPAALMPFGVGRRACPGEALGKADTFLLLGGLVQNFRF 471
W PD F PE FLD+ G P + +PFG GRR C GEA+
KAD FL+ G +QN+ F
Sbjct: 765
EWDQPDKFLPEHFLDDGGTIREHPPSFLPFGAGRRGCLGEAVAKADLFLIFXGFLQNYTF 944
Query: 472
SIPEGE 477
S G+
Sbjct: 945
SKAPGK 962
>BI385897
Amphioxus 26hr cDNA library 57% to 3A65 zebrafish, 62% to 3A49 fugu
RFFSTRVREVNGLHIPAGMIVNIPVYAIHYDADLWPEPEKFKPERFTKEEKESRDPYAYL
PFGSGPRNCVGMRLAQLELKFALAKMLQKFRFVTCDKTDIPVRLQNTLGNQIEGGLFLKV
EART*
>AU234604
Amphioxus Notochord cDNA Branchiostoma belcheri (two parts)
42% to 5A1
fugu
HEGKGVGKYIGRTPHLQISDPEMLREIFVKQFHKFANRAPEGMALDVKPQSRMLTQLVDE
DWKNVRSTISPAFSGGKLKQMTEAINSCADLLVGNIGKFGEKGESFDTKELTGAFTTD
(seq gap)
44% to 5A1 47% to 3A49
IPKQMMILIPVLGIHYDPERWPEPYKFIPERFTKEEKEKRDPFDWLPFGAGPRNCIGIRLAM
MELXGGLARVLMK
TGPXTDIPLKXMKNKQXPTPENGIRLXAELXHPGXD*
>AC150395.1
137000-149000 region - strand in HTGS first exon is a best guess
no matches
to other P450s or ESTs can yet identify this N-terminal exon.
149120
MLPPNLSQEGCINDQTHRVSS (2) 149058 (possible exon 1)
148997
MFSDIPFFYDRAHIISIWYNRKL (2) 148929 (possible exon 1)
148908
MKDGRLAFRTFFCKQSLHTLSKNDK (2) 148834 (possible exon 1)
148551
YATWPYNTFKKLGIPGPPPLPLIGNLIDYKK 148459
147339
GLSNIDLEWMKKYGKYWG 147286
146527
VYEGQLPVLIVADTKLIKQINVKEFPNFANRR 146432
145813 LMPGNGPVMKYSLTVLQDAEWKRVRSYMSPFNSAYSLKQ
145697
QLCYLIENTSDNLVAAMKRYHDAGQYVDVKE
144646
IFGCYTMDVISSTGFGTDVNSLSDPDSIFIKNVKKFYAIGALSPFTLLT 144500 (1)
143785
FGFPWFAFFLDRNNWFFNIVPPPVFNFFADAIRKVISIRESNPAESD (0) 143639
143212
KRVDVMQLLLKSHNTALDEPGNEGNIKH 143129 (1)
142553
GLSYNEILANGFIFWIGGYDTTATTISFLAYNLALNPDIQERVIAEIDEIMRGRVGHIFEYLR (0) 142368
141604
ECMDYKAASEMKYLKMCVDETLRMYPPSQR 141515
141141
AKEDIDLDGVKIPKGMCVQFSSFAIHYDPDNWPDPEKFDPER 141016
139356
FTPEEKKKRDPYAYVAWGVGPRSCVMKRLGMLEVKFAIAKILMKYRLRPCEKTQ 139195
138066
IPIRVKVSNLTQPDHGMFLKLEARTDI* 137883
>ATGN234930.b1
exon 2 seq AFSA808968.g2 55% to CYP3C zebrafish
(2)
YFMWPYSAFEDLGIPGPKPLPLFGNYLSYGK (0)
SAGEFDRECYKKFNKVYG (2)
CYP4F
like (looking,
there are several genes, possibility of
Hybrid
assembly here. These are the same
as the CYP4T section)
>AFPZ699255.y1 AFPZ733698.x1 ATUP16248.y2
AFPZ163606.x1
APWS139179.b1
ATUP98469.x4 AFSA796551.b2
ATUP336859.y1
AFPZ793080.x1
48% to 4T5
43% to 4F28 more like 4T
DQLLSHDHNCRYRTCWRTPVIALTFCSHPETVKPILSNK
(1)
AQKTEWMYRFFRPWL
(1)
GDGLLLSDGPKWQRNRRLLTPAFHFDILKHYVQLFSESTA
VLL(0)
DKWMSRGPGASVELFDHIGLMTLDNILKCSLGYNSRCQTDG (2)
IKWMSRGPDASVELLDDIGLMTLDNILKCSLGYNSRCQTDG (2)
SAPYILAVNDLTRLFAERGDQPLHYFDFIYYLSSDGRR
(2)
XXXXXNMVHRHSAEIIRQRKDTLKEQSDGDSA--KKYLDFLDILLRAK
(0)
DEDGNGLTDAEIRDEVDTFVFEGHDTTASGLAWTLYCLARHPGHQEKCRKEAQEVLQGRTELTW (2)
DNLSSLKYITLCITESLRM
YPPVQRLFRQLEKPMTFFDGRTLPQ (1)
GSPTMTDIAGTHRNRDIWPNPT
(0)
VYDPYRFSPENSANRHPYAFLPFSAGPR
(2)
NCIGKNFAMNEMKVSVALILQHFQLELDETKSPAVPFDSLTVQAKDGIWVKLHPVKNDT*
CYP4T like
(looking, at least 4 different sequences same
as 4F
like)
>
CA385834.1| Oncorhynchus
mykiss cDNA clone
Amphi.
1
EPKDRVSYAWLKPWIGDGLLVSEGQKWFRNRRLLTPGFHFDVLKPYVKVFSECTNIML 58
EPKD
+SY +L PWIGDGLLVSEGQKWFR+RRLLTPGFH+DVLKPYVK+ ++ ML
Sbjct: 390
EPKDDLSYRFLIPWIGDGLLVSEGQKWFRHRRLLTPGFHYDVLKPYVKMMADSAKTML 563
MELFETLKKVTLDSYRIHHLVAIFSLVYVILKISKLIVKRNEW
IRALETFPGPPKHWLFGHVREFKQDGNDMYKVVKWGESYPLAFQMWFGPFVSILNIHHPDYVKTILAST
ATGN270676.g1
FPLWIGPFRVVLSLVHSDYIKEIVNSP (1)
AFPZ138711.y1
SEVHCLFQVRPDETGFTIVPQWAAKFKFAFPLWIGP
EPKDDLSYRFLIPWIGDGLLVSEGQKWFRHRRLLTPGFHYDVLKPYVK
MMADSAKTMLDKWETHSKSDESFELFEHVSLMTLDSIMKCAFSSNTNCQTVRG
GESGTNSYIKAVYELSDLVNVRFRTFPYTASGSST
>AFSA90852.g2
MNCLNLGISVPLSLSTFAMVSIPAQWLPHWETGYLRTACLTVLVAVAVQLVFRFLRAL
LWKRYIQKVLAPFPGQPAHWLFGHMRE
(0)
ATGAATTGCTTAAACTTGGGAATTTCTG
TCCCTTTATCCTTAAGCACCTTTGCTATGGTGTCCATACCGGCGCAGTGGTTGCCCCACT
GGGAGACCGGTTACCTGCGGACCGCCTGCCTGACCGTGCTGGTTGCCGTGGCCGTTCAGC
TGGTGTTCAGGTTCCTCCGTGCGTTGCTATGGAAACGGTACATTCAGAAAGTTCTGGCAC
CATTCCCAGGACAACCTGCACACTGGCTGTTTGGTCATATGAGAGAGGTGAGGGATTGT
ATGN26250.g1
AFSA103155.g2 AFSA90852.g2 AFPZ722043.x1
AFPZ440133.b2
walked up
from ASWX93226.g2
AFPZ138711.y1
AFPZ115087.x1 ASWX102329.g2 ASWX93226.g2
ASWX155166.g2
AFPZ440133.b2
ATGN270676.g1
VRPDETGFTIVPQWAAKFKFAFPLWIGPFRVVLSLVHSDYIKEIVNSP (1)
AGGTCCGGCCGGATGAAACCGGTTTCACCATCGTGCCACAG
AGTGGGCAGCGAAGTTCAAGTTTGCCTTCCCGCTCTGGATCGGGCCGTTCCGCGTGGTT
CTCAGCCTCGTACATCCCGACTACATCAAGGAGATCGTCAACTCACCAGGT
walked up
to ASWX93226.g2
walked up
to ATGI120597.g1 ATUP404745.b2 ATGN167137.b1
AFPZ550292.x1
AFPZ345721.x1 ATUP332415.x1 ATUP404745.b2
ATGN270676.g1
ATGN167137.b1
(1)
EPKDRVSYAWLKPWI (1)
AGAACCAAAGGACAGGGTGTCATATGCCTGGCTGAAACCATGGATAGGT
>ATUP272608.y2
mate pair C-helix
APNK110373.g1
APNK106080.b1 ASWX61863.b2 ATUP449675.g1
AFPZ713557.b2
AFSA198365.g3
walked
down to AFPZ81376.b2 ATUP272608.y2 ATGI238519.b1
AFSA874000.b2
AFSA451698.b2
walked
down to APWS127433.g1 ATGN176879.g1 ATGN84797.g1
AFPZ133921.y1 mate pair
GDGLLVSEGQKWFRNRRLLTPGFHFDVLKPYVKVFSECTNIML (0)
AGGTGACGGGTTGTTGGTCAGCGAGGGACAGAAATGGT
TCCGTAACCGGCGCCTCCTCACGCCGGGGTTTCACTTCGACGTGCTGAAGCCGTACGTCA
AGGTCTTCTCTGAATGTACCAACATCATGCTAGT
>ATGI238519.b1 ATGN124963.b1 ATGN84797.g1
AFPZ612539.b2 ATGI238519.b1
AFSA451698.b2 AFPZ612539.g2 APWS127433.g1 AFPZ133921.y1 (short with mate pair)
these next
seqs are 100% aa matches with three nuc diffs
AFPZ115087.y1
ASWX93226.b2 ASFW53123.g2 AFSA488932.b2 AFSA447520.b2
AFSA13398.y1 AFPZ368667.y1 AFPZ407080.b2
DRWADLAPGTPVEMFHYASAMTLDSLMRCALSVRSDCQRDSDGSP
(2)
AGGACAGGTGGGCAGATCTGGCACC
TGGCACACCTGTGGAGATGTTCCACTACGCCAGCGCCATGACACTGGACAGCCTGATGCG
ATGTGCGCTCAGCGTGCGCTCGGACTGCCAGCGGGACAGCGACGGGAGTCCGT
>AFPZ133921.y1
no intron between exons 5 and 6
this seq
also reaches exon 7
APWS127433.g1 ATGN124963.b1 ATGN84797.g1
AFSA451698.b2
AFPZ612539.b2 ATGN176879.g1 AFPZ612539.g2
MTLDSLMRCALSVRSDCQRDS
DGSPYIRAVYDLTKCVVERGRYQPFHIPLIFHLSPTGFR
(2)
AGGACAGGTGGGCAGATCTGGCACCTG
GCACACCTGTGGAGATGTTCCACTACGCCAGCGCCATGACACTGGACAGCCTGATGCG
ATGTGCGCTCAGCGTGCGCTCGGACTGCCAGCGGGACAGCGACGGGAGTCCGTACATCCG
CGCCGTGTACGACCTGACGAAGTGCGTGGTGGAGCGTGGTCGCTACCAACCGTTTCACAT
TCCCCTCATCTTCCACCTCAGTCCAACTGGCTTTAGGT
>ATGN249826.g1
ASWX134355.g2 AFPZ468928.y1 ASWX173456.y1
the
following are 100% aa seq matches with three nuc diffs
AFSA56765.y1
AFPZ522964.x1
(0)
DKWTKLGSGCSVEMFEHVSLMTLDSILKCSLSYHSNCQTDR (2)
AGGACAAATGGACCAAGCTTGGCTCTGGATGCTCT
GTGGAGATGTTTGAACACGTCAGCCTGATGACTCTGGACAGCA
TCCTGAAATGTAGTCTCAGTTACCATAGCAACTGCCAGACTGACAGGT
>ATUP915634.x1
mate pair SWX165907.b2 mate pair ATWX7439.g1 ATUP623212.x1
AFSA620746.b2
ATGI69856.b1 ATWW161831.g1 AFPZ932881.y1
the
following have 1 aa diff AFPZ218223.x1 APWS83183.b1
ATUP402691.g1
mate pair ATGI268005.g1
AFSA650722.b4
(0)
ENWEEFGAGASIDVFQHVSLMTLDSMLKCALSQNTGCQKR (2)
AGGAAAACTGGGAAGAGTTCG
GGGCTGGTGCCTCTATAGATGTGTTCCAACACGTCAGCCTGATGACTCTGGACAGCATGC
TGAAATGTGCTCTCAGTCAGAACACTGGCTGTCAGAAAAGGT
>ATUP430247.b1
mate pair AFPZ224663.x3 mate pair
AFPZ575633.x1 ATUP555575.x1 ATUP546626.b1 AFSA312247.g2
the
following have 1 aa diff APWS45028.g1 AFSA160628.b2
the
following have 2 aa diffs AFPZ818365.b2
(0)
ENWEESGAGTSIDVFQHVSLMTLDSMLKCALSQDTGCQKR (2)
AGGAAAACTGGGAAGAGTCTGGGGCTGGCACCTCCATAGATGTGTTTCAACACGTCAGCCTGA
TGACTTTGGACAGTATGCTGAAGTGTGCTCTCAGTCAGGACACCGGCTGTCAGAAAAGGT
>AFPZ354459.y1
and 100% matches
AFSA73126.x1
ATGN346495.g1 ATGN268793.g1 ATGN182214.b1
ATUP895246.y1 ATWW135985.b1 ATWW155643.g1
ATUP777488.y1
ATWW119927.b1 ATUP289252.x2 ATUP174497.b1
AFPZ892508.y1
AFSA908409.b2 AFSA735851.b3
100% aa
with 1 nuc diff ATGI117873.b1 AFPZ107894.y1 AFSA27567.b2
ATGN330749.b1
ATUP754095.x1 AFSA311147.b2
AFSA307647.b2
AFPZ683302.b2
1 aa diff
APWS50764.b1 ATUP606978.x1 ATGN213407.g1
ATUP850436.x1
ATWW227658.g1 AFPZ789326.y1
ASFW186505.g2
100% aa
seq with 2 nuc diffs APWS109107.b1 ASWX40589.g2
ATUP750108.x1
ATUP733499.x1 ASFW37910.g2 AFSA448858.b2
AFSA346719.g2
100% aa
seq with 3 nuc diffs AFPZ492058.x1 AFPZ290838.x1
APWS27430.b1
ATWW70674.b1 ASFW45413.b2 AFSA582563.b2
AFSA108008.b2
(0)
DKWSRVAAGSSVELFDHVSLLTLDSMLKCSLGYRSDCQTDG (2)
AGGACAAGTGGAGCAGAGTTGCTGCGGGCTCCTCCGTGGAACTGTTT
GATCACGTGAGCCTGCTGACGTTGGACAGCATGCTGAAGTGCAGCCTTGGTTACCGTAGT
GACTGTCAAACTGACGGGT
>ATGN258597.b1
AFPZ55223.b2 AWYB1196.b1 ATUP857105.g1
ATGI235564.b1
AFSA312605.b2
1 aa diff
ASWX177131.g2 ATUP648660.b1 ASFW159249.b2
(0)
EKWLSRGPGASVELFDQVGLMTLDNILKCSLGYHSNCQTDG (2)
AGGAAAAGTGGCTGTCACGTGGTCCAGGCGCGTCTGTGGAGCTGTTTGACCAGGTCGGCCTGAT
GACGTTGGACAACATCCTGAAATGCAGCCTCGGTTACCATAGCAACTGCCAGACTGACGGGT
>ATUP337506.y1
mate pair ATGI217395.b1 ATGN85884.g1
ASFW173326.b2
AFSA644705.g2 AFSA137556.g2
APNK84474.g2
1 aa diff
three nuc diffs ATGI10129.b1 APNK71044.b2
ATUP412849.x1
ATGN367966.b1 ATGI161407.b1 ATUP795448.x1
AFPZ717049.x1
(0)
AKWRQLGAGASIDMFEHVSLMTLDSMLKCALTVESNCQVDR
(2)
AGGCCAAGTGGAGGCAGCTTGGTGCGGGTGCATCCATCGACATGTTTGAGCAC
GTGAGTCTGATGACGCTGGACAGTATGCTGAAGTGTGCGCTCACAGTGGAGAGTAACTGT
CAGGTGGACAGGT
>ATUP337506.y1
mate pair ATGN85884.g1 AFPZ139710.y1 AFPZ433829.y1
APNK24749.b2
APNK41276.b2
same aa
seq 2 nuc diffs ATUP587507.x1 AFSA873673.b2
AFSA471589.g2
KQNSYIAAVFSLTKLALQRFHLFPLHSDLIYYLTPMGYRLVQSKGSLSFSTTQ
(2)
AGAAAACAGAACTCGTATATTGCTGCTGTATTCTCCCTGACCAAG
TTGGCTCTACAGCGTTTCCACCTCTTCCCTCTGCACAGTGATCTGATCTACTACCTCACC
CCTATGGGATACAGGTTGGTACAGTCCAAGGGCTCTCTAAGCTTCTCTACCACACAGT
There are
at least 7 different exon 5 sequences
(0) ENWEEFGAGASIDVFQHVSLMTLDSMLKCALSQNTGCQKR
(2)
(0) ENWEESGAGTSIDVFQHVSLMTLDSMLKCALSQDTGCQKR
(2)
(0) DKWTKLGSGCSVEMFEHVSLMTLDSILKCSLSYHSNCQTDR (2)
(0) DKWSRVAAGSSVELFDHVSLLTLDSMLKCSLGYRSDCQTDG (2)
(0) EKWLSRGPGASVELFDQVGLMTLDNILKCSLGYHSNCQTDG (2)
(0) AKWRQLGAGASIDMFEHVSLMTLDSMLKCALTVESNCQVDR (2)
(0) DRWADLAPGTPVEMFHYASAMTLDSLMRCALSVRSDCQRDS (no
intron)
>ATWW176870.g1
ATGN284985.b1
FRKACKTAHDFSDEVIRKRRTELQQQGCHQNDTANSSEDGGKKRYLDFLDILLQAR
AGGTTTCGTAAAGCATGTAAAACTGCTCATGACTTCTCTGATGAAGTCATCAGAAAGAGGCGGACAGAGC
TCCAACAGCAAGGCTGTCATCAGAACGACACAGCGAACAGCTCGGAAGATGGGGGCAAGA
AACGATACCTGGACTTCTTGGACATCCTGCTACAAGCAAGGGT
>ATGN26250.b1
I-helix ATUP147621.b1 AFPZ133921.x1
AWXX12803.b1
ATWW81852.b1
ATWW176870.g1 ATUP449675.b1 AFSA492010.b2
60% to
CYP4F42 Xenopus 59% to 4T5
DEDGKGLSEREIRDEVDTFMFEGHDTTASGVSWILYNLAKHPACQDRCRAEVDAVLQGRAEVKW
>ATWW81852.b1
ATUP449675.b1 ATGI42359.g1 ATUP272608.x2 ATGN148197.b1 EXXR exon
AFSA105706.g2
AFPZ660013.g2 ATGN130493.b1 ATUP427110.b1 walk to ATUP164815.x1
EDLSKLPYTTMCIKESLRMHSPVPGVTRLTTQPHTFPDGRSIPA
AGGATTTTTTGCTTACATGTAGGGAGGACCTGTCC
AAGCTGCCCTACACCACCATGTGTATCAAGGAGAGTCTGCGGATGCACTCCCCTGTCGGGGGTGACACGGCTCA
CCACACAGCCGCACACCTTTCCTGATGGGAGAAGCATCCCCGCAGGT
>ATUP164815.x1
AFPZ309668.x1 ATGI130869.b1 ATWW159270.b1 ATUP541926.g1
APNK46679.g2
GCTAPILGAPGCTCTQYFEPCY
(0)
EFDPERFSPENSKGRSSHAFIPFSAGSR
AGGAATTTGACCCTGAGCGTTTCTCGCCTGAGAAC
TCCAAGGGCCGCTCTTCCCATGCCTTCATTCCTTTTTCAGCTGGATCTCGGT
>AFPZ309668.x1
AWYB4583.g1 AFPZ660013.b2 ATUP427110.g1
NCIGQHFAMNELKVTVALTLQRYRLELDETRPPYRVARLITRTRDGLWLKVYPRGADN*
50-52% to
CYP4F sequences, this seq intact
MNCLNLGISVPLSLSTFAMVSIPAQWLPHWETGYLRTACLTVLVAVAVQLVFRFLRAL
LWKRYIQKVLAPFPGQPAHWLFGHMRE
(0)
VRPDETGFTIVPQWAAKFKFAFPLWIGPFRVVLSLVHSDYIKEIVNSP (1)
EPKDRVSYAWLKPWI
(1)
GDGLLVSEGQKWFRNRRLLTPGFHFDVLKPYVKVFSECTNIML
(0)
DRWADLAPGTPVEMFHYASAMTLDSLMRCALSVRSDCQRDS ATGN124963.b1
DGSPYIRAVYDLTKCVVERGRFPPFHIPLIFHLSPTGFR (2)
ATGN124963.b1
FRKACKTAHDFSDEVIRKRRTELQQQGCHQNDTANSSEDGGKKRYLDFLDILLQAR
(0)
DEDGKGLSEREIRDEVDTFMFEGHDTTASGVSWILYNLAKHPACQDRCRAEVDAVLQGRAEVKW
(2)
EDLSKLPYTTMCIKESLRMHSPVPGVTRLTTQPHTFPDGRSIPA
(1)
GVSVSIGVHSLHHNIHVWGDNVM
(0)
EFDPERFSPENSKGRSSHAFIPFSAGSR
(2)
NCIGQHFAMNELKVTVALTLQRYRLELDETRPPYRVARLITRTRDGLWLKVYPRGADN*
DKWTKLGSGCSVEMFEHVSLMTLDSILKCSLSYHSNCQTDR (2) ASWX173456.y1
QSSAYIRAVYDITRLFVE
RIRFPPYYSDFIYSLSGTGSFDRRRSGCGVVLWL (1) ASWX173456.y1
DRWADLAPGTPVEMFHYASAMTLDSLMRCALSVRSDCQRDS ATGI238519.b1
DGSPYIRAVYDLTKCVVERGRYQPFHIPLIFHLSPTGFR (2)
AFPZ133921.y1
Missing
piece
ETDLGIAIYGCHHNSALWENPE
capitella
GCTIGVSIYGIHMNSTVWENPY
danio
GTRIGTSVFGIHRNATVWENPT
tetraodon
ESRIGTSVFGIHRNASLWENPN
fugu
GVSVSIGVHSLHHNIHVWGDNVM ATUP164815.x1
GTLVGLSIYAIHKNPAVWEDPE
xenopus
>ATUP164815.x1
ATGI130869.b1 ATWW159270.b1 ATUP541926.g1
APNK46679.g2
GVSVSIGVHSLHHNIHVWGDNVM (0)
AGGTGTTTCTGTGAGCATTGGAGTGCACAGCTTACATC
ATAACATCCATGTGTGGGGAGACAACGTCATGGT
CYP4T5 Fugu rubripes
(pufferfish)
No accession number
Scaffold_8637
78% to 4T2
508
MEITRALVVLGWSHFYQLLALFCLAIVLYKLTVLLMLKRALIRNFESFPGPPGHWLFGNILE 693 (0)
902
FKQDGNDLDKLVKFGQKYPYCFPLWFGPFVCFLNIHHPEYVKTILAST 1045 (1)
1142
EPKDDLAYSFIQNWI 1186 (1)
1291
GNGLLVSQGQKWFRHRRLLTPGFHYDVLKPYVKLMAHSTKTML 1419 (0)
1673
DKWESYAKTNKPLEVFEYVSLMTLDTILNCAFSYDSNCQTER 1798 (2)
2267
KNTYIKAVYELSNLINLRFRIFPYHNDLIFYLSPHGFRYRKACMVAHSHT 2416 (1)
2521
EEVIKKRREALKKEKELERIQAKRNLDFLDILLFAK 2638 (0)
3171
DENQQGLLDEDIRAEVDTFMFEGHDTTASGISFLLYNLACHPKHQKLCRKEIMQVLHGKDTMDW 3362 (2)
3457
EDLNKIPYTTMCIKESLRMHPPVPGISRKTTKPITFFDGRTLPA 3588 (1)
392 ESRIGTSVFGIHRNASLWENPNV 457 (1 0) this exon from Fugu LPC.11421.x1
fdhwrflpenvskrsphafvpfsagpr this exon from 4T2 Dicentrarchus labrax
NCIGQNFAMNEMKVVIAMTLLKYELLEEPTLKPKIIPRLVLRSLNGIHIKIKNANQN*
search
with this Dicentrarchus CYP4T2 seq
gacaaatggg
gaagttatgc aaacagcaac
541 gagtcctttg
aattgtttca acatgtgagc cttatgactc tggacagcat cttgaagtgt
601 gctttcagct
acaacagcaa ctgtcagact gagagtggaa caaatgtgta catcaaagca
661 gtgtatgaac
tcagtgatct gataaacctg cggttgagga catttccata ccacagtgac
721 ctaattttct
acctcagccc acatgggtac agatacagaa aggcaatcaa agtggctcag
781 agtcataca
fugu 1
DKWESYAKTNKPLEVFEYVSLMTLDTILNCAFSYDSNCQTER-KNTYIKAVYELSNLINL 59
DKW
SYA +N+ E+F++VSLMTLD+IL
CAFSY+SNCQTE N
YIKAVYELS+LINL
dicent 511
DKWGSYANSNESFELFQHVSLMTLDSILKCAFSYNSNCQTESGTNVYIKAVYELSDLINL 690
QSSAYIRAVYDITRLFVE
Query:
60
RFRIFPYHNDLIFYLSPHGFRYRKACMVAHSHT 92
R R
FPYH+DLIFYLSPHG+RYRKA VA SHT
Sbjct: 691
RLRTFPYHSDLIFYLSPHGYRYRKAIKVAQSHT 789
RIRFPPYYSDFIYSLSGTGSFDRRRSGCGVVLWL
ASWX173456.y1
QSSAYIRAVYDITRLFVE
RIRFPPYYSDFIYSLSGTGSFDRRRSGCGVVLWL (1)
AGCCAGTCCAGCGCGTACATCCGTGCTGTGTATGACATCACGAGGCTGTTTGTCGAGCGTATTCGCTT
TCCGCCGTACTACAGTGACTTCATCTACTCGCTCAGCGGTACCGGCTCATTCGATCGACG
GCGGTCGGGGTGTGGTGTGGTTTTGTGGTTGGGT
>AFSA29926.g2
ATGI49052.b1 65% to CYP4F42 Xenopus
DEDGTGLTDAEIRDEVDTFLFEGHDTTASGISWALYHLAKHPEYQDRCRREAEGLLQGRTEMTW
>APWS109579.g1
74% to CYP4F42 Xenopus
DEDGNGLSDVEIRDEVDTFMFEGHDTTASGLSWTLYNLARHPEHQERCRQEARSVLQGRSVVTG
>ATUP337506.x1
DEDGNGLSDVEIRDEVDTFMFEGHDTTASGLSWTLYNLAKHPEHQERCRQEARSVLQGRSVVNR
>AFSA246853.b2
AFPZ352137.x1 AFPZ224663.y3 APWS16853.b1
HKAACNIVHKYSEEIILQRKEVLKQQSAGDSTHGKKYLDFLDILLRAK
(0)
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQDKCRKEAQEVLQGRTVDTW
>ASWX6756.g2
ATGI162241.b1 AFPZ352137.x1 ATUP390177.g1 ATUP407011.b2
AFSA735184.b2
AFSA426465.b2 AFPZ654427.g2 ATWW138964
5 nuc
diffs to ATUP664988.g1, 1 aa diff
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQDKCRKEAQEVLQGRTEVTW
>ATWW138964.g1
FRDEVDTFMFEGHDTTASGLALTLYCLARHPGHQDKCRKEAQEVLQGRTEVTW
>ATUP664988.g1
ATWW180530.g1 AFSA128002.b2 AFSA100306.g2 AFPZ619699.x1
ATGN318029.g1
AFPZ458632.x1
YRKACNLVHEYAKRIIAERREALKQRLTEDDEETNKKKYLDFLDILLKAR
(0)
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKCRREAQEVLQGRTEVTW
>ATUP172867.g1
ATWX43092.b1 ASWX40865.b2
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARYPGHQDKCRKEAQEVLQGRT
EVTW
>ATWX40562.b1
ATUP926382.b1 ATUP177523.b1 ATUP539865.g1
note same aa seq, 3 nuc diffs to
ATUP664988.g1
ATUP16248.y2
ATGN214275.b1 ATUP837406.y1 ATUP298210.b1
AFPZ694240.b2
AFSA5009.x1
note same
aa seq, 4 nuc diffs to ATUP664988.g1
72% to
4F42 Xenopus 60% to 4T5 52% to 4V5
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKCRKEAQEVLQGRTEVTW
>AFPZ112195.x1
ATGN200959.g1 ATUP907530.x1
4 nuc
diffs to ATUP664988.g1 2 aa diffs to ATUP664988.g1
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLACHPGHQEKCRLEAQEVLQGRTEV
>ATGI22425.b1
ATUP539865.g1
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKCRREAQEVLQGGTEVTW
>AFPZ112195.x1
probably same as ATWX40562.b1 with two errors
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLACHPGHQEKCRLEAQEVLQGRTEVTW
>ATUP540419.g1
ASFW147712.b2 ATWW5209.g1
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKCRKEAQEMLQGRTEVTW
>ASWX54130.g2
AFPZ224663.y3 ATGN280478.g1
ATGI221699.b1 ATUP430247.g1
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKCRKEAQEVLQGRPDVTW
>ATUP296454.b1
ASWX46239.b2 ATGN159465.g1 ATUP915634.y1
note
ATUP915634.x1 is a middle exon DKW...
DEESNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYC
DEDGNGLTNTEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKCRKEAQEVLQGRTDVTW
>ASWX163983.b2
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQDKCRREAQEVLQGGTEVTW
>ATGN72941.g1
ASWX123977.g2 AFSA780539.b2
ASFW81426.b2
1 aa diff
to ATWX40562.b1
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKCRKEAQEVLQGGTEVTW
>AFSA853381.g2
APNK78662.g3 ATUP588704.x1 1 aa diff to AFSA246853.b2
YKKACNEVHQFSEKIIQQRKQDLDNLSTTETTRRQKYLDFLDILLMAK (0)
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWILYC
>ASWX165907.g2 1 ATUP402691.b1 ATUP921527.y1
ASFW127211.b2 ASFW54195.b2
1aa diff
to ASWX46239.b2
note
ASWX165907.b2 = mate pair = DWK exon
DEDSNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYC
>ATUP646033.g1
DEMYGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYC
Alignment
of 4F-like I-helix exon sequences.
The green aa are probably seq errors,
since they
occur only once, or on the end of a seq.
The sequences
from
AFSA853381 to APNK78662 are short pseudogene pieces. The last three
Sequence
seem to be different genes and additional searches may be required.
Numbers
after accessions are number of occurrences.
Those with
one occurrence should be combined with others (seq error)
Those with
9 are probably two genes with identical exons
ASWX6756 9
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQDKC
ATUP172867 3 DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARYPGHQDKC
ATUP296454b 4 DEDGNGLTNTEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKC
ASWX54130 5
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKC
AFSA246853 4 DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQDKC
ATUP540419 3 DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKC
ATUP664988 6/4
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKC
ATWX40562 4/6
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKC
ATGI22425 2
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKC
ASWX163983 1 DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQDKC
ATGN72941 4
DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLARHPGHQEKC
AFPZ112195 1 DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYCLACHPGHQEKC
AFSA853381 3 DEDGNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYC-----------
ATUP296454a 4 DEESNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYC-----------
ASWX165907 5 DEDSNGLTDAEIRDEVDTFMFEGHDTTASGLAWTLYC-----------
APWS109579 1 DEDGNGLSDVEIRDEVDTFMFEGHDTTASGLSWTLYNLARHPEHQERC
AFSA29926 2 DEDGTGLTDAEIRDEVDTFLFEGHDTTASGISWALYHLAKHPEYQDRC
ATGN26250 7 DEDGKGLSEREIRDEVDTFMFEGHDTTASGVSWILYNLAKHPACQDRC
:*******:**********::
**
ASWX6756
RKEAQEVLQGRTEVTW-
ATUP172867
RKEAQEVLQGRTEVTW-
ATUP296454b
RKEAQEVLQGRTDVTW-
ASWX54130
RKEAQEVLQGRPDVTW-
AFSA246853
RKEAQEVLQGRTVDTW-
ATUP540419
RKEAQEMLQGRTEVTW-
ATUP664988 RREAQEVLQGRTEVTW-
ATWX40562
RKEAQEVLQGRTEVTW-
ATGI22425
RREAQEVLQGGTEVTW-
ASWX163983 RREAQEVLQGGTEVTW-
ATGN72941
RKEAQEVLQGGTEVTW-
AFPZ112195 RLEAQEVLQGRTEVTW-
AFSA853381
-----------------
ATUP296454a
-----------------
ASWX165907
-----------------
APWS109579 RQEARSVLQGRSVVTGW
AFSA29926
RREAEGLLQGRTEMTW-
ATGN26250
RAEVDAVLQGRAEVKW-
AFSA246853.b2 +
AFPZ352137.x1 +
AFPZ224663.y3 +
APWS16853.b1 +
ASWX54130.g2 +
ASWX46239.b2 +
ATGN280478.g1 +
ATUP296454.b1 +
ATGN159465.g1 +
ATUP915634.y1 +
ATGI221699.b1 +
ATGI162241.b1 +
ATUP390177.g1 +
ATWW5209.g1 +
ATUP540419.g1 +
ATUP407011.b2
+
ATUP430247.g1 +
ASWX6756.g2 +
ASFW147712.b2 +
AFSA735184.b2 +
AFSA426465.b2 +
AFPZ654427.g2 +
ATWW138964.g1 +
AFSA5009.x1 +
AFPZ133921.x1 +
AFPZ112195.x1 +
ATUP16248.y2 +
ATGI22425.b1 +
APNK78662.g3 +
ASWX163983.b2 +
ASWX165907.g2 +
ASWX40865.b2 +
ATUP588704.x1 +
ATWX40562.b1 +
ATWX43092.b1 +
ATGN214275.b1 +
ATGN200959.g1 +
ATUP402691.b1 +
ATGN318029.g1 +
ATUP837406.y1 +
ATUP921527.y1 +
ATUP298210.b1 +
ATUP926382.b1 +
ATUP907530.x1 +
ATGN72941.g1 +
ATWW180530.g1 +
ATWW176870.g1 +
ATWW81852.b1 +
ATUP664988.g1 +
ATUP646033.g1 +
ATUP539865.g1 +
ATUP449675.b1 +
ATUP177523.b1 +
ATUP172867.g1 +
ASWX123977.g2 +
ASFW127211.b2 +
ASFW81426.b2 +
ASFW54195.b2 +
AFSA853381.g2 +
AFSA849885.b2 +
AFSA780539.b2 +
AFSA128002.b2 +
AFSA100306.g2 +
AFPZ458632.x1 +
AFPZ619699.x1 +
AFPZ694240.b2 +
100%
matches to AFPZ760949.g2 exon
AFPZ50624.g2
ATWX87052.g1 ATUP695364.x1 ATWW34381.b1
ATUP656534.b1
ATUP453935.b1 AFPZ760949.g2
(0)
DQLLSHDHNYHYFTCWQTPIIALTFCSHPETVKLILSNKS (1)
99% match
to AFPZ760949.g2 exon
ASWX83469.g2
one base missing, seq error?
97% to
AFPZ760949.g2 exon
ASFW95633.g2
stop codon and two other changes probable seq error
*LSHDHTFHYFTCWQTPIIALTFCSHPETVKLILSNKS
89% to
AFPZ760949.g2 exon
AFSA23133.b2
ATGN291068.b1 ATGI170747.g1 ATWW165626.b1
ASFW10811.g2
AFSA451581.g2 AFSA330051.g2 AFPZ476812.b2
AFPZ466033.x1
AFPZ632859.y1 AFPZ605110.g2
(0)
DQLLSHDHNCRYRTCWRTPVIALTFCSHPETVKPILSNKS
(1)
AFPZ434251.y1
another seq by itself, this matches at 96% to 7
Other seqs
which are probably the correct seq, this being a little off
DLLLSHDHNYHYFTCWQTPIIASTFCSHPESLV PSSLCR
AGGACCTGTTACTGTCTCATGATCACAACTATCA
CTACTTTACCTGCTGGCAGACTCCGATAATCGCCTCAACATTCTGCAGCCATCCCGAGAG
CCTGGT
The Seven
seqs are the same as listed above
AFPZ50624.g2 ATWX87052.g1 ATUP695364.x1 ATWW34381.b1
ATUP656534.b1
ATUP453935.b1 AFPZ760949.g2
>AFPZ50624.g2
walked upstream to AFPZ760949.g2 (no hits)
(0)
LLSHDHNYHYFTCWQTPIIALTFCSHPETVKLILSNK
(1)
>ATUP412432.y1
ARKTEWVYRFFRPWL (1)
GDGLLLSDGPKWQRNRRLLTPAFHFDILKHYVKLFSESTAVLL (0)
>APWS102130.b1
(1)
GNSLFLSDGDQWKVHRRLLTPAFHFDILKQYVSVYNREATEMI (0)
>ATUP98469.y2
ASFW123814.b2
(0)
SAPYVLAVHDLTKLIEDRPDYLSNHIDF IYYLSADGRR (2)
APYVLAVHDLTKLIEDRPDKLSNHIDF
IYYLSKDGR
FRRACKIVHSFSAQVIKERKEELKKKDSSFKSGKCLDLLDILLKAK
FRRACKIVHSFSAQVIKERNEETEKK
>AFPZ133921.y1
exon 5 mate pair to I-helix exon 8
MTLDSLMRCALSVRSDCQRDSDGSP
(2)
ATUP207008.y2
MILTANILVFLTCFTVNSTQFSLDDCHVDSIKFD
ATGN176295.g1
MILTANILVFLTCFTVNSTQFSLDDCHVDSIKFD
>danio
4F seq BC056734
MILTANILVFLTCFTVNSTQFSLDDCH
MLLYGISPFVLSVNHVFALIFLACLLTVVKLLIVRRKGVKTMER
FPGPPAHWLFGHVKEFRQDGHDLEKIVKWMELYQFAFPLWFGPSLAVLNIHHPSYAKT
ILTTTEPKDDYAYKFFIPWLGDGLLVSTGQKWFRHRRLLTPGFHYDVLKPYVKLISDS
TKVMLDKWEVHSRSEESFELFKHVSLMTLDSIMKCAFSCNSNCQTDSGTNPYIQAVFD
LCHLVNLRFRVFPYHSKAIFHLSPHGYRFRKAASIAHNHTAEVIRKRKEVLKMEEEQG
IVKNRRYLDFLDILLSARDEHQQGLSDEDIRAEVDTFMFEGHDTTASGISWIFYNLAC
NPEHQEKCRQEIQQALDGKDTLEWEDLNKIPYTTMCIKESLRLHPPVPGISRKLTKPL
TFFDGRTVPEGCTIGVSIYGIHMNSTVWENPYKFDPLRFLPENAANRSPHAFVPFSAG
PRFVTRSV
>CYP4F39
UPSTREAM OF 4F5 chr7 (+) 94% to mouse 4f39 = ortholog
13051717
MLPITDYLLYLLGLEKTAFRVYVLSALLLFLLFLLFRLLLQAFKLFS
DFRITCRRLSCFPEPPGRHWLLGHMSM 13051938
13054230
YLPNEKGLQNEKKVLDTMHHIILAWVGPFLPLLVLVHPDYIKPVLGAS 13054373
13064279
AAIAPKDEFFYSFLKPWL 13064332
>CYP4F17
= CYP4F19temp AI030199 EST CHR7 13095557
13103056 chr7 (+)
90% to
4f17 next closest 82%, probable ortholog of 4f17
13095557
MLQLSLSWLGRGPVTVSPWQLLLVVGTSLLLARILAWISAFYDN
YCRLRCFPQPPSRHWFWGHLNL 13095754
13102916
VKNNEEGLQLLAEMSHQFQDIHLCWIGIFYPILRLIHPKFIGPILQA 13103056
13103866
AAAVAPKEMIFYGFLKPWL 13103922
>CYP4F5/4f16
13119940 13133265 chr7 (+) 3 aa
diffs to mRNA U39207 90% to 4f16 89% to 4f37
13119940
MPWLTVSGLDLGSVVTSTWHLLLLGAASWILARILAWTYSFCENCSRLRCFPQSPKRNWFLGHLGT 13120137
13122954
IQSNEEGMRLVTEMGQTFRDIHLCWLGPVIPVLRLVDPAFVAPLLQAP 13123097
13125947
ALVAPKDTTFLRFLKPWL 13126000
>Xenopus
4F42 mRNA AB114053
MLPFLDHFLDSLNLSHSTFRVYIFYAVILFFSLIMFRTILKMVT
YIYAYIINARRLRCFPEPPRRSWLLGHLGLFMPTEEGLT EVSDAISSFRKTFLTWMGP
ISLVSVFHPDTVKPIVAASAAIAPKDDLFYGFLRPWLGDGLLLSHGEKWGRHRRLLTP
AFHFDILKSYVKIFNQSTDIMLAKWRRMTVEGPVSLDMFEHISLMTLDTLLKCTFSYD
SDCQEKPSDYIAAIYELSSLMVKREHYLPHHLDFIYNLSSNGRNFRQACKKVHEFTAG
VVQQRQKALKEKGMEEWIKSKQGKTKDFIDILLLSKVEDGNQLSDEDMRAEVDTFMFE
GHDTTASGLSWILYNLARHPEYQEKCRKEIIELLEGKILKHLEWDELSQLPFTTMCIK
ESLRLHPPVTAVSRRCTEDIKLPDGKVIPKGNTCLISIYGTHHNPEIWPNPQVYDPYR
FDPENVQERSSHAFVPFSAGPRNCIGQNFAMAEMKIVLALILYKFHVRLDETKAVRRK
PELILRAENGLWLQVEELKR
Xenopus
mRNA 4F42
atgttgcc
gtttttggac cattttctgg
121 actccttaaa
cctgagtcac tcaactttcc gagtttatat tttctatgct gttattctct
181 ttttctctct
tataatgttt cgaaccatat taaagatggt gacatatatt tatgcttata
241 tcatcaatgc
cagacgtctg cgttgttttc cagagcctcc aagacgtagc tggcttttag
301 gacatttggg
actgtttatg ccaacagagg agggccttac agaagtgagt gacgccattt
361 cttcttttcg
taaaacattt ctgacatgga tgggacccat ctctttagta tcagtgtttc
421 atccggacac
agttaaacca atagttgcag cctcagctgc cattgctcct aaagatgatc
481 tgttctatgg
tttcctcaga ccctggttag gggatggact gttgcttagc catggggaga
541 aatgggggag
gcaccggcgc cttctgacac ctgcctttca ctttgacatc cttaagagct
601 atgtgaagat
ttttaatcag agcacagata tcatgcttgc aaagtggcgg agaatgacag
661 tagagggccc
tgtgtctctg gatatgtttg aacatatcag tctgatgacc ttggatacac
721 ttcttaaatg
tactttcagc tatgacagtg actgccaaga gaagccaagt gattatattg
781 ctgctattta
tgaactgagc tcactaatgg tgaaacgtga gcactacttg ccccatcatt
841 tagattttat
ctacaacctt tcctccaatg gaaggaattt ccggcaggct tgcaaaaaag
901 tgcatgaatt
cactgccgga gtggtacagc aaagacagaa ggcattgaag gagaagggga
961 tggaggaatg
gattaagtct aaacaaggca aaaccaagga tttcattgat attctattgc
1021 tgtcaaaggt
tgaagatgga aaccagctat ccgatgaaga tatgagggcc gaagttgaca
1081 catttatgtt
tgaaggtcat gataccacag caagtggctt atcatggatt ctatacaatt
1141 tggctcgcca
ccctgaatat caggagaaat gcagaaagga gattatagag ttgctggaag
1201 ggaaaatcct
gaagcatttg gagtgggatg aattgtctca gttgccattc actacaatgt
1261 gcatcaagga
gagtctgcgg cttcaccctc ctgtaactgc agtatccaga cgctgtacag
1321 aggatatcaa
attacctgat ggcaaagtca tccccaaagg aaacacctgc ttgatcagta
1381 tttatggaac
ccaccacaac cctgagatct ggcctaatcc acaggtttat gacccatatc
1441 gatttgatcc
agagaacgtc caagaaaggt cttcccatgc atttgtacca ttctcagctg
1501 gacccaggaa
ttgtattgga cagaatttcg ctatggccga gatgaagatt gttttagctc
1561 taatccttta
caaatttcat gtgagattgg atgagaccaa ggcagtgcgc agaaaacctg
1621 agttgatcct
acgtgcagaa aatgggctct ggctgcaggt ggaagaactg aaacgt
CYP4V like
(looking hits with megablast using 4V4 Xenopus
All look
like the 4Fs and 4Ts, CYP4s not differentiated yet?)
>BI385317.1
Branchiostoma floridae cDNA clone 54% to Xenopus laevis CYP4V4
3 ANIPFPAGSQLCIGHRVALISDKDILSSILHLF 101
Xenopus
4V4 mRNA AB114054
atggagctaa
agggagatgt
121 taatgtgctg
ttgtggacgg ctgttatcgt ggtgctgttg accctgctgg tcttttccgc
181 tttgcccgtc
ctgctggact acgtgcgtaa atgcaaagtt atgagactga ttccgggtcc
241 cggacccaac
tacccgctcg tgggggacgc gctgctccta aagagcgatg caagagaatt
301 ctttctccaa
atgtgtgaat tcgcagagga ctttagatca gaaccacttc taaaactttg
361 gattggacca
attccttttt taatagtcta ccatgcagac actctagagc catttctgag
421 cacatccaaa
catgtggaca aggcctacct ctataaattt cttcaccctt ggcttggtaa
481 aggactgcta
acaagtacag gggaaaagtg gcgtataaga agaaagatga taacgccaac
541 ctttcatttt
gcaattctct ctgagttttt ggaagtcatg aatgagcaat ccaatgtatt
601 agttgaaaag
ctccagaagc atgctgatgg ggagtctttt gattgcttta tagatgtgac
661 actttgtgta
ttggacatca tatcagaaac agccatgggg aggaaaatag aagcacagag
721 caataaagat
tctgaatatg ttcaagcaat atacaagatg gctgatttca ttcagaacag
781 acagacaaag
ccatggttgt ggagtgactc tttatatgca tacttgaaag aaggaaaaga
841 gcacaataaa
accctaaaca ttctccacac cttcactgat aaggcgattc tagaaagagc
901 tgaagagctt
aagaaaatgg aagtaaaaaa aggtgatagt gatcctgagt cagaaaagcc
961 caagaaaaga
agtgcatttc tagatatgct tctgatggca acagatgatg ccggcaataa
1021 aatgagctac
aaggatatcc gtgaggaagt tgataccttc atgtttgagg gtcatgatac
1081 aacagcatca
gccctaaatt ggacattgtt tttactgggc tcacacccag aggcgcagag
1141 acaagttcat
aaagagctgg atgaagtttt tggtaaatct gaccgtcctg tcacaatgga
1201 tgatctaaag aagttgcgtt
atcttgaagc cgtaattaaa gaatcacttc gaatattccc
1261 ccctgtcccg
atgtttggtc gaaccgttac agaggactgc actgtccgag gatttaaagt
1321 gccgaaagga
gtaaatatca ttgttattac ttactcattg catcgtgatc cagaatattt
1381 ccctgaacca
gaagaattca gaccagagag gttctttcct gaaaatgcta gtgggcgtaa
1441 tccttatgcc
tatattcctt tttctgctgg actcagaaac tgcattggtc agcgttttgc
1501 tctgatggaa
gaaaaggttg tcttatcctc catacttagg aaatactggg tagaggcaac
1561 tcagaaacgt
gacgaatgtc tccttgtagg agagctcatt ctccgccccc aggatggcat
1621 gtggattaag
ctgaagaaca gagaaactgc ctccagtgcc
CYP7 like
(looking, no match with danio mRNA via megablast)
>gnl|ti|681941888
ATWX61331.g1
AFPZ187714.y1
APNK84514.b2 ATGI163705.b1 ATUP144046.y1
ASFW156594.b2
AFSA564815.b2 AFSA527605.g2 AFSA514068.b2
AFSA775248.g2
walked
upstream from AFSA527605.g2 to:
ASFW100120.b3
more
matches = AFPZ788847.y1 ASFW125784.g2
walked
upstream to ATGI210792.g1
walked
upstream from AFPZ931995.g2 (exon 2) to
ASWX150423.b2 ATGN25235.b1 ATUP765390.x1
ATGN88506.g1 AFSA923158.g2 AFPZ422094.x1
Possible
N-term exon
MILSIALIWAVVVGFCCLLWLAVGIRHR (2) Fugu N-term
MI-SGILAGCLVVLVVAILVQAVG-RKR
(2)
ATGATTTCTGGGATTCTGGCCGGCTGTTTGGTGGTGTTGGTGGTGGCCATTTT
GGTACAGGCTGTCGGCAGGAAACGGT
64% to
CYP7A
AFPZ788847.x1
mate pair
AFPZ395440.x1
ATUP709168.g1 AFSA796848.g2 AFPZ931995.g2
AFSA923158.b2
AFSA474510.b2 ATUP692470.g1
(2)
RDPNEPPLESGPVPYLGVALQFAMDSLKFIRSRQKKYGDVFTVKLAGKYTTFVLDPHSYSDVMRQHK
(2)
AGAGACCCGAATGAACCGCCCCTTGAG
TCTGGTCCCGTGCCCTACCTCGGGGTGGCGCTACAATTTGCCATGGACAGTCTCAAATTC
ATCCGCTCGCGACAGAAGAAGTACGGAGACGTCTTTACGGTGAAGCTTGCCGGAAAGTAC
ACGACATTTGTCCTTGACCCGCACTCCTACAGCGACGTCATGCGGCAGCACAAGT
AFPZ187714.x1
mate pair
AFSA19340.x1 ATWW110249.g1 ASFW88412.g2
These 5
have 5 nuc diffs and one aa diff ATUP312192.b1
ATGI154523.g1
ATWW237445.b1 ATGI112989.b1 AFSA474510.b2
The first
Ag in the cluster of three at the beginning of this
Exon seq
is missing here, so one of the other two is correct
ILDFKTVGMDIVERGFGTTHFERTGRAHVLHTADAYFPVHLQGNALDPLTNTMMGHLQTAMLADI
(1)
AGTAGAGCGTGGATTTGGGACGACGCACTTCGAAAGGACGGGCCGTGCGCACGTGCTGCACACCGCTGACGC
CTATTTCCCCGTCCATCTGCAGGGGAACGCCCTGGACCCGCTCACGAACACCATGATGGG
GCATCTACAGACAGCCATGTTGGCTGATATAGGT
AFSA19340.x1 AFPZ744954.y4 AFPZ744954.y1
ATGI112989.b1
ATUP312192.b1
ATWW110249.g1 ATGI230779.g1 ASFW88412.g2
ATWW237445.b1
ATUP785498.x1
GSETGWKKDGLWSFVRRIVSEASFLTIFGKHK
(2)
AGGGTCGGAAACTGGCTGGAAGAAGGACGGGCTGTGGTCCTTTGTCCGCCGCATCG
TCTCTGAAGCGTCCTTCCTCACCATCTTCGGCAAGCACAAGT
Walked
down from AFPZ744954.y4 to:
ATUP899939.x1 ATGI210792.g1 ATUP38762.b1
ATGN118038.g1
ATWW157309.g1 ATUP571743.g1 AFPZ712795.b2
QLIEQRDEVFCGGGLSGKELAGAHFSTVWASLVRSI
CYP7
amphioxus Complete seq. 41% to CYP7A1 from zebrafish and Fugu, 35% to Ciona
CYP7
MISGILAGCLVVLVVAILVQAVGRKR (2)
RDPNEPPLESGPVPYLGVALQFAMDSLKFIRSRQKKYGDVFTVKLAGKYTTFVLDPHSYSDVMRQHK
(2)
ILDFKTVGMDIVERGFGTTHFERTGRAHVLHTADAYFPVHLQGNALDPLTNTMMGHLQTAMLADI
(1)
GSETGWKKDGLWSFVRRIVSEASFLTIFGKHK
SQTVEQERARLMVVMETFWDYDRKFPQVVAGIPFWMLGKAKEQRDFLL
(0)
AFLSKDNLNQRDVLQLIEQRDEVFCGGGLSGKELAGAHFSTVWASL (0)
SNTLPTAFWTLFHLLQDPVAMAAVRREVET
(0)
ETGQTVTGFRDGGEKIDFTRQQLADMTCL
(1)
GSVVNEALRVSSVSIVLRQALEETTIALNSGSTFKIRKGDRVALFPQIVHMDPEVYEDPE
(0)
TFKYDRYLENGKEKTTFYKNGKKLRHYLIPFGIGTSRCPGRFFAVNEIKQFVSLIVCYFD
MELIDKETPPLDQSRTGLGVLPPKTDPMFRYKIK*
AGAGCAACACTCTACCCACGGCCTTCTGGACACTCTTCCACCTC
CTACAGGACCCTGTTGCCATGGCTGCAGTCAGGAGGGAGGTGGAGACGGT
AGGAACCACAATAGCCCTTAAC
TCCGGCAGTACCTTCAAGATCCGCAAGGGAGACAGGGTGGCGCTGTTCCCACAAATCGTT
CACATGGACCCAGAAGT
Query: 1
SNTLPTAFWTLFHLLQDPVAMAAVRREVETETGQTVTGFRDGGEKIDFTRQQLADMTCLG 60
+NTLP FWTLFH+++ P AM A
EV
+ ++ TR+QL +M L
Sbjct: 295
ANTLPATFWTLFHMIRCPAAMKAASEEVRQTFESSNQKVDPTNSRLVLTREQLDNMPVLD 354
Query: 61
SVVNEALRVSSVSIVLRQALEETTIALNSGSTFKIRKGDRVALFPQIVHMDPEVYEDPET 120
S++ EA+R+SS S+ +R A + + L++ ++ IRK D +A++P ++H DPE+Y+DP
Sbjct: 355
SIIKEAMRLSSASLNVRMAKSDFLLQLDNKESYHIRKDDVIAMYPPMIHFDPEIYDDPLE 414
Query: 121
FKYDRYL-ENGKEKTTFYKNGKKLRHYLIPFGIGTSRCPGRFFAVNEIKQFVSLIVCYFD 179
FKYDRY+ E G+EKT FY+NG+KLR+Y +PFG G ++CPGRFFAV+EIKQF+SL++ YF+
Sbjct: 415
FKYDRYIDEKGQEKTAFYRNGRKLRYYYMPFGSGVTKCPGRFFAVHEIKQFLSLLLSYFE 474
Query: 180
MELIDKET--PPLDQSRTGLGVLPPKTDPMFRYKIK 213
MEL+D + PPLDQSR GLGVL
P D FRY++K
Sbjct: 475
MELLDSDVKEPPLDQSRAGLGVLQPTYDVDFRYRLK 510
>CYP7A1
Fugu Scaffold_5172 Length = 18849
59% to 7A1
=
LGW1565.x1 Length = 555 27-153 CYP7A1
=
LGW57257.y1 50% to 7a1 238-350
=
LOL6406.x1 61% to 7A1 390-436 also LOL6406.y1
=
LGW154142.y1
=
LGU7599.x1
insertion
of 6 aa in exon 4 vs mammalian seqs, but probably real see zfish seq below
14694
MILSIALIWAVVVGFCCLLWLAVGIRHR 14777 (2)
15070
HSSEPPVENGLIPYLGCALQFGANPLQFLRSRQKKYGHIFTCKIA 15204 frameshift
15205
GQYIHFLCDPFSYHSVIRQGRHLDWRKFHFATSVK 15310 (0 expected) bad boundary
15425
AFGHDSFDPRHGHTTENLHQ 15484
15485 TFLKTLQGEALPSLIKTMMGHLQDVMLKSDTLRRSKDHWEVDGIFAFCYK 15634 (0)
15757 VMFESGYLTLFGKELGEDTCQARQAAQKALVLNALENFKEFDKIFPALVAGLPIHVFKSAYSARE 15951 (0)
16053
NLAKTMHAEKLSKRENVSDLISMRMILNDSLSTFNDVSKARTHVALLWASQANTLPATFWSLFYMIR 16253
(2)
16383
SPDAIKAAREEAQKVFETFGVKIDPHNPTLNLTRDVLDNMPVL 16511 (1)
16744
DSIIKEAMRLSSASLNIRVAKEDFLLHLDNQEAYRIRKDDVIALYPPMLHYDPEIFEDPY 16923 (0)
17029
EYKFDRFLDENNQEKTTFTRNGRKL 17103
17104
RYFYMPFGSGVTKCPGRFFAVYEIKQFLTLVLTYFDMELLDPAIQVPPLDQSRAGLGILQ 17283
17284
PTYDVDFRYKLKLAY* 17331
danio mRNA
for 7A
at
gatcctaacc atttccttca tttgggccat
61
agtggttggt ctttgctgtt gtctttggct tattacagga atacgcagaa gacatcctgc
121 agagcctcca
ttagagaatg gctggattcc cttccttggc tgtgctcttc agtttggggc
181 aaatccttta
gagtttcttc gcagcagaca gaagaagcat ggccatattt ttacatgcaa
241 gattgctggg
cagtatgttc atttcctttg tgatccattc tcctaccatg ctgtcatccg
301 tcaaggaagg
caccttgact ggaagaaatt tcactttgat gcctctgcga aggcatttgg
361 tcatgagagc
atggatccca gtcaaggtta caccactgag aatttgcatc agactttcct
421 gaagaccctg
caaggggatg ccttgtcttc tctaattgag accatgatgg aaaacctcca
481 gggcaccatg
ctgcaatccg gaatgctgaa ggccacaacc tctgaatggc aaagtgatgg
541 tatttacgcc
ttctgctaca aggtcatgtt tgaagcaggc tacctgaccc tcttcggaaa
601 ggaactggat
ggggaccaga gcattgcacg tcagcaggcc caaaaggctc tggtgctcaa
661 tgctttggac
aactttaaag agttcgataa gatcttccca gctctgatcg ctgggctccc
721 cattcatgtt
tttaagagtg cctacagcgc tcgtgagaaa cttgccaaga ctatgctcca
781 tgagaacctc
agcaggcgtg ccaatgtgtc tgatctcatc tccttgcgca tgcttttgaa
841 cgacacacta
tctaccttca acgagctgag caaagcccgg acccacgtcg ctatactttg
901 ggcttcacaa
gccaacactc tgcctgcaac cttctggact ctgttccaca tgatcaggtg
961 ccctgcggca
atgaaggctg ctagtgagga ggtgaggcaa acctttgaaa gttctaatca
1021 gaaagttgat
cctacaaatt ctcggcttgt actgacaagg gagcagttgg acaacatgcc
1081 agttttagac
agcatcatta aagaggcgat gagactgtcc agtgcatccc ttaatgtgag
1141 aatggccaag
agcgatttcc ttcttcaact agacaataag gagtcttacc acattcggaa
1201 agatgatgta
attgctatgt acccaccgat gattcacttt gatcctgaaa tttatgatga
1261 tcctttggaa
ttcaagtatg atagatacat tgatgagaaa gggcaggaaa agaccgcctt
1321 ttaccgtaat
gggcgcaagc ttcgttacta ctacatgccc tttggctctg gggtgaccaa
1381 atgcccagga
cgcttctttg ccgtgcatga aatcaagcag ttcttgtctt tgttgctatc
1441 gtactttgag
atggaacttt tggactctga tgtgaaagaa ccgccgttgg accagtctcg
1501 ggctggactg
ggtgtactgc agcccaccta tgatgttgac tttcgttaca gactcaaatc
1561 tctc
MILTISFIWAIVVGLCCCLWLITGIRRRHPAEPPLENGWIPFLGCALQFGANPLEFLRSR
QKKHGHIFTCKIAGQYVHFLCDPFSYHAVIRQGRHLDWKKFHFDASAKAFGHESMDPSQG
YTTENLHQTFLKTLQGDALSSLIETMMENLQGTMLQSGMLKATTSEWQSDGIYAFCYKVM
FEAGYLTLFGKELDGDQSIARQQAQKALVLNALDNFKEFDKIFPALIAGLPIHVFKSAYS
AREKLAKTMLHENLSRRANVSDLISLRMLLNDTLSTFNELSKARTHVAILWASQANTLPA
TFWTLFHMIRCPAAMKAASEEVRQTFESSNQKVDPTNSRLVLTREQLDNMPVLDSIIKEA
MRLSSASLNVRMAKSDFLLQLDNKESYHIRKDDVIAMYPPMIHFDPEIYDDPLEFKYDRY
IDEKGQEKTAFYRNGRKLRYYYMPFGSGVTKCPGRFFAVHEIKQFLSLLLSYFEMELLDS
DVKEPPLDQSRAGLGVLQPTYDVDFRYRLKSL
CYP8 like
(looking, no obvious CYP8s, but at least three CYP7s)
>GENE A
85% to Gene B 53% to CYP7 amphioxus 43% to 8B1 fugu, 37% to 7A1 Fugu
AFSA913951.b2
possible N-term missing start MET but similar to Gene B
Only
matching seq in database probably has errors
TELLSVCLGGVLAFVLLQVITRRM
ACAGAGCTGCTGAGCGTCTGTCTGGGCGGAGTCTTAGCCTTCGTGCTTCT
ACAGGTTATAACAAGGCGTATGGT
ATUP32525.b1 mate pair to exon 2
MVTELLGVCLAVVLVFVLLQVTTRRR
(2)
ATWX106442.b1
exon 2
AFPZ751569.x1
ATUP32525.g1 (note mate pair has exon 1)
ATWW184707.g1 AFSA329657.g2
(2)
RPGEPPLEPGPLPYLGVALEFSRNPLGFITSRWKKYGDVFTVRLAGHYTTFVLDPHSFTHAIRNSK (2)
VLDFRVFSSKIAHRAFGMPIVYGTHRDWVRADSDALYPKELQGQGLEKVTE
(0)
AWYB9202.g1
AFPZ202055.x1 ATUP209080.y2
AFSA329657.g2
VMMTNLQSAMLAATDVKAEWNKGELWSFVYRIMFSGK (0)
ATUP926791.g1
CTCAGACNGGGGTTAACGGATGTTTATGATTTNTGGTATGTAAATAGCTTAAGTGATGCTTACATAAATGACATATTAATTATGCAAATTGGTATCTAATTTTCCTAATTAGTTAAGAAATTTTGTAAACACGCTCAGTTCCATTATAGGACCATTGCAACATGTAACATTTGTAACTGAGAAGGAGAAGAATATTGATTGATAGATGTTATGCAAAATAAAGATCTAATTTGCATAATTAATGAGAAAATGCTATAATTTTATTGTGGTAAATAACAGGAAGTTCTTACTTGTTGCATTTGAAAGTTATGTATAGGTGAACATTAATTATGCAAATGAGATCCTCATATGCATAAATTACCTGAAAATGCGATAAAAGCCTTCTTTCTTAACATAGATTGTGTACGTCTGTTGTCTGATAGAGGAGATGATCAACTGATATAATTTATGCAAATGAGAACCTTATTTGCATAATTAATGACAAACA
CAATTTACACAGCAGTCGTAAAGGTACGGATCATTTGGCGAAGGTATGGGGTCGTGGAACTCTAGTTAAGATTGAAGTTGGGGATAGTACTTTTCTGAGGGCTTGTTTCAGTGTATTTTGACAAGCTTTCCTATGGAGAGATCTAACTGTGTTATTTGCCCCGTACACTCCAGGAGACCTGGTGAGCCGCCATTGGAGCCAGGCCCTCTTCCGTACCTGGGGGTCGCCCTGGAGTTTTCCAGAAACCCCCTGGGTTTCATTACTTCCCGCTGGAAGAAATATGGAGACGTGTTCACCGTGCGACTGGCCGGCCACTACACCACCTTCGTCCTGGACCCGCACTCATTCACTCACGCCATCCGGAACAGCAAGTCAGTATGAACAAACATGTTGTAAAGAACCATTAAAGGTGTGTTCACACATGCGTATAGATTGAAATCCGTAAAAATAGTAGAAATGGACAAATAGGTCCAGCGACCCNNNN
>gnl|ti|669707816
name:ATWX106442.b1 mate:669707912 mate_name:ATWX106442.g1 template:ATWX106442
end:R + strand
NGTGCTTTATGCAAC
CAATTTACACAGCAGTCGTAAAGGTACGGATCATTTGGCGAAGGTATGGGGTCGT
GGAACTCTAGTTAAGATTGAAGTTGGGGATAGTACTTTTCTGAGGGCTTGTTTCAGTGTATTTTGACAAG
CTTTCCTATGGAGAGATCTAACTGTGTTATTTGCCCCGTACACTCC
AGGAGACCTGGTGAGCCGCCATTG
GAGCCAGGCCCTCTTCCGTACCTGGGGGTCGCCCTGGAGTTTTCCAGAAACCCCCTGGGTTTCATTACTT
CCCGCTGGAAGAAATATGGAGACGTGTTCACCGTGCGACTGGCCGGCCACTACACCACCTTCGTCCTGGA
CCCGCACTCATTCACTCACGCCATCCGGAACAGCAAGTCAGTATGAACAAACATGTTGTAAAGAACCATT
AAAGGTGTGTTCACACATGCGTATAGATTGAAATCCGTAAAAATAGTAGAAATTGGACAAATATTGTCAG
ACGAGTTAGCAGGGTCACAAACTTCCTGGCTAGCTTACTCCTCTAGTATGTTCTGCGTCTACAGTTGTCG
AATTTCTACTATTTTCCAGCAACCTGTGGAGCCCGGTAGAGACTAAATAGGGTATTTGGGGATCGCTAAG
TCGGTGGGAGGTAGTTTCAGAACTCCATAAGGACCTCATACGGACTTGAATATATACGCACGTGTGGATC
TGATAGGTATACAGTTCAGTTTTTTGCTTGTTCATTTGTTTGTTCATTTGTTTGTTCATTTGTTTGTTCA
TTTGTTTGTTCATTTGTTTGTTCATTTGTTTGTTCATTTGTTTGTTCATTTGTTTGTTTGTTTGTTTGTC
TGACATAGCCGAAAAACCACTTTCAGGCGGAACACACCTGAACACCATCCCAAATGACGTCCTTGACCGA
AACTAGGTTCTCATTTTCAGCTACTATGATGATGTGTATTTAATAGTCAATCAT
>joined
ATWX106442.b1 and ATUP926791.g1 and AFSA75691.x1
NNNNNNNNGGGGGGGAAGGAGAANCGGACGCTCGCACTCTCACCACTTCTCCGTGTCTAACCATCACACG
TGTGTGTATAGCGAATAACAATAACCTGTGCTGTGGGGTACAGAAGAATGTTACCTCGCCAAGACAGTTA
TATTTTGGGTAGCGTTTGTTTGTATATATGTAACGTCATGTATGTACATGTATGTATGTATGTATGTAGA
GACCAGCATAACTAAAAAAAGCCTTAATGGATTGTATTGCTATTTAGTATGTGGGTAGGTCTTGATGAGA
CCTGGAAACGATTAGATTTTGGGCCCCCTAGCAGCTTGTTACGGTACTGCAGCAGAGCTTCCTGGTTTAA
TATCTCGAGTTCTGAACATGCTGCGGCTATGATTTTTGAGTGGTAGACAGGTATTGGTGCCGAGAGTAAG
TGGTGTAGGTTTGGGCCCCCTAGCGGCTTGTTTTGAAACTGCAGGGCAGTGTCAGACTTTAAAAGGGAAT
AACTCAAGAACGGGTTAACGGATTGTTATGATTTTTGGTATGTAAATAGCTTAAGTGATGCTTTACATAA
TGACATATTAATTATGCAAATTGGTATCTAATTTTCCTAATTAGTTAAGAAATTTTGTAAACACGCTCAGTTCCATTATAGGACCATTGCAACATGTAACATTTGTAACTGAGAAGGAGAAGAATATTGATTGATAGATGTTATGCAAAATAAAGATCTAATTTGCATAATTAATGAGAAAATGCTATAATTTTATTGTGGTAAATAACAGGAAGTTCTTACTTGTTGCATTTGAAAGTTATGTATAGGTGAACATTAATTATGCAAATGAGATCCTCATATGCATAAATTACCTGAAAATGCGATAAAAGCCTTCTTTCTTAACATAGATTGTGTACGTCTGTTGTCTGATAGAGGAGATGATCAACTGATATAATTTATGCAAATGAGAACCTTATTTGCATAATTAATGACAAACA
CAATTTACACAGCAGTCGTAAAGGTACGGATCATTTGGCGAAGGTATGGGGTCGT
GGAACTCTAGTTAAGATTGAAGTTGGGGATAGTACTTTTCTGAGGGCTTGTTTCAGTGTATTTTGACAAG
CTTTCCTATGGAGAGATCTAACTGTGTTATTTGCCCCGTACACTCC
>GENE B
AFPZ923737.x1
AFPZ877092.x1 AFSA910992.g2 AFSA351865.g2
AFPZ570243.x1
AFPZ9740.b2 ATUP4956.b1 AFPZ718164.y1 AFPZ823314.b2
MVTELLGVCLAVVLVFVLLQVTTRRR
(2) probable N-term
ATGI76532.g2
2 aa diffs to gene A
RPGEPPLEPGPLPYLGVALEFSRNPLGFITSRWRKYGDVFTVRLAGHYTTFVLDPHSFTHAIRNSR (2)
Joined
AFSA351865.g2 and ATGI76532.g2
AAAAATTTTTTTTCAGGCTAGGAGGCTAACTTGAGGAATTTAAACATTTTTTTCGGGGCTTCTGGAGAAAGTACCGTCCCTGGAACAGGGAGGGGAAAGGTCGCAGTGCAAAGGGCATGTACACTCCACATGTACAGAGCTGCGGGGGATGAAGAAATAGCCGGCAGGCAGGTTAATGCACCCTGGATCCTAAATTTCGCCTTTGAGACGACATTTGAGGCGATCATTGGAGTTAGACTTTCATCGCACATGCGCCTTAACTGTGTTAGAGTCACACAGACTGACCTGCCTGCCGACCTACAGCAGGATGGTGACAGAATTGCTAGGCGTTTGTTTGGCCGTAGTCTTAGTCTTCGTGCTTCTACAGGTCACAACAAGGCGTAGGTAGGAGAAACTATTTCACGTCTTAACAACTTGTCGGAGGCTGTTGCCCGGCTTGACTACGGCGCACGCAGGGGGTATGCGATTATACTGAGTATTAAGGGGGTGAAGTTTATCTACTCACATCGTGAGTCCTATCAAAGGCATTTCATATTCATAGAT
ACACGGGTTTCTTATCCAGTCCATTCTACATCAAAGGGGTCGTGTCGAACATTAACTTGGAGTGTGAGTC
GCTCCTGTGCTATTCAACGAGTTGATGAGCACCCTATTCCACTAGACGGCGATCACGCTGAGACCTTGCT
GCGACGGTGCGACCTAAACTGGATTTAATTCTGGAACTCCTGAATTCATAATTGCAATATCATGCAAACG
TTACGTAAAAAGTATTACTGGAAAAAAGTCATATTTTTTGTTGAAGTCGTTGAGCGCTCTGTCAAGCGCG
GTCGTATAGCGAGATCACCGCCTTGTGGAATGGAGGTGTAAGTTGTTGTCACTACCTTGATACTATTCCC
CTTCCCACAGGAGACCTGGTGAGCCTCCCCTGGAGCCGGGGCCTCTCCCGTACCTAGGCGTCGCCTTGGA
GTTTTCCCGAAACCCCCTGGGCTTCATCACCTCCCGCTGGAGGAAATATGGCGACGTGTTCACCGTGCGG
CTGGCCGGCCACTACACCACCTTCGTCCTGGACCCGCACTCATTCACTCACGCCATCCGGAACAGCAGGT
TAGTATGAAAGGAAATGTCTTATAACAACTACAGTGAGCTTTTCCAAACGTGCCTTGTATATAGATCACT
GTTGTTTTAGTTTGTTTGTTCGTTTTTTGTTTGGTTGACATAGCCGATAAACCGCCTTCAGGCATAAAAC
ACCAGTTCTGTAACAAGCTGTGAAGTTAAAAAGTTAAACCCTTCCCACACCAAGATGGCGCATAGGGCGG
TGCCCATCTCTGTTTCATTAGCCCTGGGCCACACACAACGCAATCACTACAGCAGGGGGCTAGTCCACTG
GTAGTGGTGTGTGTTCAACTTCCATACTCTTTCCCGAATGCTGAGTGCTAAGCAGAGAAAGCAGCATGCA
CCATTTTTAAAGTCTTTGGTATGACTCGG
AFPZ88959.b2
mate pair of AFPZ88959.g2
AFPZ88959.b2 APNK98223.b2 ATGN213244.b1
ATWX3781.b3
ATUP385028.g1 ATGI219815.g1 ATWW61924.b1
ASWX32142.g2 APNK48064.b2
ASFW94882.g2 AFSA507746.b2 AFSA700071.b2 AFPZ958552.x1
ATGN85334.b1
GLDFRLFSSKIAHRAFGVPIFYGTPRDWVRADSDALWPKELQGQGLDKVTE
AGGGGTCTGGACTTCCGACTATTTTCGTCCAAGATAGCTCATCGCGCCTTCGGGGTGCCTATA
TTCTATGGTACACCCCGCGACTGGGTCCGAGCCGACAGTGATGCGCTATGGCCAAAGGAA
TTACAGGGGCAGGGAGTGGATAACATCACAGAGGT
ATUP4956.g1
VMMSNLQSAMLAATDVTVEWNKGELWSFVYRIIFSGK
(0)
AFSA160234.g2
walking up from ATGN200156.g1
(0) TLFGRHKEEKDETALLLHAMEEFQKYDKRFPEIISNVPWWLMGHTKKRYEYLK
(0)
ATGN200156.g1
ATGI16796.g1 ATGN186157.b1 ATUP270375.g1 AFSA312546.b2
AFPZ88959.g2
AFPZ541093.x1
Walking up
from ATGN200156.g1
AFPZ244071.x1 AFSA312546.b2 (SDFI + VRAE exons)
AFSA19010.x1
SMVSPAGLGQRGVSDFIRMRQEIYADANLTPDEITACNFATMWASL
(0)
SNTVPAAFWTLFYLLKDPVAMAAVRAE
(0)
AGAGTAACACCGTCCCTGCCGCCTTCTGGACCTTGTTCTACCTCCTGAAG
GATCCTGTCGCCATGGCTGCCGTTAGGGCGGAAGT
Matches to
VRAE exon
AFPZ541093.x1 ATGI16796.g1 ATGN341992.g1
ATGN200156.g1
ATGN281106.g1 ATGN186157.b1 ATUP270375.g1
ATGI164468.g1
AFSA312546.b2 AFPZ416567.g2
1 aa diff
IRAE istead of VRAE
AFPZ88959.g2 ATUP769642.y1 ATUP460828.g1
Walked
down from ATGN281106.g1 to ASFW19202.g2 (82% match)
Note
ASFW19202.g2 was the limit of extension upstream from the EXXR exon
Walked
down from ASFW19202.g2 to
ATUP281063.x2 and found the missing exon
SLETMKEAGKMIHVTREQLNDMKCL
(1)
Related
C-terminal sequence
>ATUP314736.g1
TFKYDRFLENGMEKTTFYKNGRKLRHYLLPFGHGVSMCPGRFFALNEI
KQFVTIVICYFNMELLEKQTPPKDQSRAGLGTLAPLKECLFRYSLK*
Walked up
from ATUP314736.g1 to AFPZ78488.b2 ATGI251736.b1
Found EXXR
exon
(1)
GSAINEALRMCSASIIIRVATDDAELALESGSTFRIRKGDRVALYPGFVHMDTEVFDDPE (0)
Walked up
from AFPZ78488.b2 to AFPZ1765.y1 ATGI167277.g1 ASFW94882.b2
No exon
found
Walked up
from AFPZ1765.y1 to ATUP281063.x2 ATWW105983.b1 ATUP770651.x1
ATGI221136.g1
AFSA540330.b2 ATGI167277.g1 ASFW19202.g2
ASFW19202.g2
could not be extended further upstream
ASFW94882.g2
mate pair to ASFW94882.b2 links this C-term seq to this N-term exon
Only 3 aa
diffs to AFPZ88959.b2
GLDFRLFSSKIAHRAFGVPIFYGTPRDWVRADSDALWPKELQGQGLDKVTE
ATUP770651.y1
mate pair to ATUP770651.x1
Same as
AFSA160234.g2
(1)
SYKTLFGRHKEEKDETALLLHAMEEFQKYDKRFPEIISNVPWWLMGHTKKRYEYLK (0)
The
evidence suggests that Gene B is composed of two parts joined by
multiple
mate pairs
>Gene B
assembled
42% to
CYP7A fugu and 40% to CYP8B2 fugu only 31% to 8A1
MVTELLGVCLAVVLVFVLLQVTTRRR
(2)
RPGEPPLEPGPLPYLGVALEFSRNPLGFITSRWRKYGDVFTVRLAGHYTTFVLDPHSFTHAIRNSR (2)
GLDFRLFSSKIAHRAFGVPIFYGTPRDWVRADSDALWPKELQGQGLDKVTE
VMMSNLQSAMLAATDVTVEWNKGELWSFVYRIIFSGK
(0)
TLFGRHKEEKDETALLLHAMEEFQKYDKRFPEIISNVPWWLMGHTKKRYEYLK
(0)
SMVSPAGLGQRGVSDFIRMRQEIYADANLTPDEITACNFATMWASL
(0)
SNTVPAAFWTLFYLLKDPVAMAAVRAEVDQ
(0)
ETGQSLETMKEAGKMIHVTREQLNDMKCL
(1)
GSAINEALRMCSASIIIRVATDDAELALESGSTFRIRKGDRVALYPGFVHMDTEVFDDPE
(0)
TFKYDRFLENGMEKTTFYKNGRKLRHYLLPFGHGVSMCPGRFFALNEI
KQFVTIVICYFNMELLEKQTPPKDQSRAGLGTLAPLKECLFRYSLK*
>BJ652936.1|
BJ652936 Eptatretus burgeri hagfish
cDNA clone
hg128o16 5', mRNA sequence.
Length = 591
Query: 257
VPAAFWTLFYLLKDPVAMAAVRAESLETMKEAGK------MIHVTREQLNDMKCLGSAIN 310
+PAAFW
L++LL P A+ +R E + +K G+ ++ ++ L ++ CLGSAI+
Sbjct:
14 LPAAFWALYHLLCHPDALTVIRKEVDDVLKSTGQYPKPSSLLKLSPTTLPNLVCLGSAIS 193
Query: 311
EALRMCSASIIIRVATDDAELALESGSTFRIRKGDRVALYPGF-VHMDTEVFDDPETFKY 369
E+LR+CSASI IRVA DD +L LE G T
+RK D VA+YP +H+D E++
+PE +KY
Sbjct: 194
ESLRLCSASINIRVAQDDLDLELEPGRTVPLRKNDWVAMYPQTALHLDPEIYPEPEIYKY 373
Query: 370
DRFLENGMEKTTFYKNGRKLRHYLLPFGHGVSMCPGRFFALNEIKQFVTIVICYFNMELL 429
DRFLENG EKT FYK G+KL HYL+PFG GVSMCPGRF ALNEIKQF+ ++I ++E+L
Sbjct: 374
DRFLENGQEKTNFYKGGQKLHHYLMPFGSGVSMCPGRFLALNEIKQFLFLLIAVLDLEIL 553
Query: 430
EKQTPPK 436
Q K
Sbjct: 554
PDQPQVK 574
>gi|58647881|gb|CX908537.1|
JGI_CAAN1354.fwd NIH_XGC_tropTe4 Xenopus tropicalis cDNA clone
IMAGE:7686846 5', mRNA sequence.
Length = 784
Score = 249 bits (636), Expect = 5e-65
Identities = 121/260 (46%), Positives =
175/260 (67%), Gaps = 7/260 (2%)
Frame = +3
Note the
strong match in yellow upstream of EXXR.
The region before the yellow
Is a poor
match and there might be another small exon in this region that fits better.
Query: 177
FQKYDKRFPEIISNVPWWLMGHTKKRYEYLKSMVSPAGLGQRG-VSDFIRMRQEIYADAN 235
F K+D
+FP ++ N+P L+G TKK E L P
+ +R +S+ ++ R+ +
Sbjct:
3
FTKFDAKFPYLVINIPIALLGATKKIREELIHFFFPNKMEKRSEISEVVQERKNVLEQYE 182
Query: 236
LTPDEITACNFATMWASLSNTVPAAFWTLFYLLKDPVAMAAVRAEVDQILKETGQSLETMKEAG 291
L + A +FA +WAS+
NT+PA FW ++YL++ P A+AAVR E
++ G+
Sbjct: 183
LQDYDRAAHHFAFLWASVGNTIPATFWAMYYLVRHPEALAAVRDEIDHLLQSTGQKKGPE 362
Query: 292
KMIHVTREQLNDMKCLGSAINEALRMCSASIIIRVATDDAELALESGSTFRIRKGDRVAL
349
IH+TREQL+ M LGSAI E+ R+C+AS+ IR+ +D +L LE
T R+RK D +AL
Sbjct: 363
YDIHITREQLDSMVLLGSAIKESFRLCAASMNIRLVQEDFDLELEGNQTIRLRKDDFIAL
542
Query: 350
YPGFVHMDTEVFDDPETFKYDRFLENGMEKTTFYKNGRKLRHYLLPFGHGVSMCPGRFFA 409
YP +HMD E+++DPE +KYDRF+ENG
EK FYK G+KL+ YL+PFG G S CPGRFFA
Sbjct: 543
YPPALHMDPEIYEDPERYKYDRFVENGKEKILFYKKGKKLKEYLMPFGSGTSKCPGRFFA 722
Query: 410
LNEIKQFVTIVICYFNMELL 429
+NEIKQF+ +++ Y +MEL+
Sbjct: 723
MNEIKQFLAVLLIYVDMELV 782
>CYP7
amphioxus for comparison 54% to Gene B same family
MISGILAGCLVVLVVAILVQAVGRKR (2)
RDPNEPPLESGPVPYLGVALQFAMDSLKFIRSRQKKYGDVFTVKLAGKYTTFVLDPHSYSDVMRQHK
(2)
ILDFKTVGMDIVERGFGTTHFERTGRAHVLHTADAYFPVHLQGNALDPLTNTMMGHLQTAMLADI (1)
GSETGWKKDGLWSFVRRIVSEASFLTIFGKHK
SQTVEQERARLMVVMETFWDYDRKFPQVVAGIPFWMLGKAKEQRDFLL
(0)
AFLSKDNLNQRDVLQLIEQRDEVFCGGGLSGKELAGAHFSTVWAS L(0)
SNTLPTAFWTLFHLLQDPVAMAAVRRE
(0)
TVTGFRDGGEKIDFTRQQLADMTCL
(1)
GSVVNEALRVSSVSIVLRQALEETTIALNSGSTFKIRKGDRVALFPQIVHMDPEVYEDPE
(0)
TFKYDRYLENGKEKTTFYKNGKKLRHYLIPFGIGTSRCPGRFFAVNEIKQFVSLIVCYFD
MELIDKETPPLDQSRTGLGVLPPKTDPMFRYKIK*
CYP11 like
(looking)
15th
International Conference on Comparative ENDOCRINOLOGY
MAY 23-28
2005 BOSTON
P15.3 Wed, 16:30-18:30 Spawning behavior and sex steroids in
amphioxus MIZUTA, T*, KUBOKAWA, K; Ocean Research Institute, University of
Tokyo
Abstract:
Amphioxus is the evolutionary closest animal to vertebrate. We have studied
reproductive behavior of captive amphioxus in tanks. Spontaneous spawnings were
recorded and analyzed during reproductive periods. Characteristics of the
behavior are as follows: 1) The first spawning animal was a male every spawning
day with no exception. 2) Spawnings of male and female were non-synchronous.
Spawnings lasted approximately 2 hours after the first male spawning of the
day. 3) We failed in the prediction of the spawning day in tank and the habitat, although spawning
occurs after the sunset in dark. 4) Level of gonadal maturation differed widely
among individuals even in the breeding season and from this fact we supposed
that spawning occurs irregularly when animals attain gonadal maturation. In
amphioxus as in all vertebrates and some invertebrates so far as studied,
endogenous endocrine factors would play an important role in inducing spawning.
To confirm previous reports histochemically showing existence of sex steroid
hormones in amphioxus gonads, we attempted to measure testosterone,
progesterone and estradiol-17beta in extracts of fully matured ovaries of
amphioxus by radioimmunoassay (RIA). Progesterone and a steroid-like substance
that shows a similar replacement curve with estradiol-17beta were detected.
Concentrations of these steroids in mature amphioxus ovaries were significantly
higher than those in ovaries of amphioxus collected in the non-breeding season.
In addition, we
demonstrated immunopositive reactions to antibodies against 3beta-HSD of fish
and P450scc of a commercial source in peripheral epidermal cells and
inner parts of an oocyte, respectively. These facts suggest that the pathway from cholesterol to
progesterone known in vertebrates exists at least in mature amphioxus.
CYP19 like
(looking, cannot find it with megablast of danio or Xenopus CYP19 mRNA)
>AFPZ686853.b2
possible N-term, not great. Two possible start METs
AFPZ686853.b2 AFPZ745305.x1 ATUP248884.b1
AFSA807178.g2
ATWW145756.g1 ATUP442682.g1 APNK55600.b2
MLQFLVIESRGSFPLNRSRTRHGITSQIEADGCS
MDTGEGWDVLLVVLLVVLVWYYIRETWTSGIDGIFPP
(1)
ATGTTGCAGTTTCTAGTGATAGAAAGCAGA
GGTAGCTTCCCATTGAACAGAAGTCGTACCAGACACGGCATAACCAGTCAGATCGAGGCA
GACGGCTGCAGTATGGACACAGGCGAGGGATGGGATGTTCTGTTAGTCGTGCTGCTTGTT
GTGCTTGTCTGGTACTACATCCGGGAAACCTGGACCAGCGGGATCGACGGGATATTTCCC
CCAGGT
>AFSA119741.g2
ASFW13005.g2 AFSA119741.g2 AFSA64159.g2
ATGI113449.b1 ATUP184988.g1
AFSA770314.b2 AFSA140463.g2 AFPZ766687.y1
APWS36590.b1 AFPZ686853.b2
(1)
GPPYIPLLTPLWTLWVFLHDGIWAATAGYAAKYGDFVRVWLGTEQTFIISR (2)
AGGTCCTCCGTACA
TCCCGCTATTGACGCCGCTATGGACCCTATGGGTGTTCCTTCACGATGGCATCTGGGCAG
CCACGGCCGGGTACGCCGCCAAGTACGGGGACTTCGTGCGGGTCTGGCTCGGCACCGAGC
AGACTTTCATCATCAGCAGGT
>AFSA903966.g2 AFSA119741.g2 AFPZ531554.y1
ATGI113449.b1
AFPZ766687.y1 ATUP184988.g1 ATWX16177.g1
(2)
ASAAAHVLKSSKYRARFGDPSGLAQIGMNGSGVIFNNDVQSWKFLRFFFVK (1)
AGAGCATCAGCAGCTGCGCATGTGCTTAAGTCCAGTAAGTACC
GGGCGCGGTTCGGCGACCCTTCTGGGCTTGCGCAAATCGGCATGAACGGCTCGGGCGTCA
TCTTCAACAACGACGTGCAGAGCTGGAAGTTCCTCCGCTTCTTCTTCGTCAAAGGT
AFPZ213319.y1 AFPZ116760.y1 ASWX37478.g2
ATGN339521.b1 ATGN140388.g1
ATUP248884.g1 ATWW121655.g1 AFSA777349.g2
ATGN262628.g1 ATUP212374.g1
ATUP184988.b1 ASFW193353.b3 AFPZ750454.y1
AFPZ323837.y1
(1)
VLDRAAGVSAIATRRQLANIRDIASSNPDGAVDVVTLMRRITLEIGNRLFLGVNIEN
(1)
AGTTCTTGACAGAGCAGCCGGCGTATCCGCCATTGCTACCAGACGACAACTGGCTAACATCCGGG
ACATTGCGTCGAGTAACCCGGATGGAGCAGTGGATGTCGTCACACTAATGCGCAGAATCA
CGCTGGAAATCGGAAACCGGCTATTTCTGGGTGTCAACATAGAAAATGGT
ASFW152099.b2
ATGN140388.g1 ATGN163395.g1 ATGN123302.b1
ATUP248884.g1
AFPZ895309.y1
AFSA777349.g2 ASFW152099.b2 AFPZ940540.x1
AFPZ323837.y1 ATUP212374.g1
ASFW193353.b3 AFSA522025.g2 AFPZ686853.g2
(1) DLEVVNTINGYFAAWEFFMIRPKVLQLIYPTLYRKHQTAV (2)
AGATCTGGAGGTGGTGAACACAATCAATGGATATTTCGCTGCCTGGGAGTTCTTCATGATAC
GACCCAAGGTGCTGCAGTTGATTTATCCTACCCTGTACAGAAAACACCAGACAGCAGTGT
APNK10005.g2 APNK8873.g2 ASFW152099.b2 AFSA830540.b2 AFSA765117.b2
ATGN123302.b1 ATGI177758.b1 ATUP570910.g1
AFPZ686853.g2 ATWW67966.g1
ATGN163395.g1
(2)
RALQDVVGKLVDKKRAVMNGDEAEEEFSIPKGEHDFAAALIQAQ (0)
AGGAGGGCTTTGCAAGACGTGGTGGGGAAGCTGGTGGACAAGAAGAGGGCCGTCATGAATGGAGAC
GAGGCCGAGGAAGAATTCAGCATCCCAAAAGGCGAACACGATTTCGCAGCTGCACTCATC
CAGGCGCAGGT
>ATUP846622.y1 aa 305-358
ATUP846622.y1 ATGI19659.g1 ATWW169596.g1
AFPZ766687.x1
AFPZ938114.x1
AFSA830540.b2 ATUP215781.y2 ATWW67966.g1
ATUP163149.x1
AFSA496531.g2 APWS76397.b1 ASWX46173.b2
(0) EFGQVSASCVRQCVTEMLLAGPDTMSVHIYFILLHIAEHGLENGILREIREVL
(1)
AGGAATTTGGCCAGGTGTCAGCCTCCTGTGTTCGGCAGTGCGTGACAGAAATGCTGCTTG
CCGGTCCGGACACCATGTCCGTCCACATCTACTTCATCCTCCTGCACATAGCCGAGCATG
GTCTAGAGAACGGGATACTTAGGGAAATCAGGGAAGTCTTGGGT
>aa
359-440
AFPZ750454.x3 ASWX46173.b2 ATUP637968.y1
ATUP846622.y1 ATGI38272.g1
ATGI156524.b1 ATWW50911.b1 ATUP151917.g1 ASWX78442.g2
(1)
GDRDPTRDDLSKMVFLDHVIN ESMRARPVVTFVMRHAEEEDHVDGYVIPKGTNVIINLVAVHQDPRHFP
EPETFDPDHFKEK (1)
AGGGGACCGAGATCCCACGAGAGATGATCTTAGCAAGATGG
TGTTCCTCGATCACGTGATCAACGAGAGTATGCGCGCAAGGCCAGTGGTCACTTTCGTCA
TGCGCCATGCTGAAGAGGAAGACCACGTGGACGGTTACGTCATACCAAAGGGGACCAACG
TGATCATCAACTTGGTTGCCGTGCACCAAGACCCTCGTCACTTTCCCGAGCCTGAAACGT
TCGATCCAGATCACTTCAAAGAAAAGGT
>ASWX78442.g2
AFPZ597386.y1 ASWX78442.g2 ATGN272728.g1
ATGN236719.g1 AFPZ323837.x1
AFPZ213319.x1 ATUP570910.b1 AFSA770314.g2
ATUP554961.x1
AFSA555906.g2
AFPZ556237.x1 ATWW50911.b1 AFPZ791616.y1
(0)
VPSTQFMPFGLGVRSCVGRTIAPL
QMKAVLITLLRMYQLSPSRDHQSLEVSRNLSEHPTEPGSMFLYPRLETI*
AGGTACCCTCTACCCAGTTCATGCCGTT
TGGCCTCGGCGTTCGCTCCTGTGTGGGACGAACCATCGCACCTCTTCAGATGAAGGCTGT
CCTCATCACGCTACTGCGCATGTACCAACTGAGCCCGTCACGTGATCATCAGAGCCTCGA
GGTGAN
>CYP19
amphioxus 37% to CYP19 zebrafish ovarian, 38% to brain form
two
possible start METs
MLQFLVIESRGSFPLNRSRTRHGITSQIEADGCS
MDTGEGWDVLLVVLLVVLVWYYIRETWTSGIDGIFPP (1)
(1)
GPPYIPLLTPLWTLWVFLHDGIWAATAGYAAKYGDFVRVWLGTEQTFIISR (2)
(2)
ASAAAHVLKSSKYRARFGDPSGLAQIGMNGSGVIFNNDVQSWKFLRFFFVK (1)
(1)
VLDRAAGVSAIATRRQLANIRDIASSNPDGAVDVVTLMRRITLEIGNRLFLGVNIEN
(1)
(1) DLEVVNTINGYFAAWEFFMIRPKVLQLIYPTLYRKHQTAV (2)
(2)
RALQDVVGKLVDKKRAVMNGDEAEEEFSIPKGEHDFAAALIQAQ (0)
(0)
EFGQVSASCVRQCVTEMLLAGPDTMSVHIYFILLHIAEHGLENGILREIREVL
(1)
(1)
GDRDPTRDDLSKMVFLDHVIN ESMRARPVVTFVMRHAEEEDHVDGYVIPKG
TNVIINLVAVHQDPRHFP
EPETFDPDHFKEK (0)
(0)
VPSTQFMPFGLGVRSCVGRTIAPLQMKAVLITLLRMYQLSPSRDHQSLEVSRNLSEHPTEPGSMFLYPRLETI*
>CYP19A1
Fugu ov Scaffold_7098 64% to LDZ38561.x1 CYP19 Length = 14029 53% to CYP19
=
LGS44549.x1 like ovary CYP19 P450s
9466 MAAVGLDAEVLVSVSPNATEAESPGSSAGTRALIILTCLLLLVWSHTEKKSVP
9308 (1)
9242
SLLGPSFCLGFGPLLTYVRFIWTGIGTASNYYNKKYGDIVRVWVNGEETLVISR 9081 (2)
8985 ASAVHHVLKSRQYTSRFGSKQGLSCIGMNERGIIFNNNVTEWRKIRGYFTK 8830 (1)
8759
ALTGPAVQNTVEVCNSSTQAHLDRLEDLAQVDVLSLLRCTVVDISNRLFLDIPIN 8595 (1)
8499
EKELLLKIHKYFDTWQTVLIK----PDIYFKFGWIHQKHKTAA 8392 (2)
8296
RELQEAIEGLVEQKRRDLEQADKLENINFTAELLFAQ 8186 (0)
8084
NHGELSAENVMQCVLEMVIAAPDTLSVSLFFMLLLLKQNPDVELQLLQEIDAVVGK
(0 expected, bad boundary)
RQLQNGDLQKLRVLETFINECLRFHPV 7719
7718
VDFTMRRSLSDDVIEGYRVPKGTNIILNTGHMHRTEFFLRPTEFCLQNFEKN
7563 (0)
APRRYFQPFGSGPRACVGKHIAMVMMKSILVTLLSQYSVCPHEGLT 7327
7326
LDCLPQTNNLSQQPVEHQEEAQQLSMRFLPRQRGSWQTV* 7207
>gi|47847288|dbj|AB178482.1|
Rana rugosa mRNA for P450 aromatase, complete cds
Length =
1726
Frog
Query:
9
VLLVVLLVVLVWYYIRETWTSGIDGIFPPGPPYIPLLTPLWTLWVFLHDGIWAATAGYAA 68
V+ V L++++W Y TS I PGP Y L PL T FL GI
+A+ Y +
Sbjct: 105
VVAFVFLLIIIWSYEE---TSSI-----PGPSYCLGLGPLITYGRFLWTGIGSASNYYNS 260
Query:
69 KYGDFVRVWLGTEQTFIIS 87
YG+FVRVW+ E+T IIS
Sbjct: 261
MYGEFVRVWINGEETLIIS 317
MILEALNTMQYNITEAMPSLAPATAASVVAFVFLLIIIWSYEETSSIP
GPSYCLGLGPLITYGRFLWTGIGSASNYYNSMYGEFVRVWINGEETLIISSSSA
TCHVMKHGHYVSRFGSKLGLQCIGMNENGIIFNSNPSLWKVIRPFFNRALSGPGLIQT
TEHSMKSTKRFLAKLSDVTDQVGNVNVLKLMRLIMVDTSNNLFLRIPTDENEIVLQIQ
KYFDAWETLILKPDIFFKFSWLYKKYEKSVNDLKKAVEILIEQKRQELSASDKLDEHL
DFASELIFAQNHGVLTAENVNQSIVEMLIAAPDTMSVSLYFILTLIAQHPKAEKMILD
EIHAVVGDREVQSSDMPNLKVLENFIYESMRYQPVVDVVMRKALEDDVIDGYYVKKGT
NIILNIGRMHKVEFFPKPNEFSLENFEKTVPQRYFQPFGFGPRACAGKYIAMVMMKAI
LVTLLKRYKVQTLQGRCLENIHNNNNLSTYPDESQSSLEMAFISLHTAPLAH
gi|58384757|gb|AY859423.1|
Mugil cephalus aromatase cytochrome P450 brain isoform (Cyp19b)
mRNA,
complete cds
Length =
2313
Score = 53.1 bits (126), Expect = 3e-06
Identities = 30/88 (34%), Positives =
48/88 (54%)
Frame = +2
Query:
1
MDTGEGWDVLLVVLLVVLVWYYIRETWTSGIDGIFPPGPPYIPLLTPLWTLWVFLHDGIW 60
M G +V ++LL++L++ +
TW+ + P
GP ++ L P+ + F+ GI
Sbjct: 140
MTAGTASEVASLLLLLLLLFLLLVTTWSRTHRSLIP-GPYFLAGLGPILSYIRFMWSGIG 316
Query:
61 AATAGYAAKYGDFVRVWLGTEQTFIISR 88
A Y KYG VRVW+ E+T I+SR
Sbjct: 317
TACNYYNNKYGSIVRVWINGEETLILSR 400
MMLLLLEKLTMGPMTAGTASEVASLLLLLLLLFLLLVTTWSRTH
RSLIPGPYFLAGLGPILSYIRFMWSGIGTACNYYNNKYGSIVRVWINGEETLILSRSS
AVYHVLRSAHYTSRFGSKMGLECVGMEGKGIIFNNDVPLWKKVRAYFAKALTGPGLQR
TVGICVSSTAKRLDRLQDVTDSSGHVDVLNLLRAIVVDISNRMFLRVPLNEKDLLMKI
QNYFETWQTVLIKPDIFFKMGWLYNKHKRAGKELQDAMDALLDIKRKIINETEKLDED
FDFATELIFAQNHGELSADNVRQCVLEMVIAAPDTLSISLFFMLMLLKQNPDVEMQIV
EEMNTILSERDVQNLDYQGLKVLESFINESLRFHPVVDFTMRKALEDDNIDGTAIRKG
TNIILNIGLMHKTEFFPKPKEFSLMNFDRTVPSRFFQPFGCGPRSCVGKHIAMVMMKA
ILVTLLSRYTVCPRQGCTLNSIKQTNNLSQQPVEDEHSLAMRFIPRAAQPQLKPL
>gi|44886089|dbj|AB164064.1|
Cynops pyrrhogaster P450 arom mRNA for cytochrome P450 aromatase,
complete cds
Length =
3176
Score = 54.7 bits (130), Expect = 1e-06
Identities = 35/80 (43%), Positives =
44/80 (55%)
Frame = +2
Query:
9
VLLVVLLVVLVWYYIRETWTSGIDGIFPPGPPYIPLLTPLWTLWVFLHDGIWAATAGYAA 68
+LLV ++LVW Y TS I PGP Y L P+ + FL GI +A Y
Sbjct: 320
LLLVSCFLLLVWRYEE---TSSI-----PGPGYCMGLGPVLSYCRFLRTGIGSAANYYNN 475
Query:
69 KYGDFVRVWLGTEQTFIISR 88
YGDFVRVW+ E+T IIS+
Sbjct: 476
LYGDFVRVWINGEETLIISK 535
MLLETLNPMYYNISHVVPEVSPTATVSLLLVSCFLLLVWRYEET
SSIPGPGYCMGLGPVLSYCRFLRTGIGSAANYYNNLYGDFVRVWINGEETLIISKSSA
TFHVMKHEHYTSRFGSRLGLQCVGMNENGIIFNSNPSLWKEIRPYFSKALSGPGLVQT
TDMCIKSTLTYLSRLKEVTTENGNVNVLTLMRLIMLDTSNNLFLRIPLDESEIVLKIQ
KYFDAWQALLLKPDIFFKISWMYYKYEKSAKDLKEAIEKLIEKKRKKLSTVERLEENM
DFASELIFAQNRGDLSADNVNQCILEMLIAAPDTLSVTLYFMLMLIAQHPRVEAKIME
EIKAVIGDREVRSTDMQNLKVVESFICESMRYQPVVGLVMRKALADDVIDGYYVKKGT
NIILNLGRMHRVEYFQKPNEFTLENFQKNVPYRYFQPFGFGPRACAGKYIAMVMMKAI
LVTLLKRYSVQPIMGRCLENIQNNNDLSVHPDETQSSLEMVFLPRNGTI
>gi|40021578|gb|AY489060.1|
Halichoeres tenuispinis
protogynous
Wrasse
MLMDVSSEVTVFLLLMVLLLLFTSWSRTQKQIPGPPFLAGLGPL
LTYSRFIWTGIGTACNYYNNKYGSIVRVWINSEETLILSRSSAVYHVLRSAHYTARFG
STTGLECIGMEGKGIIFNSDVQLWRKVRTYFSKALTGPGLQRTVGICVSSTAKHLERL
KEMTDPSGHVDALNLLRAIVVDISNKLFLRVPINEKDLLMKIQSYFETWQTVLIKPDI
FFKIGWLYNKHKKAAQELQDVMESLLVTKRKMIKESEKLDDDLDFATELIFAESHGEL
SADNVRQCVLEVWRSQLQYTLSISLFFMLMLLKQNPDVELRIVEEMNTVLREKGDGNL
DYQSLNVLESFINESLRFHPVVDFTMRKALEDDNIEGIKIAKGTNIILNIGLMHKTEF
FPHPTEFSLTNFDKTVPSRFFQPFGCGPRSCVGKHIAMVMMKAILVALLSRYTVCPRQ
GCTINSIRQTNDLSQQPVEDEHSLAMRFIPRATQPPLSHIFSQEM
Gen Comp
Endocrinol. 1984 Oct;56(1):53-58.
In vitro conversion of
androgen to estrogen in amphioxus gonadal tissues.
Callard GV, Pudney JA,
Kendall SL, Reinboth R.
The ability to convert
androgen to estrogen (aromatization) is a constant feature of gonadal and
neural tissues in all major vertebrate groups. In experiments reported here,
the existence of this pathway was investigated in the protochordate amphioxus
(Branchiostoma lanceolatum). Following incubation with
[3H]19-hydroxyandrostenedione, gonadal homogenates contained authentic estrone
and estradiol-17 beta, as determined by derivative formation and
recrystallization to constant specific activity. Cephalic ("brain")
and other segments were aromatase negative. The results indicate that a potential for estrogen
biosynthesis in the gonads predates that in other tissues and arises prior to
the evolution of true vertebrates.
danio
aromatase mRNA
atg
gcaggtgatc tgctccagcc ctgtggaatg aagccggtgc gtctcggcga
61
ggctgtggtg gatcttctta tccaaagggc tcataacggc actgaaaggg ctcaggacaa
121 tgcgtgtgga
gctacagcca caatactgct gctgctactc tgcctgctgc tagccatcag
181 acaccatcga
ccacacaaat cacacattcc aggtccttct ttcttttttg gtctgggtcc
241 tattgtctcc
tactgtcggt tcatctggtc tgggatcggg actgccagca actactacaa
301 cagcaagtat
ggagacattg tgcgtgtctg gatcaatggt gaggaaactc tcatcttgaa
361 caggtcgtca
gctgtatatc acgtgttaag gaagtctttg tacacttcac gctttggaag
421 taaactgggt
ctgcagtgca tcgggatgca tgagcagggc atcatattca actcaaatgt
481 ggctctctgg
aagaaagtcc gtgcatttta tgctaaagct ctcacaggtc cagggcttca
541 gaggactatg gagatctgca
ccacctccac aaactctcac ctggacgatt tgtctcagct
601 gacggatgct
caaggacagc tggacattct taacttactg cggtgcatcg tggtggacgt
661 ttccaacaga
ctgtttctag gagtcccgct caatgagcac gatctgcttc agaagattca
721 taaatacttt
gacacctggc agactgtatt aatcaagcct gatgtctact tcagactgga
781 ctggctgcac
aagaagcaca agagagatgc tcaggagttg caggatgcca tcacagctct
841 gatcgagcag
aagaaagttc aactggcaca cgcagagaaa cttgaccacc tcgactttac
901 agcagagctg
atatttgctc agagccatgg agagctgagc gcagagaacg tcaggcagtg
961 tgtgttggag atggtgatcg cggctccaga
cactctctcc atcagtctgt tcttcatgct
1021 gctgttatta
aaacaaaatc cagatgtcga gttaaagatc ctgcaggaaa tggacagtgt
1081 tttagctggc
cagagcctcc agcactcgca tctgtccaag ctgcagatcc tggagagttt
1141 tatcaacgag
tctctacgtt ttcacccggt cgtggacttc accatgcggc gggcgctgga
1201 tgatgatgtc
atcgagggat acaacgtgaa gaaaggaaca aacatcatac tgaatgtggg
1261 tcggatgcac
agatccgaat tcttctccaa acccaatcag ttcagtcttg acaacttcca
1321 gaaaaatgtt
ccgagtcgtt tcttccagcc gttcggatcg ggtcctcggt cgtgtgtggg
1381 gaagcacatt
gccatggtga tgatgaagtc tattctggtg gctctgctgt ctcgtttctc
1441 tgtgtgtcct
atgaaggcct gtacagtaga aaacatcccg caaaccaaca acctgtcaca
1501 gcagccggtg
gaggagccgt ccagcctcag cgtgcagctt atcctcagaa acactctc
Xenopus CYP19
mRNA
atggaagcc
ttgaatccag tgcagtataa
61
catcacagaa gctgttccca ctctggcacc tgccactact ctttctctgc tgctcttcat
121 ttttgtgctc
atcattctat ggaatcaaga ggagacatct ctgataccag gcccagctta
181 ttgcatggga
ctcgggcccc tcatttctta tggccgtttt ctactgacag gaattggcaa
241 agcagcaaat
tactacaaca acatgtatgg agaatttgtg agagtctgga ttaatggcga
301 ggaaacactg
attatcagca aatcttcagc aacatttcac atcatgaaac acagccatta
361 tgtctcacgc
tttggaagca agctagggct acagtgcatt ggcatgaatg aaaatggcat
421 catattcaat
agcaacccat ctctatggaa ggtcattcgg ccatacttca tcagagcttt
481 gtctggtcca
ggacttatgc aaacaacaga aaactgtata agatctacaa atcactacct
541 ggataacctg
agtaatgtta caaatgaact gggaaatgta gatgtcctta agctaatgag
601 gcttattatg
ttagatacat caaacaatct cttcctaagg atacccttag atgaaagtga
661 aattgttctt
aagatccaga aatactttga cgcctggcag gctctgcttc tgaaaccaga
721 catcttcttt
aaaatttcct ggctgtacaa gaaatatgaa aaatcagcaa atgatctgaa
781 ggaagctatt
gaacttctca ttgaacagaa aagacagaaa ctctcaagtt ctgagaagct
841 ggatgaggat
atggattttt catcagaact catatttgcc cagaatcatg gagatctaac
901 agctgagaat
gtcaatcagt gtattctgga aatgctaata gctgctcccg ataccatgtc
961 tgtatctctc
ttcttcatgt tagttctgat tgctcaacac ccaaagatag aagaaggaat
1021 aatgaatgaa atggataaag
ttattggtaa ccgggatgta gagagcaatg acattccaaa
1081 tcttaaaatt
ctggagagct ttatttatga aagcatgagg taccaaccag tggtagacct
1141 ggttatgcgc
aaggctctgg aggatgatat cattgatggt tactatgtga agaaaggcac
1201 taacatcatt
ttaaacttgg ggcgcatgca caaaattgta tactttccaa aaccgaacga
1261 gttcaccttg
gaaaattttg aaaagacggt tccatatcgt tacttccagc cctttggctc
1321 tggtccacgt
gcatgtgccg ggaagtacat agccatggtg atgatgaaag tcattctggt
1381 tactcttctc
aagaggtaca aagtgcagac attgagagga agatgcctgg agaatatcca
1441 aaataacaat gatttgtcca tgcaccctga tgaaagtcaa
ccttccttag agatgatctt
1501 cattcctaaa
aacacagcag agttcaaact g
CYP20 (no
hits with megablast using Danio CYP20 mRNA)
>ATGN150171.g1
aa 1-24
also
ATGN245625.b1 ATGN288660.b4
640
MLDYAIFAITFVVFLIATVLYLYP (0) 711
ASFW100504.g2
mate pair = C-term
165
MLDYAIFVITFVVFLIATVLYLYP (0)94
>ATGN288660.b4
exon 2
GANKITTIPGLEPSDPK
(2)
>ATUP402034.g1
aa 41-90
DGNLGDVGRAGSLHEFLLKLHTEYGDIASFWWGQQLVVSLGAPELWKQHERIFDRP
(1)
walked
upstream of ATGN370239.b1
ASWX106517.g2 ATUP286141.x2 ATWW97828.g1
AFPZ484891.b2
AFPZ43753.b2 APWS171586.b1 ATUP727787.y1 ATUP750333.x1
AFSA945609.b2 AFSA313913.g2 ATUP328643.y1
AFPZ512582.y1
AFPZ529103.y1
poor match
on ATUP750333.x1
(1) ALLFKGFEPLIGAKSIQYANSVDGRTRRKLYDPSYGHNAMKHYYSIFQE
(0)
>ATGN370239.b1
aa 143-202
518
LGQEMAKKWESMKGDQHIPLHAHIIALAMKAITRSSFGDAFKDEKECVQFGRNYDIV 688
ATGI38129.b1
AFSA140408.g2 mate pair = C-term
ILQLGQEMAKKWETMEGDQHIPLHAHMIALAMKAITRSSFGDSFKDEKECVQFGRNDDIV
ATWW28266.b2
mate pair = C-term
221
VLQLGQEMASKWESTKGDQHIPLHAHMMALALKTFTRSSFGDSFKDEKECVQFGRNY 391
>ATGI38129.b1
(0)
CWNDMEERIKGSHPTEGSPREKKFKE (1)
AGTGTTGGAATGATATGGAAGAGAGGATCAAGGGAAGTCACCCC
ACGGAAGGAAGTCCCAGGGAGAAAAAGTTTAAAGAAGGT
>APWS103772.b1
(1)
ALGKLHATIARVAKYRRENPSPPQEQLFIDVLIEGNLPEEQ (0)
AGCACTGGGAAAGTTACACGCTACTATTGCACGGGTGGCAAAGTA
CCGTCGAGAGAACCCTTCCCCACCCCAGGAGCAACTCTTCATCGATGTGCTCATTGAGGG
GAATCTGCCTGAGGAGCAGGT
>ATGN158528.b1
(0)
VLCDAMTFTVGGIHTSGN (1)
AGGTCTTGTGTGATGCTATGACATTTACGGTTGGGGGAATCCATACTTCTGGAAACTGT
>ATGN168444.b1
I-helix region, mate pair = C-term
AFPZ734943.y1 ATUP79075.y1 ATGN168444.b1 ATWW144961.g1
AFSA813644.b2 AFPZ813089.b2 AFPZ604816.b2
(1) VLTWALYYIATHEEVEEKLHQELSDVLGKKGEVTPDNISQLV
(2)
AGTGCTGACATGGGCCCTGTACTACATCGCCACTCATGAGGAGGTAGAAGAG
AAACTGCACCAGGAACTGAGCGATGTCTTGNGGAAGAAAGGAGAAGTCACCCCTGACAAC
ATCTCACAACTAGTGT
>AFPZ604816.b2
AFPZ910670.x1 AFPZ33570.g2 APNK104245.g1
ATGN336865.g1 ATGN336865.b1
ATGN326825.b1 ATGN165528.g1 ATGI17171.b1
ATUP478376.x1 ATUP166894.x1
AFPZ798704.x1 AFPZ868290.x1 ASFW78821.b2
AFSA582650.b2 AFPZ450600.b2
(2) YLRQVLDESLRCAVIAPWGARYMDLDAEVGGHIVPAK (0)
AGGTACCTACG
ACAGGTTCTTGACGAGTCGTTGCGCTGTGCCGTGATCGCTCCATGGGGCGCACGTTACAT
GGACCTGGACGCTGAAGTAGGAGGCCACATTGTGCCAGCCAAGGT
>AFSA912322.b2
772 (0)
QTPVIHAFGVVLQDERIWPEPNK (2) 834
AGACCCCAGTTATTCATGCTTTTGGAGTTGTCCTCCAAGATGAGAGGATTTGGCCAGAGCCAAACAAGT
GAATTTTAATTTTCAACATTTTGGGGCTCTTAGTGTTAAAATATAGAAACTCCAGAAAA
>ATUP839681.y1
(2)
FDPDRFDAENSKGRHKLAFQPFGFAGGRKCP (1)
AGGTTTGATCCAGATAGGTTTGATGCAGAG
AACAGTAAGGGTCGTCACAAGTTGGCATTCCAGCCATTTGGGTTTGCGGGGGGTCGCAAA
TGCCCAGGT
>ATGN302804.b1
aa 410-462
AFPZ226042.y3
AFPZ39024.g2 AFPZ33570.b2 ATGN302804.b1
ATWW28672.g1 AFPZ888298.y1 AFPZ888298.x1
AFSA693004.b2
2 nucl
diffs
AFPZ336078.y1 AFPZ535923.x1 AFPZ30312.b2
APWS33488.g1
ATGN214653.g1
ATGI38129.g1 ATWW28266.g2 ASFW145026.g2 ASFW100504.b2
AFSA939980.g2 AFSA140408.b2 AFPZ444424.x1
AFPZ593083.g2
3 nucl
diffs
ATGI33407.b1 APWS15046.g1 ATGN168444.g1 ATUP527043.g1 AFSA547804.g2
52 (1)
GYRFTYTWTSVFLSILCRQFKLHLVDGQVVKPCHGLVTRPVDEIWITVTKRD* 204
AGGTTCACCTATACGTGGACATCAGTGTTCCTGTCCATCCTGTGCCGACAGTTCAAGCTCCATCTGGTG
GACGGACAGGTGGTCAAGCCGTGCCACGGGCTCGTCACGCGCCCGGTCGACGAGATCTGG
ATTACGGTCACCAAGCGTGACTAA
>CYP20
amphioxus 39% to CYP20 Danio
MLDYAIFAITFVVFLIATVLYLYP
(0)
(0)
GANKITTIPGLEPSDPK (2)
(2)
DGNLGDVGRAGSLHEFLLKLHTEYGDIASFWWGQQLVVSLGAPELWKQH
ERIFDRP (1)
(1)
ALLFKGFEPLIGAKSIQYANSVDGRTRRKLYDPSYGHNAMKHYYSIFQE (0)
(0)
LGQEMAKKWESMKGDQHIPLHAHIIALAMKAITRSSFGDAFKDEKECVQFGRNYDI (0)
(0)
CWNDMEERIKGSHPTEGSPREKKFKE (1)
(1)
ALGKLHATIARVAKYRRENPSPPQEQLFIDVLIEGNLPEEQ (0)
(0)
VLCDAMTFTVGGIHTSGN (1)
(1) LLTWALYYIATHEEVEEKLHQELSDVLGKKGEVTPDNISQLV
(2)
(2)
YLRQVLDESLRCAVIAPWGARYMDLDAEVGGHIVPAK (0)
(0)
QTPVIHAFGVVLQDERIWPEPNK
(2)
FDPDRFDAENSKGRHKLAFQPFGFAGGRKCP (1)
(1)
GYRFTYTWTSVFLSILCRQFKLHLVDGQVVKPCHGLVTRPVDEIWITVTKRD*
>CYP20
Danio rerio (zebrafish) ctg10765 74% to fugu
9501 MLDFAIFAVTFVIILIGAVLYLYP (0) 9572
SSRRASGVPGLNPTEEK (2?)
DGNLQDIVNKGSLPEFLVGLH
DEFGSVASFWFGARPVVSLGAVNQLRQHINPNWT (1) 10024
12291 TDSFETMLKSLLGYQSGSGVGLTESMMRKKVYE-GAINKTLENNFPLLLQ
(0) 12439
12929
QVEELVDKWASYPKSQHTPLCAHFL 13003
13003
GLAMKAVTQLAMGSRFRDDAEVIRFRKNHEA (0) 13095
15738 IWSEIGKGYLDGSLEKSSSRKAHYES (1?) 15815
15897 ALAEMESVLKSVAKQRPGQGSSQSFVNYLLQANLTERQ (0)
16010
16583 VMEDGMVFTLAGCVITAN (1) 16636
17689 LCIWAVHFLSVSEAVQDRLYHELVEVLGDELVSLEKIPQLR
(2) 17811
19293
YCQQVLNETVRTAKLTPVAARLQEVEGKVDQHVIPKE (0) 19403
21269
TLVIYALGVVLQDADTWSLPYR
(2) 21334
21425 FNPDRFAEESVMKSFSLLGFSGSQACPELR (2) 21514
FAYTVATVLLSTLVRRLRMHRVDGQVVEARYELVTTPKDDTWITVSKRN*
>CD784670.1
CYP20 Rhipicephalus appendiculatus cDNA a Tick
MLDFAIFAVCFVVFLLALVLYLYPSSAKQTTIPGLEPSDKKEG
NVGDIVQAGGLQNFLISLHKEHGPIASFWIGTKLVVSIGKADLFKTQSHVFDKPAELFVL
YRDVMGAGSIFFANGAEARKRRRLIDEVLTGKSLDKFLGPIEKLCSEVVMHLKDTPDDEH
VPVYQYMYALCMKISTRLLFGEYFFDDMEVLKFSRNFELCIKELEE
>CD295714.1
CYP20 Sea urchin larva cDNA Strongylocentrotus purpuratus
106
MLDFAIFAVTFIILLVGLLIYIYPTTPQKTTTVPGLEPSDPVKGNLDEIGDAGSLHQFLT 285
286
KLHAEHGDIASFYFTDQLCVSITSPELFKEHQAVFNRPALLFKLFEPLITPDSIQYANGG 465
466
DGRKRRDLTDRCFGFQALQNFIGVFNKNHRALVKK 570
>DN668857.1 CYP20 Gasterosteus aculeatus cDNA 77%
to Danio
Conner
Creek sticklebacks
MLDFAIFAVTFVVILVGAVLYLYPSSRRASGIPGLNPTDEKDGNLQDIVDRGSLHE
FLVSLHREFGSVASFWFGGRPVVSLGSVHLLRQHINPNHTTDSFETMLKSLLGYQSAMGG
GAAETVIRKKVYENAINNTLKSNFPLVLKLVEELVGKWKSFPASQHTPLCAHLLGLAMKT
VTQLALGESFGDDAKVLSFRKNHDA
IWSEIGKGYLDGSLEKSSSRKGDYEK
ALSEMESMLLSVXEGKKAQKKQT
FVDALLQFSLTERQ
VMEDCMVFTLAGCVITAN
WGIWAVHFL
CYP21 like
(looking, searched with heme signature exon and I-helix exon and EXXR exon from
zebrafish, No hits found)
>CYP21
AL953915.4 Zebrafish exon 7 boundaries not certain, no good EST or mRNA in
vertebrata. N-term not clear
MCFSVVSVVLLLFILWMLVVKFWRQSHRRTDG (0)
4754
IVILICVSFYCPIAVFPKLLHSLYKLFFSTVSPTISGPRSL PLLGNMLDLAQDHLPIHLTALA 4942
4943
KCYGNIYRLNCGSTS
5738
AMVVLNNSEIIREALVKKWSDFAGRPYSYTG
5918
XDIVSGGGRTISLGDFSEEWKAHRRVTHSALQRCTTDSLHSVIEKQAQHLCQ 6070
7187
VLRDYSGKAVDLSEDFTVASSNVITTLTFSKA 7282
7369
YDKSSAELQKLQECLNEIVSLWGSPWISALDSFPLLR
KFPNPPFSRLMKEVARRDELIGKHIEEFK 7665 (0)
KSEHKEGGTLTSSLLKCLEPQQGAANHT (0)
8868
TLTDTHVHMTTVDLLIGGTETIAALLNWTVAFLLHRPE 8981
14582
VQDKVYEELCCVLDVRYPQYSDRHKLPYLCALISEMLRLRPVAPLAVPHRAIRNS 14746
16875
SIAGHFIPKNTIIIPNLYGAHHDPEVWDDPYSFKP 16979
17077 ERFLEGGGGSLRSLIPFGGGARLCLGEAVAKMEMFLFTAYLLREFKFLPASKEEPLP
17247
17248
ELRGVASVVLKVKPYTVIAHPREQ* 17322
CYP24 like
(looking, cannot find it with megablast of N-term
Part of
zebrafish CYP24 EST seq CN507760, or whole tetraodon mRNA)
>CYP24
zebrafish ctg12249 CN507760.1 69% to human CYP24 except N-term 76% to fugu
CYP24
MRAHLQRAPQILELLKKKTAGLQHCKPTSSVCVLDSKDAAGSAPCAHS
90158
LDSIPGPTNWPLFGSLIEVIRNGGLKRQHETL 90253
91424
IHFHKKFGKIFRMKLGSFESVHIGSPCLLEALYRKEGSYPERLEIKPWKAYRDMRDEAYGLLIL 91615
91705
EGRDWQRVRSAFQQKLMKPTEVMKLDGKI 91791
SEVAADLIKRIGKVNGKMDDLYFELNKWSFET
92515
ICYVIYDKRFGLLQDSVSKEGMDFITAVKT
MMSTFGTMMVTPVELHKTLNTKTWKDHTEAWDRIFST 92850
95260
AKHYIDKNLQKQSNGEADDFLSDIFHNGNLTKKELYAATTELQVGGVET 95406
TANSMLWVIFNLSRNPCAQGKLLKEIQDVVPAGQTPRAEHIKNMPYLKACLKESMR
VSPSVPFTSRTLDKDTVLGDYTLPKG 99051
TVLMLNSQAIGVSEEYFDNGRQFRPERWLEEKSSINPFAH
VPFGVGKRMCIGRRLAELQIQLGLCWILRDYK
IVATDLEPVDSLHSGTLVPSRELPVAFVPR 101033
CYP26 like
there only appear to be two not three
both are
most like 26C.
>AFPZ916045.b2
exon 1 APWS115441.b1 APWS115441.b1 ATGN366174.g1
AFSA580900.g2
AFPZ435288.x1 ASWX124624.b2 AFPZ464480.y1 AFPZ610800.g2
MLAELLINAAVPLVLVWTLWTLWKHYSTQGDPACDLPLPKGSMGLPFIGETLAFVTQ (0)
>APWS143762.g1
exon 2 or 3
GADFSRSRHELYGDVYKTHILGRPTVRVRGADNVRKILHGENTLVT
TIWPYSIRAVLGTQNLGMSFGEEHRFRKRVVMKAFNQNAMESYLRSTQTVLRETVAQWCV
QPQPVVVYPASREMALKIAAASLIGVHTGQEDAQRVTVLFQNMIDNLFSLPVKIPFGGLLSK
>AFPZ682082.g2
ALRYRQIIDEWLEGHIKRKQRDIDNGDIGTDALSRLILAARDVGHDLNSQEIQDTA
VELLFAGHETTSSAATSLIMHLALQPQVVQKVQEDLEKHGLLQPDQPLSLEQVGRL
TYVGQVVKEVLRISPPIGGGFRKALKTFELD
>BI377228
Amphioxus 5-6 hrs cDNA 53% to 26C1 Fugu 44% to 26B1 fugu
AFPZ57964.g2
GRLTYVGQVVKEVLRISPPVGGGFRKALKTFELD
()
GFQVPAGWTVTYSIRDTHGSVGNVSSPDQFDPDRWAADSDGSRRGRHHYIPFGAGPRACAG
KEFAKLQLKLLCVELVRSCRWELADGKVPAMTAIPVPRPVNGLPVQFTP
CEPITNNTLSDATEQNTNLSVAQQN*
>46% to
Xenopus CYP26, 47% to 26C1 hum, 43% to 26A1 hum 42% to 26B1 hum
65% to the
second Amphioxus seq
MLAELLINAAVPLVLVWTLWTLWKHYSTQGDPACDLPLPKGSMGLPFIGETLAFVTQ (0)
GADFSRSRHELYGDVYKTHILGRPTVRVRGADNVRKILHGENTLVT
TIWPYSIRAVLGTQNLGMSFGEEHRFRKRVVMKAFNQNAMESYLRSTQTVLRETVAQWCV
QPQPVVVYPASREMALKIAAASLIGVHTGQEDAQRVTVLFQNMIDNLFSLPVKIPFGGLLSK
(0)
ALRYRQIIDEWLEGHIKRKQRDIDNGDIGTDALSRLILAARDVGHDLNSQEIQDTA
VELLFAGHETTSSAATSLIMHLALQPQVVQKVQEDLEKHGLLQPDQPLSLEQVGRL
TYVGQVVKEVLRISPPIGGGFRKALKTFELD
(0)
GFQVPAGWTVTYSIRDTHGSVGNVSSPDQFDPDRWAADSDGSRRGRHHYIPFGAGPRACAG
KEFAKLQLKLLCVELVRSCRWELADGKVPAMTAIPVPRPVNGLPVQFTP
CEPITNNTLSDATEQNTNLSVAQQN*
>CF919306
Amphioxus 26 hrs cDNA 61% to BI377228 52% to 26C1 fugu
ASWX177691.b2
(exon 4) APWS171289.g1 (exon 2)
46% to
26C1 43% to 26B1 44% to 26A1
GGKFSSSRHAHYGDVFKTHILGRPTIRVRGATNVRKILLGENHIVTSLWPQTFRTVLGT
GNLAMSNGEEHRLRRKVIMKAFNYEALERYVPIMQEILREAVQRWCGAPQPVTVWPMARE
MAFRVASAVLVGFQHSDEEIQHLTSLFTNMVKNLFSLPVKLPGSGLSN
(0)
GLFYRQAIDEWMMNHIQRKKEFVLQGGDSGDVLSHIMNNAKDNGEKLSDQEIQDTVVELLFAGHET
TSSAATSLIMHLALQPQVVQKVQEDLEKHGLLQPDQPLSLEQVGRLTYVGQVVKEVLRRR
PPIGGGYRRALKSFDIG
(0)
GFHVPKGWAVLYSIRDTHEASQIFSSPELFDPDRWTPETSQAPLARYDMVTFGGGPRA
CVGKEFAKLLLKLLCVELTRRCRWKLADDKLPDMKLIPIVYPADGLPVIFTP
IGGKSPGDENKNGVPYEERTRGKDCPILCSVSFEKDINVAT*
>exon 2
APWS171289.g1 AFSA937430.b2 AFSA602616.g2
APNK6178.b2
ATUP304592.g1 ATGN98133.b1 AFSA785561.b2
AGGGAGGCAAATTCAGTTCCAGCAGACATGCGCACTACGGGGATGTCTTCAAGACCCACA
TCCTGGGCCGCCCGACCATCCGCGTTAGAGGGGCGACCAACGTGCGCAAGATCCTGCTGG
GAGAGAACCACATCGTCACCAGTCTGTGGCCGCAGACGTTCCGCACGGTTCTGGGGACCG
GGAACCTCGCCATGAGTAACGGCGAGGAGCACAGGCTGCGCAGAAAGGTCATCATGAAGG
CCTTCAACTACGAGGCGCTGGAGAGGTACGTCCCCATCATGCAGGAGATCCTGCGCGAGG
CTGTCCAGCGGTGGTGCGGGGCTCCCCAGCCGGTGACTGTCTGGCCCATGGCACGGGAGA
TGGCCTTCCGTGTGGCGTCAGCCGTCCTGGTGGGCTTCCAGCACAGCGATGAGGAGATCC
AACACCTCACCTCTCTCTTCACCAACATGGTCAAGAACCTCTTCTCTCTCCCAGTCAAGC
TACCCGGGAGTGGGCTCAGTAACGT
>exon 3
ATWX28853.g1 ATUP829694.b2 ATGI148270.g1 AFSA233238.b2
AFPZ187567.x1
ATUP946171.b1 ATUP911017.x1 ATUP863742.g1
ATGI148270.g1
ATWX28853.g1 APWS145712.b1
AFPZ130036.y1
AGGGGCTGTTTTATCGACAAGCCATCGATG
AGTGGATGATGAACCACATTCAGAGGAAGAAAGAGTTTGTGCTGCAGGGTGGCGACAGTG
GAGACGTCTTGTCGCACATCATGAACAATGCGAAGGACAACGGAGAGAAGTTGTCTGACC
AGGAGATCCAGGACACGGTGGTGGAGCTGCTGTTTGCCGGGCACGAGACCACGTCCAGCG
CCGCCACCTCCCTCATCATGCACCTGGCGCTGCAGCCACAGGTGGTTCAGAAGGTGCAGG
AGGACCTGGAGAAGCACGGGCTGCTGCAGCCGGACCAGCCTCTGAGTCTGGAGCAGGTCG
GCAGGCTGACGTACGTGGGGCAGGTCGTCAAGGAGGTGCTCAGGCGGCGCCCGCCCATTG
GAGGAGGCTACCGCAGAGCGCTCAAGTCTTTTGACATCGGCGT
>exon 4
ASWX177691.b2
ATUP915224.y1 ATUP215152.y2 AFSA763907.g2
AFSA344826.g2
AFPZ764896.b2 AFPZ69374.g2
AFPZ407263.b2 AFSA77547.g2
AGGGTTTCCATGTGCCCAAGGGATG
GGCGGTACTGTACAGCATCAGAGACACACACGAAGCCTCCCAAATCTTCTCCTCGCCGGA
GCTGTTCGACCCTGACCGATGGACCCCCGAGACATCCCAGGCGCCCCTGGCCCGGTACGA
TATGGTGACGTTCGGCGGGGGACCACGAGCCTGTGTCGGGAAGGAGTTTGCCAAGCTCCT
ACTGAAGCTTCTGTGTGTGGAGCTGACGAGAAGGTGCCGCTGGAAGCTGGCAGACGACAA
GCTACCGGACATGAAGCTCATTCCCATCGTGTATCCAGCCGACGGCTTACCTGTTATCTT
CACTCCCATTGGCGGAAAGTCACCTGGTGACGAAAACAAAAATGGCGTGCCGTATGAGGA
GAGGACAAGGGGCAAGGACTGTCCTATTCTCTGCTCGGTGTCGTTTGAAAAAGACATAAA
CGTCGCGACT
CYP27
(looking, no hits with megablast using human 27A1 or Xenopus 27A)
Endocrinology.
2003 Jun;144(6):2704-16.
Cloning of
a functional vitamin D receptor from the lamprey (Petromyzon marinus), an
ancient vertebrate lacking a calcified skeleton and teeth.
Whitfield
GK, Dang HT, Schluter SF, Bernstein RM, Bunag T, Manzon LA, Hsieh G, Dominguez
CE, Youson JH, Haussler MR, Marchalonis JJ.
Department
of Biochemistry and Molecular Biophysics, College of Medicine, University of
Arizona, Tucson, Arizona 85724,
USA.
kerr@medbioc.arizona.edu
The
nuclear vitamin D receptor (VDR) mediates the actions of its 1,25-dihydroxyvitamin
D(3) ligand to control gene expression in terrestrial vertebrates. Prominent
functions of VDR-regulated genes are to promote intestinal absorption of
calcium and phosphate for bone mineralization and to potentiate the hair cycle
in mammals. We report the cloning of VDR from Petromyzon marinus, an unexpected
finding because lampreys lack mineralized tissues and hair. Lamprey VDR
(lampVDR) clones were obtained via RT-PCR from larval protospleen tissue and
skin and mouth of juveniles. LampVDR expressed in transfected mammalian COS-7
cells bound 1,25-dihydroxyvitamin D(3) with high affinity, and transactivated a
reporter gene linked to a vitamin D-responsive element from the human CYP3A4
gene, which encodes a P450 enzyme involved in xenobiotic detoxification. In
tests with other vitamin D responsive elements, such as that from the rat
osteocalcin gene, lampVDR showed
little or no activity. Phylogenetic comparisons with nuclear receptors from
other vertebrates revealed that lampVDR is a basal member of the VDR grouping,
also closely related to the pregnane X receptors and constitutive androstane
receptors. We propose that, in this evolutionarily ancient vertebrate, VDR may
function in part, like pregnane X receptors and constitutive androstane receptors,
to induce P450 enzymes for xenobiotic detoxification.
CYP11
amphioxus, most similar to CYP11A1 of vertebrates
Looking
for N-term
These
sequences are repeat sequences.
There are too many of them to be
A true
gene.
>APWS65319.g1
(query)
Query: 698
ETISTSDTKWRPRIAKILKPAD*LEIPGEYVGQGAFFKFINV*TMSPERNTRYLXHIXCI 519
+T ST
D WRPRIA ILKPAD*LEI GEY G FFKFINV*TMSPER+TR L HI CI
Sbjct: 191
QTPSTGDPTWRPRIASILKPAD*LEITGEYAGLATFFKFINV*TMSPERHTRSLTHIHCI 370
Query: 518
LESFVVVCGVSGSFVVICGV*ADRLLT
438
LESFVVVCG+S SFVV+CG+*ADRL T
Sbjct: 371
LESFVVVCGISESFVVVCGI*ADRLQT 451
>APWS65319.g1
(query)
ATGTCTCCAGAGCGCAACACAAGGTATTTGANCCATATATANTG
CATATTAGAGTCGTTTGTGGTCGTTTGTGGTGTTTCTGGGTCGTTTGTGGTCATTTGTGGTGT
APWS32978.b1
Sbjct
ATGTCTCCAGAGCGCCACACAAGGTCTTTGACCCATATACACTG
CATTTTAGAGTCGTTTGTGGTCGTTTGTGGTATTTCCGAGTCGTTTGTGGTCGTTTGTGGTAT
Walked up
to ATGN68471.b1
ATGN295224.g1 ATGN68471.b1 ATUP813076.x1
(mate pair to LKIV exon)
ATUP179744.g1 ATGI135158.g1 (mate pair to heme sign. exon)
ATUP548358.y1 ATWW121053.b1 APNK7243.b2 AFSA920522.b2 AFSA567674.b2
AFSA527467.g2 ATUP163891.y1
MSQVPII
(0)
ATGAGTCAGGTGCCTATCATAGT
(0)
TYSTAAVGSTSHHDDDREAKPFSALPGPPSVPVLGNFLHMWWEGLLEKEKLNKNHIMFTDFFRQYGPIFR
(2)
AGACATACAGCACAGCTGCAGTCGGGTCTACCAGCCACCACG
ATGACGACAGGGAGGCCAAGCCTTTCTCTGCCCTGCCTGGTCCC
CCCTCTGTACCAGTCCTCGGCAACTTCCTGCACATGTGGTGGGAGGGACTCCTCGAGAAA
GAAAAGCTCAACAAAAATCATATCATGTTCACAGATTTCTTTCGTCAGTATGGTCCAATATTCAGGT
>ASFW5255.g2
ATUP770573.x1 ATUP502350.b1 ATUP22951.b1
ATUP261870.b1
ATUP813076.y1
ATUP163891.y1 ATGN295224.g1
2 nuc
diffs but same aa seq
AFPZ508950.x1 ATGI69596.g1 APWS104982.b1
AFSA460369.b2 ATUP958061.g1
ATGN113210.g1 ATGI237486.g1 ASWX31370.b2
AFPZ863602.y1 AFSA738793.g2
(2)
LKIVNVDMVSIKDPVAVQELFRKEGKYPARIDIKPWRRYREISGKATGVFLS (2)
AGTGTTCCGGAGGAGGGAAGTACCGGCACGCATCGACATCAAACCCTG
GAGGAGGTACAGGGAAATCTCAGGCAAGGCCACTGGAGTGTTTCTCAGGT
>ATWX54621.b1
C-helix (mate pair to GRR exon)
(2) NGKDWQKNRSIMARPMLRPKHVSTYVSNLDTVSADMIKRLRVLQARADGIEV
PNISDELFKWALE (1)
AGTAATGGCAAAGACTGGCAGAAAAACAGGTCCATAATGGCGCGACCCATGC
TACGCCCCAAACATGTGTCTACGTACGTCAGCAACTTGGACACGGTGTCAGCCGACATGA
TCAAGCGACTGCGAGTACTCCAGGCAAGGGCCGATGGGATAGAAGTTCCAAACATATCAG
ATGAGCTGTTCAAATGGGCTCTAGAATGT
>ATWX25911.b1
SICTVLFNERMGYLQDNISQDAQDFIQGIHTIFLTTNTVIFPDADVHRFLRTKPWRQSVEAWDTVF
(1)
AGCCATCTGCACGGTCCTGTTCAATGAGCGGATGGGGTATCTACAG
GACAACATCTCCCAGGATGCTCAGGACTTCATCCAGGGTATCCACACCATCTTCCTCACA
ACCAACACCGTCATCTTCCCTGACGCGGATGTGCATCGTTTCCTGAGAACCAAACCGTGG
AGACAGTCTGTGGAGGCGTGGGACACGGTTTTCCGT
>AFPZ267404.y1
(mate pair to GRR exon)
(1)
REKVMVRKLQEALEREERGEGEDDQPNFLAFVNSTGRLTKDEIYSNTIELMGAAIDT (0)
AGGTGAGAAGGTGATG
GTCCGTAAGTTACAAGAAGCTCTGGAGCGGGAGGAGAGGGGGGAGGGGGAGGACGATCAA
CCCAACTTCCTGGCATTCGTCAACAGCACAGGGAGGCTGACCAAGGATGAGATTTACTCC
AACACCATTGAGTTGATGGGTGCTGCTATTGACACGGT
>ATGN150620.g1
55% to zebrafish 27A aa 281-336
AFSA195579.g2 AFPZ323214.y1 ATGN360541.b1
ATGN150620.g1 ATWW217237.g1
ATWW208446.g1 ATUP560326.g1 APNK7248.b2
AFSA699487.b2 AFPZ633876.y1
APWS109299.g1
ATWX69434.g1 ATUP305845.g1 ATGI148000.g1 AFSA903424.g2
AFPZ643989.g2
(0)
TSNTLLWTLYELSRRPELQDRLYQEVTQVIGQDKVMTWDHLKDLHLLKAIIKETLR (2) 885
AGACCTCCAACACCCTCCTGTGGACCCTGTACGAGCTGTCACGCAGACCTGAACTCCAGGACAGACTG
TATCAGGAGGTCACACAGGTCATAGGTCAGGACAAGGTCATGACCTGNGATCACCTGAAG
GACCTGCACCTCCTGAAGGCCATCATTAAGGAGACTCTGAGGT
>ATWX69434.g1
APNK115343.b1 ATWX69434.g1 ATUP302108.b1
ATGN360541.b1 ATGN112603.g1
AFSA814391.b2 AFSA350501.g3 ATUP613311.y1
ATWW217237.g1
(2)
MYPVVHNVSRLLQEDTVLMGYRLPAK (0)
AGGATGTATCCAGTTGTCCATAATGTCAGCCGTTTGCTGCAGGAGGACACAGTGCTCATGGGATATCGG
TTACCCGCAAAGGT
walked
down to
AFPZ139562.y1 ATUP806443.x1 ATGN112603.g1
ASFW123561.b2 AFPZ913658.y1
51% to 27A
AFPZ913658.y1 ATUP205039.x2 ATWX47281.g1
APWS151338.g1 ATUP560326.b1
ATWW147637.g1 ASFW123561.g2 AFSA517290.b2
AFPZ633876.x1 ATGI135158.b1 APWS157725.b1 AFSA460369.g2
AFSA100624.g2 AFSA100624.b2 ATUP806443.x1
ASFW123561.b2
(1)
TCVVAQVYAMGRDPQLFPDPDEFKPERWLRTGEAHDEINPYSSLPFGFGPRSCL (1)
AGACCTGCGTGGTTGCCCAAGTGTACGCCATGGGGCGGGACCCC
CAGCTGTTTCCTGATCCAGACGAGTTTAAACCCGAGCGCTGGTTGAGAACAGGAGAGGCT
CACGATGAAATCAACCCGTACAGCTCTCTGCCATTTGGCTTCGGGCCACGCAGTTGTCTTGGT
>AFPZ633876.x1
ATWX54621.g1 ATGN45774.g1 ATGN92985.g1 ATWW147637.g1 ATUP440242.g1
ASFW123561.g2 ASFW5255.b2 AFPZ633876.x1 AFPZ267404.x1
ATUP305845.b3
ATUP305845.b1 ASFW110757.g2 AFPZ455460.x1
(1)
GRRVAEVELQLLLAK (0)
AGGTCGTCGTGTGGCAGAGGTCGAGCTGCAACTCCTTCTTGCAAAGGT
>ATWX25911.g1 last exon of a CYP27 like gene
note there
is an odd repeat of the first 10aa of this seq
up to 10X
in ATWX25911.g1
AFPZ79643.b2 ATGI71035.g1 ATWX25911.g1
ATGN112603.b1 (mate pair)
ATWW217237.b1
(mate pair) ATWW212578.g1 ATWW208446.b1 (mate pair) AFPZ944117.x1
AFSA285680.b2
ATGN209347.b1 AFPZ755865.b2
(0) MSQQFVLSQVEPEEISSVAQPLLMPETPLHLRFVDRK*
AGATGTCCCAGCAGTTTGTGCTGAGTC
AGGTGGAACCAGAAGAGATTTCCTCAGTAGCGCAGCCGTTACTGATGCCGGAGACACCCC
TGCACCTCCGGTTTGTGGACAGGAAGTAA
>Assembled
CYP11 seq missing exon 1, cannot identify yet. This is a guess.
Green
parts match other genes.
MSQVPII (0)
(0) TYSTAAVGSTSHHDDDREAKPFSALPGPPSVPVLGNFLHMWWEGLLEKEKLNKNHIMFTDFFRQYGPIFR (2)
(2)
LKIVNVDMVSIKDPVAVQELFRKEGKYPARIDIKPWRRYREISGKATGVFLS (2)
(2) NGKDWQKNRSIMARPMLRPKHVSTYVSNLDTVSADMIKRLRVLQARADGIEV
PNISDELFKWALE (1)
(1) SICTVLFNERMGYLQDNISQDAQDFIQGIHTIFLTTNTVIFPDADVHRFLRTKPWRQSVEAWDTVF
(1)
(1)
REKVMVRKLQEALEREERGEGEDDQPNFLAFVNSTGRLTKDEIYSNTIELMGAAIDT (0)
(0)
TSNTLLWTLYELSRRPELQDRLYQEVTQVIGQDKVMTWDHLKDLHLLKAIIKETLR (2) 885
(2) MYPVVHNVSRLLQEDTVLMGYRLPAK (0)
(1) TCVVAQVYAMGRDPQLFPDPDEFKPERWLRTGEAHDEINPYSSLPFGFGPRSCL (1)
(1) GRRVAEVELQLLLAK (0)
(0) MSQQFVLSQVEPEEISSVAQPLLMPETPLHLRFVDRK*
more CYP27 related sequences from amphioxus
Gene B,
probable CYP24 sequence
TBLASTN
search of trace files with CYP11 amphioxus (above) as query
>ATGI133701.b1 ATGI133701.g1 ATGI263107.g1
(mate pair to last exon)
ATUP25462.b1
mate pair
ATUP253168.g1
AFPZ178922.x01
AFPZ178922.x1 APWS49400.g1 ASFW134828.g2 mate pair
16 aa
diffs to seq AFPZ69957.g2
MYQLLSAARHQGQSLFRVCRARSLAALKTTYRPQSNKAEESVTYDTAAR
PFEEIPGPKGLPLIGTALEYTPF(1)
>ATGN202956.g1
AFSA200913.g2 ATGN202956.g1 ATGN232406.b1
ATUP741276.x1
APWS88827.g1 ASWX68424.g2
(1) GQFKMITNLRESFRERTRTYGSIYRERIGPLDLVVISDPKEIEKVFR
NEGRYPERIELASIKVYREIKKLPTGLINL (2)
3aa diffs to above seqs
APNK105102.b1
ATUP936479.x1
ATGN122794.b1
ATWW20426.b1
ASFW96552.b2
AFSA664961.g2
GQFKMITNLRGSFRERTRTYGSIYRERIGPLDLVVISDPTEIEKVFRNEGRYPERIELASIKVYREIKKLPAGLINL
AGGTCAGTTTAAAATGATA
ACAAACCTGCGGGAATCCTTCAGGGAGAGAACGAGGACTTACGGCAGTATCTACCGGGAG
AGGATCGGTCCACTTGACCTGGTGGTCATCAGCGATCCGAAAGAGATAGAGAAGGTGTTC
CGCAACGAGGGGAGATACCCGGAGCGCATTGAGCTGGCGAGCATCAAAGTCTACCGGGAG
ATAAAGAAGTTGCCAACTGGATTGATCAACCTGT
>AFPZ73802.b2 ATWW116775.b1 ATUP525866.b1
AFSA402775.b2 ATUP741276.x1
The next
set are the same aa seq but have some nucleotide changes
ATUP25462.g1 AFSA200913.b2 AFPZ79091.g2
ATUP96117.x2 ATGI14630.g1
ATUP80026.x2 APWS81623.b1 ASWX9205.b2 ASWX9204.b2 ATUP613797.x1
ATGN332649.g1 ATGN213212.b1 ATGN297330.b1
ATGI197442.b1 ATGI153639.g1
ATWW52587.b1 ATWW71881.b1 ATUP471525.g1 ASWX134134.b2 ASFW134828.b2
AFSA527936.g2 AFSA482665.b2 AFSA737597.b2
AFSA440956.g2 AFSA93983.g2
(2) NGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTRDLVDVIRALIGKEESGGQVQNFINYVYRWALE
(1)
AGAAACGGCCCGGAGTGGCAGCGCGTGCGCAGCTCGGTTCAGAAGGACCTCATGCGGCCTAAGACTGTC
GGTGCGTACGCCTCCCTGCAGGATGACGTCACAAGGGACTTGGTTGACGTCATCAGGGCT
CTGATAGGGAAGGAAGAGAGCGGAGGTCAAGTTCAAAACTTCATCAACTATGTGTACAGA
TGGGCGCTAGAGGGT
>ATGN332649.g1
ASFW117082.b2 AFPZ821362.g2 ATUP96117.x2 AFSA900319.g2
(1)
AISVVVLDKRLGCLTLGDLEPGSDAKLMIDGVNDFFDAFVKLEMSATGL YKYISTPTWRKFAKAVDQFHR (2)
AGCGATCAGCGTGGTTGTGCTGGACAAGCGGCTGGGGTGCCTGACCT
TGGGTGACCTTGAACCTGGTTCTGACGCAAAACTGATGATTGACGGGGTCAATGACTTCT
TCGATGCGTTCGTGAAACTGGAGATGTCAGCAACTGGCCTCTACAAGTACATCAGCACAC
CGACGTGGAGGAAGTTCGCAAAGGCAGTCGACCAGTTTCATAGGT
>ASFW117082.b2
ATUP830924.b1 ATGN22852.g1 ATUP560796.g1
ATUP234960.g1
AFSA900319.g2 AFPZ821362.g2 ATUP198761.x1
(2)
VAEKLLKEKLAKTTTEDGKPAESDTDFLQSLLSRNDVTFEEAMEMAVDLLSAGIDT (0)
AGCGTTGCTGAAAAGTTGCTGAAGGAAAAGCTG
GCTAAAACTACAACCGAAGATGGGAAACCCGCCGAGTCCGACACGGACTTCCTCCAGAGC
CTCCTGTCCAGGAATGACGTCACCTTCGAGGAGGCCATGGAGATGGCGGTGGATCTGTTG
TCTGCAGGGATTGACACGGT
Walked
downstream to
AFSA197261.b2 AFPZ521495.y1 AFPZ279370.x1
ATWW68556.g1
ATWW76986.b1 AFPZ635668.x1 AFPZ710720.y1 AFPZ603520.g2
No hits
found
>ASWX115484.g2
(2)
VYPTVLNNVRRLDQDIVLSGYVVPAK (0)
AGGGTTTATCCCACTGTCCCCTAA
CAACGTAGACGGTTGGACCAAGACATCGTGGTGTCTGGATATGTCGTTCCTGCCAAGGT
>ATUP741276.y1
mate pair to exon 2 ASWX115484.g2 (more accurate at N-term)
(0)
SGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIDDKVLNRMHYLRAVVKETFR (2)
(2) VYPTVLNNVRRLDQDIVLSGYVVPAK (0)
AGTCAGGGAACACTCTGATGTTCAATCTCTTCTGCCTGGCGAAAAACCCGGAA
GCCCAGGAGAAACTTTACCGAGAGATCCAGGAGGTGGTCCCAGCCGGGCAGCCCATAGAT
GATAAGGTGTTGAACAGGATGCACTACCTGCGGGCCGTGGTGAAGGAAACTTTCAGGT
>ATUP152918.x2
more accurate seq of VYP exon
AGGGTTTATCCAACTGTCCTAAACAACGTAAGACGGTTGGACCAGGACATCGTGTTGT
CTGGATATGTCGTTCCTGCGAAGGT
>ATUP280746.y2
(0)
TTILLAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCL (1)
>ATGN202956.b1
mate pair to exon 2
(1)
GRRFAEQELHLGLIR (0)
>ATGI263107.b1
mate pair to first exon
(0)
IVQNFHVGWAGEDMKQDNRIILAPDRDSFVFSERT*
37% to
27B1 fugu
MYQLLSAARHQGQSLFRVCRARSLAALKTTYRPQSNKAEESVTYDTAARPFEEIPGPKGLPLIGTALEYTPF(1)
(1)
GQFKMITNLRESFRERTRTYGSIYRERIGPLDLVVISDPKEIEKVFRN
EGRYPERIELASIKVYREIKKLPTGLINL (2)
(2) NGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTRDLVDVIRALIGKEESGGQVQNFINYVYRWALE
(1)
(1)
AISVVVLDKRLGCLTLGDLEPGSDAKLMIDGVNDFFDAFVKLEMSATGL YKYISTPTWRKFAKAVDQFHR (2)
(2)
VAEKLLKEKLAKTTTEDGKPAESDTDFLQSLLSRNDVTFEEAMEMAVDLLSAGIDT (0)
(0)
SGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAGQPIDDKVLNRMHYLRAVVKETFR (2)
(2)
VYPTVLNNVRRLDQDIVLSGYVVPAK (0)
(0)
TTILLAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCL (1)
(1)
GRRFAEQELHLGLIR (0)
(0)
IVQNFHVGWAGEDMKQDNRIILAPDRDSFVFSERT*
Gene D
probable CYP24 sequence
>AFSA245302.g2 ATUP830413.b1 ATUP956020.b1
ATUP792235.y1 AFPZ580605.x1
AFPZ494158.y1 AFPZ133883.y1 ATUP874170.b1
AFPZ330780.x1
ATGN48350.g1 mate pair to LCPT exon conflict this is probably the
Correct
N-term
MSLLPRVVRHHGRLFNVCSARSLVTYRSQSTRAEESVAYDTAAR
PFEKIPGPKGLPLIGTGLDYAPF (1)
>ATGN253365.g1
41% to CYP11A 38% to CYP27A, 73% to ATGN202956.g1
ATGN253365.g1 ATUP236935.g1 ASFW152033.b2
AFSA537353.b2 AFSA355465.g2
AFPZ611442.g2 ATUP848010.x1
(1)
GRFPIKTNLRDSYRERTKTYGSIYREKIGPRELVVISDPKDIQKVYRNE
GRYPERPQVDSIKTYREMKKLPAGIVVL
(2)
AGGTCGATTTCCAATAAAAACAAACCTGCGAGATTCATACAGAGAGAGAAC
AAAGACCTACGGGAGTATCTACCGTGAAAAGATCGGACCAAGAGAACTAGTGGTCATCAG
CGATCCGAAGGACATCCAGAAGGTGTACCGCAACGAGGGGAGATATCCGGAGCGCCCACA
GGTGGACAGCATCAAAACCTACCGGGAGATGAAGAAGCTGCCAGCTGGAATAGTGGTTCTGT
(2)
NGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTRDLVDVIKALIGKEESGGQVHNFINYVYRWTLE (1)
AGAAACGGTCCTGAGTGGCAGCGCGTGCGCAGTTCGGTTCAGAAGGACCTCATGCGGCCCA
AGACTGTCGGTGCGTACGCCTCTCTGCAGGATGACGTCACCAGGGACTTGGTTGACGTCA
TCAAGGCTCTGATAGGGAAGGAGGAGAGCGGAGGTCAAGTTCACAACTTCATCAACTATG
TGTACAGATGGACGCTAGAGTGT
>ATUP236935.g1
mate pair ATUP618228.y1 ATGN318410.b1 AFSA186811.g2
(1)
AISVVVLDKRLGCLTLGDLEPGSDAQMMIGGVNDFFNAFAKLEMSATGL YKYISTPTWRKFQKAIDQWHT (2)
AGCGATCAGCGTGGTT
GTGCTGGACAAGCGGCTAGGTTGCCTGACCTTGGGTGACCTTGAACCGGGTTCTGACGCA
CAAATGATGATTGGCGGGGTCAACGACTTCTTCAACGCATTTGCCAAACTGGAAATGTCA
GCAACTGGTCTCTACAAGTACATCAGCACACCGACCTGGAGGAAGTTC
CAAAAGGCGATCGACCAGTGGCACACGT
>AFPZ611442.b2
mate pair
AFPZ179516.y1 ASWX174932.x1 ATWW156666.g1
ATUP173093.g1
ATUP427411.g2
AFSA551799.b2 AFSA145337.b2 AFPZ611442.b2
AFPZ183898.y1 ATWX88590.g1
AWYB4327.b1 ATGN37072.g1 ATUP18347.x2
ATGI56395.b1 ATWW168445.g1
ATWW74735.b1
(2)
VAAKLLKEKLTQSTIEDGKPAESDTDFLQSLLSRNDVTFEEAMEMALDLLSAGIDT (0)
AGAGTCGCTGCGAAGTTGCTGAAGGAAAAGCTGACGCAGAGTACAATT
GAAGATGGGAAACCCGCCGAGTCCGACACGGACTTTCTCCAGAGCCTCCTGTCCAGGAAT
GACGTCACCTTCGAGGAGGCGATGGAGATGGCATTGGATCTGTTGTCTGCCGGGATCGACACGGT
>ATUP173093.g1
ATGN155684.b1
(0)
TGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAEQPIDDKVLNRMHYLRAVVKETFR (2)
AGACGGGGAACACCCTGATGTTCAACCTCTT
CTGCTTGGCGAAAAACCCGGAAGCCCAGGAGAAACTTTACCGAGAGATCCAGGAGGTGGT
CCCAGCCGAGCAGCCCATAGATGATAAGGTGTTGAACAGGATGCACTACCTGCGGGCCGT
GGTGAAGGAAACTTTCAGGT
>ATGN48350.b1
mate pair to exon 1
ATUP204612.y2 ATGN213212.g1 ATUP924192.x1
ATUP305484.g1
ATGI56395.g1
ATGN48350.b1 ATUP796046.y1 ATWW71881.g1 ATUP574827.g1
ATGI251785.g1
AFPZ551860.x1 ASFW167838.b2 ASFW29294.g2
AFSA914086.g2 AFSA905811.b2
AFSA186811.b2
(2)
LCPTVGNNIRTLDRDMVLSGYVVPAK (0)
AGGCTTTGTCCAACTGTTGGCAACAACATAAGAACGCTGGACCGAGACATGGTGTTGT
>ATGN253365.b1
short, mate pair
ATGN48350.b1 ATUP759533.x1 ATUP574827.g1
AFSA914086.g2
AFPZ115025.y1 ATUP204612.y2 APWS64696.b1
ATGN213212.g1 ATWW71881.g1
AFPZ845352.y1 AFSA905811.b2 AFSA186811.b2
(0)
TKIFMAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCI (1)
AGACGAAGATCTTCATGGCTCACGACGTCATCAGCTCGCTT
CCGGAACCGGAAGTCTACAAACCGGAAAGATGGCTCCGTGATGACGAGTCGAGCAGCGTCCAACCGTTCACCCT
GCTGCCGTTCGGCTACGGACCGAGGATGTGCATTGGT
all
with same aa seq but some nuc changes, probably more than one gene
ATGN253365.b1 ASFW10928.g2 AFSA627713.g2
AFPZ69957.b2 AFPZ115025.y1
APWS64696.b1 ATUP837108.y1 AFPZ845352.y1 AFSA186811.b2
AFPZ377974.x1
AFPZ577425.x3 ATGI105841.b1 ATUP127524.x2
ATUP52374.y2 ASWX174932.y1 ASWX70755.g2 ATUP404385.b1
ATGN368254.b1 ATGI182479.b1 ATUP207917.x2
ATGI127474.b1 ATUP163752.y1 ATUP554136.y1
ASWX158863.b2 AFSA947971.g2
AFSA594073.g2 AFSA576548.b2 AFSA537353.g2
AFSA763143.b1 AFSA409073.b2
AFPZ409600.b2
(1)
GRRFAEQELHLGLIR (0)
AGGTCGGCGTTTCGCAGAACAAGAGCTTCATCTCGGATTGATCAGGGT
Possible
C-term seq for Gene D
>AFPZ69957.b2 APWS64696.b1 ATUP837108.y1 ATUP782584.y1 AFPZ838779.y1
(0)
IVQNFHVGWAGEDMKQVNRLVLSPDRDSFVFSARA*
MSLLPRVVRHHGRLFNVCSARSLVTYRSQSTRAEESVAYDTAAR
PFEKIPGPKGLPLIGTGLDYAPF (1)
(1)
GRFPIKTNLRDSYRERTKTYGSIYREKIGPRELVVISDPKDIQKVYRNE
GRYPERPQVDSIKTYREMKKLPAGIVVL (2)
(2)
NGPEWQRVRSSVQKDLMRPKTVGAYASLQDDVTRDLVDVIKALIGKEESGGQVHNFINYVYRWTLE (1)
(1)
AISVVVLDKRLGCLTLGDLEPGSDAQMMIGGVNDFFNAFAKLEMSATGL YKYISTPTWRKFQKAIDQWHT (2)
(2)
VAAKLLKEKLTQSTIEDGKPAESDTDFLQSLLSRNDVTFEEAMEMALDLLSAGIDT (0)
(0)
TGNTLMFNLFCLAKNPEAQEKLYREIQEVVPAEQPIDDKVLNRMHYLRAVVKETFR (2)
(2)
LCPTVGNNIRTLDRDMVLSGYVVPAK (0)
(0)
TKIFMAHDVISSLPEYYPEPEVYKPERWLRDDESSSVQPFTLLPFGYGPRMCI (1)
(1)
GRRFAEQELHLGLIR (0)
(0)
IVQNFHVGWAGEDMKQVNRLVLSPDRDSFVFSARA*
data for
do-it-yourself
>CYP11amphi
mixed seq 43% to Gene C, 35% to Gene B, 34% to gene D
36%
to 27B1 fugu, 38% to 11A1 fugu, 33% to CYP24 fugu, 32% to 27C1 fugu
37%
to chicken CYP11A1, 39% to catfish Ictalurus punctatus 11A1
This
is a probable CYP11A gene
(2)
EAKPFSALPGPPSVPVLGNFLHMWWEGLLEKEKLNKNHIMFTDFFRQYGPIFR (2)