>gnl|ti|647066038 1095898227332  34% to 17A1 35% to 2U1 fugu 33% to 2U1 human

74% to 1095901734433

1097567032902 1096761105127 1096123323522 1096088745900

Combined seq from CN769290 and CN769570 39% to CYP17A

EST = CN769570.1

mate pair of 1096088745900 had partial match to N-TERM.  WALKED UPSTREAM TO

1097672588127, N-term still missing, end of this exon seq not certain

cannot walk upstream any further

 

(1) AFNRNTNSLINSDPGPRFKILRKLASSSLKIYAEGLLGMERIAISEYCELSKKLQSIKEKPVSVHKIM (1)

AGCATTCAACAGAAATACGAACAGCCTCATTAACAGTGATCCAGGCCCGCGTTTTAAAATTTTA

CGAAAGTTAGCATCATCTTCTTTGAAAATTTACGCTGAGGGTTTATTGGGAATGGAAAGA

ATAGCAATCAGTGAATATTGTGAACTGAGTAAAAAGTTACAATCAATAAAAGAAAAACCA

GTATCGGTTCATAAAATAATGGGT

 

(0) QSTLNIICTILFNHRYEDDNQEFQNIIKYSSLIVQTFNETSYVSS

IPLLRYFPTATSRNIFEIIRLRDPILKRKLQEHRKSYDKNNLRDITDALIKVSLDSEMGE

ELTEKITDDNIEFLLNDFMIAGSETSSSTILWFIVYMLHWPEYQNKLYDEITKVASDNRY

VSLKDRPMLHLMQAAIHETLRLSSVVPLGLVHKAMENSSICGKFVPKGALILTNLWSMHH

DESYWKNAMSFYPERWLEKSGEFNYKLGYAYLPFSNGPRSCLGETLAKTELFVFITRLLK

DYRFEMPTGKELPCLDGRSGITSPPNDFEVVIIPRN*

AGCAAAGCACACTTAACATTA

TTTGTACCATTCTTTTTAATCATCGCTACGAGGATGACAACCAGGAGTTTCAGAATATCA

TAAAATACTCAAGTTTAATCGTTCAAACTTTTAATGAAACCAGTTACGTATCTTCCATTC

CATTGCTGCGCTATTTCCCAACGGCAACGTCGCGAAATATTTTTGAAATCATAAGGCTTC

GTGATCCGATTTTAAAACGAAAACTCCAAGAGCACAGAAAATCTTACGATAAGAATAATT

TACGTGACATAACCGATGCATTAATAAAAGTGTCTTTAGATTCAGAGATGGGTGAAGAAT

TAACTGAAAAGATTACTGATGATAATATTGAGTTTCTTTTAAACGATTTTATGATTGCTG

GATCCGAAACTTCATCAAGTACTATTCTTTGGTTTATTGTTTACATGTTACATTGGCCAG

AATACCAAAATAAACTTTATGATGAAATTACTAAAGTAGCATCAGATAACCGTTATGTAT

CTTTAAAGGATCGACCTATGCTTCATTTAATGCAAGCTGCAATTCATGAAACACTTAGAC

TGTCATCGGTGGTACCTCTTGGTTTGGTTCATAAAGCAATGGAGAACAGTAGCATTTGTG

GCAAGTTTGTTCCTAAGGGAGCTCTTATTTTAACAAATTTATGGAGTATGCATCACGATG

AAAGCTATTGGAAAAATGCAATGAGTTTTTACCCGGAACGTTGGCTGGAAAAATCTGGCG

AGTTCAATTATAAATTGGGGTACGCATATTTACCGTTTTCTAATGGACCTCGTAGTTGTT

TAGGAGAAACATTGGCAAAAACAGAGTTGTTTGTGTTTATTACACGATTACTTAAAGATT

ACCGATTTGAAATGCCAACTGGAAAAGAGTTACCTTGTTTAGATGGTCGTTCTGGAATCA

CCTCCCCTCCTAATGACTTTGAAGTCGTGATAATTCCAAGAAATTAA

 

>complete combined seq CN566859 CN566581 CYP2 clan member [gene 2]

1097326058990

32% to 2X9 aa 26-146 34% to CYP17A

MFLEVIGAVFIPPLIWTIWVYIKHLIDCLHYPRGPIPLPFIGNGYLIRKAEPYKELVNLGKIYG

DVFSFSVGSVRYVIVNSLEGIQEVLVKKGWQFAGRPKGP ()

SWDRSIHGLIQRDPSKKFKILRKLATSSLKIFADGLAGMESKAIEESFQLNKKLLETNGKPFSMQEIT (1)

1097329249233

(1) TLCVLNIICSILFNHRYKEDDLEFQDIIKYSNICFKERGVNNYIISIPWLRY

FPSASSRNLDEMIKIRDPLL

KKKVQEHKRSYDEYNLRDLTDALIKASNSETGQDPDEKVTDDNIVFILN

NFILAGSETSSNTILWFIVYILHWPEYQDKLYDEILKVTSGSRYPCLKDRPSLHLMQAAI

YETLRLSSVAPFGLHHKAMEKSSICGKSI

PKGALIITNLWSIHHDESYWKNAMSFYPERWLENSGEFNSKLGNAYLPFSSGPRSCIGETL 395

394 AKTELFIFISRLINDFRFVKPILEELPRLEGSFGITCTPYDFKVEIVPRSKNLLV* 227

AGATGCAACTGATAATAAATTCGCGATCCGCTATAAAGAAAAAA

GTCCAAGAGCACAAAAGATCGTATGACGAATATAATTTACGCGATCTAACAGATGCTTTA

ATAAAAGCATCAAACTCGGAGACGGGACAAGATCCGGATGAAAAAGTTACTGATGATAAT

ATTGTATTTATCTTAAATAATTTTATACTCGCAGGATCAGAGACTTCATCAAATACGATT

CTTTGGTTCATTGTTTATATTTTACATTGGCCGGAGTATCAAGATAAACTTTATGATGAA

ATTTTAAAAGTAACATCAGGTAGCCGTTACCCTTGCTTAAAAGATCGCCCGTCACTACAT

TTAATGCAAGCTGCAATTTATGAAACACTTAGGTTGTCATCGGTCGCACCTTTTGGTTTA

CATCATAAAGCCATGGAGAAAAGTAGCATTTGTGGAAAATCTATCCCTAAAGGCGCTCTT

ATAATAACCAATCTATGGAGTATACACCATGACGAAAGCTACTGGAAAAATGCAATGAGT

TTTTACCCTGAACGTTGGTTGGAAAGTTCTGGCGAATTTAACTCTAAACTAGGAAATGCG

TATTTACCATTCTCTAGTGGACCTCGTAGCTGTATTGGAGAAACATTAGCAAAAACTGAG

 

>1097329039870 1096703377333 1097664213870

MLFKVIGTILIPPLIWVVWIYIKHLVDCLSYPQGPFPLPFIGNAHLIRNRESYKVF

SEFQKIYGSVFGFSIGSTRYVVVNNLEGVQEVLIKKGSQFAGRPRRA (1)

ATGCTCTTTAAAGTCATTGG

TACAATCTTGGTTCCACCTTTAATATGGGTTGTATGGATTTATATCAAACATCTTGTTGA

CTGCTTGTCCTATCCTCAAGGACCATTTCCTCTCCCATTTATAGGAAATGCTCATTTAAT

AAGAAATAGGGAGTCTTATAAAGTGTTTTCTGAATTTCAGAAGATTTATGGCAGCGTTTT

TGGATTTAGCATTGGCTCAACCAGATATGTGGTTGTAAATAACTTAGAAGGAGTTCAAGA

GGTTTTGATCAAAAAAGGTTCACAGTTTGCAGGCCGCCCAAGACGAGCAAGT

 

>1096703827379 1095896863976

MFPEIVGAIMLPPLIWAAWIYIKHLVDCLVYPRGPFPLPFVGNAYLFSKGKPYKEFVKLG 103

KTYGDVFGFSIGSIRYVVVNSLEGIKKXXXXXXXXXXXXXXXX

ATGTTTCCTGAAATCGTTG

GCGCAATTATGCTTCCTCCCTTGATATGGGCAGCGTGGATTTACATAAAACATCTTGTTG

ACTGTTTAGTTTATCCCCGAGGACCATTTCCACTACCTTTTGTAGGAAATGCATATCTCT

TCAGTAAAGGCAAACCTTATAAAGAATTTGTTAAACTTGGAAAAACTTACGGCGATGTAT

TTGGCTTTAGCATTGGTTCAATACGATATGTAGTCGTGAACAGCTTGGAAGGTATCAAGA

AGT

 

>1095899160393 frameshifted

MFFEVIRAFFTPPLVWIIMVYIKNLIDYLYYPREPIPLPFIGNGDLIRKAEPFKEL

VNLEKKYGDVFSFRIGLVRFVVVSSLEVILEILVKKGWQANGRPKAP (1)

ATGTTTTTTGAAGTTATTCGCGCCTTCTTTACTCCACCTTTGGTATGGATTATAATGGTTTATATAAAA

AATTTAATCGATTATTTGTATTATCCACGAG

AACCGATACCACTACCATTTATTGGAAATGGTGATTTGATAAGAAAAGCAGAACCGTTTA

AAGAGTTGGTTAACCTGGAAAAAAAATATGGCGATGTTTTTAGTTTTAGGATTGGTTTAG

TCAGATTTGTGGTTGTTTCA

AGTTTAGAAGTAATTTTAGAAATACTAGTAAAAAAAGGGTG

GCAGGCAAATGGTCGTCCAAAAGCTCCAAGT

 

 

 

>1097329360095 4 aa diffs to CN566859 from PKG

FYLNNFILAGSETSSNTILWFIVYILHWPEYQDKLYDEILKVTSGSRYPCLKDRPSLHLM

QAAIYETLRLSSVAPFGLHHKAMEKSSICGKSIPKGALIITNLWSIHHDESYWKNAMSFY

PERWLESSGEFNSKLGNAYLPFSSGPRSCIGETLAKTELFIFISRLINDFRFVKPISEEL

PRLDGSFGITCTPYDFKVEIVPRSKNLLF*

TTTTATCTTAATAATTTTATACTTGCAGGATCAGAGACTTCAT

CAAATACGATTCTTTGGTTCATTGTTTATATTTTACATTGGCCGGAGTATCAAGATAAAC

TTTATGATGAAATTTTAAAAGTAACATCAGGTAGCCGTTACCCTTGCTTAAAAGATCGCC

CGTCACTACATTTAATGCAAGCTGCAATTTATGAAACACTTAGGTTGTCATCGGTCGCAC

CTTTTGGTTTACATCATAAAGCCATGGAGAAAAGTAGCATTTGTGGAAAATCTATCCCTA

AAGGCGCTCTTATAATAACCAATCTATGGAGTATACACCATGACGAAAGCTACTGGAAAA

ATGCAATGAGTTTTTACCCTGAACGTTGGTTGGAAAGTTCTGGCGAATTTAACTCTAAAC

TAGGAAATGCGTATTTACCATTCTCTAGTGGACCTCGTAGCTGTATTGGAGAAACATTAG

CAAAAACTGAGTTGTTTATTTTTATATCCCGATTAATAAATGATTTCCGATTTGTAAAAC

CGATATCAGAGGAATTACCGCGTTTAGATGGTAGTTTTGGCATCACTTGTACTCCTTATG

ACTTTAAAGTTGAAATAGTTCCAAGGAGTAAAAATTTACTGTTTTAA

 

>1097509039345 92% identical to 1096064108200, probably joins with 1095898835518

1096625274183  1095900033599  1095896933215 100% match so this similar seq is real

1097206379175 1097678021634

MFLEVAFGVVTPLFLYVIATYLDHLFKCRFYPPGPFPLPIIGNLHLIGKKPHEKFVEYSK 538

KYGEVFSLSFGMHRVVIVSGKDSIREVLVQKSNIFAGRPKNYIANIVSRGYKNIGYGDIG 718

PKWKILRKIAHSSLKNYGESTAHLETLVVRESEELHKNLYKKSNRSTKLEHKF (1)

>gnl|ti|649400787 1095898835518 93% identical to 1096064108200, 39% to 17A1 fugu

35% to 2U1

gnl|ti|647175227 1095898288652 1096602038000

(1) GVAVLNVICSIVFGKRYEYENCEFKEILTYMNYVFTGVAGTNAISFIPWLRFLPLDGLR

KLKKGLSIRDPVLRKQLLYHRETYNESNLRDYTDYVIQFSRDEAILKKFGEQLTDDY

LELLLNDIFIAGTETALTTLLWSIIYLIHWPKFQDKIYNEIVSAIGKNRYPSMKDRNMLP

LVNAALSETLRLSSVTPLGVPHKAMEDTTLLNDLKIPKGTTILTNLWQLHHNKNCWENPH

EFNPYRWFTNDQTLDSIKSMNFLPFSAGTRVCLGKGIAEVELFLFYSRLVRDFKFEVKP

GDSLPSLYGNCGLL*

AGGTGTTGCGGTATTAAATGT

CATTTGCTCTATTGTATTTGGAAAACGCTATGAGTACGAAAATTGTGAATTTAAAGAAAT

CCTAACCTACATGAATTATGTTTTTACTGGTGTAGCTGGTACAAACGCAATTTCTTTTAT

TCCGTGGCTTCGTTTCCTTCCATTAGATGGATTACGAAAATTAAAAAAAGGACTTTCAAT

TAGAGATCCGGTTCTTCGGAAGCAGTTGTTATATCACAGAGAGACCTACAATGAAAGTAA

CCTGCGTGACTATACAGACTATGTCATACAATTTTCAAGAGATGAGGCCATCTTGAAAAA

GTTTGGAGAACAGCTAACTGATGACTACTTAGAGCTTTTACTTAATGATATATTTATAGC

TGGAACTGAAACTGCATTGACAACTTTACTTTGGTCAATTATCTACCTTATTCACTGGCC

AAAGTTTCAAGACAAAATTTACAATGAAATTGTTTCAGCTATTGGTAAAAATAGATATCC

TTCTATGAAAGATCGTAATATGCTGCCTCTTGTTAACGCTGCGTTATCAGAAACATTGCG

GTTATCTTCTGTTACTCCATTAGGAGTACCTCACAAAGCTATGGAAGATACAACTCTCTT

GAATGATTTAAAGATTCCCAAAGGCACCACAATTTTAACGAACCTTTGGCAATTACATCA

CAATAAAAACTGTTGGGAAAATCCACATGAGTTTAATCCATATAGATGGTTTACTAATGA

TCAAACACTTGATTCTATAAAATCTATGAATTTTTTACCTTTTTCTGCTGGTACCAGAGT

GTGTTTAGGAAAGGGTATTGCTGAAGTTGAACTTTTTCTTTTTTACTCAAGGCTGGTTCG

TGATTTTAAGTTTGAAGTAAAACCCGGCGATAGTCTTCCAAGTTTATATGGAAATTGTGG

ATTACTCTAA

 

>gnl|ti|648017453 1095896110991     52   1e-05 35% to 17A1 fugu 34% to 2U1 fugu

gnl|ti|647987527 1095895119635

1096703762277 used this seq to walk upstream past a repeat could not go futher

71% to 1095898227332

(1) ELTTLNIICTILFNQRYEQDDDEFQNIIKYSNLSFKAFSASNLLSSIPWLRYFPTTASKYIQ 707

706 EIERLRDPILKRKLQEHRKSYDENNLRDITDALIKASIHLNAEKDSLIKVTDDNIQFILN 527

526 DLILAGSETSSSTITWFIVYMLHYPEYQDKIFNEVIKVTSGNRYPCLNDRPLLHLLQATI 347

346 HETLRLSSVAPLGLRHKAMENSTICDKPVLKGTLIITNLWSIHHDERYWKNPMSFYPERW 167

166 LNETGEFDYKLGNAYIPFSGGPRACLGETLAKTELFVIISRLVTDFYFEKSVEEDLPRLDSF 374

375 PGVTRSPYDFKVVVVSRS*

 

>gnl|ti|647193621 1095899233960 1096082123583 1097696262164 1096620040714

1097206342731

Combined seqeunces BP505786 and CB073123 and CB271974 40% to CYP17A [gene 3]

CN570733 same as CN570522 BP505786

50% to 1095898835518 37% to 17A1

(1) VTGVMNVLCGIVFGTQYEENDKELEKVISFKQLILDGVADTFAISFLPWLRFFPSNGLKKVRK

GVLIRDKLLRFQLKKHRETYNPVQIRDYTDYVLKYSKEFETSRNIDEQLSEDNMEMM

LQDIFISGSETTISTLLWFAVYLVNWPKYQDDIYDETIKIVGNDRYPSLSDRPKLHLFES

AMKETLRLSSVIPLGLPHRSLEETSIKKFKIPKNTNVMINLWQLHHDSKSWSDPHTFNPY

RWLNDKNIFDKSKNPNYLPFSTGLRACLGYHTTESIIFLFFTRLIRDFNLCLKPGASTP

SLNGVLRVTLTPDTSYIILKPRSNNLISQKIEA*

AGTTACTGGAGTGATGAACGTTCTTTG

TGGAATTGTTTTTGGTACACAATATGAAGAAAATGATAAAGAACTTGAAAAAGTCATATC

TTTTAAACAGTTAATATTAGATGGAGTAGCAGATACATTCGCAATATCTTTTTTGCCGTG

GTTAAGGTTTTTTCCTTCAAACGGATTAAAGAAAGTACGAAAAGGCGTGTTGATAAGAGA

TAAACTACTTAGGTTTCAATTAAAAAAACATCGAGAAACATACAATCCAGTTCAAATAAG

AGATTACACTGATTACGTACTTAAATACTCAAAAGAGTTCGAAACTTCAAGAAACATAGA

TGAGCAGTTAAGTGAAGATAATATGGAAATGATGCTTCAGGATATTTTCATTAGTGGTAG

CGAAACAACTATATCAACACTTCTTTGGTTTGCTGTTTATTTAGTTAACTGGCCAAAGTA

TCAAGATGATATCTATGATGAAACTATTAAAATAGTCGGTAATGATAGGTATCCTAGTCT

TTCAGATCGTCCAAAGCTTCATTTATTTGAAAGTGCTATGAAAGAAACTCTGCGTTTGTC

GTCTGTCATTCCATTAGGTTTACCTCACAGAAGTCTTGAAGAAACCAGCATAAAAAAATT

TAAAATTCCTAAAAATACAAACGTAATGATTAATCTGTGGCAGTTGCACCATGATAGTAA

ATCTTGGAGTGATCCTCATACATTTAATCCATATAGATGGTTAAATGACAAGAATATCTT

TGACAAAAGCAAAAACCCAAACTATCTTCCATTTTCAACCGGATTAAGAGCCTGCTTAGG

TTATCACACAACCGAATCCATCATTTTTTTGTTTTTTACCCGATTGATAAGAGATTTTAA

TCTTTGTTTGAAACCTGGCGCATCTACTCCAAGTTTAAACGGTGTTTTGCGAGTAACCTT

AACTCCTGATACGTCATACATTATTCTAAC

 

>gnl|ti|648033522 1095897342515  39% to 17A1 N-term

1095899118747 1095900033599 1096071090512  1096703396910  1096608233968  

MFLEIAFGVTAPLLLYVIATYLDHLFKCRFYPP

GPFPLPIIGNLHLIGKKPHEKFVEYSKKYGEVFSLSFGMHRVVIVSGKDSIREVLVQK

SNIFAGRPKNYIANIVSRGYKNIGYGDIGPKWKILRKIAHSSLKNYGESTKHLETLVVK

ESEELHKRLFKNCNRSTELEDEF (1)

ATGTTCTTAGAAATTGCTTTTGGAGTAACAGCTCCTCTGCTTTTGTATGTCATTGCAACTTATCTAG

ATCATTTGTTTAAATGCAGATTTTACCCGCCAGGCCCTTTTCCTTTACCGATTATTGGGA

ACTTACATTTGATTGGAAAAAAACCACATGAAAAGTTTGTAGAATATTCAAAAAAGTATG

GAGAAGTATTCAGTCTAAGTTTTGGAATGCATCGTGTTGTTATTGTTTCAGGAAAAGATT

CTATTAGAGAGGTTTTGGTTCAAAAATCAAACATTTTTGCAGGGCGTCCTAAAAACTACA

TTGCTAATATTGTATCTCGTGGTTATAAAAATATTGGCTACGGAGATATTGGACCTAAAT

GGAAAATTTTGAGGAAAATTGCTCACTCTTCTTTAAAAAACTATGGAGAGTCAACTAAAC

ATTTGGAAACGCTTGTCGTAAAAGAAAGCGAAGAGCTACACAAAAGACTTTTTAAAAATT

GTAACAGATCCACAGAGCTAGAAGATGAGTTTGGT

1096064108200 93% to 1095898835518  1097206931796(9 aa diffs)

1097206498632 walked up to 1096081234652 found mate pair 1096071090512

already known N-term seq matches 1095897342515 100%

1095897342515 38% to 17A1 fugu whole seq.

MFLEIAFGVTAPLLLYVIATYLDHLFKCRFYPP

GPFPLPIIGNLHLIGKKPHEKFVEYSKKYGEVFSLSFGMHRVVIVSGKDSIREVLVQK

SNIFAGRPKNYIANIVSRGYKNIGYGDIGPKWKILRKIAHSSLKNYGESTKHLETLVVK

ESEELHKRLFKNCNRSTELEDEF (1)

(1) GVAVLNVICFIVFAKRYENKDSEFKKILMYMNYVFSGVASTNFASFIPWLRFFPLDGLR

KLKKGLSIRDPVLRKQLLYHRETYNESNLRDYTDYVIQFSRDEAILKKFGEQLTDDYLEL

LLNDIFIAGTETALTTLLWSIIYLIHWPKFQDEIYNEIVSTIGKDRYPSMKDRNMLPLVN

AALSETLRLSSVTPLGVPHKAMEDTTLLNDLKIPKGTTILTNLWQLHHNENCWENPHEFNPYRWF

TNDQALDSIKSMNFLPFSAGTRVCLGKGIAEVELFLFYSRLVRDFKFEVKPGDSLPSLDG

NYGITLTPRIFTTFVVARNDSLVAQNHSL*

 

 

>gnl|ti|647182814 1095899213949  1095958075467  1095733042694

1097672545497

54% to 1095898835518, 36% to 17A1 36% to 2U1

walked upstream to 1097672406696 which mate pairs to exon 2 below

(1) GVAVLNVICFIVFGERYQYSDPAFIEILTTINNIVSGLSNTTAVDFLPGLRYLQFSEIK 256

257 KLKSSLVIYFRLLNDQLKKHKKTFDENNIRDFTDSIIKFSKDETMENKFEEELTDEHLEH 436

437 VIGDMFIAGSETTLTSLLWLIIYMIHYPKYQEEIFEEITRVIGENRYPQLSDRDSLHLVK 616

617 ASIKECLRLSSIIPLGVPHKTMSDTTLIGYNIPKNTTVIINHWQIHNDTNHWKNPNEFNP 796

797 HRWIDDDSKFDATRATSYLPFSAGTRVCLGKTVAETELFFFFTRLIRDFKFE

GVPGCPLPSLIGKCSITLAPEEFNVHVTPRINSLMFSKNVLPE*

 

>combined seq CN774619 CN775634 CYP2 clan member  [Gene 1]

 

32% to CYP1C1 aa 173-297 29% to 17A2

  2 ESEELHKRLLMKSKTSVDLKTEFGAAIINVICFIVFGERYQYSNSEFKEVLTTINNIV 175

176 DGLSNTTAVGFLPWLRFLPFSPIKKLSISLSKYIRFLNDKLTKHKETFNENKIRDS 343

344 TDSIIN 361

>1096526199166 frame3_ORF1 7aa diffs to CN774619 may be same gene

(1) GAAIINVICFIVFGERYQYSDSEFKEVLTTINDIVDGLSNTTAVGFLPWLRFLPFSPIKKLSIS

LSKYVRFLNDKLKKHKETFDEKKIRDFTDSIINFSNNEAVKQKFKNVDEHLEPVIGDLFI

TGSETTLTSLLWLILYMMHYPKYQQEIFKEITTVIGEDRYPCLNDRDSLHLVKAALKECL

RLSSIVPLGLPHKTTKETVLMGHSIPGNATVMINHWQIHNDTNYWENPNEFNPYRWIGKD

KKFDPSKATSFLPFSAGTRVCLGKTVAENELFFFFSRLIRDFNFECIPGCPPPSLIGKCN

ITHAPKQFCAYLTPRINNLM*

AGGTGCTGCAATTATAAACGTGATTTGTTTCATTGTTTTTGGGGAAAGATACCAGTATTCAGATTCAGAAT

TTAAAGAAGTTCTTACAACAATAAATGATATAGTCGATGGGTTGTCAAATACAACTGCTG

TTGGATTTTTGCCGTGGTTGAGATTTTTACCGTTTTCTCCAATAAAAAAACTGAGTATTT

CACTTTCAAAATATGTTCGTTTTTTAAACGATAAGTTGAAAAAACATAAGGAAACATTTG

ATGAAAAGAAAATTCGAGATTTTACTGATTCTATTATAAATTTTTCTAATAACGAAGCTG

TCAAACAAAAATTTAAAAACGTTGATGAACATTTAGAGCCTGTGATTGGGGATTTATTTA

TAACGGGTAGTGAGACCACATTAACATCTTTATTGTGGTTAATTCTTTATATGATGCATT

ATCCCAAATATCAACAAGAAATTTTTAAAGAAATTACAACGGTTATTGGTGAAGACCGGT

ACCCATGTTTAAATGACCGTGATTCTTTGCATCTTGTTAAAGCCGCATTAAAAGAGTGTC

TGCGTTTATCTTCAATTGTTCCTCTTGGATTACCACACAAAACAACCAAAGAAACAGTTC

TTATGGGACATAGCATTCCTGGGAATGCAACAGTCATGATTAATCATTGGCAGATTCATA

ACGATACTAACTACTGGGAAAATCCTAACGAATTTAATCCTTATCGGTGGATTGGTAAAG

ATAAGAAATTTGATCCAAGTAAAGCAACAAGTTTTTTACCTTTTTCAGCCGGTACAAGAG

TTTGTTTAGGGAAAACAGTTGCTGAAAATGAACTATTTTTCTTCTTTTCTAGATTAATTC

GAGATTTTAACTTTGAGTGCATACCTGGTTGTCCACCTCCAAGTTTAATTGGTAAATGCA

ATATTACTCATGCTCCAAAACAGTTTTGCGCATACTTGACTCCAAGAATAAACAATCTAATGTAA

 

>whole gene 1095899272864 1096526199166

MWYEIICGLIISILLYIIGSYLMHLLECRKYPLGPFPIPIFGNLHLLGTEPHKILAAYS

KKYGAVFSISLGLQRIVIISDITTTREALVQKASIFAGRPKSYLIQLISSGYKGIAFMDY

GSFWKVLRKVSHSSLKIYGEGHERFEKILTKESEELHKRLLKKSNNSVELKSEF (1)

GAAIINVICFIVFGERYQYSDSEFKEVLTTINDIVDGLSNTTAVGFLPWLRFLPFSPIKKLSIS

LSKYVRFLNDKLKKHKETFDEKKIRDFTDSIINFSNNEAVKQKFKNVDEHLEPVIGDLFI

TGSETTLTSLLWLILYMMHYPKYQQEIFKEITTVIGEDRYPCLNDRDSLHLVKAALKECL

RLSSIVPLGLPHKTTKETVLMGHSIPGNATVMINHWQIHNDTNYWENPNEFNPYRWIGKD

KKFDPSKATSFLPFSAGTRVCLGKTVAENELFFFFSRLIRDFNFECIPGCPPPSLIGKCN

ITHAPKQFCAYLTPRINNLM*

 

>gnl|ti|655005893 1095958068757     44   0.002 43% to 4V5 fugu 36% to 4T5

gnl|ti|651153924 1095901025079 N-term

gnl|ti|651153911 1095901025066

1097206604076 1097206339312 

complete gene no introns ESTs = CV566433.1 CX054637.1 CV566166.1

MVSVFYILFSGLVFYVVSKILWKLWRNSYGLSSIVTPPNVPFFGTSLYLHSDA

RKFFFQLYDYTRRYGDVFCIWLGPKPVICSSSVKFSEAVLSSQKVITKGFSYDFLHDWLK

TGLLTSTGSKWKTRRRLLTPSFHFSILNNFIKIFEEQASILVDKLAVAADNKEVVDVQVP

IGLATLDIICETSMGVKVNAQSHPDSEYVKAITVLNEEIQMRQKFPWLWFDAIYKLL 568

567 PCGKRFYKALDVAHKLSFDVINERMQMKIQESYCETASDEKKFFLDLLLDIYRKGKI 397

396 DTEGIQEEVDTFMFEGHDTTSAALGWTLWLLGKNPDVQKKLHKEIDEIELNGGSLYDKVR 217

216 QSKYLEIILKESLRMHPPVPMYGRTVEEDMTIDGQFVPKGAQIVLLVLILHSNPDYWEN 40

39  PNDFIPERFEADSYEKRNPYSYVPFSAGPRNCIGQKFAMIEEK

ILLYSIMKNFHLKSMQNENEVFGTLDIIHKSINGINIKFTRR*

ATGGTATCAGTTTTTTATATATTATTTAGTGGACTT

GTTTTCTATGTTGTTAGTAAGATATTGTGGAAGTTATGGAGAAATTCATATGGTTTATCA

TCAATAGTTACACCTCCAAATGTACCATTTTTTGGAACATCTTTGTACTTGCATAGTGAT

GCCCGCAAATTTTTTTTCCAACTATATGACTACACAAGAAGATATGGCGATGTGTTTTGC

ATTTGGTTGGGGCCAAAACCAGTAATATGTTCTTCCTCTGTAAAATTCTCAGAAGCAGTA

TTAAGTAGTCAGAAAGTTATCACCAAAGGATTTTCTTATGATTTTTTGCATGACTGGTTA

AAAACTGGGTTACTTACAAGCACAGGATCAAAATGGAAAACACGTAGAAGGCTACTAACT

CCAAGTTTTCATTTTTCTATACTCAATAACTTTATTAAAATATTCGAAGAGCAAGCATCC

ATTCTGGTGGACAAACTAGCTGTAGCTGCTGACAACAAGGAAGTTGTAGATGTGCAAGTA

CCTATTGGTTTGGCAACCTTGGATATAATCTGCGAAACTTCAATGGGTGTAAAAGTAAAT

GCACAAAGTCATCCAGATTCTGAGTATGTTAAAGCT

ATCACAGTTTTAAATGAAGAAATTCAAATGCGTCAAAAGTTTCCTTGGCTTTG

GTTTGATGCCATTTACAAACTGTTGCCTTGTGGGAAAAGGTTTTATAAGGCTTTAGATGT

TGCTCATAAGCTATCTTTTGATGTAATAAATGAACGCATGCAAATGAAAATTCAAGAATC

TTATTGTGAGACTGCGTCAGATGAAAAGAAATTTTTTTTAGATTTATTGTTAGATATATA

TCGCAAAGGTAAAATTGACACTGAAGGTATTCAAGAAGAAGTTGATACTTTTATGTTTGA

AGGTCATGATACAACTTCAGCTGCACTAGGTTGGACTCTTTGGTTGTTAGGAAAAAATCC

AGATGTTCAAAAAAAGCTGCACAAAGAAATTGATGAGATAGAGTTAAATGGAGGTTCACT

TTATGATAAAGTCAGACAGTCTAAATACCTTGAAATTATTCTTAAAGAATCATTACGAAT

GCATCCTCCTGTCCCTATGTATGGAAGAACAGTTGAGGAAGATATGACTATTGATGGTCA

GTTTGTTCCCAAAGGAGCACAAATAGTTCTTTTAGTTTTAATCTTGCACTCAAACCCTGA

TTATTGGGAAAACCCAAATGATTTTATACCTGAACGT

TTTGAAGCTGATAGTTATGAAAAGCGCAACCC

ATACAGTTATGTACCTTTTTCTGCTGGACCAAGGAATTGCATTGGCCAAAAATTTGCCAT

GATTGAAGAGA

AAATATTACTGTATAGCATAATGAAAAACTTCCATCTTAAGTCAATGCAGAATGAAAATG

AGGTTTTTGGTACTCTTGATATAATTCATAAGTCAATTAATGGAATTAATATAAAGTTCA

CAAGAAGATAA

 

>1096064105622 very similar to 1095958068757 varies at N-term 86%

346 MIYASYLVLVGLFVFFVSKILWKLWKSSYGLETIATPPNIPVFGTSLYLHSDARKFFFQL 525

526 SEFTKKYGTVFCIWLGPKPMIISSSVKFSEAVLSSQKVITKGFSYDFLHDWLKTGLLTST 705

706 GSKWKTRRRLLTPSFHFSILNNFIKIFEEQASILVDKLAVAADNKEVVDVQVPIGLATLD 885

886 IICETSMGVKVNAQSHPDSEYVKAITVLNEEFVMRIKYPWXLWFDVIYKLLPCGKR

34 aa gap between these two seqs

>CV564924.1 EST 93% to 1095958068757

EKKFFLDLLWDIYRKGEIDTEGIQEEVDTFMFEGHDTTSAALGWTLWLLGKNPDVQRKLHKEIDEIE

LNGGSLYDKVRQSKYLENILKESLRMHPPVPMYGRTVEEDMTIDDQFIPKGAQIILLVLM

LHSNPEYWENPNDFMPERFEADSYEKRNPYSYVPFSAGPRNCIGQKFAMIEEKILLYSIM

KNFHLKSMQDENEVFGTVDVIHKSINGINIMFTRR

GAAAAGAAATTTTTTTTAGATTTGTTATGGGATATATATCGAAAAGGTGAAATTGACACTGAAGGTATTCAAGAAGAAGTTGATACTTTTATGTTTGAAGGTCATGATACTACTTCAGCTGCACTAGGTTGGACTCTTTGGTTGTTAGGAAAAAATCCAGACGTTCAAAGGAAGTTGCACAAAGAAATTGATGAAATAGAGTTAAATGGAGGTTCACTTTATGATAAAGTTAGACAGTCTAAATACCTTGAAAATATTCTTAAAGAATCATTACGAATGCATCCTCCTGTCCCTATGTATGGAAGAACAGTTGAGGAAGATATGACTATTGATGATCAGTTTATTCCCAAAGGAGCACAAATTATTCTTTTAGTCCTAATGTTGCATTCGAACCCAGAATATTGGGAAAATCCAAATGATTTCATGCCTGAACGTTTTGAAGCTGATAGTTATGAAAAGCGCAACCCATACAGTTATGTACCTTTTTCTGCTGGACCAAGGAATTGCATTGGCCAAAAATTTGCCATGATTGAAGAGAAAATATTACTGTACAGCATAATGAAAAACTTCCATCTTAAGTCAATGCAGGATGAAAATGAAGTATTTGGGACTGTTGATGTAATCCATAAATCAATTAATGGAATTAATATAATGTTCACCAGAAGAAAAGGAAAAACTTATCTTGTTTAGTTTAGTTCATTATTTATCAGTAATTTGAAATAAT

 

>1096064105622 90% to 1095958068757 varies at N-term

1096071088011 joins CV564924.1 EST

MIYASYLVLVGLFVFFVSKILWKLWKSSYGLETIATPPNIPVFGTSLYLHSDARKFFFQL

SEFTKKYGTVFCIWLGPKPMIISSSVKFSEAVLSSQ

KVITKGFSYDFLHDWLKTGLLTSTGSKWKTRRRLLTPSFHFSILNNFIKIFEEQASILV

DKLAVAADNKEVVDVQVPIGLATLDIICETSMGVKVNAQSHPDSEYVKAITVLNEEFVMR

IKYPWLWFDVIYKLLPCGKRFYKALDVAHKLSFDVINERMQMKIRESYCETASDEKKFFL

DLLLDIYQKGEIDTEGIQEEVDTFMFEGHDTTSAALGWTLWLLGKNPDVQRKLHKEIDEI

ELNGGSLYDKVRQSKYLENILKESLRMHPPVPMYGRTVEEDMTIDNQFIPKGAQIILLVL

MLHSNPEYWENPNDFMPDRFEADSYEKRNPYSYVPFSAGPRNCIGQKFAMIEEKILLYSIM

KNFHLKSMQDENEVFGTVDVIHKSINGINIMFTRR

 

 

>gnl|ti|655009968 1095963046224     42   0.010 46% to CYP20 35% to 27B1

419 DGGIHKFLVENHKRLGPMFSFYWGKELAVSLACPILFKEVATLFNRP 559

 

>gnl|ti|646849327 1095897329284 1097672251908 mate pair =  1097672200068 has exon 2

40% to 2X2 N-term

MLLQITCGFLFPPLIWIVWTYIKHLYDCLSYPQGPIPLPFIGNAHLLRKGEPYKELVNLGKIYGDVFGFSIGSIRYVVVNNLEGIKEVLIKKGSQFAGRPRLKFTI (1)

ATGCTTCTTGAAATTACTTGTGGGGTTCTGTTCCCACC

GTTAATATGGATTGTCTGGACATATATTAAACATCTTTATGATTGTTTGAGTTATCCACA

AGGACCAATACCACTGCCATTTATAGGAAATGCTCATCTTTTAAGAAAAGGTGAACCTTA

CAAGGAATTAGTTAATCTTGGAAAGATATATGGTGATGTTTTTGGATTTAGTATTGGTTC

AATTAGATATGTAGTTGTAAACAATTTAGAAGGTATTAAGGAAGTTTTGATTAAAAAAGG

TTCACAGTTTGCTGGTCGTCCAAGGCTAAAGTTTACTATTAGT

 

Exon 2 1097331043073  1097206900216  1097672200068  mate pair = 1097672251908

This mate pair has 2 aa diffs to 1095897329284

one nuc diff same aa seq

1096124035195 1096041114543 1097329360644 1096625189581

1095958061778 1095898207031

(1) ALSRGMNGLIMSDPSPHFRILRKLASSSLKIYAEGLDGMEKKAINEYSYLHKKLSTMNGKAVSLKRMI (1)

AGCTTTGAGTAGGGGTATGAATGGCCTTATTATGAGTGATCCT

TCACCACATTTTAGAATTTTACGAAAATTAGCATCATCTTCGTTAAAAATTTATGCTGAA

GGATTAGACGGGATGGAAAAAAAAGCTATAAATGAGTACAGTTATTTGCATAAAAAATTA

TCAACAATGAATGGAAAGGCTGTATCTTTAAAAAGAATGATAGGT

 

>1096124019772 related exon 2 5 aa diffs to 1097331043073

1096123858905 1096123680637

(1) ALTRAMNGLIISDPSPHFKILRKLASSSLKLYAEGLDGMEKKAINEYSYLHKKLSTMNGKAVSLKRMI (1)

 

>1097265020030 new N-term weak with frameshifts and a stop codon

no exact matches exist so this may be poor quality sequence

TCGCTYPQKIWNVL

WTDIKHLSDSESYPQGPISLPI

XXXAHIERKGETYREIDRLR*IYGDDIGMCIGTLRYVDVNNLEGIRDVLIYTGTQFL

ACGTGTGGGTGTACGTACCCACAGAAAATATGGAATGTC

TGGACAGATATAAAACATCTCTCAGATAGTGAGAGTTATCCACAAGGACCAATATCACTGCCAATT

GCACATATAGAAAGAAAAGGTGAGACATACAGGGAGATAGATAGACTTAGATAG

ATATATGGTGATGATATAGGTATGTGTATCGGTACACTTAGATATGT

AGATGTAAACAATTTAGAAGGTATTAGGGACGTTTTGATTTACACAGGTACACAGTTTCT

CTGGT

 

>1096110062131  related exon 2 73% to 1095897329284

1097331675401  1097646001099  1096704247756

(1) AWSRALNGLVACDPGPRFKVLRKLASSSLKIYAEGLDGMEKKAADEYSHLNKKLQTMNGKPVSLQNMI (1)

mate pair of 1097646001099 = 1097664041480, continues on 1096703402618

1097329754969 1097664053056

possible frameshift at NDRP_LHL

(1) ELGTLNIICTILFNHRYEEDDKEFQDIIKYSNLTVKIFGGTSILSSIPWLRFLPSASSRSIYE

IVRIRDPLLKKKLQEHKSSFDENNLRDVTDVLIKVSLGSDIAKGSEEKITDENIEFLLND

FIIAGSETSSSTILWFIVYLLHWPEYQDKLYNEIIKVTSGKRYPCLNDRP ?

LHLTQATIHETLRLSSVGPLAIVHKAMENSSICGKPVPKGAFILTNLWSTH

HDESYWKNPMCFYPERWLEKSGEFNSKLGYAFLPFSGGPRSCLGEALARTELFVFFSRLV

TDYRFEKPNGEELPRLNGRFGLTCSPFDFKSVVVPRC*

AGAGTTAGGTACCCTCAACATCATTTGTACTATTTTGTTCAATCATCGATATGAAGAAGAT

GATAAAGAATTTCAGGATATCATCAAATACTCAAATCTGACTGTTAAAATTTTTGGTGGA

ACAAGCATTTTATCTTCTATTCCATGGCTGCGTTTTTTACCATCAGCTTCTTCAAGAAGC

ATATATGAGATAGTAAGAATACGTGATCCACTTTTGAAAAAAAAGCTACAAGAGCACAAG

AGCTCGTTTGATGAGAATAACTTACGTGATGTGACTGATGTATTAATTAAGGTTTCTTTG

GGTTCAGATATTGCAAAAGGTTCCGAAGAAAAAATTACTGACGAAAACATAGAGTTTCTT

TTAAACGATTTCATAATTGCCGGATCAGAAACTTCATCAAGTACAATTCTTTGGTTTATT

GTTTATCTTTTACATTGGCCAGAATACCAAGATAAACTTTATAACGAAATTATAAAAGTT

ACATCAGGTAAGCGTTACCCATGTTTAAACGATCGCCCc

CTTCATTTAACGCAAGCCACAATTCATGAAACACTTCGATTGTCATCAGTAGGTCCTC

TTGCTATAGTTCATAAAGCGATGGAAAACAGTTCCATATGTGGAAAACCAGTTCCCAAAG

GAGCTTTTATACTAACAAATTTATGGAGTACACATCATGATGAAAGTTATTGGAAAAATC

CAATGTGTTTTTATCCAGAACGTTGGTTAGAAAAATCTGGTGAGTTTAATTCTAAGTTAG

GGTATGCATTTTTGCCGTTTTCAGGCGGACCTCGTAGCTGTTTAGGAGAAGCACTTGCAA

GAACAGAGTTGTTTGTCTTTTTTTCACGATTAGTAACAGATTATCGGTTTGAAAAACCAA

ATGGTGAGGAGTTACCGCGTTTGAATGGTCGTTTTGGTCTCACTTGCTCTCCTTTTGACT

TTAAATCGGTGGTTGTTCCAAGATGTTAA

 

>1097206642797 related exon 2 61% to 1095897329284

1096761288099 1096082164704 1097567110690 1097672343044

(1) DWSRTMNSLINNDLNATFKVLRKITSSSLKIYAEGLVGMEKRAIEEYTHLNKKLLSLKGQAVSIKNMI (1)

AGATTGGAGTAGAACAATGAACAGCCTCATCAATAACGACTTAAATGCAACCT

TTAAAGTTTTACGAAAAATAACATCCTCATCATTAAAGATTTATGCGGAAGGATTGGTGG

GAATGGAAAAAAGAGCTATTGAGGAATACACCCACTTAAATAAAAAGCTTTTATCATTGA

AAGGGCAAGCAGTATCTATTAAAAACATGATTGGT

 

>1097206059080 5 aa diffs to 1095898809307 might be the same gene

(1) GPCKPSHIICTILFNHRYDENDQEFQDIIKYSNLSVRASSATSLISSIPWLRFFPSTASR

NIYEIIRLRDPILKRKLQEHRSSYDENNLRDVTDSLIKVSLDSALENNSHEKITDDNIEF

LLNDFIIAGSETSSNTVLWFIVYMLHWPEYQDKLYNEILKITSGNRYPCLSDRPMLHLMQ

AAIHETLRLSSVAPLGVGHKAMESSSICGKPVPKGAFILTNLWSIHHDETHWNNAMSFYP

ERWLEKSGEFNLKLGEAYLPFSSGPRSCLGETLAKIELFVFISRLVKDYRFEKPTEEDLP

NLKGESGITRTPSEFKVMAIPRN*

AGGGCCGTGCAAACCGTCTCACATAATTTGCACAATACTTTTTAATCATCGATATGATG

AAAATGATCAAGAATTTCAAGATATCATAAAATATTCAAATTTGTCTGTTAGAGCATCTA

GTGCAACCAGTCTTATATCTTCTATTCCATGGTTACGGTTTTTTCCTTCAACTGCTTCAA

GAAATATTTATGAAATAATAAGACTTCGTGATCCGATTTTGAAACGGAAACTTCAAGAAC

ACCGAAGTTCTTATGATGAAAATAATTTACGCGATGTGACTGATTCCTTAATAAAAGTCT

CTTTGGATTCAGCATTGGAAAACAATTCACATGAGAAAATCACAGATGATAACATTGAGT

TTCTTTTAAACGATTTTATAATTGCTGGATCAGAAACGTCGTCAAACACTGTTCTTTGGT

TTATTGTTTATATGTTGCATTGGCCAGAATATCAAGATAAACTTTATAATGAAATTTTAA

AGATAACATCCGGAAATCGTTATCCATGTTTAAGCGATCGCCCTATGCTTCATTTGATGC

AAGCTGCAATTCATGAAACACTTAGACTGTCGTCAGTAGCACCTTTGGGTGTAGGTCATA

AAGCAATGGAAAGCAGTAGCATCTGTGGTAAACCTGTTCCAAAGGGTGCTTTTATATTAA

CAAACTTGTGGAGCATACATCACGATGAGACTCATTGGAATAATGCCATGAGTTTTTATC

CAGAACGTTGGCTGGAAAAATCTGGTGAGTTTAATTTGAAACTTGGTGAAGCGTACTTAC

CATTTTCAAGTGGACCGCGTAGTTGTTTGGGAGAAACATTAGCTAAAATTGAATTGTTTG

TATTTATATCACGGTTAGTAAAAGATTATCGGTTTGAAAAACCAACTGAAGAAGACTTAC

CAAACTTAAAAGGTGAATCTGGCATAACTCGCACTCCTTCTGAATTTAAAGTTATGGCTA

TTCCAAGAAATTAA

 

>gnl|ti|649393684 1095898809307 45% to 17A1 C-term. No exact matches

(1) VYLKLGEAYLPFSSGPRSCLGEALAKIELFIFISRLVKDYRFEKPTEEELPNLKGESGITRIPSEFKVMTIPRN*

AGTTTATTTGAAACTTGGTGAAGCGTACTTACCATTTTC

AAGTGGACCGCGTAGTTGTTTGGGAGAAGCATTAGCAAAAATAGAGTTGTTTATATTTAT

ATCACGGTTAGTAAAAGATTATCGGTTTGAAAAACCAACTGAAGAAGAGTTACCAAACTT

AAAAGGTGAATCTGGCATAACTCGCATTCCTTCTGAATTTAAAGTTATGACTATTCCAAGAAATTAA

 

>gnl|ti|646968536 1095898162561 83% to 1095897329284 37% to 2X2 N-term

1096041100060 1097672010393 1096602125478

MILKVIGSIFFPPLIWFVYSYIKHLIECLYYPKGPVPLPFIGNTNLLRKKETCKEFVNLGKIYGDIFGFSIGSIRYVIVNNLEGIHEVLIKKGSQFSGRPRII (1)

ATGATTCTTAAAGTCATTGGTAGCATTTTTTTC

CCGCCTCTTATTTGGTTTGTCTACAGTTACATCAAACATCTTATAGAATGTTTGTACTAT

CCGAAAGGACCAGTTCCTCTACCGTTCATAGGAAATACAAACTTATTAAGAAAAAAGGAA

ACTTGTAAAGAGTTTGTTAATCTTGGGAAGATATATGGTGATATTTTTGGATTCAGCATT

GGTTCTATTAGATATGTAATTGTTAACAACTTAGAAGGTATTCATGAAGTTTTAATTAAA

AAAGGCTCACAATTTTCTGGTCGACCAAGGATTATATGT

 

>1097509072583 new exon 3 boundary wrong

(0) LWSYTCDKESGTNLTVLDDLSNLSFDIVGDVGFGYQFNTITSHSSNEFTSAVRNLTKMQI 694

NASVFSKVLITCFPFLVKFLLLFGKRRNLIQIVYKTLNK (2)

AGCTTTGGTCATATACATGCGATAAAGA

AAGTGGGACAAACCTAACTGTTCTGGATGATTTGTCTAATCTGTCATTCGATATAGTTGG

TGATGTTGGTTTTGGGTACCAATTTAACACAATCACTTCTCATTCTAGTAATGAATTTAC

TTCAGCTGTTCGGAATTTGACTAAAATGCAAATCAATGCTAGTGTGTTCTCAAAAGTTTT

AATAACTTGTTTTCCATTTTTGGTCAAATTCTTGTTATTGTTTGGAAAGCGTAGAAATCT

TATACAGATTGTTTATAAAACTTTGAACAAGT

 

>gnl|ti|648014530 1095896049543 41% to CYP21

LKYLDCVVK

PANTILRTHVSSIHMNETIYPDPHSFKHERFMTG

AGTCCTGCAAATACAATCC

TACGAACTCATGTTAGCAGTATACACATGAATGAAACTATTTATCCAGATCCTCATTCAT

TTAAACATGAAAGGTTTATGACAGGT

>1096082202706 probably the same as 1095896049543 which has errors

1097664076692 1095994179331

(0) NIEVQEKLREDIQKNILDVNNISFEEVMSLKYLDCVVKETLRLHGPAPLLGRRTISATKF

GEYEVPANTILRTHVSSIHMNETIYPDPHSFKPERFMT (1)

AGAACATAGAAGTTCAAGAGAAACTTAGAGAAGATATCCAGAA

AAACATATTGGATGTAAATAATATTTCTTTTGAGGAAGTTATGAGTTTAAAATATTTGGA

TTGTGTCGTTAAAGAAACCTTGCGCTTACATGGACCTGCACCACTTTTAGGCAGAAGGAC

CATTAGTGCAACAAAATTTGGTGAATATGAAGTTCCTGCAAATACAATCCTACGAACTCA

TGTTAGCAGTATACACATGAATGAAACTATTTATCCAGATCCTCATTCATTTAAACCTGA

AAGGTTTATGACAGGT

 

 

>1097309000937 1097206907008 1095901911044

MFLVCLALIVLFIGLFLLCYLLKRTFHPLRLLPSPKEQLITGHNRYFHGRDHTSTYLSFN 858

EKFKEEGLCTLDTLY (1)

ATGTTTCTAGTATGTCTAGCACTCATAGTTTTATTTATTGGATTA

TTTTTACTGTGTTATTTATTAAAACGTACCTTTCACCCTCTTCGACTTTTACCATCACCA

AAAGAACAACTTATTACTGGTCATAATAGGTACTTTCACGGCCGCGACCATACTAGCACC

TATTTGAGTTTCAACGAAAAGTTTAAAGAAGAAGGTTTATGTACGCTAGATACATTATATGGT

 

>1096091465110 88% to 1097331817678

1096625274441 1096123742264 1097265046825 1095964362241

MFLICLALLILSIGLFFLRYLLKRIFHPLQLLPSPKEQLITGHISHFQGRDHSNTFLGF 701

NEKFKEEGLCTLDTLY (1)

 

>1097331817678  1096526275245  1096124165677 1096110023112 1096761988512

1096701884902

walked down from end of 1096526275245

walked farther from end of 1097672563082 ran into a repeat region

MFVICLALITLFIGLFFLRCLLKRIFHPLRLLPSPKEHLITGHISHFQGRDHSNTFLSFNEKFKEE

GLCTLDTLY (1)

ATGTTTGTGATATGTCTAGCACTCATAACTTTGTTTATTGGATTATTTTTCCTGC

GTTGTTTATTAAAACGTATCTTTCACCCTCTTCGATTATTACCATCACCAAAAGAACATC

TCATTACTGGTCATATTAGTCACTTTCAAGGCCGTGACCATTCTAACACCTTTTTGAGCT

TCAACGAAAAATTTAAAGAAGAAGGTTTATGCACGCTAGATACATTATATGGT

 

>1095899139433 1096703930092 1097509100606 1097675339850  new exon 2

VPRYVYLIAPEFIKKIFADGKLFQRPTTLKILAPLIGNSMLGSNYEDHHWQRKLFNGAFT 549

SQQLKNYFPAFLKHTNLLMK

AGTGCCCAGGTATGTTTATCTAATTGCTCCAGAATTCATTAAAAAGATATT

TGCAGATGGGAAACTTTTTCAAAGGCCTACTACATTAAAAATCTTGGCACCATTAATTGG

AAACAGCATGCTTGGTTCAAATTACGAAGACCATCATTGGCAAAGAAAGTTATTCAATGG

AGCATTTACTTCACAACAGCTGAAAAATTATTTTCCTGCATTTTTAAAGCATACTAATTT

GCTAATGAAAGT

 

>new exon 2 1095899339221 1097206043402 1097672369437 possible frameshift/insertion

(1) GFKFIYLLMPEYIKTMVSNGKVFQKSTAMKVIFPLVGNGMLVSNYEHHHWQRKLFNEAFS

AQQLKKYFPAFKEHT DLLIK (0)

AGGGTTCAAATTTATTTACCTTTTAATGCCAGAATATATTAAAACAA

TGGTTTCTAATGGCAAGGTTTTTCAAAAATCGACTGCAATGAAAGTTATATTTCCTCTAG

TTGGCAACGGTATGCTTGTGTCAAATTATGAACATCACCATTGGCAAAGAAAATTATTTA

ATGAAGCATTTTCTGCACAACAGTTAAAAAAATATTTTCCTGCATTTAAAGAGCATACTA

ATAAAAGATTTACTAATAAAAGT

 

>1095964240637 1097516021618 1096705343938 1095900018167  1096607016658  new exon 2

(1) GFRFVDLLLPEFIKTIFSDGKVFHRSNVLKVLFPLVGNGMIVSNYEDHHWQRKVLNEAFT 854

SQQLKNYFPAFTLHTDLLMK (0)

AGGTTTCAGATTTGTTGATCTATTATTGCCAGAATTTATTAAAACAA

TATTTTCTGATGGTAAAGTTTTTCACAGATCGAATGTTTTGAAAGTTTTGTTTCCTCTAG

TTGGAAATGGTATGATTGTATCAAATTATGAAGATCATCATTGGCAAAGAAAAGTTTTAA

ATGAAGCTTTTACCTCCCAACAGCTAAAGAATTATTTTCCTGCTTTTACATTGCATACTG

ATTTGCTAATGAAAGT

 

>1097675832709 new exon 1 with one possible frameshift or there is another exon

MCMVYIAVLILLCLIVFF

ANVLKRFYHPLRNFPSPQENLITGHYSYFYRYDHVKTLLNFGKQFEKNGLYTLDTLN (1)

ATGTGTATGGTTTATATAGCAGTATTGATTTTAT

TATGTTTAATAGTATTCTTTGCTAATGTTTTAAAGCGTTTTTATCATCCGCTTCGTAAT

TTTCCCTCACCTCAAGAAAATTTAATTACAGGCCATTATAGCTATTTTTATCGTTATGAT

CATGTCAAGACTTTGTTAAATTTTGGAAAGCAGTTTGAAAAGAATGGCTTATATACATTA

GATACATTAAATGGT

 

N-terminal EST sequences for hydra P450s

>DN812964.1 ACAC-aac48b12.g1 Hydra EST UCI 7..same as DN812371.1

MYTIGIAVLIFLCFSLFFANILKRFYHPLRKLPSPKENFFTAHYGYFNGYDQINAVINFG 280

KQFKERGLYTLDTLN

>DN810769.1 ACAC-aac19b13.g1 Hydra EST UCI 7.. same as DN812371.1

MYTIGIAVLIFLCFSLFFANILKRFYHPLRKLPSPKENFFTAHYGYFNGYDQINAVINFG 256

KQFKERGLYTLDTLN

>DN816152.1 ACAC-aac24b14.g1 Hydra EST UCI 7.. same as DN812371.1

IAVLIFLCFSLFFANILKRFYHPLRKLPSPKENFFTAHYGYFNGYDQINAVINFGKQFKE 199

RGLYTLDTLN

>CN775805.1 tae77f11.x1 Hydra EST Darmstadt .. same as DN812371.1

IAVLIFLCFSLFFANILKRFYHPLRKLPSPKENFFTAHYGYFNGYDQINAVINFGKQFKE 185

RGLYTLDTLN

>BP514308.1 BP514308 Hydra magnipapillata c...have this one

MYSIYIAIIIVPLVFFVAVFFKRFYHQFRLLPSPKESLITCHYSYFDVHDHVNTLLNFG 208

KEFKDYGLYTINTL

>BP514307.1 BP514307 Hydra magnipapillata c...same as BP514308.1

MYSIYIAIIIVPLVFFVAVFFKRFYHQFRLLPSPKESLITCHYSYFDVHDHVNTLLNFG 208

KEFKDYGLYTINTL

>BP505238.1 BP505238 Hydra magnipapillata c... same as BP514308.1

MYSIYIAIIIVPLVFFVAVFFKRFYHQFRLLPSPKESLITCHYSYFDVHDHVNTLLNFG 209

KEFKDYGLYTINTL

>CO509836.1 tai58f02.y1 Hydra EST UCI 5 ALP .. same as BP514308.1

IYIAIIIVPLVFFVAVFFKRFYHQFRLLPSPKESLITCHYSYFDVHDHVNTLLNFGKEF 181

KDYGLYTINTL

>DN813094.1 ACAC-aab89g09.g1 Hydra EST UCI 7..= 1097675463974

MFLICLALLILSIGLFFLRYLLKRIFHPLQLLPSPKEQLITGHISHFQGRDHSNTFLGFN 303

EKFKEEGLCTLDTL

>DN603400.1 ACAC-aac10m18.g1 Hydra EST UCI 7..= 1097675463974

same as 1096091465110 DN813094.1 DN137655.1

MFLICLALLILSIGLFFLRYLLKRIFHPLQLLPSPKEQLITGHISHFQGRDHSNTFLGFN 283

EKFKEEGLCTLDTL

>DN137655.1 ACAE-aaa07c04.g1 Hydra EST UCI 5.. ..= 1097675463974

same as 1096091465110 DN813094.1 DN137655.1

LICLALLILSIGLFFLRYLLKRIFHPLQLLPSPKEQLITGHISHFQGRDHSNTFLGFNEK 192

FKEEGLCTLDTL

>CN567598.1 tag12b09.x1 Hydra EST -Kiel 1 Hy..we have this one

LLKRIFHPLRFLPSPKEQLITGHINHFQGRDHSSTYLSFNEKFK