[Bioperl-l] Problems reading Genbank file

gert thijs gert.thijs@esat.kuleuven.ac.be
Tue, 14 Nov 2000 20:40:08 +0100


This is a multi-part message in MIME format.
--------------1DA9235EBFDF878C6348EFEB
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Hello,

I am trying to read some sequences from a genbank flat file and store the
sequences in a hash with the accession number of the sequences as the key. But
when I want to print the accession number all I get is 'unknown'
Here is the code I use to read the genbank sequences and store them in a hash.
I have included the test file I use as an attachment to this mail.

 
# read all data from temporary gb flat file
$inStream = new Bio::SeqIO(-file => "<test.gb", -format => 'Genbank' );
%seqList = ();
while ( $seq = $inStream->next_seq() ){
    $key = $seq->accession_number;
    print "$key \n";
    $seqList{$key} = $seq;
}
$inStream->close;


Thanx,
Gert Thijs

 
==========================================================
+ Gert Thijs              gert.thijs@esat.kuleuven.ac.be +
+                                                        +
+ Dept. Elektrotechniek ESAT-SISTA                       +
+ Kardinaal Mercierlaan, 94                              +
+ B-3001 HEVERLEE  Belgium                               +
+ Tel :  +32-16-32 18 84 ---- Fax : +32-16-32 19 70      +
==========================================================
--------------1DA9235EBFDF878C6348EFEB
Content-Type: text/plain; charset=us-ascii;
 name="test.gb"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="test.gb"

LOCUS       AF016236     7990 bp    DNA             BCT       06-JAN-1998
DEFINITION  Rhodobacter sphaeroides DMSO/TMAO-sensor kinase (dorS),
            DMSO/TMAO-response regulator (dorR), DMSO/TMAO-cytochrome
            c-containing subunit (dorC), DMSO-membrane protein (dorB), and
            DMSO/TMAO-reductase (dorA) genes, complete cds.
ACCESSION   AF016236
VERSION     AF016236.1  GI:2353766
KEYWORDS    .
SOURCE      Rhodobacter sphaeroides.
  ORGANISM  Rhodobacter sphaeroides
            Bacteria; Proteobacteria; alpha subdivision; Rhodobacter group;
            Rhodobacter.
REFERENCE   1  (bases 1 to 7990)
  AUTHORS   Mouncey,N.J., Choudhary,M. and Kaplan,S.
  TITLE     Characterization of genes encoding dimethyl sulfoxide reductase of
            Rhodobacter sphaeroides 2.4.1T: an essential metabolic gene
            function encoded on chromosome II
  JOURNAL   J. Bacteriol. 179 (24), 7617-7624 (1997)
  MEDLINE   98062189
REFERENCE   2  (bases 1 to 7990)
  AUTHORS   Mouncey,N.J., Choudhary,M. and Kaplan,S.
  TITLE     Direct Submission
  JOURNAL   Submitted (25-JUL-1997) Microbiology and Molecular Genetics,
            University of Texas Medical School, 6431 Fannin, Houston, TX 77030,
            USA
FEATURES             Location/Qualifiers
     source          1..7990
                     /organism="Rhodobacter sphaeroides"
                     /strain="2.4.1T"
                     /db_xref="taxon:1063"
                     /chromosome="2"
     RBS             159..165
     gene            175..2622
                     /gene="dorS"
     CDS             175..2622
                     /gene="dorS"
                     /codon_start=1
                     /transl_table=11
                     /product="DMSO/TMAO-sensor kinase"
                     /protein_id="AAB94870.1"
                     /db_xref="GI:2353767"
                     /translation="MIAEKSERFFPFAVSAELAPVGVSAAERSALADYLESSETLLTE
                     RVVAYASTRSYSHLVSTLPEAWRSSVQGLTDSVILMLDHQSAEAAIDYDADIGTDPST
                     AYGIEAGLRHRLRGISLETFIGAFKGYRDVYLNVTAEARVPAAMREGWLRLLRGFFDR
                     AEIGICAHWSGELGTLDHDQLLSVNRALVNEKNKYLTIFESMNNPVLLVDDGGRIENM
                     NFAAARLFLQDALPGSVYYGPEANLRFADLAGFDLEAVKLRGEAGDVLTCIGERWYSI
                     TAQEMLDVSRKFVGIVVTFHDVTEARRAREQAEALARAKTDFLATMSHEIRTPIHSIG
                     GVTELLKQSELASRDRGYVDAIERSTEVLASIVSDVLDYARIESGLVELEQVDFSIDQ
                     ILDDVARMMQPLVRRKPQLRIVIERTDLPAGPGKMQASLRQILINLTSNAVKFTPEGT
                     VVIGAERLAGGHRFRFTVSDTGPGIAAEKLEEIFKPYIQSDSSISRRHGGTGLGLAIC
                     RRLAGHLGGRLDVRSTPGFGSRFTLEVALAPGGSEPPGDATGDAPPPARALDLLVVED
                     DEVNALVAQSLLSAAGHGVRVAGTGEAALVRPRRSTAFDLVLTDLNLPDMDGLELART
                     IRRHADRHTAELPLVALSAHGPGVDPAALTEAGIDAFLGKPFHFARLEEILSRLVGSS
                     SPTPLGKAAPARRALQSVDLCVLRGHAEALGRTSAARIVQTFRQSVLETARALELAMD
                     EADMRSVTSLAHRLKGAARHLGFRGLSDKAEQVETAAAADGCEAALVLELVSDCRAAP
                     ALADLAWAEASAGVAES"
     gene            complement(2640..3340)
                     /gene="dorR"
     CDS             complement(2640..3340)
                     /gene="dorR"
                     /codon_start=3
                     /transl_table=11
                     /product="DMSO/TMAO-response regulator"
                     /protein_id="AAB94871.1"
                     /db_xref="GI:2353768"
                     /translation="MKKNYHMLVVEDDPVSRQTLAMYLRKENHEVSEARDGEQMRRVF
                     PKGDVDVVMLDINMPGKDGLSILRELPRQSEVGIIMVTSRKEDVDRIVALEFGADDYV
                     TKPYNMREILPRAKNFARRVAALRLVRPDQPATTFDGWTLDAAHWALTDPAGNHVKLT
                     RAEFELLATFVAHPGQVLTRDQLMNHVGRRGHETFDRTIDVLVRRIRRKIEADPSDPR
                     LIVTVHGIGYVFQA"
     RBS             complement(3350..3355)
     misc_feature    3422..3432
                     /note="putative DorR binding-site"
     misc_feature    3433..3443
                     /note="putative DorR binding-site"
     misc_feature    3454..3464
                     /note="putative DorR binding-site"
     misc_feature    3476..3486
                     /note="putative DorR binding-site"
     RBS             3535..3543
     gene            3571..4785
                     /gene="dorC"
     CDS             3571..4785
                     /gene="dorC"
                     /codon_start=1
                     /transl_table=11
                     /product="DMSO/TMAO-cytochrome c-containing subunit"
                     /protein_id="AAB94872.1"
                     /db_xref="GI:2353769"
                     /translation="MGRSRGRASEAKVISRIWKAFWRPSTKWGLGVLLVTGGIAGAVG
                     WNGFHYVVEKTTTTEFCISCHSMRDNNYEEYKTTIHYQNTSGVRAECADCHVPKSGWK
                     LYRAKLLAAKDLWGEIQGTIDTREKFEAHRLEMAETVWADMKANDSATCRTCHSFNAM
                     DFAHQKPEASKQMQQAMNEGGTCIDCHKGIAHKLPDMASGYRALFSKLEKASQSLKPS
                     KGETLYPLQTIEAYLERPSGDKAKGDGRLLAATPMQVVDVKGEWVQVAVKGWQQEGAE
                     RVIYEKQGKRIFNAALAPTATGSIVAGASMVDPDTEQTWTDVSLTAWVRNRDLTDDQE
                     ALWQYGKQMFNGACGMCHVLPHTEHFLANQWIGTLNAMKSRAPLDDEQFRLVQRYVQM
                     HAKDVEPEGAAE"
     RBS             4767..4774
                     /gene="dorC"
     gene            4782..5462
                     /gene="dorB"
     CDS             4782..5462
                     /gene="dorB"
                     /codon_start=1
                     /transl_table=11
                     /product="DMSO-membrane protein"
                     /protein_id="AAB94873.1"
                     /db_xref="GI:2353770"
                     /translation="MTFAHSFPSAHMPVPAPAAGAGEIAPLCAWLAEVFIAPPSAPEI
                     GAYRRGEAAAWLASLAADPDFAPGAAAMRQALAGEGSDEALAARLGTAFNRLFLGFGG
                     RRTVVPCESAWRGNGRLYQAPAAEMQHLFARADLSLGAGCVEPPDHISVELALLSFLL
                     VSGDPGTSAMKERLQGWIPAFCARCLEEDTTGFWGGAARLLTAAVAACPARDEARQDR
                     HTEERKAR"
     RBS             5443..5454
                     /gene="dorB"
     gene            5459..7927
                     /gene="dorA"
     CDS             5459..7927
                     /gene="dorA"
                     /codon_start=1
                     /transl_table=11
                     /product="DMSO/TMAO-reductase"
                     /protein_id="AAB94874.1"
                     /db_xref="GI:2353771"
                     /translation="MTKLSGQELHAELSRRAFLSYTAAVGALGLCGTSLLAQGARAEG
                     LANGEVMSGCHWGVFKARVENGRAVAFEPWDKDPAPSHQLPGVLDSIYSPTRIKYPMV
                     RREFLEKGVNADRSTRGNGDFVRVTWDEALDLVAKELKRVQESYGPTGTFGGSYGWKN
                     PGRLHNCQVLMRRALNLAGGFVNSSGDYSTGAAQIIMPHVMGTLEVYEQQTAWPVVVD
                     NTELMVFWAADPVKTNQIGWVVPDHGAFAGMQAMKEKGTKVICINPVRTETADYFGAE
                     LVSPRPQTDVALMLGMAHTLYSEDLHDKDFIENCTSGFDIFAAYLTGESDGTPKTAEW
                     AAEICGLPAEQIKELARRFVGGRTMLAAGWSIQRMHHGEQAHWMLVTLASMIGQIGLP
                     GGGFGLSYHYSNGGSPTSDGPALGGISDGGKPVEGAAWLSASGAASIPCARVVDMLLN
                     PGGEFQFNGATATYPDVKLAYWVGGNPFAHHQDRNRMLKAWEKLETFIVQDFQWTATA
                     RHADIVLPATTSYERNDIESVGDYSNRAILAMKKVVDPLYEARSDYDIFAALTERLGK
                     GKEFTEGRDEMGWISSFYEAAVKQAEFKQMEMPSFEDFWSEGIVEFPITEGANFVRYA
                     DFREDPLFNPLGTPSGLIEIYSKNIEKMGYDDCPAHPTWMEPAERLGGPGAKYPLHVV
                     ASHPNSRLHSQLNGTSLRDLYAVAGHEPCLINPDDAAARGIADGDVLRVFNDRGQILV
                     GAKVSDAVMPGAIQVYEGGWYDPLDPSEEGTLDKYGDVNVLSLDVGTSKLAQGNCGQT
                     ILADVEKYAGAPVTVTVFDTPKGP"
     stem_loop       7932..7963
BASE COUNT     1345 a   2668 c   2681 g   1296 t
ORIGIN      
        1 tccgcatttg acgtcaatca aggattgtcc cgcattaacc tatcagatcg gccgagacgg
       61 tctgccgcag tcgaaggcgg cggatcatgg agatggacgt gccgggcgcg cggaacgggc
      121 aaggggctcg cgccccggag ccccacttca tgcgccttgg aaggagtttg gtcgatgata
      181 gccgagaagt cggagcggtt cttccccttt gcggtcagtg cggaacttgc gcccgtgggc
      241 gtctcggcgg ccgaacggag cgcgcttgcc gactatctgg agtcgagcga gacccttctg
      301 accgaacgcg ttgtcgccta cgccagcacc cgcagctaca gccacctcgt ctcgacgctg
      361 cccgaggcct ggcgctcgtc cgttcagggg ctgacggact ccgtcatcct catgctcgac
      421 caccagtcgg ccgaagccgc catcgactat gacgcagata tcggcaccga tcccagcacc
      481 gcctatggca tcgaggccgg tctccgccac cgcctgcggg gcatctcgct cgagaccttc
      541 atcggcgcct tcaagggcta tcgcgatgtc tatctgaatg tcaccgccga ggcgcgggtt
      601 cccgccgcga tgcgcgaggg gtggttgcgc ctcctgcggg gtttcttcga ccgggccgag
      661 attgggatct gcgcccattg gagcggcgag ctgggcactc tcgatcacga ccagctgctc
      721 tcggtgaacc gggcgctcgt caacgagaag aacaagtatc tgaccatctt cgaaagcatg
      781 aacaatccgg tgctgctggt ggatgacggc gggcgcatcg agaacatgaa cttcgccgcc
      841 gcgcgtctct tcctgcagga tgccctgccg ggctcggtct attacgggcc ggaggcgaac
      901 ctcaggttcg ccgatctcgc gggtttcgac ctcgaggcgg tgaagctgcg cggcgaggcg
      961 ggcgatgtgc tgacctgcat cggcgagcgc tggtactcga tcacggcgca ggagatgctg
     1021 gacgtcagcc gcaagttcgt gggcatcgtg gtcaccttcc acgacgtgac cgaagcgcgg
     1081 cgggcgcgcg aacaggccga ggcgctggcc cgcgccaaga ccgacttcct cgccacgatg
     1141 agccacgaga tccgcacccc gatccacagc atcggggggg tcaccgaact tctcaagcag
     1201 tccgagcttg cctcccgcga ccgcggctat gttgatgcga tcgagcggtc gaccgaggtg
     1261 ctcgcctcga tcgtgagcga cgtgctcgat tacgcgcgga tcgagtccgg gctggtcgag
     1321 ctcgagcagg tggatttctc gatcgaccag atcctcgacg atgtggcgcg gatgatgcag
     1381 ccgctggtgc gccgcaagcc gcagcttcgc atcgtgatcg agcggacgga cctgcccgcc
     1441 ggtcctggga agatgcaggc aagcttgcgg cagatcctca tcaatctcac gagcaacgcg
     1501 gtgaagttca ccccggaggg aaccgttgtg atcggggccg agcgcctcgc cggcggccat
     1561 cgcttccgct tcaccgtgag cgataccggg ccgggcatcg cggccgagaa gctcgaggag
     1621 atcttcaaac cctatatcca gtccgacagc tcgatctcgc gccgccacgg cggcaccggc
     1681 ctcggtctcg cgatctgccg gaggctcgcc ggacatctgg gggggcgcct cgacgtgcgc
     1741 agcacgcccg gcttcggcag ccgcttcacg ctggaagtgg cgctcgctcc gggcgggagc
     1801 gagcccccgg gcgatgcgac gggcgacgcg cctccgccgg cacgggcgct ggatctgctg
     1861 gtggtcgagg atgacgaggt gaatgcgctg gtggcgcaga gcctgctctc ggctgccggc
     1921 cacggcgtcc gggtcgccgg caccggcgag gcggcgctcg ttcgccctcg gcggagcacc
     1981 gctttcgacc tcgtgctgac ggacctcaac ctgccggaca tggacgggct ggagctggcc
     2041 cgcacgatcc gccgccacgc cgacaggcac acggcggagc tgccgttggt ggcgctctcc
     2101 gcgcatggcc cgggtgtgga tccggcggcg ctgaccgagg cggggatcga cgccttcctg
     2161 ggcaaaccct tccatttcgc gcgtcttgaa gagatcctct cccgtctggt cggaagttcc
     2221 tcgcccacgc cgctgggcaa ggccgcgccg gcccggcgcg ccttgcagtc ggtggatctc
     2281 tgcgtgctgc ggggtcatgc cgaggcgctg ggacgtacct ctgccgcacg gatcgtccag
     2341 accttccggc agagcgtgct cgagacggcc cgtgcgctgg aactggcgat ggatgaggcc
     2401 gacatgcggt ccgtgacgtc tctggcccac cggctgaagg gggcggcgcg gcatctgggc
     2461 ttcaggggtc tctccgacaa ggcagagcag gtcgagaccg cggccgcggc ggacggctgc
     2521 gaggccgctc tggtgctcga gctggtctcc gactgccgcg ccgctccggc gctggccgat
     2581 ctcgcctggg ccgaagccag tgccggagtg gccgagagct gaggcggtct cttccggcgt
     2641 caggcctgga agacgtagcc gatgccgtga acggtcacga tcagccgcgg gtcggaaggg
     2701 tccgcctcga tcttgcggcg gatgcggcgc actagcacgt cgatggtccg gtcgaaggtc
     2761 tcgtgcccgc ggcggccgac atggttcatc agctggtcgc gggtcaggac ctgcccggga
     2821 tgggccacga aggtggccag cagctcgaac tcggcccgtg tcagtttgac atgattgccc
     2881 gcgggatcgg tcagcgccca atgggccgcg tcgagggtcc agccgtcgaa ggtcgtcgcc
     2941 ggctggtccg gccgcaccag ccgcagggcc gccacccgcc gggcgaaatt ctttgcccgc
     3001 ggcaggatct cgcgcatgtt gtatggcttg gtcacataat cgtccgcgcc gaactccagc
     3061 gccacgatcc ggtccacatc ctccttccgg ctcgtcacca tgatgatgcc gacctcggac
     3121 tgccggggca gttcccgcag aattgacagc ccgtccttgc ccggcatgtt gatgtcgagc
     3181 atcaccacat ccacgtcgcc cttggggaag acgcggcgca tttgttcgcc gtcgcgcgct
     3241 tcgctgactt cgtgattttc cttgcgcaga tacatcgcga gcgtctggcg gctgaccgga
     3301 tcgtcttcga cgacaagcat gtggtagttt ttcttcatga cgcgcgaggt ctcctgcggc
     3361 cggttggacc taatgcaccc tttcgcgccc cgatttcaac ggcaactcat tcacttggcc
     3421 gctgttaaca tcctgttcac atcattttac gccaggttaa caatctgacg caacgcggtt
     3481 cacaccgctc ctccaccttg gctttcaaca gaggcagcaa gccggtggac cttcggggaa
     3541 ggaccggcgc gcccgccgca ttcctgcggc atggggcgtt ctcgcggtcg ggcttcggag
     3601 gcaaaagtga tcagcaggat ttggaaggct ttctggcgac cgagcacgaa atgggggctc
     3661 ggcgtcctgc tcgtgaccgg cggcatcgcc ggtgcggtcg gatggaacgg gttccactat
     3721 gtggtggaaa agaccaccac gacggaattc tgcatcagct gccactcgat gcgggacaac
     3781 aactacgagg aatacaagac caccatccac taccagaaca cctcgggcgt gcgggcggaa
     3841 tgcgccgact gtcacgtccc gaaatccggc tggaagctct accgcgcgaa gctcctcgcc
     3901 gcgaaggacc tctggggcga aattcagggc accatcgaca cgcgtgagaa gttcgaggcg
     3961 caccggctcg agatggccga gaccgtctgg gccgacatga aggccaacga ctcggccacc
     4021 tgccggacct gccactcgtt caacgcgatg gacttcgccc accagaagcc cgaggcctcg
     4081 aagcagatgc agcaggcgat gaacgagggc ggaacctgca tcgactgcca caagggcatc
     4141 gcccacaagc tgcccgacat ggccagcggc taccgcgcgc tgttctcgaa gctcgagaag
     4201 gcctcgcagt cgctcaagcc cagcaagggc gagacgctct atccgctcca gaccatcgag
     4261 gcctatctcg agcggccctc gggcgacaag gcgaagggcg acgggcggct tctggccgcg
     4321 acgccgatgc aggtggtcga cgtgaagggt gagtgggtgc aggtcgcggt gaagggctgg
     4381 cagcaggaag gcgccgagcg ggtcatctac gagaagcagg gcaagcggat tttcaacgcc
     4441 gcactggcgc cgacggccac gggctcgatc gtggcgggcg cgtccatggt cgatccggac
     4501 accgaacaga cctggacgga tgtctcgctg acggcgtggg tgcgcaaccg cgacctgaca
     4561 gacgaccagg aagcgctctg gcagtatggc aagcagatgt tcaacggtgc ctgcggcatg
     4621 tgtcacgtcc tgccccacac cgagcatttc ctcgccaacc agtggatcgg cacgctcaac
     4681 gccatgaaga gccgggcgcc gctcgatgac gaacagttcc gcctcgtgca gcgctacgtc
     4741 cagatgcatg cgaaggacgt ggaaccggaa ggagctgcgg aatgaccttc gcgcattcct
     4801 tccccagcgc ccacatgccc gtcccggcgc ctgccgccgg ggccggcgag atcgccccgc
     4861 tctgtgcctg gctggccgaa gtgttcatcg ccccgccgtc ggcccccgag atcggcgcct
     4921 atcgccgcgg ggaagccgcg gcctggttgg ccagccttgc ggccgacccc gacttcgccc
     4981 ccggcgccgc cgccatgcgg caggcgctgg ccggggaggg cagcgacgaa gccctcgcag
     5041 cccggctcgg gacggccttc aaccggctgt tcctcggctt cggcggccgc cgcacggtgg
     5101 tgccgtgcga atccgcctgg cggggaaacg ggcggcttta tcaggccccg gcggccgaga
     5161 tgcagcatct cttcgcccgg gccgaccttt cgctcggcgc aggctgcgtc gagccgcccg
     5221 accacatctc ggtcgagctc gcgctcctgt ccttcctgct cgtgagcggg gatcccggca
     5281 ctagcgccat gaaagaacgc ctgcagggct ggatcccggc cttctgcgca cgttgcctcg
     5341 aagaggatac gacgggcttc tggggaggcg ccgcgcgtct cctgaccgcc gcggtggccg
     5401 catgccccgc ccgggacgaa gcccggcaag accgtcatac ggaagaaagg aaagccagat
     5461 gactaagttg tcaggtcagg agctgcatgc cgaactctcg cggcgcgcct tcctgagcta
     5521 tacggcggct gtgggggctc tcggtctctg cggcacctcg ctcctcgcgc agggagcccg
     5581 cgcggaaggt ctcgccaacg gcgaggtcat gtcgggctgc cactggggcg tgttcaaggc
     5641 ccgggtcgag aacggccgcg ccgtggcctt cgagccctgg gacaaggacc ccgcgccgtc
     5701 gcaccagctg ccgggcgtgc tcgattcgat ctattcgccc acgcggatca aatatccgat
     5761 ggtgcgccgc gaattcctcg agaagggcgt gaacgccgac cgctccaccc gcggcaacgg
     5821 cgacttcgtc cgcgtcacct gggatgaagc gctcgacctc gtggccaagg aactgaagcg
     5881 cgttcaggaa agctacgggc ccaccggcac cttcggcggc tcctacggct ggaaaaaccc
     5941 gggccggctg cacaactgtc aggtcctcat gcgccgcgcg ctgaatctcg cgggcgggtt
     6001 cgtgaactcg tcgggcgact attcgaccgg cgccgcgcag atcatcatgc cgcatgtcat
     6061 gggcacgctc gaggtctacg agcagcagac cgcctggccc gtggtggtgg acaacaccga
     6121 actgatggtc ttctgggccg ccgatccggt gaagaccaac cagatcggct gggtggtccc
     6181 cgaccatggc gccttcgcgg gcatgcaggc aatgaaggaa aagggcacca aggtcatctg
     6241 catcaacccc gtgcgcaccg agacggccga ctatttcggc gccgaactcg tgtcgccgcg
     6301 gccgcagacc gacgtggcgc tgatgctcgg catggcgcac acgctctaca gcgaagatct
     6361 gcacgacaag gacttcatcg aaaactgcac ctcgggcttc gacatcttcg cggcctacct
     6421 gaccggcgag agcgacggca cgcccaagac ggccgaatgg gccgccgaga tctgcggcct
     6481 gccggccgag cagatcaagg aactcgcccg ccgcttcgtg ggcggccgga cgatgctcgc
     6541 cgcgggctgg tcgatccagc ggatgcacca tggcgaacag gcgcactgga tgctcgtcac
     6601 gctggcctcg atgatcggcc agatcggtct tccgggcggc ggcttcggcc ttagctacca
     6661 ttactccaac ggtggctcgc ccacgagcga cggcccggcg ctgggcggta tttcggacgg
     6721 cggcaagccg gtcgaaggtg cggcctggct gtcggcgagc ggcgcggctt cgatcccctg
     6781 cgcccgggtg gtggacatgc tgctcaatcc gggcggcgag ttccagttca acggtgccac
     6841 ggcgacctat cccgacgtga agctggccta ctgggtgggc ggcaacccct tcgcgcacca
     6901 ccaggaccgc aaccggatgc tcaaggcctg ggaaaagctc gagaccttca tcgtgcagga
     6961 cttccagtgg accgccaccg cgcgccacgc cgacatcgtc ctgccggcga cgacctccta
     7021 cgaacgcaac gacatcgagt cggtgggcga ctattcgaac cgcgccatcc tcgcgatgaa
     7081 gaaggtggtc gatccgctct acgaggcccg gtcggactac gacatcttcg cagccctgac
     7141 ggagcgtctg ggcaagggca aggaattcac cgaaggccgc gacgagatgg gctggatcag
     7201 ctcgttctac gaggcggcgg tgaagcaggc cgagttcaag cagatggaga tgccgtcgtt
     7261 cgaggacttc tggtcggaag ggatcgtcga gttcccgatc accgagggcg cgaacttcgt
     7321 tcgctatgcc gacttccgcg aggatccgct gttcaacccc ctcggcacgc cctcgggcct
     7381 gatcgagatc tactcgaaga acatcgagaa gatgggctat gacgattgcc cggcccatcc
     7441 gacctggatg gaaccggccg agcgtctcgg cgggccgggg gcgaaatatc cgctccatgt
     7501 ggtggcgagc cacccgaact cgcggctgca ctcgcagctg aacggcacct cgctgcgcga
     7561 cctctatgcg gtggcggggc acgagccctg tctcatcaac cccgacgatg cggccgcgcg
     7621 cggcatcgcg gacggcgatg tgctgcgggt gttcaacgac cgcgggcaga tcctcgtggg
     7681 cgcgaaggtg agcgacgcgg tgatgccggg cgcgatccag gtctacgagg gcggctggta
     7741 cgacccgctc gacccctcgg aggaaggcac gctcgacaaa tacggcgacg tgaacgtgct
     7801 gtcgctcgac gtcggcacct cgaagctggc gcagggcaac tgcggccaga ccatcctcgc
     7861 ggatgtcgaa aaatatgcgg gcgcgccggt gacggtgacc gtgttcgaca cgccgaaggg
     7921 accctgaggc gccccggccg gggcggcggt tcccccgccc gccttcacct tccccggccc
     7981 gcaccgcttg 
//




LOCUS       AF057044     2300 bp    mRNA            PLN       15-APR-1998
DEFINITION  Arabidopsis thaliana acyl-CoA oxidase (ACX1) mRNA, complete cds.
ACCESSION   AF057044
VERSION     AF057044.1  GI:3044213
KEYWORDS    .
SOURCE      thale cress.
  ORGANISM  Arabidopsis thaliana
            Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta;
            Magnoliophyta; eudicotyledons; core eudicots; Rosidae; eurosids II;
            Brassicales; Brassicaceae; Arabidopsis.
REFERENCE   1  (bases 1 to 2300)
  AUTHORS   Hooks,M.A., Kellas,F. and Graham,I.A.
  TITLE     An acyl-CoA oxidase gene of Arabidopsis thaliana
  JOURNAL   Unpublished
REFERENCE   2  (bases 1 to 2300)
  AUTHORS   Hooks,M.A., Kellas,F. and Graham,I.A.
  TITLE     Direct Submission
  JOURNAL   Submitted (02-APR-1998) Division of Biochemistry and Molecular
            Biology, University of Glasgow, University Ave., Glasgow G12 8QQ,
            United Kingdom
FEATURES             Location/Qualifiers
     source          1..2300
                     /organism="Arabidopsis thaliana"
                     /cultivar="Columbia"
                     /db_xref="taxon:3702"
                     /tissue_type="seedling hypocotyl"
                     /clone_lib="CD4-15: Keiber"
                     /dev_stage="3 days old"
     gene            1..2300
                     /gene="ACX1"
     CDS             77..2071
                     /gene="ACX1"
                     /EC_number="1.3.3.6"
                     /codon_start=1
                     /product="acyl-CoA oxidase"
                     /protein_id="AAC13498.1"
                     /db_xref="GI:3044214"
                     /translation="MEGIDHLADERNKAEFDVEDMKIVWAGSRHAFEVSDRIARLVAS
                     DPVFEKSNRARLSRKELFKSTLRKCAHAFKRIIELRLNEEEAGRLRHFIDQPAYVDLH
                     WGMFVPAIKGQGTEEQQKKWLSLANKMQIIGCYAQTELGHGSNVQGLETTATFDPKTD
                     EFVIHTPTQTASKWWPGGLGKVSTHAVVYARLITNGKDYGIHGFIVQLRSLEDHSPLP
                     NITVGDIGTKMGNGAYNSMDNGFLMFDHVRIPRDQMLMRLSKVTREGEYVPSDVPKQL
                     VYGTMVYVRQTIVADASNALSRAVCIATRYSAVRRQFGAHNGGIETQVIDYKTQQNRL
                     FPLLASAYAFRFVGEWLKWLYTDVTERLAASDFATLPEAHACTAGLKSLTTTATADGI
                     EECRKLCGGHGYLWCSGLPELFAVYVPACTYEGDNVVLQLQVARFLMKTVAQLGSGKV
                     PVGTTAYMGRAAHLLQCRSGVQKAEDWLNPDVVLEAFEARALRMAVTCAKNLSKFENQ
                     EQGFQELLADLVEAAIAHCQLIVVSKFIAKLEQDIGGKGVKKQLNNLCYIYALYLLHK
                     HLGDFLSTNCITPKQASLANDQLRSLYTQVRPNAVALVDAFNYTDHYLNSVLGRYDGN
                     VYPKLFEEALKDPLNDSVVPDGYQEYLRPVLQQQLRTARL"
     misc_feature    2060..2068
                     /gene="ACX1"
                     /note="putative targeting signal to peroxisomes"
BASE COUNT      592 a    476 c    561 g    671 t
ORIGIN      
        1 tttttttcct atcatctctg agagttttct cgagaaactt ttgagtgttt agctactaga
       61 ttctgaatta cgaatcatgg aaggaattga tcacctcgcc gatgagagaa acaaagcaga
      121 gttcgacgtt gaggatatga agatcgtctg ggctggttcc cgccacgctt ttgaggtttc
      181 cgatcgaatt gcccgccttg tcgccagcga tccggtgttt gagaaaagca atcgagctcg
      241 gttgagtagg aaggagctgt ttaagagtac gttgagaaaa tgtgcccatg cgtttaaaag
      301 gattatcgag cttcgtctca atgaggaaga agcaggaaga ttgaggcact ttatcgacca
      361 gcctgcctat gtggatctgc actggggaat gtttgtgcct gctattaagg ggcagggtac
      421 agaggagcag cagaagaagt ggttgtcgct ggccaataag atgcagatta ttgggtgtta
      481 tgcacagact gagcttggtc atggctcaaa tgttcaagga cttgagacaa ctgccacatt
      541 tgatcccaag actgatgagt ttgtaattca cactccaact cagactgcat ccaaatggtg
      601 gcctggtggt ttgggaaaag tttctactca tgctgttgtt tacgctcgtc tcataactaa
      661 cggaaaagac tacggtatcc atggattcat cgtgcaactg cgaagcttag aagatcattc
      721 tcctcttccg aatataactg ttggtgatat cgggacaaag atgggaaatg gagcatataa
      781 ttcaatggac aacgggtttc ttatgtttga tcatgttcgc attcctagag atcaaatgct
      841 catgaggctg tcaaaagtta caagagaagg agaatatgtt ccatcggatg ttccaaagca
      901 gctggtatat ggtactatgg tgtatgtgag acaaacaatt gtggctgatg cttccaatgc
      961 actatctcga gcagtttgca tagctacaag atacagtgca gtgcggaggc aatttggcgc
     1021 acataatggt ggcattgaga cacaggtgat tgattataaa actcagcaga acaggctatt
     1081 tcctctgcta gcatctgcat atgcatttcg atttgttgga gagtggctaa aatggctgta
     1141 cacggatgta actgaaagac tggcggctag tgatttcgca actttgcctg aggctcatgc
     1201 atgcactgca ggattgaagt ctctcaccac cacagccact gcggatggca ttgaagaatg
     1261 tcgtaagtta tgtggtggac atggatactt gtggtgcagt gggctccccg agctgtttgc
     1321 tgtatatgtt cctgcctgca catacgaagg agacaatgtt gtgctgcaat tacaggttgc
     1381 tcgattcctc atgaagacag tcgcccagct gggatctgga aaggttcctg ttggcacaac
     1441 tgcttatatg ggccgggcag cacatctttt gcaatgtcgt tctggtgttc aaaaggctga
     1501 ggattggtta aaccctgatg ttgtactgga agctttcgaa gctagggctc tcagaatggc
     1561 tgttacgtgt gccaaaaatc tcagcaagtt tgagaatcag gaacaaggat tccaagagct
     1621 cttggctgat ttggttgagg ccgctattgc tcattgccaa ttgattgttg tttccaagtt
     1681 catagcgaaa ctggagcaag acataggtgg caaaggagtg aagaaacagc tgaataatct
     1741 gtgttacatt tatgctcttt atctcctcca caaacatctc ggcgatttcc tctccactaa
     1801 ctgcatcact cccaaacaag cctctcttgc taacgaccag ctccgttcct tatacactca
     1861 ggtccggcct aatgcggttg cacttgtgga cgccttcaat tacaccgacc attacttgaa
     1921 ctcggttctt ggccgttacg acggtaatgt gtacccaaag ctctttgagg aagcgttgaa
     1981 ggatccattg aacgactcgg tggttcctga tgggtaccaa gaataccttc gacctgtgct
     2041 tcagcagcaa cttcgtaccg ctaggctctg aagagttttc tttgcttgat actcgatatg
     2101 gttaatcaca ttagacttgc ttcgtccttc ttcttcgtct tcttcttctt ctcgctttga
     2161 ataatttcgc agtttaaaaa ctggcgatgc ccttatttat atgtagcaat gtaatagtta
     2221 atgtacgatc gtcatatggc ggaattttag tactattttt cgttttcaat gcaacattaa
     2281 tacaattgat cgtttctact 
//


--------------1DA9235EBFDF878C6348EFEB--