[Bioperl-l] extracting ORGANISM line from genbank file

Anna Kostikova geoeco at rambler.ru
Mon Aug 24 05:20:13 EDT 2009


Dear all,

I am trying to extract species taxonomy from ORGANISM line. In fact I 
only need a first line under ORGANISM tag (e.i. genus + species). I 
though that it would be possible to do with the SeqBuilder object by 
stating

 $builder->add_wanted_slot('display_id','species');

the problem is, however, that I've got an empty file as a result.
What might be wrong with the script (see below)?
Thanks a lot in advance for any ideas,

-------------------------------------------

#!/usr/bin/perl
use strict;
use Bio::SeqIO;
use Bio::Seq::SeqBuilder;

 my $usage = "genbank_to_fasta_cleaning.pl infile outfile \n";
         my $infile = shift or die $usage;
         my $infileformat = 'Genbank' ;
         my $outfile = shift or die $usage;
         my $outfileformat = 'raw';
		 my $i = 0;

         my $seq_in = Bio::SeqIO->new('-file' => "<$infile",
                                      '-format' => $infileformat);

	     my $seq_out = Bio::SeqIO->new('-file' => ">$outfile",
                                       '-format' => $outfileformat);

		my $builder = $seq_in->sequence_builder();

   $builder->want_none();
   $builder->add_wanted_slot('display_id','species');

   while(my $seq = $seq_in->next_seq()) {
       $seq_out->write_seq($seq);
   }

     exit;

----------------------------------------------------

Anna



More information about the Bioperl-l mailing list