[Bioperl-l] extracting ORGANISM line from genbank file
Anna Kostikova
geoeco at rambler.ru
Mon Aug 24 05:20:13 EDT 2009
Dear all,
I am trying to extract species taxonomy from ORGANISM line. In fact I
only need a first line under ORGANISM tag (e.i. genus + species). I
though that it would be possible to do with the SeqBuilder object by
stating
$builder->add_wanted_slot('display_id','species');
the problem is, however, that I've got an empty file as a result.
What might be wrong with the script (see below)?
Thanks a lot in advance for any ideas,
-------------------------------------------
#!/usr/bin/perl
use strict;
use Bio::SeqIO;
use Bio::Seq::SeqBuilder;
my $usage = "genbank_to_fasta_cleaning.pl infile outfile \n";
my $infile = shift or die $usage;
my $infileformat = 'Genbank' ;
my $outfile = shift or die $usage;
my $outfileformat = 'raw';
my $i = 0;
my $seq_in = Bio::SeqIO->new('-file' => "<$infile",
'-format' => $infileformat);
my $seq_out = Bio::SeqIO->new('-file' => ">$outfile",
'-format' => $outfileformat);
my $builder = $seq_in->sequence_builder();
$builder->want_none();
$builder->add_wanted_slot('display_id','species');
while(my $seq = $seq_in->next_seq()) {
$seq_out->write_seq($seq);
}
exit;
----------------------------------------------------
Anna
More information about the Bioperl-l
mailing list