[Bioperl-l] reading multiple swissprot records from a single file

Jason Stajich jason.stajich at duke.edu
Wed Jan 5 16:44:57 EST 2005


It reads a stream of data which is delimited by the '//'.  It only 
processes one at a time.  You just keep calling next_seq until it gets 
to the end of the file or filehandle. That is why we typically 
construct the usage with a while loop.

For example if you wanted to make a new file which only had your 
keepers in it.

my $in = Bio::SeqIO->new(-format => 'swiss', -file => 'sprot42.dat');
my $out = Bio::SeqIO->new(-format=> 'swiss', -file =>'>keepers.swiss');

while( my $seq =$in->next_seq ) {
   my $keep = 0;
   for my $feature ($seq->get_SeqFeatures ) {
    # figure out if feature criteria is met, if so, set $keep =1;
   }
   if($keep) {
    $out->write_seq($seq);
   }
}

If you wanted to use a filehandle instead of a file just use the -fh 
parameter instead of -file.  See Bio::Root::IO for more information.

This might be useful if you were streaming in zcat [zcat reads gzipped 
files and produces a stream of the unzipped data].

  open(FH, "zcat sprot42.dat.gz |") || die("could not open file with 
zcat");  # the trailing '|' is necessary to tell perl to pipe the 
output
  my $in = Bio::SeqIO->new(-fh => \*FH, -format=> 'swiss');

OR save the handle in a variable

my $fh;
  open($fh, "zcat sprot42.dat.gz |") || die("could not open file with 
zcat");  # the trailing '|' is necessary to tell perl to pipe the 
output
  my $in = Bio::SeqIO->new(-fh => $fh, -format=> 'swiss');


-jason
On Jan 5, 2005, at 3:48 PM, Daily, Kenneth Michael wrote:

> I'm having trouble using bioperl to parse a file with multiple 
> (thousands) of swissprot records in them. Is there a way to do this 
> with SeqIO and such? The way I understand it, if I use a filehandle to 
> read in the data, it still is expecting only one record in the file. 
> Can I use a FH to read in a record, which ends with //, then put this 
> variable into a SeqIO object to manpulate it? I need to look at each 
> record and decide if I want to keep it based on the features it has. I 
> have a program using standard parsing techniques but want to do this 
> with bioperl if possible. Thanks for any help.
>
> Kenny Daily
> IU School of Informatics
> kmdaily at indiana dot edu
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/



More information about the Bioperl-l mailing list