[Bioperl-l] Problem with Bio::SeqIO opening gzipped files

Brian Osborne brian_osborne at cognia.com
Fri Nov 14 11:43:54 EST 2003


Jason,

This is odd because the SeqIO HOWTO says you can do the trick that Zayed is
trying. From the HOWTO:

use Bio::SeqIO;
      # get command-line arguments, or die with a usage statement
      my $usage = "gzip2fasta.pl infile informat outfile\n";
      my $infile = shift or die $usage;
      my $informat = shift or die $usage;
      my $outformat = shift or die $usage;

      # create one SeqIO object to read in, and another to write out
      my $seqin = Bio::SeqIO->new('-file' => "/usr/local/bin/gunzip $infile
|",
                                  '-format' => $informat);

      my $seqout = Bio::SeqIO->new('-file' => ">$outfile",
                                   '-format' => 'Fasta');

      # write each entry in the input to the output file
      while (my $inseq = $seqin->next_seq) {
            $outseq->write_seq($inseq);
      }
      exit;

I should correct the HOWTO?

Brian O.


-----Original Message-----
From: bioperl-l-bounces at portal.open-bio.org
[mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Jason Stajich
Sent: Friday, November 14, 2003 11:15 AM
To: Zayed Albertyn
Cc: bioperl-l at bioperl.org; Andreas Kahari
Subject: Re: [Bioperl-l] Problem with Bio::SeqIO opening gzipped files

When you pass in -file there is an implicit assumption that it is a
filename you are passing in, NOT a stream.

If you want to make this work, do this (you can replace 'zcat' with
'gunzip -c' if you prefer )
 open($fh, "zcat $filename.gz |");
 my $seqio  = new Bio::SeqIO(-fh => $fh, -format => 'genbank');

You can also provide multiple files in that zcat
 open($fh, "zcat $file1 $file2 ... |");

-jason
On Fri, 14 Nov 2003, Zayed  Albertyn wrote:

> Hi Andreas
>
> Adding the -c switch still doesnt work. I still get the same error
> message. Input is the full path to the file e.g.
>
> /cip0/db/GENBANK/RELEASE137/gbest13.seq.gz  == $path/$file
>
> I've written another script that does the normal
> open(FILE,"/bin/gunzip -c file1 |")
>
> and it works fine
>
> Z
>
> >
> >     my $seq_in = Bio::SeqIO::new(
> >     '-file'   => "/bin/gunzip -c $path/$file|",
> >     '-format' => 'genbank'
> >     );
> >
> >
> >
> > --
> > |()()|      Andreas Kähäri                                |(==)|
> > |)()(|      EMBL, European Bioinformatics Institute       |=)(=|
> > |()()|      Wellcome Trust Genome Campus, Hinxton         |(==)|
> > |)()(|      Cambridge, CB10 1SD                           |=)(=|
> > |()()|      United Kingdom                                |(==)|
> >
>
> -----------------------------------------------
> From: Zayed Albertyn
> Electric Genetics PTY Ltd
> Tel: +27 21 959 3645; Mobile: +2782 480 6097
> www.egenetics.com
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu

_______________________________________________
Bioperl-l mailing list
Bioperl-l at portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list