[Bioperl-l] Problem with Bio::SeqIO opening gzipped files

Jason Stajich jason at cgt.duhs.duke.edu
Fri Nov 14 12:07:12 EST 2003


Ah right - I guess that should work - the "-c" is really needed to correct
the howto.

I just make a separate filehandle myself to make sure things work.

-jason
On Fri, 14 Nov 2003, Brian Osborne wrote:

> Jason,
>
> This is odd because the SeqIO HOWTO says you can do the trick that Zayed is
> trying. From the HOWTO:
>
> use Bio::SeqIO;
>       # get command-line arguments, or die with a usage statement
>       my $usage = "gzip2fasta.pl infile informat outfile\n";
>       my $infile = shift or die $usage;
>       my $informat = shift or die $usage;
>       my $outformat = shift or die $usage;
>
>       # create one SeqIO object to read in, and another to write out
>       my $seqin = Bio::SeqIO->new('-file' => "/usr/local/bin/gunzip $infile
> |",
>                                   '-format' => $informat);
>
>       my $seqout = Bio::SeqIO->new('-file' => ">$outfile",
>                                    '-format' => 'Fasta');
>
>       # write each entry in the input to the output file
>       while (my $inseq = $seqin->next_seq) {
>             $outseq->write_seq($inseq);
>       }
>       exit;
>
> I should correct the HOWTO?
>
> Brian O.
>
>
> -----Original Message-----
> From: bioperl-l-bounces at portal.open-bio.org
> [mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Jason Stajich
> Sent: Friday, November 14, 2003 11:15 AM
> To: Zayed Albertyn
> Cc: bioperl-l at bioperl.org; Andreas Kahari
> Subject: Re: [Bioperl-l] Problem with Bio::SeqIO opening gzipped files
>
> When you pass in -file there is an implicit assumption that it is a
> filename you are passing in, NOT a stream.
>
> If you want to make this work, do this (you can replace 'zcat' with
> 'gunzip -c' if you prefer )
>  open($fh, "zcat $filename.gz |");
>  my $seqio  = new Bio::SeqIO(-fh => $fh, -format => 'genbank');
>
> You can also provide multiple files in that zcat
>  open($fh, "zcat $file1 $file2 ... |");
>
> -jason
> On Fri, 14 Nov 2003, Zayed  Albertyn wrote:
>
> > Hi Andreas
> >
> > Adding the -c switch still doesnt work. I still get the same error
> > message. Input is the full path to the file e.g.
> >
> > /cip0/db/GENBANK/RELEASE137/gbest13.seq.gz  == $path/$file
> >
> > I've written another script that does the normal
> > open(FILE,"/bin/gunzip -c file1 |")
> >
> > and it works fine
> >
> > Z
> >
> > >
> > >     my $seq_in = Bio::SeqIO::new(
> > >     '-file'   => "/bin/gunzip -c $path/$file|",
> > >     '-format' => 'genbank'
> > >     );
> > >
> > >
> > >
> > > --
> > > |()()|      Andreas Kähäri                                |(==)|
> > > |)()(|      EMBL, European Bioinformatics Institute       |=)(=|
> > > |()()|      Wellcome Trust Genome Campus, Hinxton         |(==)|
> > > |)()(|      Cambridge, CB10 1SD                           |=)(=|
> > > |()()|      United Kingdom                                |(==)|
> > >
> >
> > -----------------------------------------------
> > From: Zayed Albertyn
> > Electric Genetics PTY Ltd
> > Tel: +27 21 959 3645; Mobile: +2782 480 6097
> > www.egenetics.com
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
>
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu



More information about the Bioperl-l mailing list