[Bioperl-l] Re: [Bioperl-guts-l] Notification: incoming/996

Jason Stajich jason@chg.mc.duke.edu
Tue, 31 Jul 2001 18:38:33 -0400 (EDT)


Happy it worked! Welcome to bioperl 0.7! 

Ah yes, header line no sequence is different from an empty file. 

I will close this bug then.
-jason

On Tue, 31 Jul 2001, Neilay Dedhia wrote:

> Hi Jason, 
> 
> I installed Bioperl 0.7 and the bug I mentioned has been fixed
> in that release. In Bioperl 0.7, given a fasta file which has only
> a header line but no sequence, Bio::SeqIO does return a valid Bio::Seq
> object, whereas in BioPerl 0.6 it threw an exception. The bug
> was in Bio::SeqIO::fasta. 
> 
> Therefore in BioPerl 0.7, if $fasta_file contains only the header line but
> no sequence the following call returns 0. 
> 
> 	Bio::SeqIO(-file => $fasta_file)->next_seq->length()
> 
> The above call sequence was for illustrative purposes. Since I knew
> that $fasta_file existed and did contain one sequence, I
> concatenated the method calls. 
> 
> Thanks for your help!
> 
> Neilay
> 
> 
> On Mon, 30 Jul 2001, Jason Stajich wrote:
> 
> > Neilay -
> > 
> > I'm not sure I agree - you should be testing whether next_seq returns a
> > valid Seq or PrimarySeq object before you try and call length.  The
> > defined behavior of next_seq method is if there is no valid sequence to
> > read in, then it returns a null.  I not sure I understand your statement
> > here.
> > 
> > > If the fasta file contains no sequence,
> > > the following call which should ideally return
> > > 0 throws an exception "Cannot parse entry".
> > 
> > Are you asking for it to throw an exception OR return 0 here?  
> >
> > You should try upgrading to bioperl 0.7 as we are currently not going back
> > and fixing any bugs on the 0.6 branch.  Perhaps I am not seeing the same
> > behavior as you since I am running off the the main-trunk.
> > 
> > SeqIO is an iterator which means you cannot expect next_seq to always
> > return a valid object - if it returns null then it was at the end of the
> > data stream.
> > 
> > my $seqio = new Bio::SeqIO(-file => $file);
> > my $saw_any = 0;
> > 
> > while( my $seq = $seqio->next_seq ) {
> >  $saw_any = 1;
> > }
> > 
> > if( ! $saw_any ) { # file was empty }
> > 
> > -jason
> > On Mon, 30 Jul 2001 bioperl-bugs@bioperl.org wrote:
> > 
> > > JitterBug notification
> > > 
> > > new message incoming/996
> > > 
> > > Message summary for PR#996
> > > 	From: dedhia@cshl.org
> > > 	Subject: Bio::SeqIO::fasta::next_primary_seq throws an exception if sequence length is zero.
> > > 	Date: Mon, 30 Jul 2001 11:17:39 -0400
> > > 	0 replies 	0 followups
> > > 
> > > ====> ORIGINAL MESSAGE FOLLOWS <====
> > > 
> > > >From dedhia@cshl.org Mon Jul 30 11:17:39 2001
> > > Received: from localhost (localhost [127.0.0.1])
> > > 	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f6UFHdw07710
> > > 	for <bioperl-bugs@pw600a.bioperl.org>; Mon, 30 Jul 2001 11:17:39 -0400
> > > Date: Mon, 30 Jul 2001 11:17:39 -0400
> > > Message-Id: <200107301517.f6UFHdw07710@pw600a.bioperl.org>
> > > From: dedhia@cshl.org
> > > To: bioperl-bugs@bioperl.org
> > > Subject: Bio::SeqIO::fasta::next_primary_seq throws an exception if sequence length is zero. 
> > > 
> > > Full_Name: Neilay Dedhia
> > > Module: Bio::SeqIO::fasta
> > > Version: 0.6.1
> > > PerlVer: 5.00502
> > > OS: Solaris
> > > Submission from: (NULL) (143.48.7.14)
> > > 
> > > 
> > > If the fasta file contains no sequence,
> > > the following call which should ideally return
> > > 0 throws an exception "Cannot parse entry". 
> > > 
> > > $length = Bio::SeqIO->new(-file => $file)
> > >                      ->next_seq()
> > >                      ->length(); 
> > > 
> > > Here is a patch:
> > > 
> > > *** /usr/local/lib/perl5/site_perl/5.005/Bio/SeqIO/fasta_old.pm Mon Jul 30
> > > 10:35:47 2001
> > > --- /usr/local/lib/perl5/site_perl/5.005/Bio/SeqIO/fasta.pm     Mon Jul 30
> > > 10:36:20 2001
> > > ***************
> > > *** 111,117 ****
> > >         return unless $entry = $self->_readline;
> > >     }
> > >   
> > > !   my ($top,$sequence) = $entry =~ /^(.+?)\n([^>]+)/s
> > >       or $self->throw("Can't parse entry");
> > >     my ($id,$fulldesc) = $top =~ /^\s*(\S+)\s*(.*)/
> > >       or $self->throw("Can't parse fasta header");
> > > --- 111,117 ----
> > >         return unless $entry = $self->_readline;
> > >     }
> > >   
> > > !   my ($top,$sequence) = $entry =~ /^(.+?)\n([^>]*)/s
> > >       or $self->throw("Can't parse entry");
> > >     my ($id,$fulldesc) = $top =~ /^\s*(\S+)\s*(.*)/
> > >       or $self->throw("Can't parse fasta header");
> > > 
> > > 
> > > _______________________________________________
> > > Bioperl-guts-l mailing list
> > > Bioperl-guts-l@bioperl.org
> > > http://bioperl.org/mailman/listinfo/bioperl-guts-l
> > > 
> > 
> > Jason Stajich
> > jason@chg.mc.duke.edu
> > Center for Human Genetics
> > Duke University Medical Center 
> > http://www.chg.duke.edu/ 
> > 
> > 
> > 
> > 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
> 

Jason Stajich
jason@chg.mc.duke.edu
Center for Human Genetics
Duke University Medical Center 
http://www.chg.duke.edu/