[Bioperl-l] Re: [Bioperl-guts-l] Notification: incoming/996

Neilay Dedhia dedhia@phage.cshl.org
Tue, 31 Jul 2001 17:02:11 -0400 (EDT)


Hi Jason, 

I installed Bioperl 0.7 and the bug I mentioned has been fixed
in that release. In Bioperl 0.7, given a fasta file which has only
a header line but no sequence, Bio::SeqIO does return a valid Bio::Seq
object, whereas in BioPerl 0.6 it threw an exception. The bug
was in Bio::SeqIO::fasta. 

Therefore in BioPerl 0.7, if $fasta_file contains only the header line but
no sequence the following call returns 0. 

	Bio::SeqIO(-file => $fasta_file)->next_seq->length()

The above call sequence was for illustrative purposes. Since I knew
that $fasta_file existed and did contain one sequence, I
concatenated the method calls. 

Thanks for your help!

Neilay


On Mon, 30 Jul 2001, Jason Stajich wrote:

> Neilay -
> 
> I'm not sure I agree - you should be testing whether next_seq returns a
> valid Seq or PrimarySeq object before you try and call length.  The
> defined behavior of next_seq method is if there is no valid sequence to
> read in, then it returns a null.  I not sure I understand your statement
> here.
> 
> > If the fasta file contains no sequence,
> > the following call which should ideally return
> > 0 throws an exception "Cannot parse entry".
> 
> Are you asking for it to throw an exception OR return 0 here?  
>
> You should try upgrading to bioperl 0.7 as we are currently not going back
> and fixing any bugs on the 0.6 branch.  Perhaps I am not seeing the same
> behavior as you since I am running off the the main-trunk.
> 
> SeqIO is an iterator which means you cannot expect next_seq to always
> return a valid object - if it returns null then it was at the end of the
> data stream.
> 
> my $seqio = new Bio::SeqIO(-file => $file);
> my $saw_any = 0;
> 
> while( my $seq = $seqio->next_seq ) {
>  $saw_any = 1;
> }
> 
> if( ! $saw_any ) { # file was empty }
> 
> -jason
> On Mon, 30 Jul 2001 bioperl-bugs@bioperl.org wrote:
> 
> > JitterBug notification
> > 
> > new message incoming/996
> > 
> > Message summary for PR#996
> > 	From: dedhia@cshl.org
> > 	Subject: Bio::SeqIO::fasta::next_primary_seq throws an exception if sequence length is zero.
> > 	Date: Mon, 30 Jul 2001 11:17:39 -0400
> > 	0 replies 	0 followups
> > 
> > ====> ORIGINAL MESSAGE FOLLOWS <====
> > 
> > >From dedhia@cshl.org Mon Jul 30 11:17:39 2001
> > Received: from localhost (localhost [127.0.0.1])
> > 	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f6UFHdw07710
> > 	for <bioperl-bugs@pw600a.bioperl.org>; Mon, 30 Jul 2001 11:17:39 -0400
> > Date: Mon, 30 Jul 2001 11:17:39 -0400
> > Message-Id: <200107301517.f6UFHdw07710@pw600a.bioperl.org>
> > From: dedhia@cshl.org
> > To: bioperl-bugs@bioperl.org
> > Subject: Bio::SeqIO::fasta::next_primary_seq throws an exception if sequence length is zero. 
> > 
> > Full_Name: Neilay Dedhia
> > Module: Bio::SeqIO::fasta
> > Version: 0.6.1
> > PerlVer: 5.00502
> > OS: Solaris
> > Submission from: (NULL) (143.48.7.14)
> > 
> > 
> > If the fasta file contains no sequence,
> > the following call which should ideally return
> > 0 throws an exception "Cannot parse entry". 
> > 
> > $length = Bio::SeqIO->new(-file => $file)
> >                      ->next_seq()
> >                      ->length(); 
> > 
> > Here is a patch:
> > 
> > *** /usr/local/lib/perl5/site_perl/5.005/Bio/SeqIO/fasta_old.pm Mon Jul 30
> > 10:35:47 2001
> > --- /usr/local/lib/perl5/site_perl/5.005/Bio/SeqIO/fasta.pm     Mon Jul 30
> > 10:36:20 2001
> > ***************
> > *** 111,117 ****
> >         return unless $entry = $self->_readline;
> >     }
> >   
> > !   my ($top,$sequence) = $entry =~ /^(.+?)\n([^>]+)/s
> >       or $self->throw("Can't parse entry");
> >     my ($id,$fulldesc) = $top =~ /^\s*(\S+)\s*(.*)/
> >       or $self->throw("Can't parse fasta header");
> > --- 111,117 ----
> >         return unless $entry = $self->_readline;
> >     }
> >   
> > !   my ($top,$sequence) = $entry =~ /^(.+?)\n([^>]*)/s
> >       or $self->throw("Can't parse entry");
> >     my ($id,$fulldesc) = $top =~ /^\s*(\S+)\s*(.*)/
> >       or $self->throw("Can't parse fasta header");
> > 
> > 
> > _______________________________________________
> > Bioperl-guts-l mailing list
> > Bioperl-guts-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-guts-l
> > 
> 
> Jason Stajich
> jason@chg.mc.duke.edu
> Center for Human Genetics
> Duke University Medical Center 
> http://www.chg.duke.edu/ 
> 
> 
> 
>