Bioperl: Bio::Tools::Blast->new on sequence file w/no hits, throws fatal exception. Any ideas?

Steve Chervitz sac@neomorphic.com (Steve A. Chervitz)
Thu, 27 May 1999 20:09:49 -0700 (PDT)


Timothy M. Kunau (CBC) writes:
 > I hope this is a silly question,

Definitely not. Probably a good candidate for a FAQ.

 > I have a PERL script to parse BLASTX files and format the data
 > for our databases. This operation fails:
 > 
 > 
 >     $blastObj = Bio::Tools::Blast->new( -file => $seqfile,
 >         -parse  => 1,  
 >         -signif => '1e-1',
 >         -stats => 1 );

You should definitely be wrapping your Blast object creation statement
in an eval {} block (Perl's equivalent to a try {}). For example:

  eval {	
      $blastObj = Bio::Tools::Blast->new( -file => $seqfile,
  	  -parse  => 1,  
  	  -signif => '1e-1',
  	  -stats => 1 );
  };
  
  # Catch the exception and ignore it until we're done processing.
  if ($@) {
     push @errs, $@;
  }
  
  # After processing all of your reports,
  if(@errs) {
      # print out the errors to STDERR (or to a log file).
      printf STDERR "\n*** %d Blast reports produced fatal errors:\n", 
  		     scalar(@errs);
      foreach(@errs) { print STDERR $_; }
  }

You might wonder: why does the blast object throw an exception when
I gave it a perfectly valid, parsable blast report? Well, when you
specify a -signif criterion, you are adding a requirement that the
blast object should only contain hits at or below the indicated
significance value. Since hits are the raison d'etre of a parsed blast 
object, the object dies since it cannot attach any hits to itself (am
I personifying the blast object too much?).

Throwing an exception guarantees that a blast object won't be used any
further. It also provides a convenient screen, since you know that
every object created will have at least one hit at or below your
-signif cutoff.

If you are interested in every blast object, then omit the -signif
parameter in your call to new() or parse(). I haven't come across a
situation where I needed to work with Blast objects that had hits
failing to meet my -signif criterion. I'd be interested to know if
someone has. When working with files, you could just call new() a
second time, omitting the -signif parameter. This wouldn't work when
parsing from a stream.

BTW, I'd strongly recommend using the above eval {} strategy when
doing any reasonably complex operation in a situation where an
exception could halt a long-running script prematurely. It's a good
"safe scripting" practice.

Cheers,

Steve Chervitz

 > 
 > The EXCEPTION report looks like this:
 > 
 > 
 >     -------------------- EXCEPTION --------------------
 >     MSG: No hits were found.
 >     CONTEXT: Error in object Bio::Tools::Blast "3201671"
 >     SCRIPT: ./parse
 >     STACK: 
 >     Bio::Tools::Blast::_parse(1521)
 >     Bio::Tools::Blast::parse(1476)
 >     Bio::Tools::SeqAnal::_initialize(286)
 >     Bio::Tools::Blast::_initialize(1051)
 >     Bio::Root::Object::new(455)
 >     main::parse_file(348)
 >     main::prepare_file(291)
 >     File::Find::finddir(155)
 >     File::Find::finddir(182)
 >     File::Find::find_opt(109)
 >     File::Find::find(202)
 >     main::./parse-em.pl(99)
 > ---------------------------------------------------
 > 
 > 
 > where $seqfile is a normal BLASTX result file with no hits. 
 > The $seqfile also contains the following line:
 > 
 > 
 >      ***** No hits found ******
 > 
 > 
 > The EXCEPTION then aborts my script.
 > 
 > Short of grep'ing the file for the existence of this line, is 
 > there a preferred method to trap and handle this exception more
 > gracefully? I had thought:
 > 
 > 
 >      @bohits = $blastObj->num_hits('total');
 > 
 > 
 > where the number of elements in the array could be tested to be
 > greater than zero (0). However, the error occurs in the Blast->new
 > invocation so there is no new $blastObj to test. <sigh>
 > 
 > Does anyone else have a better work-around?
 > 
 > Thanks,
 > 
 > Tim
 > -- 
 > Timothy M. Kunau  kunau@ahc.umn.edu  Computational Biology Ctrs.
 > Genomic Database Administrator, Computing and Bioinformatics, AHC
 > 612-626-6937  http://www.cbc.umn.edu/~kunau  http://www.cbc.umn.edu
 > =========== Bioperl Project Mailing List Message Footer =======
 > Project URL: http://bio.perl.org/
 > For info about how to (un)subscribe, where messages are archived, etc:
 > http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
 > ====================================================================
 > 
=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================