[Bioperl-l] parsing tblastn using BPlite

Jason Eric Stajich jason@cgt.mc.duke.edu
Fri, 26 Oct 2001 08:43:41 -0400 (EDT)


Many apologies for my stupidity in not spotting the problem at the outset.

Let's go back to the beginning, I'm looking at your code, the reason you
get 16k vs 32k is that _parseHeader is being called twice -
once when you instantiate a new report and then again when you call
$report->_parseHeader.  You have to be particularly careful when calling a
private method like _parseHeader - I'm not sure I like our SYNOPSIS
code giving this as an example (pretty sure >I< ammended the example to
have this in there...)

You should either instantiate a new report in each loop - or call
_parseHeader but not both.  My apologies for misleading examples and not
catching this the first time you posted your question.  The corrected code
is below (slightly modified from your example).

As for the BPlite error you got (test 30) this is just string <-> float
conversion issues, likely it is probably perl not handling numbers in a
platform independent way or we didn't quite get this fixed for 0.9.0.
What OS/hardware/perl version are you running on?  We'll make sure this
is handled in the future releases.

Additionally, I have started work on a simpler search result parsing
system (Bio::SearchIO) which may be more intiutive and will do event based
parsing.

It can currently only handle ncbi XML reports, but would like to
integrate text blast reports, hmmer, and possibly FASTA (if there are some
volunteers out there to help) perhaps to be included in the 1.0 release.

use Bio::Tools::BPlite;
use strict;
my $file = $ARGV[0];
my $cutoff = $ARGV[1];
my $total_rec = 0;
open(BLAST, "$file");
    my $report = new Bio::Tools::BPlite(-fh=>\*BLAST);
{
    $total_rec++;
    my $QUERY = $report->query;
    my $rec = 0;

  SBJCT: while( my $sbjct = $report->nextSbjct ) {
      my $SUBJ = $sbjct->name;
    HSP: while( my $hsp = $sbjct->nextHSP ) {
	my $HSP = $hsp->P;
	if ($HSP < $cutoff) {
	    print "$QUERY\t$SUBJ\t$HSP\n";
	    last SBJCT;
	}
    }
  }
    last if ($report->_parseHeader == -1);
    redo;
}
close(BLAST);
print STDERR "total records .$total_rec.\n";



-- 
Jason Stajich
Duke University
jason@cgt.mc.duke.edu