[Bioperl-l] bl2seq parsing in SearchIO

Jason Stajich jason at cgt.mc.duke.edu
Sun Apr 27 13:50:09 EDT 2003


At long last we have bl2seq parsing in SearchIO via the ever
complex Bio::SearchIO::blast driver for Bio::SearchIO.  I've added tests
to t/SearchIO.t - if you can find cases that break please submit a bug via
bugzilla (http://bugzilla.bioperl.org/).  Along with Steve's changes
recently I think we finally can parse the full suite of all things BLAST
in one module.  Future work down this track will include some performance
improvements and working out the event-based parsing model a little better
so that it can be a bit simplier and more resuable.


As soon as we've got things tested out a little more and we can
start incorporating this in the the tutorial, StandAloneBlast, etc. Note
this will only go on the main trunk but will be part of the 1.3.x
developer series of releases which will be forthcoming I hope before the
beginning of summer.  As always you can have access to the code
immedietely through CVS.

Some notes:

I still can't figure out how to differentiate between BLASTX and TBLASTN
since bl2seq doesn't report the algorithm used - I assume it is BLASTX by
default - you can supply the program type with -report_type in the
SearchIO constructor i.e.

my $parser = new Bio::SearchIO(-format => 'blast',
	                       -file   => 'bl2seq.tblastn.report',
	                       -report_type => 'tblastn');

This only really affects where the frame and strand information are put -
they will always be on the $hsp->query instead of on the $hsp->hit part of
the feature pair for blastx and tblastn bl2seq produced reports.
Hope that's clear...


Note to people wanting to implement a SearchIO parser - don't use blast.pm
as your starting point for learning this if you don't want to warp your
mind... =)

-jason

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu


More information about the Bioperl-l mailing list