[Bioperl-l] Loading Blast Report in a minimal way

Sendu Bala bix at sendu.me.uk
Mon Oct 8 05:54:15 EDT 2007


zhihuali wrote:
> Hi netters,
> 
> I'm using SearchIO to parse my blast reports. They are extremely 
> huge, and not surprisingly, it's extremely slow and sometimes the 
> system crashed due to memmory problem. As I can handle small reports 
> quickly, it seems like a problem related to the way SearchIO works: 
> it slurps the whole report into the memory and builds millions of 
> objects.
> 
> I've checked old posts and some people used FastHitEventBuilder to 
> build hit objects without any hsp objects. And some people suggested 
> using tabular output of blast. But in my case I need to go to each of
>  the hsps of each hit, parse the alignment, and gather the
> information needed if that hsp fits certain criteria, and then move
> on to the next hsp/or jump over to the next hit/ or exit the
> processing, according to the information I have already got. An ideal
> way would be to read one hsp at a time from the report to the memory.
> Is there some way to modify SearchIO (or build another Search Event)
> to do this?

Use Bio::SearchIO::blast_pull

(ie.
use Bio::SearchIO;
my $in = Bio::SearchIO->new(-format => 'blast_pull',
                                             -file =>
't/data/new_blastn.txt');
)

It doesn't yet support all kinds of Blast report, however. Let me know
how you get on.

Cheers,
Sendu.


More information about the Bioperl-l mailing list