[Bioperl-l] Re: reading blast report

Siddhartha Basu sidd.basu at gmail.com
Thu Jan 14 15:15:45 EST 2010


On Thu, 14 Jan 2010, Jason Stajich wrote:

> What aspects of the report are you loading?  You might consider the blast 
> report as tab-delimited (-m 8 format) if you only are interested in 
> start/end positions and scores of ailgnments which is a simpler and reduced 
> dataset that has lower memory footprint by the parser.

I think this would be a better approach i am mostly interested in
start/end/score data only.

>
> Searchio (default) -format => blast - you can try the BLAST -format => 
> blast_pull instead which lazy parses to create objects and will reduce 
> memory consumption.

It's another good option though. But just out of curosity,  so the
regular blast parser do load the entire file in the memory consider the
output consist of multiple Results concatenated together into a
single file. Could anybody clarify.

thanks, 
-siddhartha


>
> -jason
> On Jan 14, 2010, at 11:15 AM, Siddhartha Basu wrote:
>
> > Hi,
> > I have a script that reads a tblastn report(13000 records) and loads in
> > a chado database(Bio::Chado::Schema module),  however the machine runs of 
> > memory. I am trying to figure
> > out other than loading the database stuff
> > if it the reading of SearchIO module could consume a lot of memory. So,
> > when i am reading a blast file and getting the result object ....
> >
> > while (my $result = $searchio->next_result)
> >
> > * Does the searchio object loads a huge chunk of file in the memory or
> >  for each iteration it only reads a part of the result.
> >
> > * Does doing an index on blast report and then reading from it be much
> >  faster and why. And is there any way i could iterate through each
> >  record in the index,  will that be helpful.
> >
> > -siddhartha
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
> http://fungalgenomes.org/
>



More information about the Bioperl-l mailing list