[Bioperl-l] Bio::DB::SeqFeature::Store::memory -> filter_by_type very slow
Jelle Scholtalbers
j.scholtalbers at gmail.com
Fri Feb 5 15:36:50 UTC 2010
Hi,
a different issue I encountered today with the
Bio::DB::SeqFeature::Store::memory is with the BINSIZE that this module
sets:
use constant BINSIZE => 10_000;
It would be nice to be able to set this dynamically since different GFF
files ask for different indexing rules. I ran into this as a problem when I
used a file that had its position multiplied by 1000 and at that point the
program ran fairly quick, 3-4min. After dividing the positions by 1000
(which is desired) the program took ~30min. to finish. The slowdown was
traceable to Bio::DB::SeqFeature::Store::memory::filter_by_location. By
setting the BINSIZE to 1 the issue was solved. However for another GFF file
this size is way too low.
Is this already possible and did I not see it or would this be an option to
add?
Cheers,
Jelle
2010/2/1 Chris Fields <cjfields at illinois.edu>
> Jelle,
>
> Seems reasonable, but Lincoln and Scott know that code better and are
> better suited to comment on it. Lincoln, Scott?
>
> chris
>
> On Feb 1, 2010, at 6:24 AM, Jelle Scholtalbers wrote:
>
> > Hi,
> > I used the Bio::DB::SeqFeature::Store::memory module to load in a GFF3
> file
> > which I could then use in my script in a 'queryable' way. To retrieve
> > features I used for example
> > $db->features(-type => 'BAC:FPC', -seq_id=>'chromosome0')
> > However when doing a profile on my script I found out that 60% of the
> > running time went into filter_by_type from
> > Bio::DB::SeqFeature::Store::memory.
> > Replacing this function with
> > my @features = grep{$_->type eq 'BAC:FPC'}
> > $db->features(-seq_id=>'chromosome0')
> > which gave me the same results was just a fraction of the earlier run
> time.
> > My script went from 60min. to 4min. for the same result and only changing
> > this function (is called often).
> > Can/Should this be fixed or is this just the faster way to do it?
> >
> > Cheers,
> > Jelle
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
More information about the Bioperl-l
mailing list