[Bioperl-l] location binner object

Jason Stajich jason@cgt.mc.duke.edu
Wed, 8 May 2002 10:46:20 -0400 (EDT)


I'm generating a bunch of Bio::LocationI objects and would like to test if
some are within some specified distance away from each other.  I want to
be able to do fast lookups to see if a location is within some range.
This seems pretty similar to Lincoln's binning in GFF for locations, but
would it be possible to do this in-memory/BDB file as I am generating a
lot of these and don't need to keep them once I've processed and
identified the best choices?

Essentially I want to be able to take a location from list A and see if
any locations in list B fall in the range of X bp downstream of A.
Plenty of implementation options, probably the fastes and easiest to
implement would be a single vector of length of the full range covered by
all the locations and have ptrs to the objects in the slots where the
location overlaps, but this is a memory hog.  I could map to a BDB file
and just eat the disk space since this is essentially generating a
tempfile

Anyways, Lincoln do you have any input here - is it just going to be
easier to slap everything into an sql backend with DB:GFF rather than
reinventing the wheel or can I basically just run everything through the
Bio::DB::GFF binning but store in a BDB file?  I'm happy to adapt this to
some sort of location aggregator/binner object for everyone's use.

Thanks.

-jason
-- 
Jason Stajich
Duke University
jason at cgt.mc.duke.edu