[Bioperl-l] Article
Keith Player
keithplayer at hotmail.com
Thu Nov 2 00:22:07 UTC 2006
Sorry I didn't attach the article link originally. You can view the full-text
for free:
http://www.genome.org/cgi/content/abstract/12/10/1599
When I was talking about the R-tree I was talking about the current
implementation.
I should point out that I didn't actually try the perl module directly but
implemented the binning schema straight in mysql. I found that by using the
SQL I mentioned previously the database performed better compared to using the
binning schema, I assume because of less disk seeks. I tested on a dataset of
around 30k records and another the same size as the paper. The binning
outperformed the queries as described in the paper, but the SQL I mentioned in
the first post outperformed the binning schema by around a factor of 3.
The new binning schema might make all this moot, especially if it removes
layers so that groups/features next to each other are saved on the
same/adjacent pages. The only question then would be whether database
optimization is effected by the binning.
Also does needing to know the largest length of a group/feature make the SQL
statement I created impractical?
More information about the Bioperl-l
mailing list