[Bioperl-l] Next-gen modules
Chris Fields
cjfields at illinois.edu
Wed Jun 17 13:09:54 EDT 2009
On Jun 17, 2009, at 8:27 AM, Tristan Lefebure wrote:
> Hello,
> Regarding next-gen sequences and bioperl, following my
> experience, another issue is bioperl speed. For example, if
> you want to trim bad quality bases at ends of 1E6 Solexa
> reads using Bio::SeqIO::fastq and some methods in
> Bio::Seq::Quality, well, you've got to be patient (but may
> be I missed some shortcuts...).
The key issues affecting speed in bioperl are contained object
instantiation and inheritance (and between those two, the latter much
more so as it plays a role with contained objects as well as the
container).
http://www.bioperl.org/wiki/Why_BioPerl_is_slow
Moose/Perl6 roles/traits are one way around that issue, but we are a
ways off from getting that running. I think to get that working
decently would be a from-ground-up endeavor (see my past posts on
biomoose/bioperl6).
> A pure perl solution will be between 100 to 1000x faster...
> Would it be possible to have an ultra-light quality object
> with few simple methods for next-gen reads?
>
> I can contribute some tests if that sounds like an important
> point.
>
> -Tristan
The quality objects themselves I don't think are that heavy; I think
the main impediment is inheritance. One could get around that a bit
by using a direct_new method to create a blessed hash directly, then
reimplement methods to lazily create any objects contained on the fly.
chris
More information about the Bioperl-l
mailing list