[Bioperl-l] Next-gen modules
Elia Stupka
e.stupka at ucl.ac.uk
Wed Jun 17 17:49:38 UTC 2009
I would suggest developing the "standard" version first, then moving
onto potential optimizations.
When we went through a similar argument in Ensembl about 8 years ago
we ended up dropping Bio::Root completely...
If one is truly after performance for these large next-gen projects,
it'd be down to pure piping, shell, and worrying about location and
copying of files, sticking to systems-level as much as possible, and
quite far from Bioperl altogether, so I think it's a whole different
level of optimization issues, probably outside the scope of Bioperl.
Elia
On 17 Jun 2009, at 18:09, Chris Fields wrote:
>
> On Jun 17, 2009, at 8:27 AM, Tristan Lefebure wrote:
>
>> Hello,
>> Regarding next-gen sequences and bioperl, following my
>> experience, another issue is bioperl speed. For example, if
>> you want to trim bad quality bases at ends of 1E6 Solexa
>> reads using Bio::SeqIO::fastq and some methods in
>> Bio::Seq::Quality, well, you've got to be patient (but may
>> be I missed some shortcuts...).
>
> The key issues affecting speed in bioperl are contained object
> instantiation and inheritance (and between those two, the latter
> much more so as it plays a role with contained objects as well as
> the container).
>
> http://www.bioperl.org/wiki/Why_BioPerl_is_slow
>
> Moose/Perl6 roles/traits are one way around that issue, but we are a
> ways off from getting that running. I think to get that working
> decently would be a from-ground-up endeavor (see my past posts on
> biomoose/bioperl6).
>
>> A pure perl solution will be between 100 to 1000x faster...
>> Would it be possible to have an ultra-light quality object
>> with few simple methods for next-gen reads?
>>
>> I can contribute some tests if that sounds like an important
>> point.
>>
>> -Tristan
>
> The quality objects themselves I don't think are that heavy; I think
> the main impediment is inheritance. One could get around that a bit
> by using a direct_new method to create a blessed hash directly, then
> reimplement methods to lazily create any objects contained on the fly.
>
> chris
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
---
Senior Lecturer, Bioinformatics
UCL Cancer Institute
Paul O' Gorman Building
University College London
Gower Street
WC1E 6BT
London
UK
Office (UCL): +44 207 679 6493
Office (ICMS): +44 0207 8822374
Mobile: +44 7597 566 194
Mobile (Italy): +39 338 8448801
More information about the Bioperl-l
mailing list