[Bioperl-l] Next-gen modules

Elia Stupka e.stupka at ucl.ac.uk
Wed Jun 17 17:49:38 UTC 2009


I would suggest developing the "standard" version first, then moving  
onto potential optimizations.

When we went through a similar argument in Ensembl about 8 years ago  
we ended up dropping Bio::Root completely...

If one is truly after performance for these large next-gen projects,  
it'd be down to pure piping, shell, and worrying about location and  
copying of files, sticking to systems-level as much as possible, and  
quite far from Bioperl altogether, so I think it's a whole different  
level of optimization issues, probably outside the scope of Bioperl.

Elia

On 17 Jun 2009, at 18:09, Chris Fields wrote:

>
> On Jun 17, 2009, at 8:27 AM, Tristan Lefebure wrote:
>
>> Hello,
>> Regarding next-gen sequences and bioperl, following my
>> experience, another issue is bioperl speed. For example, if
>> you want to trim bad quality bases at ends of 1E6 Solexa
>> reads using Bio::SeqIO::fastq and some methods in
>> Bio::Seq::Quality, well, you've got to be patient (but may
>> be I missed some shortcuts...).
>
> The key issues affecting speed in bioperl are contained object  
> instantiation and inheritance (and between those two, the latter  
> much more so as it plays a role with contained objects as well as  
> the container).
>
> http://www.bioperl.org/wiki/Why_BioPerl_is_slow
>
> Moose/Perl6 roles/traits are one way around that issue, but we are a  
> ways off from getting that running.  I think to get that working  
> decently would be a from-ground-up endeavor (see my past posts on  
> biomoose/bioperl6).
>
>> A pure perl solution will be between 100 to 1000x faster...
>> Would it be possible to have an ultra-light quality object
>> with few simple methods for next-gen reads?
>>
>> I can contribute some tests if that sounds like an important
>> point.
>>
>> -Tristan
>
> The quality objects themselves I don't think are that heavy; I think  
> the main impediment is inheritance.  One could get around that a bit  
> by using a direct_new method to create a blessed hash directly, then  
> reimplement methods to lazily create any objects contained on the fly.
>
> chris
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

---
Senior Lecturer, Bioinformatics
UCL Cancer Institute
Paul O' Gorman Building
University College London
Gower Street
WC1E 6BT
London
UK

Office (UCL): +44 207 679 6493
Office (ICMS): +44 0207 8822374

Mobile: +44 7597 566 194
Mobile (Italy): +39 338 8448801




More information about the Bioperl-l mailing list