[Bioperl-l] Next-gen modules

Wed Jun 17 22:24:50 UTC 2009

George Hartzell wrote:
> Sendu Bala writes:
>  > Tristan Lefebure wrote:
>  > > Hello,
>  > > Regarding next-gen sequences and bioperl, following my 
>  > > experience, another issue is bioperl speed. For example, if 
>  > > you want to trim bad quality bases at ends of 1E6 Solexa 
>  > > reads using Bio::SeqIO::fastq and some methods in 
>  > > Bio::Seq::Quality, well, you've got to be patient (but may 
>  > > be I missed some shortcuts...).
>  > 
>  > This is my concern as well. Or, rather, is there actually a significant 
>  > set of users out there who are dealing with next-gen sequencing and 
>  > would consider using BioPerl for their work?
>  > 
>  > I'm working with all the 1000-genomes data at the Sanger, and we at 
>  > least are probably never going to use BioPerl for the work.
>  > [...]
> 
> Is it purely a speed issue, or are there other issues (e.g. stability,
> correctness, compatibility) that are contributing to your decision?

Too heavy-weight, too slow, too memory intensive, missing too much 
functionality in any case. If I have to write new parsers and wrappers, 
I may as well make them fast (which means they don't "fit" into BioPerl).

> What *are* you using?

There are already great tools written in C that do all the heavy lifting 
and the rest is done in perl written for speed and low memory.