[Bioperl-l] Next-gen modules

Wed Jun 17 13:27:12 UTC 2009

Hello,
Regarding next-gen sequences and bioperl, following my 
experience, another issue is bioperl speed. For example, if 
you want to trim bad quality bases at ends of 1E6 Solexa 
reads using Bio::SeqIO::fastq and some methods in 
Bio::Seq::Quality, well, you've got to be patient (but may 
be I missed some shortcuts...).

A pure perl solution will be between 100 to 1000x faster... 
Would it be possible to have an ultra-light quality object 
with few simple methods for next-gen reads?

I can contribute some tests if that sounds like an important 
point.

-Tristan

On Wednesday 17 June 2009 08:02:11 Mark A. Jensen wrote:
> Elia--
> I say a definite +1; in fact, this sounds like it should
> be a Hot Topic (see
> http://www.bioperl.org/wiki/Category:Hot_Topics for some
> others you might have missed in your hiatus...). I will
> create a page that can be a central point for wish lists,
> discussion, etc.
>
> There has been much discussion of late about FASTQ
> http://lists.open-bio.org/pipermail/bioperl-l/2009-June/0
>30187.html
> http://lists.open-bio.org/pipermail/bioperl-l/2009-May/02
>9970.html
> http://lists.open-bio.org/pipermail/bioperl-l/2009-May/02
>9911.html
> http://lists.open-bio.org/pipermail/bioperl-l/2009-April/
>029765.html
>
> cheers from a newbie,
> Mark
>
> ----- Original Message -----
> From: "Elia Stupka" <e.stupka at ucl.ac.uk>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Wednesday, June 17, 2009 7:29 AM
> Subject: [Bioperl-l] Next-gen modules
>
> > Dear all,
> >
> > after several years of absence I am slowly coming back
> > to Bioperl, and hope to contribute again to its
> > development.
> >
> > One area that I was thinking of starting from, since we
> > are actively involved with it, is to improve BIoperl's
> > support fo next-gen sequencing data, tools, etc. Since
> > I am sure I have missed out on a lot of recent
> > developments, do let me know if/what is useful.
> >
> > One example that comes to mind is that the conversion
> > of various formats to/from FASTQ does not seem to be
> > supported. Some code can be found within Li Heng's
> > script: http://maq.sourceforge.net/ fq_all2std.pl but
> > it would be good if it could make its way into SeqIO?
> > And similarly, potentially, for other next-gen sequence
> > formats?
> >
> > Similarly, there seems to be little in bioperl-run to
> > support tools that have been developed in this area,
> > such as Maq, BowTie, TopHat, etc?
> >
> > Do let me know if there is a past thread on this, or
> > other people actively developing, etc. so that I can
> > find out what priorities are.
> >
> > thanks and best regards to all (old friends and new),
> >
> > Elia
> >
> > ---
> > Senior Lecturer, Bioinformatics
> > UCL Cancer Institute
> > Paul O' Gorman Building
> > University College London
> > Gower Street
> > WC1E 6BT
> > London
> > UK
> >
> > Office (UCL): +44 207 679 6493
> > Office (ICMS): +44 0207 8822374
> >
> > Mobile: +44 7597 566 194
> > Mobile (Italy): +39 338 8448801
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l