[Bioperl-l] Next-gen modules

Elia Stupka e.stupka at ucl.ac.uk
Wed Jun 17 12:54:22 UTC 2009


Dear Mark,

thanks a lot for the pointers.

With regards to FASTQ parsing:

-my understanding by reading past threads is to work on a single  
format, i.e. FASTQ and to interpet the quality "flavours" as just  
quality conversions, right?

-However, I assume we would still want to support a simple way for the  
user to say format => 'fastq-solexa' using the nomenclature adopted in  
BioPython suggested by Peter, right?

-I also saw Heikki's "long essay", but did not yet compare to Heng  
Li's code at http://maq.sourceforge.net/fq_all2std.pl, I guess we  
would hope they would produce identical outputs, will be a good check.

Finally, I saw Tristan's reply to Heikki's thread, so what is the  
status quo? Is it moving forward?

cheers

Elia



On 17 Jun 2009, at 13:02, Mark A. Jensen wrote:

> Elia--
> I say a definite +1; in fact, this sounds like it should be a Hot  
> Topic (see http://www.bioperl.org/wiki/Category:Hot_Topics for some  
> others
> you might have missed in your hiatus...). I will create a page that  
> can be a central point for wish lists, discussion, etc.
>
> There has been much discussion of late about FASTQ http://lists.open-bio.org/pipermail/bioperl-l/2009-June/030187.html
> http://lists.open-bio.org/pipermail/bioperl-l/2009-May/029970.html
> http://lists.open-bio.org/pipermail/bioperl-l/2009-May/029911.html
> http://lists.open-bio.org/pipermail/bioperl-l/2009-April/029765.html
>
> cheers from a newbie, Mark
>
> ----- Original Message ----- From: "Elia Stupka" <e.stupka at ucl.ac.uk>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Wednesday, June 17, 2009 7:29 AM
> Subject: [Bioperl-l] Next-gen modules
>
>
>> Dear all,
>> after several years of absence I am slowly coming back to Bioperl,  
>> and  hope to contribute again to its development.
>> One area that I was thinking of starting from, since we are  
>> actively  involved with it, is to improve BIoperl's support fo next- 
>> gen  sequencing data, tools, etc. Since I am sure I have missed out  
>> on a  lot of recent developments, do let me know if/what is useful.
>> One example that comes to mind is that the conversion of various   
>> formats to/from FASTQ does not seem to be supported. Some code can  
>> be  found within Li Heng's script: http://maq.sourceforge.net/  
>> fq_all2std.pl but it would be good if it could make its way into   
>> SeqIO? And similarly, potentially, for other next-gen sequence  
>> formats?
>> Similarly, there seems to be little in bioperl-run to support  
>> tools  that have been developed in this area, such as Maq, BowTie,  
>> TopHat, etc?
>> Do let me know if there is a past thread on this, or other people   
>> actively developing, etc. so that I can find out what priorities are.
>> thanks and best regards to all (old friends and new),
>> Elia
>> ---
>> Senior Lecturer, Bioinformatics
>> UCL Cancer Institute
>> Paul O' Gorman Building
>> University College London
>> Gower Street
>> WC1E 6BT
>> London
>> UK
>> Office (UCL): +44 207 679 6493
>> Office (ICMS): +44 0207 8822374
>> Mobile: +44 7597 566 194
>> Mobile (Italy): +39 338 8448801
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>

---
Senior Lecturer, Bioinformatics
UCL Cancer Institute
Paul O' Gorman Building
University College London
Gower Street
WC1E 6BT
London
UK

Office (UCL): +44 207 679 6493
Office (ICMS): +44 0207 8822374

Mobile: +44 7597 566 194
Mobile (Italy): +39 338 8448801




More information about the Bioperl-l mailing list