[Biopython] A third FASTQ variant from Illumina 1.3+ ?!!

Peter biopython at maubp.freeserve.co.uk
Mon Jun 15 11:49:29 UTC 2009


On Fri, Jun 5, 2009 at 8:10 PM, Peter<biopython at maubp.freeserve.co.uk> wrote:
> On Fri, Jun 5, 2009 at 1:02 PM, Peter<biopython at maubp.freeserve.co.uk> wrote:
>> On Fri, Jun 5, 2009 at 12:47 PM, Peter<biopython at maubp.freeserve.co.uk> wrote:
>>> Oh dear - it sounds like Solexa/Illumina have just made the whole FASTQ
>>> thing much much worse by introducing a third version of the FASTQ file
>>> format. ...
>
> I'm proposing to support this new FASTQ variant in Bio.SeqIO under the
> format name "fastq-illumina" (unless anyone has a better idea). In the
> meantime, anyone happy installing Biopython from CVS/github can try
> this out - but be warned it will need full testing.
>
> Comments on the (updated) docstring for the Bio.SeqIO.QualityIO module
> would also be welcome - you can read this online here:
> http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/SeqIO/QualityIO.py?cvsroot=biopython

I've since had an email conversation with an Illumina employee which
confirms the introduction of the new FASTQ variant, and that the choice
of offset was indeed to try and make the new Illumina 1.3+ files (using
PHRED scores offset by 64) more or less work even with code still
expecting the original Solexa/Illumina files (using Solexa scores offset
by 64).

Peter



More information about the Biopython mailing list