[Open-bio-l] FASTQ support in Biopython, BioPerl, and EMBOSS

Peter biopython at maubp.freeserve.co.uk
Thu Jul 30 21:50:34 UTC 2009


On Thu, Jul 30, 2009 at 9:08 PM, Chris Fields<cjfields at illinois.edu> wrote:
>
> I do think if it affects performance to a significant enough degree we
> can do this silently, we just need to ensure this is well-documented.

Agreed.

> My opinions is this use will prove to be a edge case anyway (most will
> want conversion to Sanger vs. Illumina/Solexa).

Absolutely.

Going from Solexa/Illumina to Sanger FASTQ will be more common
(and there are no truncation issues). Going from Sanger FASTQ to
Solexa or Illumina FASTQ will be rarer, and while a truncation is
possible it requires very high scores (above PHRED 62) which are
likely only to be possible from a consensus alignment or such like.
i.e. Yes, it should be an edge case.

I guess this expected usage supports the argument about issuing a
warning on truncation, even with a modest performance overhead
(because it only slows down the rarer expected usage).

But let's get some benchmarks done to help settle this...

Peter



More information about the Open-Bio-l mailing list