[Open-bio-l] FASTQ support in Biopython, BioPerl, and EMBOSS

Chris Fields cjfields at illinois.edu
Thu Jul 30 21:19:56 EDT 2009


I do tend to agree, and I don't think any savings from a performance  
hit will be worth the headache of having to repeatedly explain why  
it's (silently) doing so, when a simple warning or error message  
('value X out of range for fastq format y') would suffice.

chris

On Jul 30, 2009, at 6:52 PM, Aaron Mackey wrote:

> I would strongly warn against truncation, for any reason.  Use the  
> formulas you have for quality-encoding conversions, but do not  
> assume that you know more than I do about what my data contains, or  
> that you are in any way helping me by altering my data, silently or  
> otherwise.  Said another way, feel free to warn me that my data may  
> contain garbage, and utterly fail to convert it for me, but do not  
> try to fix it for me.
>
> -Aaron
>
> On Thu, Jul 30, 2009 at 5:50 PM, Peter <biopython at maubp.freeserve.co.uk 
> > wrote:
> On Thu, Jul 30, 2009 at 9:08 PM, Chris Fields<cjfields at illinois.edu>  
> wrote:
> >
> > I do think if it affects performance to a significant enough  
> degree we
> > can do this silently, we just need to ensure this is well- 
> documented.
>
> Agreed.
>
> > My opinions is this use will prove to be a edge case anyway (most  
> will
> > want conversion to Sanger vs. Illumina/Solexa).
>
> Absolutely.
>
> Going from Solexa/Illumina to Sanger FASTQ will be more common
> (and there are no truncation issues). Going from Sanger FASTQ to
> Solexa or Illumina FASTQ will be rarer, and while a truncation is
> possible it requires very high scores (above PHRED 62) which are
> likely only to be possible from a consensus alignment or such like.
> i.e. Yes, it should be an edge case.
>
> I guess this expected usage supports the argument about issuing a
> warning on truncation, even with a modest performance overhead
> (because it only slows down the rarer expected usage).
>
> But let's get some benchmarks done to help settle this...
>
> Peter
> _______________________________________________
> Open-Bio-l mailing list
> Open-Bio-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/open-bio-l
>



More information about the Open-Bio-l mailing list