[Bioperl-l] warning: Bio::Index::Fastq;

Peter Cock p.j.a.cock at googlemail.com
Tue Mar 11 23:31:37 UTC 2014


On Tue, Mar 11, 2014 at 11:18 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> My feeling is that we could implement a switch to allow fast parsing
> if the 4-line convention is used, and a more diligent parser that deals
> with trickier FASTQ files.  Frankly, I don't know of any cases offhand
> where sequencers giving FASTQ wrap lines off-hand (maybe Roche
> 454? but the standard there is SFF, and anyway 454 is going away...).
>
> Even the PacBio and Moleculo data we have seen all uses the 4-line format.
>
> chris

Yes, thankfully outside some historical files within Sanger (where
FASTQ was invented), we seem to be fortunate that line wrapped
FASTQ has been avoided - even with longer read lengths. Maybe
our NAR paper recommendation helped with this?

I think at this point assuming four lines per record is a safe default
(especially if backed with basic sanity testing of the @ and + lines).

Peter



More information about the Bioperl-l mailing list