[Open-bio-l] FASTQ records with no sequence?

Chris Fields cjfields at illinois.edu
Thu Jul 30 19:59:50 UTC 2009


On Jul 30, 2009, at 10:35 AM, Peter wrote:

> Hi all,
>
> On the continuing topic of the nebulous FASTQ format, are there
> any strong views as to weather a FASTQ files could hold records
> without a sequence (and therefore no quality scores)? This could
> make sense as output from an (aggressive) quality filter.
>
> This was a discussion I meant to start on the OBF list, not the
> EMBOSS list - so here is the start of the thread:
> http://lists.open-bio.org/pipermail/emboss/2009-July/003707.html
>
> Basically in some contexts an empty FASTQ record makes sense,
> so perhaps we should include examples of this for our test suite.
> However, there is more than one reasonable way to represent
> such a record (either omitting the sequence and quality lines, or
> including blank sequence and quality lines).
>
> On Thu, Jul 30, 2009 at 4:09 PM, Peter Rice<pmr at ebi.ac.uk> wrote:
>>
>> Peter C. wrote:
>>
>>> As we are recommending no line wrapping on output this means
>>> typical FASTQ records would be four lines - so doing the same
>>> makes sense here too.
>>
>> I vote for 4 lines on output.
>
> If we want to allow zero length sequences, then yes, I would also
> vote for the 4 line output (i.e. blank lines for the sequence and
> the quality string).

Same here.

>> It should be possible to allow zero lines on input depending on
>> where the '+' check is.
>
> Yes, I'm pretty sure a parser could cope with any of the zero length
> sequence FASTQ examples I gave.
>
> Peter

Should be easy to do this with bioperl as well.

chris



More information about the Open-Bio-l mailing list